AHMassBank 1.7.0
Authors: Johannes Rainer [cre] (ORCID: https://orcid.org/0000-0002-6977-7147)
Compiled: Tue Oct 29 16:08:13 2024
MassBank is an open access, community maintained annotation database for small
compounds. Annotations provided by this database comprise names, chemical
formulas, exact masses and other chemical properties for small compounds
(including metabolites, medical treatment agents and others). In addition,
fragment spectra are available which are crucial for the annotation of
untargeted mass spectrometry data. The CompoundDb
Bioconductor package supports conversion of MassBank data into the CompDb
(SQLite) format which enables a simplified distribution of the resource and easy
integration into Bioconductor-based annotation workflows.
CompDb
Databases from AnnotationHub
The AHMassBank
package provides the metadata for all CompDb
SQLite databases
with MassBank annotations in r Biocpkg("AnnotationHub")
. To get and use MassBank annotations we first we
load/update the AnnotationHub
resource.
library(AnnotationHub)
ah <- AnnotationHub()
Next we list all MassBank entries from AnnotationHub
.
query(ah, "MassBank")
## AnnotationHub with 6 records
## # snapshotDate(): 2024-10-24
## # $dataprovider: MassBank
## # $species: NA
## # $rdataclass: CompDb
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["AH107048"]]'
##
## title
## AH107048 | MassBank CompDb for release 2021.03
## AH107049 | MassBank CompDb for release 2022.06
## AH111334 | MassBank CompDb for release 2022.12.1
## AH116164 | MassBank CompDb for release 2023.06
## AH116165 | MassBank CompDb for release 2023.09
## AH116166 | MassBank CompDb for release 2023.11
We fetch the CompDb
with MassBank annotations for release 2021.03.
qr <- query(ah, c("MassBank", "2021.03"))
cdb <- qr[[1]]
## loading from cache
## require("CompoundDb")
CompDb
Databases from MassBankMassBank provides its annotation database as a MySQL dump. To simplify its usage
(also for users not experienced with MySQL or with the specific MassBank
database layout), MassBank annotations can also be converted into the
(SQLite-based) CompDb
format which can be easily used with the
CompoundDb package. The steps to convert a MassBank MySQL
database to a CompDb
SQLite database are described below.
First the MySQL database dump needs to be downloaded from the MassBank github
page. This database needs
to be installed into a local MySQL/MariaDB database server (using mysql -h localhost -u <username> -p < MassBank.sql
with <username>
being the name of
the user with write access to the database server).
To transfer the MassBank data into a CompDb
database a helper function from
the CompoundDb
package can be used.
library(RMariaDB)
con <- dbConnect(MariaDB(), host = "localhost", user = <username>,
pass = <password>, dbname = "MassBank")
source(system.file("scripts", "massbank_to_compdb.R", package = "CompoundDb"))
massbank_to_compdb(con)
sessionInfo()
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] CompoundDb_1.11.0 S4Vectors_0.45.0 AnnotationFilter_1.31.0
## [4] AnnotationHub_3.15.0 BiocFileCache_2.15.0 dbplyr_2.5.0
## [7] BiocGenerics_0.53.0 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 dplyr_1.1.4 blob_1.2.4
## [4] filelock_1.0.3 Biostrings_2.75.0 bitops_1.0-9
## [7] fastmap_1.2.0 lazyeval_0.2.2 RCurl_1.98-1.16
## [10] digest_0.6.37 mime_0.12 lifecycle_1.0.4
## [13] cluster_2.1.6 ProtGenerics_1.39.0 rsvg_2.6.1
## [16] KEGGREST_1.47.0 RSQLite_2.3.7 magrittr_2.0.3
## [19] compiler_4.5.0 rlang_1.1.4 sass_0.4.9
## [22] tools_4.5.0 utf8_1.2.4 yaml_2.3.10
## [25] knitr_1.48 htmlwidgets_1.6.4 bit_4.5.0
## [28] curl_5.2.3 xml2_1.3.6 BiocParallel_1.41.0
## [31] withr_3.0.2 purrr_1.0.2 grid_4.5.0
## [34] fansi_1.0.6 colorspace_2.1-1 ggplot2_3.5.1
## [37] MASS_7.3-61 scales_1.3.0 cli_3.6.3
## [40] rmarkdown_2.28 crayon_1.5.3 generics_0.1.3
## [43] httr_1.4.7 rjson_0.2.23 DBI_1.2.3
## [46] cachem_1.1.0 zlibbioc_1.53.0 parallel_4.5.0
## [49] AnnotationDbi_1.69.0 BiocManager_1.30.25 XVector_0.47.0
## [52] base64enc_0.1-3 vctrs_0.6.5 jsonlite_1.8.9
## [55] bookdown_0.41 IRanges_2.41.0 bit64_4.5.2
## [58] clue_0.3-65 jquerylib_0.1.4 glue_1.8.0
## [61] codetools_0.2-20 DT_0.33 Spectra_1.17.0
## [64] gtable_0.3.6 BiocVersion_3.21.1 GenomeInfoDb_1.43.0
## [67] GenomicRanges_1.59.0 UCSC.utils_1.3.0 munsell_0.5.1
## [70] tibble_3.2.1 pillar_1.9.0 rappdirs_0.3.3
## [73] htmltools_0.5.8.1 GenomeInfoDbData_1.2.13 R6_2.5.1
## [76] evaluate_1.0.1 Biobase_2.67.0 png_0.1-8
## [79] memoise_2.0.1 bslib_0.8.0 MetaboCoreUtils_1.15.0
## [82] Rcpp_1.0.13 gridExtra_2.3 ChemmineR_3.59.0
## [85] xfun_0.48 fs_1.6.4 MsCoreUtils_1.19.0
## [88] pkgconfig_2.0.3