Abstract
This vignette is a guide containing example code for performing real-life tasks. Importantly, it covers some functionality that were not covered in the Quick-Start vignette (because they are too computationally intensive to be reproducible in a vignette). Version 1.9.0
For instructions on installing and configuring SpliceWiz, please see the Quick-Start vignette.
library(SpliceWiz)
#> Loading required package: NxtIRFdata
#> SpliceWiz package loaded with 2 threads
#> Use setSWthreads() to set the number of SpliceWiz threads
First, define the path to the directory in which the reference should be stored. This directory will be made by SpliceWiz, but its parent directory must exist, otherwise an error will be returned.
Note that setting genome_path = "hg38"
will prompt
SpliceWiz to use the default files for nonPolyA and Mappability
exclusion references in the generation of its reference. Valid options
for genome_path
are “hg38”, “hg19”, “mm10” and “mm9”.
buildRef(
reference_path = ref_path,
fasta = "genome.fa", gtf = "transcripts.gtf",
genome_type = "hg38"
)
buildRef()
first runs getResources()
, which
prepares the genome and gene annotations by storing a compressed local
copy in the resources
subdirectory of the given reference
path. Specifically, a binary compressed version of the FASTA file
(a.k.a. TwoBitFile), and a gzipped GTF file. If fasta
and/or gtf
are https or ftp links, the resources will be
downloaded from the internet (which may take a while).
After local compressed versions of the genome and gene annotations
are prepared, buildRef()
will proceed to generate the
SpliceWiz reference.
Note that these two steps can be run separately.
getResources()
will prepare local compressed copies of the
FASTA / GTF resources without generating the SpliceWiz reference.
Running buildRef()
, with reference_path
specifying where the resources were prepared previously with
getResources()
, will perform the 2nd step (SpliceWiz
reference generation) without needing to prepare the genome resources
(in this case, set the parameters fasta = ""
and
gtf = ""
).
As an example, the below steps:
getResources(
reference_path = ref_path,
fasta = "genome.fa",
gtf = "transcripts.gtf"
)
buildRef(
reference_path = ref_path,
fasta = "", gtf = "",
genome_type = "hg38"
)
is equivalent to this:
buildRef(
reference_path = ref_path,
fasta = "genome.fa",
gtf = "transcripts.gtf"
genome_type = "hg38"
)
To re-build and overwrite an existing reference, using the same
resource annotations, set overwrite = TRUE
# Assuming hg38 genome:
buildRef(
reference_path = ref_path,
genome_type = "hg38",
overwrite = TRUE
)
If buildRef()
is run without setting
overwrite = TRUE
, it will terminate if the file
SpliceWiz.ref.gz
is found within the reference
directory.
The following will first download the genome and gene annotation files from the online resource and store a local copy of it in a file cache, facilitated by BiocFileCache. Then, it uses the downloaded resource to create the SpliceWiz reference.
FTP <- "ftp://ftp.ensembl.org/pub/release-94/"
buildRef(
reference_path = ref_path,
fasta = paste0(FTP, "fasta/homo_sapiens/dna/",
"Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz"),
gtf = paste0(FTP, "gtf/homo_sapiens/",
"Homo_sapiens.GRCh38.94.chr.gtf.gz"),
genome_type = "hg38"
)
AnnotationHub contains Ensembl references for many genomes. To browse what is available:
require(AnnotationHub)
#> Loading required package: AnnotationHub
#> Loading required package: BiocGenerics
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
#> lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#> pmin.int, rank, rbind, rownames, sapply, saveRDS, setdiff, table,
#> tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: BiocFileCache
#> Loading required package: dbplyr
ah <- AnnotationHub()
query(ah, "Ensembl")
#> AnnotationHub with 36971 records
#> # snapshotDate(): 2024-10-24
#> # $dataprovider: Ensembl, FANTOM5,DLRP,IUPHAR,HPRD,STRING,SWISSPROT,TREMBL,E...
#> # $species: Mus musculus, Sus scrofa, Homo sapiens, Rattus norvegicus, Danio...
#> # $rdataclass: TwoBitFile, GRanges, EnsDb, SQLiteFile, data.frame, OrgDb, li...
#> # additional mcols(): taxonomyid, genome, description,
#> # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#> # rdatapath, sourceurl, sourcetype
#> # retrieve records with, e.g., 'object[["AH5046"]]'
#>
#> title
#> AH5046 | Ensembl Genes
#> AH5160 | Ensembl Genes
#> AH5311 | Ensembl Genes
#> AH5434 | Ensembl Genes
#> AH5435 | Ensembl EST Genes
#> ... ...
#> AH117289 | LRBaseDb for Taeniopygia guttata (Zebra finch, v008)
#> AH117290 | LRBaseDb for Takifugu rubripes (Fugu, v008)
#> AH117291 | LRBaseDb for Ursus maritimus (Polar bear, v008)
#> AH117292 | LRBaseDb for Vulpes vulpes (Red fox, v008)
#> AH117293 | LRBaseDb for Xenopus tropicalis (Tropical clawed frog, v008)
For a more specific query:
query(ah, c("Homo Sapiens", "release-94"))
#> AnnotationHub with 9 records
#> # snapshotDate(): 2024-10-24
#> # $dataprovider: Ensembl
#> # $species: Homo sapiens
#> # $rdataclass: TwoBitFile, GRanges
#> # additional mcols(): taxonomyid, genome, description,
#> # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#> # rdatapath, sourceurl, sourcetype
#> # retrieve records with, e.g., 'object[["AH64628"]]'
#>
#> title
#> AH64628 | Homo_sapiens.GRCh38.94.abinitio.gtf
#> AH64629 | Homo_sapiens.GRCh38.94.chr.gtf
#> AH64630 | Homo_sapiens.GRCh38.94.chr_patch_hapl_scaff.gtf
#> AH64631 | Homo_sapiens.GRCh38.94.gtf
#> AH65744 | Homo_sapiens.GRCh38.cdna.all.2bit
#> AH65745 | Homo_sapiens.GRCh38.dna.primary_assembly.2bit
#> AH65746 | Homo_sapiens.GRCh38.dna_rm.primary_assembly.2bit
#> AH65747 | Homo_sapiens.GRCh38.dna_sm.primary_assembly.2bit
#> AH65748 | Homo_sapiens.GRCh38.ncrna.2bit
We wish to fetch “AH65745” and “AH64631” which contains the desired FASTA and GTF files, respectively. To build a reference using these resources:
Build-Reference-methods
will recognise the inputs of
fasta
and gtf
as AnnotationHub resources if
they begin with “AH”.
For human and mouse genomes, we highly recommend specifying
genome_type
as the default mappability file is used to
exclude intronic regions with repeat sequences from intron retention
analysis. For other species, one could generate a SpliceWiz reference
without this reference:
buildRef(
reference_path = ref_path,
fasta = "genome.fa", gtf = "transcripts.gtf",
genome_type = ""
)
If one wishes to prepare a Mappability Exclusion for species other than
human or mouse, please see the
Calculating Mappability Exclusions using STAR
section
below.
For human and mouse genomes, gene ontology annotations are
automatically generated. This is inferred by specifying
genome_type
to the human or mouse genome. For other
species, or to specify human/mouse, this should be specified in the
ontologySpecies
parameter of buildRef()
.
Only Ensembl/orgDB resources are supported (for now). For a list of available species:
getAvailableGO()
#> [1] "Anopheles gambiae"
#> [2] "Arabidopsis thaliana"
#> [3] "Bos taurus"
#> [4] "Canis familiaris"
#> [5] "Gallus gallus"
#> [6] "Pan troglodytes"
#> [7] "Escherichia coli"
#> [8] "Drosophila melanogaster"
#> [9] "Homo sapiens"
#> [10] "Mus musculus"
#> [11] "Sus scrofa"
#> [12] "Rattus norvegicus"
#> [13] "Macaca mulatta"
#> [14] "Caenorhabditis elegans"
#> [15] "Xenopus laevis"
#> [16] "Saccharomyces cerevisiae"
#> [17] "Danio rerio"
#> [18] "Triticum aestivum"
#> [19] "Triticum aestivum_subsp._aestivum"
#> [20] "Triticum vulgare"
#> [21] "Brassica napus"
#> [22] "Arachis hypogaea"
#> [23] "Hibiscus syriacus"
#> [24] "Acridium cancellatum"
#> [25] "Schistocerca cancellata"
#> [26] "Triticum dicoccoides"
#> [27] "Triticum turgidum_subsp._dicoccoides"
#> [28] "Triticum turgidum_var._dicoccoides"
#> [29] "Dendrohyas sarda"
#> [30] "Hyla arborea_sarda"
#> [31] "Hyla sarda"
#> [32] "Locusta gregaria"
#> [33] "Schistocerca gregaria"
#> [34] "Gossypium hirsutum"
#> [35] "Gossypium hirsutum_subsp._mexicanum"
#> [36] "Gossypium lanceolatum"
#> [37] "Gossypium purpurascens"
#> [38] "Camelina sativa"
#> [39] "Myagrum sativum"
#> [40] "Carassius auratus_gibelio"
#> [41] "Carassius gibelio_gibelio"
#> [42] "Carassius gibelio"
#> [43] "Carassius gibelio_subsp._gibelio"
#> [44] "Cyprinus gibelio"
#> [45] "Schistocerca piceifrons"
#> [46] "Papaver somniferum"
#> [47] "Zingiber officinale"
#> [48] "Trichomonas vaginalis_G3"
#> [49] "Trichomonas vaginalis_strain_G3"
#> [50] "Helianthus annuus"
#> [51] "Schistocerca americana"
#> [52] "Acipenser ruthenus"
#> [53] "Schistocerca serialis_cubense"
#> [54] "Panicum virgatum"
#> [55] "Nicotiana tabacum"
#> [56] "Oncorhynchus mykiss"
#> [57] "Oncorhynchus nerka_mykiss"
#> [58] "Parasalmo mykiss"
#> [59] "Salmo mykiss"
#> [60] "Schistocerca nitens"
#> [61] "Schistocerca vaga"
#> [62] "Salvia splendens"
#> [63] "Carassius carassius"
#> [64] "Cyprinus carassius"
#> [65] "Vicia villosa"
#> [66] "Camellia sinensis"
#> [67] "Thea sinensis"
#> [68] "Oncorhynchus keta"
#> [69] "Salmo keta"
#> [70] "Pisum sativum"
#> [71] "Salmo salar"
#> [72] "Raphanus sativus"
#> [73] "Oncorhynchus kisutch"
#> [74] "Oncorhyncus kisutch"
#> [75] "Salmo kisatch"
#> [76] "Lolium rigidum"
#> [77] "Aegilops squarrosa_subsp._squarrosa"
#> [78] "Aegilops squarrosa"
#> [79] "Aegilops tauschii"
#> [80] "Patropyrum tauschii_subsp._tauschii"
#> [81] "Patropyrum tauschii"
#> [82] "Triticum aegilops"
#> [83] "Triticum tauschii"
#> [84] "Salmo trutta"
#> [85] "Cryptomeria japonica"
#> [86] "Cupressus japonica"
#> [87] "Coregonus clupeaformis"
#> [88] "Salmo clupeaformis"
#> [89] "Oncorhynchus gorbuscha"
#> [90] "Salmo gorbuscha"
#> [91] "Cyprinus carpio"
#> [92] "Glycine max_subsp._soja"
#> [93] "Glycine soja"
#> [94] "Salmo fontinalis"
#> [95] "Salvelinus fontinalis"
#> [96] "Glycine max"
#> [97] "Phaseolus max"
#> [98] "Chenopodium quinoa"
#> [99] "Hordeum sativum"
#> [100] "Hordeum vulgare_subsp._vulgare"
#> [101] "Hordeum vulgare_var._nudum"
#> [102] "Hordeum vulgare_var._vulgare"
#> [103] "Festuca perennis_(L.)_Columbus_&_J.P.Sm.,_2010"
#> [104] "Festuca perennis"
#> [105] "Lolium perenne"
#> [106] "Lolium vulgare"
#> [107] "Coffea arabica"
#> [108] "Barbus grahami"
#> [109] "Sinocyclocheilus grahami"
#> [110] "Sinocyclocheilus rhinocerous"
#> [111] "Gossypium arboreum"
#> [112] "Brassica oleracea"
#> [113] "Malus sylvestris"
#> [114] "Pyrus malus_var._sylvestris"
#> [115] "Astyanax mexicanus"
#> [116] "Tetragonopterus mexicanus"
#> [117] "Arachis stenosperma"
#> [118] "Prosopis alba"
#> [119] "Sinocyclocheilus anshuiensis"
#> [120] "Brassica rapa"
#> [121] "Lactuca sativa"
#> [122] "Dreissena polymorpha"
#> [123] "Mytilus polymorphus"
#> [124] "Hydractinia symbiolongicarpus"
#> [125] "Triticum urartu"
#> [126] "Hevea brasiliensis"
#> [127] "Siphonia brasiliensis"
#> [128] "Oncorhynchus tschawytscha"
#> [129] "Oncorhynchus tshawytscha"
#> [130] "Salmo tshawytscha"
#> [131] "Arachis ipaensis"
#> [132] "Zea mays"
#> [133] "Zea mays_var._japonica"
#> [134] "Salmo namaycush"
#> [135] "Salvelinus namaycush"
#> [136] "Capsicum annuum"
#> [137] "Brienomyrus brachyistius"
#> [138] "Marcusenius brachyistius"
#> [139] "Convolvulus nil"
#> [140] "Ipomoea nil"
#> [141] "Pharbitis nil"
#> [142] "Olea europaea_subsp._europaea_var._sylvestris"
#> [143] "Olea europaea_var._oleaster"
#> [144] "Olea europaea_var._sylvestris"
#> [145] "Olea europea_subsp._sylvestris"
#> [146] "Alosa sapidissima"
#> [147] "Clupea sapidissima"
#> [148] "Carpiodes asiaticus"
#> [149] "Myxocyprinus asiaticus"
#> [150] "Actinidia eriantha"
#> [151] "Gossypium raimondii"
#> [152] "Salmo alpinus"
#> [153] "Salvelinus alpinus"
#> [154] "Catostomus texanus"
#> [155] "Xyrauchen texanus"
#> [156] "Doryrhamphus excisus"
#> [157] "Quercus lobata"
#> [158] "Malus communis"
#> [159] "Malus domestica"
#> [160] "Malus pumila_auct."
#> [161] "Malus pumila_var._domestica"
#> [162] "Malus sylvestris_var._domestica"
#> [163] "Malus x_domestica"
#> [164] "Pyrus malus"
#> [165] "Pyrus malus_var._domestica"
#> [166] "Quercus suber"
#> [167] "Oncorhynchus nerka"
#> [168] "Salmo nerka"
#> [169] "Nicotiana tomentosiformis"
#> [170] "Carya illinoensis"
#> [171] "Carya illinoinensis"
#> [172] "Mercenaria mercenaria"
#> [173] "Venus mercenaria"
#> [174] "Quercus robur"
#> [175] "Durio zibethinus"
#> [176] "Pongo abelii"
#> [177] "Pongo pygmaeus_abelii"
#> [178] "Pongo pygmaeus_abeli"
#> [179] "Mya arenaria"
#> [180] "Arachis duranensis"
#> [181] "Arachis spegazzinii"
#> [182] "Pyrus x_bretschneideri"
#> [183] "Trifolium pratense"
#> [184] "Gorilla gorilla_gorilla"
#> [185] "Cobitis anguillicaudata"
#> [186] "Misgurnus anguillicaudatus"
#> [187] "Scaphiopus bombifrons"
#> [188] "Spea bombifrons"
#> [189] "Haliotis rufenscens"
#> [190] "Haliotis rufescens"
#> [191] "Oreochromis nilotica"
#> [192] "Oreochromis niloticus"
#> [193] "Perca nilotica"
#> [194] "Tilapia nilotica"
#> [195] "Acropora convexa"
#> [196] "Acropora millepora"
#> [197] "Acropora singularis"
#> [198] "Cebus apella"
#> [199] "Sapajus apella"
#> [200] "Simia apella"
#> [201] "Eucalyptus grandis"
#> [202] "Dasypus fenestratus"
#> [203] "Dasypus novemcinctus"
#> [204] "Callithrix jacchus_jacchus"
#> [205] "Callithrix jacchus"
#> [206] "Simia jacchus"
#> [207] "Pistacia vera"
#> [208] "greater Indian_fruit_bat"
#> [209] "Pteropus giganteus"
#> [210] "Pteropus medius"
#> [211] "Salvia miltiorhiza"
#> [212] "Salvia miltiorrhiza"
#> [213] "Daphnia pulicaria"
#> [214] "Magnolia sinica"
#> [215] "Manglietia sinica"
#> [216] "Manglietiastrum sinicum"
#> [217] "Pachylarnax sinica_(Y.W.Law)_N.H.Xia_&_C.Y.Wu"
#> [218] "Rosa chinensis"
#> [219] "Rosa indica_auct.,_non_L."
#> [220] "Mytilus californianus"
#> [221] "Pteropus vampyrus"
#> [222] "Vespertilio vampyrus"
#> [223] "Chinemys reevesii"
#> [224] "Chinemys reevesi"
#> [225] "Emys reevesii"
#> [226] "Geoclemys reevesii"
#> [227] "Geoclemys reevessi"
#> [228] "Mauremys reevesii"
#> [229] "Mauremys reevesi"
#> [230] "Choloepus brasiliensis_Fitzinger_1871"
#> [231] "Choloepus brasiliensis"
#> [232] "Choloepus didactylus"
#> [233] "Macaca nemestrina"
#> [234] "Simia nemestrina"
#> [235] "Lotus corniculatus_var._japonicus"
#> [236] "Lotus japonicus"
#> [237] "Nicotiana sylvestris"
#> [238] "Tupaia belangeri_chinensis"
#> [239] "Tupaia chinensis"
#> [240] "Clarias gariepinus"
#> [241] "Clarias lazera"
#> [242] "Silurus gariepinus"
#> [243] "Barbus tetrazona"
#> [244] "Capoeta tetrazona"
#> [245] "Puntigrus tetrazona"
#> [246] "Puntius tetrazona"
#> [247] "Systomus tetrazona"
#> [248] "Lycium ferocissimum"
#> [249] "Nicotiana attenuata"
#> [250] "Octodon degus"
#> [251] "Haliotis rubra"
#> [252] "Aedes albopictus"
#> [253] "Stegomyia albopicta"
#> [254] "Spinacia oleracea"
#> [255] "Paramecium aurelia_syngen_4"
#> [256] "Paramecium tetraurelia"
#> [257] "Salvia hispanica"
#> [258] "Medicago truncatula"
#> [259] "Crassostrea virginica"
#> [260] "Ostrea virginica"
#> [261] "Felis catus"
#> [262] "Felis domesticus"
#> [263] "Felis silvestris_catus"
#> [264] "Anubis baboon"
#> [265] "Papio anubis"
#> [266] "Papio cynocephalus_anubis"
#> [267] "Papio doguera"
#> [268] "Papio hamadryas_anubis"
#> [269] "Papio hamadryas_doguera"
#> [270] "Simia anubis"
#> [271] "Pongo pygmaeus"
#> [272] "Simia pygmaeus"
#> [273] "Sorex etruscus"
#> [274] "Suncus etruscus"
#> [275] "Mimosa cineraria"
#> [276] "Prosopis cineraria"
#> [277] "Nycticebus coucang"
#> [278] "Tardigradus coucang"
#> [279] "Rhododendron vialii"
#> [280] "Pan paniscus"
#> [281] "Nematostella vectensis"
#> [282] "Ixodes dammini"
#> [283] "Ixodes scapularis"
#> [284] "Lupinus angustifolius"
#> [285] "Ipomoea triloba"
#> [286] "Equus asinus"
#> [287] "Emiliania huxleyi_CCMP1516"
#> [288] "Emiliania huxleyi_CCMP2090"
#> [289] "Mangifera indica"
#> [290] "Pteropus alecto"
#> [291] "Rana temporaria"
#> [292] "Crassostrea gigas"
#> [293] "Magallana gigas"
#> [294] "Ostrea gigas"
#> [295] "Etheostoma spectabile"
#> [296] "Poecilichthys spectabilis"
#> [297] "Macadamia integrifolia"
#> [298] "Megalobrama amblycephala"
#> [299] "Halichoerus grypus"
#> [300] "Phoca grypus"
#> [301] "Juglans regia"
#> [302] "Selaginella moellendorffii"
#> [303] "Selaginella moellendorfii"
#> [304] "Pleuronectes platessa"
#> [305] "Presbytis francoisi"
#> [306] "Trachypithecus francoisi"
#> [307] "Tripterygium wilfordii"
#> [308] "Argiope bruennichi"
#> [309] "Lepus cuniculus"
#> [310] "Oryctolagus cuniculus"
#> [311] "Huro salmoides"
#> [312] "Labrus salmoides"
#> [313] "Labrus salmonides"
#> [314] "Micropterus nigricans"
#> [315] "Micropterus salmoides"
#> [316] "Solanum stenotomum"
#> [317] "Heterocephalus glaber"
#> [318] "Neosciurus carolinensis"
#> [319] "Sciurus carolinensis"
#> [320] "Cervus elaphus"
#> [321] "Polyodon spathula"
#> [322] "Squalus spathula"
#> [323] "Gadus chalcogrammus"
#> [324] "Theragra chalcogramma"
#> [325] "Nothobranchius furzeri"
#> [326] "Bos bubalis"
#> [327] "Bubalus arnee_bubalis"
#> [328] "Bubalus bubalis"
#> [329] "Pleuronectes solea"
#> [330] "Solea solea"
#> [331] "Solea vulgaris"
#> [332] "Mastomys coucha"
#> [333] "Praomys coucha"
#> [334] "Impatiens glandulifera"
#> [335] "Dermacentor andersoni"
#> [336] "Felis nebulosa"
#> [337] "Neofelis nebulosa"
#> [338] "Pteropus egyptiacus"
#> [339] "Rousettus aegyptiacus"
#> [340] "Rousettus aegypticus"
#> [341] "Rousettus egyptiacus"
#> [342] "Phoenix dactylifera"
#> [343] "Pimephales promelas"
#> [344] "Ostrea edulis"
#> [345] "Peromyscus maniculatus_bairdii"
#> [346] "Gasterosteus pungitius"
#> [347] "Pungitius pungitius"
#> [348] "Populus alba"
#> [349] "Cricetus auratus"
#> [350] "Golden hamsters"
#> [351] "Mesocricetus auratus"
#> [352] "Syrian hamsters"
#> [353] "Chromis aureus"
#> [354] "Oreochromis aurea"
#> [355] "Oreochromis aureus"
#> [356] "Daucus carota_subsp._sativus"
#> [357] "Daucus carota_var._sativus"
#> [358] "Dermacentor silvarum"
#> [359] "Hylobates syndactylus"
#> [360] "Simia syndactyla"
#> [361] "Symphalangus syndactylus"
#> [362] "Saccharolobus solfataricus"
#> [363] "Sulfolobus solfataricus"
#> [364] "Felis geoffroyi"
#> [365] "Leopardus geoffroyi"
#> [366] "Oncifelis geoffroyi"
#> [367] "Felis yagouaroundi"
#> [368] "Herpailurus yagouaroundi"
#> [369] "Herpailurus yaguarondi"
#> [370] "Puma yagouaroundii"
#> [371] "Puma yagouaroundi"
#> [372] "Cervus canadensis"
#> [373] "Populus diversifolia"
#> [374] "Populus euphratica"
#> [375] "Cucurbita pepo_subsp._pepo"
#> [376] "Cucurbita pepo_var._medullosa"
#> [377] "Cucurbita pepo_var._pepo"
#> [378] "Macaca cynomolgus"
#> [379] "Macaca fascicularis"
#> [380] "Macaca irus"
#> [381] "Simia fascicularis"
#> [382] "Emys muticus"
#> [383] "Geoclemmys mutica"
#> [384] "Mauremys mutica"
#> [385] "Suricata suricatta"
#> [386] "Viverra suricatta"
#> [387] "Hylobates moloch"
#> [388] "Simia moloch"
#> [389] "Solanum dulcamara"
#> [390] "Cucurbita moschata"
#> [391] "Coffea eugeniodes"
#> [392] "Coffea eugenioides"
#> [393] "Cucurbita maxima"
#> [394] "Labrus bergylta"
#> [395] "Centropristis striata"
#> [396] "Labrus striatus"
#> [397] "Oryza sativa_(japonica_cultivar-group)"
#> [398] "Oryza sativa_Japonica_Group"
#> [399] "Oryza sativa_subsp._japonica"
#> [400] "Jaculus jaculus"
#> [401] "Mus jaculus"
#> [402] "Dioscorea cayenensis_subsp._rotundata_(Poir.)_J.Miege,_1968"
#> [403] "Dioscorea cayenensis_subsp._rotundata"
#> [404] "Dioscorea rotundata"
#> [405] "Cercopithecus aethiops_sabaeus"
#> [406] "Cercopithecus sabaeus"
#> [407] "Cercopithecus sabeus"
#> [408] "Chlorocebus aethiops_sabaeus"
#> [409] "Chlorocebus aethiops_sabeus"
#> [410] "Chlorocebus sabaeus"
#> [411] "Chlorocebus sabeus"
#> [412] "Simia sabaea"
#> [413] "Marmota monax"
#> [414] "Mus monax"
#> [415] "Pygathrix roxellana"
#> [416] "Rhinopithecus roxellana"
#> [417] "Semnopithecus roxellana"
#> [418] "Callorhinus ursinus"
#> [419] "Callorhynus ursius"
#> [420] "Phoca ursina"
#> [421] "Cricetulus barabensis_griseus"
#> [422] "Cricetulus griseus"
#> [423] "Elephantulus edwardii"
#> [424] "Macroscelides edwardii"
#> [425] "Cobitis heteroclita"
#> [426] "Fundulus heteroclitus"
#> [427] "Neothunnus macropterus"
#> [428] "Scomber albacares"
#> [429] "Thunnus albacares"
#> [430] "Telopea speciosissima"
#> [431] "Danio aesculapii"
#> [432] "Danio sp._'snakeskin'"
#> [433] "Danio sp._snakeskin"
#> [434] "Apodemus sylvaticus"
#> [435] "Mus sylvaticus"
#> [436] "Sylvaemus sylvaticus"
#> [437] "Populus balsamifera_subsp._trichocarpa"
#> [438] "Populus trichocarpa"
#> [439] "Mercurialis ambigua"
#> [440] "Mercurialis annua"
#> [441] "Eugenia oleosa"
#> [442] "Syzygium oleosum"
#> [443] "Citellus tridecemlineatus"
#> [444] "Ictidomys tridecemlineatus"
#> [445] "Spermophilus tridecemlineatus"
#> [446] "Ovis ammon_aries"
#> [447] "Ovis aries"
#> [448] "Ovis orientalis_aries"
#> [449] "Ovis ovis"
#> [450] "Solanum verrucosum"
#> [451] "Leo pardus"
#> [452] "Panthera pardus"
#> [453] "Microtus oregoni"
#> [454] "Arabidopsis lyrata_subsp._lyrata"
#> [455] "Arabis lyrata_subsp._lyrata"
#> [456] "Arabis lyrata"
#> [457] "Cardaminopsis lyrata"
#> [458] "Manihot esculenta"
#> [459] "Manihot utilissima"
#> [460] "Mustela erminea"
#> [461] "Dolichos unguiculatus"
#> [462] "Phaseolus unguiculatus"
#> [463] "Vigna unguiculata"
#> [464] "Lycopersicon pennellii_(Correll)_D'Arcy,_1982"
#> [465] "Solanum pennellii_Correll,_1958"
#> [466] "Solanum pennellii"
#> [467] "Panicum viride"
#> [468] "Setaria viridis"
#> [469] "Musa AA_Group"
#> [470] "Musa acuminata_AA_Group"
#> [471] "Musa acuminata"
#> [472] "Musa nana"
#> [473] "Gymnostomus macrolepis"
#> [474] "Onychostoma macrolepis"
#> [475] "Scaphesthes macrolepis"
#> [476] "Varicorhinus macrolepis"
#> [477] "Varicorhinus (Scaphesthes)_macrolepis"
#> [478] "Oryza glaberrima"
#> [479] "Pelteobagrus fulvidraco"
#> [480] "Pimelodus fulvidraco"
#> [481] "Pseudobagrus fulvidraco"
#> [482] "Tachysurus fulvidraco"
#> [483] "Hylobates concolor_leucogenys"
#> [484] "Hylobates concolor_leucogyneus"
#> [485] "Hylobates leucogenys_leucogenys"
#> [486] "Hylobates leucogenys"
#> [487] "Nomascus leucogenys_leucogenys"
#> [488] "Nomascus leucogenys"
#> [489] "Nomascus leukogenys"
#> [490] "Nannospalax ehrenbergi_galili"
#> [491] "Nannospalax galili"
#> [492] "Spalax galili"
#> [493] "Equus caballus"
#> [494] "Equus przewalskii_f._caballus"
#> [495] "Equus przewalskii_forma_caballus"
#> [496] "Thunnus maccoyii"
#> [497] "Thynnus maccoyii"
#> [498] "Chromis diagramma"
#> [499] "Simochromis diagramma"
#> [500] "Diplophysa dalaica"
#> [501] "Triplophysa dalaica"
#> [502] "Felis tigris"
#> [503] "Panthera tigris"
#> [504] "Echinus purpuratus"
#> [505] "Strongylocentrotus purpuratus"
#> [506] "Lucioperca lucioperca"
#> [507] "Perca lucioperca"
#> [508] "Sander lucioperca"
#> [509] "Stizostedion lucioperca"
#> [510] "Dipodomys spectabilis"
#> [511] "Acinonyx jubatus"
#> [512] "Felis jubata"
#> [513] "Conyza canadensis"
#> [514] "Erigeron canadensis"
#> [515] "Mustela lutreola"
#> [516] "Camelus bactrianus_ferus"
#> [517] "Camelus ferus"
#> [518] "Cajanus cajan"
#> [519] "Didelphys domestica"
#> [520] "Monodelphis domestica"
#> [521] "Pygathrix bieti"
#> [522] "Rhinopithecus bieti"
#> [523] "Saimiri boliviensis"
#> [524] "Hesperomys eremicus"
#> [525] "Peromyscus eremicus"
#> [526] "Arabidopsis salsuginea"
#> [527] "Eutrema salsugineum"
#> [528] "Hesperis salsuginea"
#> [529] "Sisymbrium salsugineum"
#> [530] "Stenophragma salsugineum"
#> [531] "Thellungiella salsuginea"
#> [532] "Thelypodium salsugineum"
#> [533] "Coetomys damarensis"
#> [534] "Cryptomys damarensis"
#> [535] "Fukomys damarensis"
#> [536] "Leptonychotes weddellii"
#> [537] "Leptonychotes weddelli"
#> [538] "Otaria weddellii"
#> [539] "Grammomys dolichurus_surdaster"
#> [540] "Grammomys surdaster"
#> [541] "Thamnomys surdaster"
#> [542] "Solanum aracc-papa"
#> [543] "Solanum tuberosum"
#> [544] "Andropogon sorghum"
#> [545] "Sorghum bicolor"
#> [546] "Sorghum bicolor_subsp._bicolor"
#> [547] "Sorghum nervosum"
#> [548] "Sorghum saccharatum"
#> [549] "Sorghum vulgare"
#> [550] "Holocentrus calcarifer"
#> [551] "Lates calcarifer"
#> [552] "Hippopotamus amphibius_kiboko"
#> [553] "Ixodes sanguineus"
#> [554] "Rhipicephalus sanguineus"
#> [555] "Clupea harengus_harengus"
#> [556] "Clupea harengus"
#> [557] "Bos indicus_x_Bos_taurus"
#> [558] "Bos primigenius_indicus_x_Bos_primigenius_taurus"
#> [559] "Bos taurus_indicus_x_Bos_taurus_taurus"
#> [560] "Bos taurus_x_Bos_indicus"
#> [561] "Chrysochloris asiatica"
#> [562] "Talpa asiatica"
#> [563] "Bufo bufo"
#> [564] "Rana bufo"
#> [565] "Maylandia zebra"
#> [566] "Metriaclima zebra"
#> [567] "Pseudotropheus sp._'Pseudotropheus_zebra_complex'"
#> [568] "Pseudotropheus zebra"
#> [569] "Ictalurus punctatus"
#> [570] "Silurus punctatus"
#> [571] "Oryx dammah"
#> [572] "Camelus dromedarius"
#> [573] "Asparagus litoralis"
#> [574] "Asparagus officinalis"
#> [575] "Amaranthus gangeticus"
#> [576] "Amaranthus mangostanus"
#> [577] "Amaranthus tricolor"
#> [578] "Pneumatophorus japonicus"
#> [579] "Scomber japonicus"
#> [580] "Lutra lutra"
#> [581] "Peromyscus leucopus"
#> [582] "Perca fluviatilis"
#> [583] "Pagothenia bernacchii"
#> [584] "Pseudotrematomus bernacchii"
#> [585] "Trematomus bernacchii"
#> [586] "Trematomus bernacchi"
#> [587] "Lipurus cinereus"
#> [588] "Phascolarctos cinereus"
#> [589] "Mustela furo"
#> [590] "Mustela putorius_furo"
#> [591] "Chaetochloa italica"
#> [592] "Panicum italicum"
#> [593] "Pennisetum macrochaetum"
#> [594] "Setaria italica"
#> [595] "Setaria viridis_subsp._italica"
#> [596] "Elaeis guineensis"
#> [597] "Mus rattus"
#> [598] "Rattus rattoides"
#> [599] "Rattus rattus"
#> [600] "Rattus wroughtoni"
#> [601] "Acropora digitifera"
#> [602] "Madrepora digitifera"
#> [603] "Echinops telfairii"
#> [604] "Echinops telfairi"
#> [605] "Madrepora verrucosa"
#> [606] "Pocillopora danae"
#> [607] "Pocillopora verrucosa"
#> [608] "Myotis daubentonii"
#> [609] "Myotis daubentoni"
#> [610] "Vespertilio daubentonii"
#> [611] "Limia formosa"
#> [612] "Mollienesia formosa"
#> [613] "Poecilia formosa"
#> [614] "Phyllostomus discolor"
#> [615] "Microcebus murinus"
#> [616] "Peromyscus californicus_insignis"
#> [617] "Peromyscus californicus_subsp._insignis"
#> [618] "Galago garnettii"
#> [619] "Galago garnetti"
#> [620] "Otolemur garnettii"
#> [621] "Arvicanthis niloticus"
#> [622] "Mus niloticus"
#> [623] "Didelphis ursina"
#> [624] "Vombatus ursinus"
#> [625] "Phaseolus angularis"
#> [626] "Vigna angularis"
#> [627] "Haitia acuta"
#> [628] "Physa acuta"
#> [629] "Physa heterostropha"
#> [630] "Physa integra"
#> [631] "Physella acuta"
#> [632] "Physella heterostropha"
#> [633] "Physella integra"
#> [634] "Ctenopharyngodon idella"
#> [635] "Ctenopharyngodon idellus"
#> [636] "Leuciscus idella"
#> [637] "Thalassophryne amazonica"
#> [638] "Cyprinus rohita"
#> [639] "Labeo rohita"
#> [640] "Talpa occidentalis"
#> [641] "Bombina bombina"
#> [642] "Rana bombina"
#> [643] "Cavia aperea_porcellus"
#> [644] "Cavia cobaya"
#> [645] "Cavia porcellus"
#> [646] "Mus porcellus"
#> [647] "Odocoileus virginianus"
#> [648] "Amphibalanus amphitrite"
#> [649] "Balanus amphitrite"
#> [650] "Panicum hallii"
#> [651] "Angill angill"
#> [652] "Anguilla anguilla_anguilla"
#> [653] "Anguilla anguilla"
#> [654] "Muraena anguilla"
#> [655] "Delphinus orca"
#> [656] "Orcinus orca"
#> [657] "Cannabis sativa"
#> [658] "Penaeus bubulus"
#> [659] "Penaeus carinatus"
#> [660] "Penaeus durbani"
#> [661] "Penaeus monodon"
#> [662] "Penaeus (Penaeus)_monodon"
#> [663] "Didelphis vulpecula"
#> [664] "Trichosurus vulpecula"
#> [665] "Myotis lucifugus"
#> [666] "Vespertilio lucifugus"
#> [667] "Brachypodium distachyon"
#> [668] "Bromus distachyos"
#> [669] "Aotus nancymaae"
#> [670] "Aotus nancymai"
#> [671] "Rhamnus zizyphus"
#> [672] "Ziziphus jujuba"
#> [673] "Ailuropoda melanoleuca"
#> [674] "Micropterus dolomieu"
#> [675] "Micropterus velox"
#> [676] "Lycopersicon esculentum"
#> [677] "Lycopersicon esculentum_var._esculentum"
#> [678] "Solanum esculentum"
#> [679] "Solanum lycopersicum"
#> [680] "Solanum lycopersicum_var._humboldtii"
#> [681] "Poecilia mexicana"
#> [682] "Manis pentadactyla"
#> [683] "Meles meles"
#> [684] "Ursus meles"
#> [685] "Ornithorhynchus anatinus"
#> [686] "Platypus anatinus"
#> [687] "Felis uncia"
#> [688] "Panthera uncia"
#> [689] "Uncia uncia"
#> [690] "Alligator mississippiensis"
#> [691] "Crocodilus mississipiensis"
#> [692] "Myrmecophaga aculeata"
#> [693] "Tachyglossus aculeatus"
#> [694] "Colossoma macropomum"
#> [695] "Myletes macropomus"
#> [696] "Cordylus capensis"
#> [697] "Cordylus (Hemicordylus)_capensis"
#> [698] "Hemicordylus capensis"
#> [699] "Pseudocordylus capensis"
#> [700] "Zonurus capensis"
#> [701] "Eptesicus fuscus"
#> [702] "Vespertilio fuscus"
#> [703] "Dromiciops australis"
#> [704] "Dromiciops gliroides"
#> [705] "Camelus pacos"
#> [706] "Lama guanicoe_pacos"
#> [707] "Lama pacos"
#> [708] "Vicugna pacos"
#> [709] "Mollienesia latipinna"
#> [710] "Poecilia latipinna"
#> [711] "Elephas maximus_indicus"
#> [712] "Corylus avellana"
#> [713] "Ostrea maxima"
#> [714] "Pecten maximus"
#> [715] "Felis viverrina"
#> [716] "Prionailurus viverrinus"
#> [717] "Gymnodraco acuticeps"
#> [718] "Thalarctos maritimus"
#> [719] "Ursus maritimus"
#> [720] "Lemur catta"
#> [721] "Myotis myotis"
#> [722] "Vespertilio myotis"
#> [723] "Lytechinus pictus"
#> [724] "Psammechinus pictus"
#> [725] "Litopenaeus vannamei"
#> [726] "Penaeus (Litopenaeus)_vannamei"
#> [727] "Penaeus vannamei"
#> [728] "Ursus arctos"
#> [729] "Vitis riparia"
#> [730] "Felis bengalensis"
#> [731] "Prionailurus bengalensis"
#> [732] "Clethrionomys glareolus"
#> [733] "Mus glareolus"
#> [734] "Myodes glareolus"
#> [735] "Mustela nigripes"
#> [736] "Putorius nigripes"
#> [737] "Alopex lagopus"
#> [738] "Canis lagopus"
#> [739] "Vulpes lagopus"
#> [740] "Cercocebus atys"
#> [741] "Cercocebus torquatus_atys"
#> [742] "Simia atys"
#> [743] "Lepidosiren annectens"
#> [744] "Protopterus annectens"
#> [745] "Rhinocryptis annectens"
#> [746] "Cerasus avium"
#> [747] "Prunus avium"
#> [748] "Prunus cerasus_var._avium"
#> [749] "Procambarus clarkii"
#> [750] "Sorex fumeus"
#> [751] "Macrorhinus angustirostris"
#> [752] "Mirounga angustirostris"
#> [753] "Beta vulgaris_subsp._vulgaris"
#> [754] "Beta vulgaris_subsp._vulgaris_var._altissima"
#> [755] "Beta vulgaris_Sugar_Beet_Group"
#> [756] "Beta vulgaris_var._altissima"
#> [757] "Eumetopias jubatus"
#> [758] "Phoca jubata"
#> [759] "Centruroides sculpturatus"
#> [760] "Diceros bicornis_minor"
#> [761] "Cicer arietinum"
#> [762] "Cleome hassleriana_Chodat,_1898"
#> [763] "Tarenaya hassleriana"
#> [764] "Sebastes umbrosus"
#> [765] "Sebastichthys umbrosus"
#> [766] "Eriocheir chinensis"
#> [767] "Eriocheir japonica_sinensis"
#> [768] "Eriocheir sinensis"
#> [769] "Dicentrarchus labrax"
#> [770] "Labrax labrax"
#> [771] "Morone labrax"
#> [772] "Perca labrax"
#> [773] "Roccus labrax"
#> [774] "Sciaena labrax"
#> [775] "Acanthopagrus latus"
#> [776] "Sparus latus"
#> [777] "Xiphophorus hellerii"
#> [778] "Xiphophorus helleri"
#> [779] "Acanthochromis polyacanthus"
#> [780] "Acanthochromis polyacathus"
#> [781] "Dascyllus polyacanthus"
#> [782] "Mustela vison"
#> [783] "Neogale vison"
#> [784] "Neovison vison"
#> [785] "Lingula anatina"
#> [786] "Lingula lingua"
#> [787] "Lingula nipponica"
#> [788] "Lingula unguis"
#> [789] "Madrepora faveolata"
#> [790] "Montastraea faveolata"
#> [791] "Montastrea faveolata"
#> [792] "Orbicella faveolata"
#> [793] "Esox lucius"
#> [794] "Chinchilla lanigera"
#> [795] "Chinchilla velligera"
#> [796] "Chinchilla villidera"
#> [797] "Mirounga leonina"
#> [798] "Phoca leonina"
#> [799] "Perognathus longimembris_pacificus"
#> [800] "Cynocephalus variegatus"
#> [801] "Galeopithecus variegatus"
#> [802] "Galeopterus variegatus"
#> [803] "Vigna radiata"
#> [804] "Vitis vinifera"
#> [805] "Vitis vinifera_subsp._vinifera"
#> [806] "Characodon multiradiatus"
#> [807] "Girardinichthys multiradiatus"
#> [808] "Marmota flaviventris"
#> [809] "Phaseolus calcaratus"
#> [810] "Phaseolus chrysanthos"
#> [811] "Phaseolus chrysanthus"
#> [812] "Vigna calcarata"
#> [813] "Vigna umbellata"
#> [814] "Balaenoptera acutorostrata"
#> [815] "Canis procyonoides"
#> [816] "Nyctereutes procyonoides"
#> [817] "Amphioxus floridae"
#> [818] "Branchiostoma floridae"
#> [819] "Moschus berezovskii"
#> [820] "Erythranthe guttata"
#> [821] "Mimulus guttatus_subsp._guttatus"
#> [822] "Mimulus guttatus"
#> [823] "Camelus bactrianus"
#> [824] "Desmodus rotundus"
#> [825] "Phyllostoma rotundum"
#> [826] "Octopus sinensis"
#> [827] "Physeter catodon"
#> [828] "Physeter macrocephalus"
#> [829] "Alexandromys fortis"
#> [830] "Microtus fortis"
#> [831] "Dendronephthya gigantea"
#> [832] "Canis hyaena"
#> [833] "Hyaena hyaena"
#> [834] "Helicophagus hypophthalmus"
#> [835] "Pangasianodon hypophthalmus"
#> [836] "Pangasius hypophthalmus"
#> [837] "Pangasius sutchi"
#> [838] "Pseudochaenichthys georgianus"
#> [839] "Capsella rubella"
#> [840] "Perkinsus marinus_ATCC_50983"
#> [841] "Holocentrus leopardus"
#> [842] "Plectropomus leopardus"
#> [843] "Hippocampus zosterae"
#> [844] "Artibeus jamaicensis"
#> [845] "Citrus sinensis"
#> [846] "Citrus x_sinensis"
#> [847] "Punica granatum"
#> [848] "Abrus cyaneus"
#> [849] "Abrus precatorius"
#> [850] "Polypterus senegalus"
#> [851] "Acomys russatus"
#> [852] "Mus russatus"
#> [853] "Hemibagrus wyckioides"
#> [854] "Macrones wyckioides"
#> [855] "Mystus wyckioides"
#> [856] "Melanotaenia boesemani"
#> [857] "Sturnira hondurensis"
#> [858] "Amphilophus centrarchus"
#> [859] "Archocentrus centrarchus"
#> [860] "Cichlasoma centrarchus"
#> [861] "Heros centrarchus"
#> [862] "Delphinus melas"
#> [863] "Globicephala melaena"
#> [864] "Globicephala melas"
#> [865] "Manis javanica"
#> [866] "Phyllostomus hastatus"
#> [867] "Vespertilio hastatus"
#> [868] "Scyliorhinus canicula"
#> [869] "Squalus canicula"
#> [870] "Silurana tropicalis"
#> [871] "Xenopus laevis_tropicalis"
#> [872] "Xenopus (Silurana)_tropicalis"
#> [873] "Xenopus tropicalis"
#> [874] "Pipistrellus kuhlii"
#> [875] "Pipistrellus kuhli"
#> [876] "Vespertilio kuhlii"
#> [877] "Solea senegalensis"
#> [878] "Mugil cephalotus"
#> [879] "Mugil cephalus"
#> [880] "Mugil galapagensis"
#> [881] "Mugil japonicus"
#> [882] "Siphostoma scovelli"
#> [883] "Syngnathus scovelli"
#> [884] "Capra aegagrus_hircus"
#> [885] "Capra hircus"
#> [886] "Poeciliopsis prolifica"
#> [887] "Gopherus flavomarginatus"
#> [888] "Lontra canadensis"
#> [889] "Lutra canadensis"
#> [890] "Hesperomys torridus"
#> [891] "Onychomys torridus"
#> [892] "Elephas africanus"
#> [893] "Loxodonta africana_africana"
#> [894] "Loxodonta africana"
#> [895] "Boophilus microplus"
#> [896] "Rhipicephalus (Boophilus)_microplus"
#> [897] "Rhipicephalus microplus"
#> [898] "Molossus molossus"
#> [899] "Vespertilio molossus"
#> [900] "Lagenorhynchus obliquidens"
#> [901] "Delphinus cymodoce"
#> [902] "Delphinus truncatus"
#> [903] "Tursiops cymodoce"
#> [904] "Tursiops truncatus"
#> [905] "Morone flavescens"
#> [906] "Perca flavescens"
#> [907] "Euarctos americanus"
#> [908] "Ursus americanus"
#> [909] "Arvicola nivalis"
#> [910] "Chionomys nivalis"
#> [911] "Microtus nivalis"
#> [912] "Felis rufus"
#> [913] "Lynx rufus"
#> [914] "Myotis brandtii"
#> [915] "Vespertilio brandtii"
#> [916] "Astatotilapia burtoni"
#> [917] "Chromis burtoni"
#> [918] "Haplochromis burtoni"
#> [919] "Silurus meridionalis"
#> [920] "Silurus soldatovi_meridionalis"
#> [921] "Cucumis melo"
#> [922] "Hydra attenuata"
#> [923] "Hydra carnea"
#> [924] "Hydra littoralis"
#> [925] "Hydra magnipapillata"
#> [926] "Hydra vulgaris"
#> [927] "Anoplopoma fimbria"
#> [928] "Gadus fimbria"
#> [929] "Alosa alosa"
#> [930] "Clupea alosa"
#> [931] "Chelonia mydas"
#> [932] "Testudo mydas"
#> [933] "Ctenocephalides felis"
#> [934] "Stylophora pistillata"
#> [935] "Cyrtodiopsis dalmanii"
#> [936] "Diopsis dalmanni"
#> [937] "Teleopsis dalmanni"
#> [938] "Rhagoletis zephyria"
#> [939] "Rhodamnia argentea"
#> [940] "Gasterosteus aculeatus"
#> [941] "Labrus celidotus"
#> [942] "Notolabrus celidotus"
#> [943] "Budorcas taxicolor"
#> [944] "Nelumbo nucifera"
#> [945] "Amphiprion ocellaris"
#> [946] "Arvicola amphibius"
#> [947] "Arvicola terrestris_(Linnaeus,_1758)"
#> [948] "Mus amphibius"
#> [949] "Daphnia magna"
#> [950] "Phaseolus vulgaris"
#> [951] "Psammomys obesus"
#> [952] "Carlito syrichta"
#> [953] "Simia syrichta"
#> [954] "Tarsius syrichta"
#> [955] "Cyprinodon tularosa"
#> [956] "Arvicola princeps"
#> [957] "Ochotona princeps"
#> [958] "Phytophthora sojae"
#> [959] "Equus caballus_przewalskii"
#> [960] "Equus ferus_przewalskii"
#> [961] "Equus przewalskii"
#> [962] "Phoca vitulina"
#> [963] "Coecilia bivitatum"
#> [964] "Rhinatrema bivitattum"
#> [965] "Rhinatrema bivittatum"
#> [966] "Lagomys curzoniae"
#> [967] "Ochotona curzonae"
#> [968] "Ochotona curzoniae"
#> [969] "Kogia breviceps"
#> [970] "Physeter breviceps"
#> [971] "Clupea cyprinoides"
#> [972] "Megalops cyprinoides"
#> [973] "Diospyros lotus"
#> [974] "Hippoglossus stenolepis"
#> [975] "Phacochoerus africanus"
#> [976] "Corythoichthys intestinalis"
#> [977] "Syngnatus intestinalis"
#> [978] "Mandrillus leucophaeus"
#> [979] "Papio leucophaeus"
#> [980] "Simia leucophaea"
#> [981] "Epinephelus fuscoguttatus"
#> [982] "Perca summana_fuscoguttata"
#> [983] "Asterias miniata"
#> [984] "Asterina miniata"
#> [985] "Patiria miniata"
#> [986] "Rhinolophus rouxii_sinicus"
#> [987] "Rhinolophus sinicus"
#> [988] "Lampris incognitus"
#> [989] "Monachus schauinslandi"
#> [990] "Neomonachus schauinslandi"
#> [991] "Hippoglossus hippoglossus"
#> [992] "Pleuronectes hippoglossus"
#> [993] "Andrographis paniculata"
#> [994] "Etheostoma cragini"
#> [995] "Perca chuatsi"
#> [996] "Siniperca chuatsi"
#> [997] "Meriones unguiculatus"
#> [998] "Colobus angolensis_palliatus"
#> [999] "Notothenia coriiceps"
#> [1000] "Hypomesus transpacificus"
#> [1001] "Dermochelys coriacea"
#> [1002] "Testudo coriacea"
#> [1003] "Bufo bufo_gargarizans"
#> [1004] "Bufo gargarizans"
#> [1005] "Bufo japonicus_gargarizans"
#> [1006] "Delphinapterus leucas"
#> [1007] "Delphinus leucas"
#> [1008] "Fugu flavidus"
#> [1009] "Takifugu flavidus"
#> [1010] "Pteronotus mesoamericanus"
#> [1011] "Pteronotus parnellii_mesoamericanus"
#> [1012] "Citrus clementina"
#> [1013] "Citrus deliciosa_x_Citrus_sinensis"
#> [1014] "Citrus x_clementina"
#> [1015] "Fugu rubripes"
#> [1016] "Sphaeroides rubripes"
#> [1017] "Takifugu rubripes"
#> [1018] "Tetraodon rubripes"
#> [1019] "Homarus americanus"
#> [1020] "Osteoglossum formosum"
#> [1021] "Scleropages formosus"
#> [1022] "Larimichthys crocea"
#> [1023] "Pseudosciaena amblyceps"
#> [1024] "Pseudosciaena crocea"
#> [1025] "Sciaena crocea"
#> [1026] "Fragaria vesca"
#> [1027] "Folsomia candida"
#> [1028] "Limulus polyphemus"
#> [1029] "Monoculus polyphemus"
#> [1030] "Doryrhamphus dactyliophorus"
#> [1031] "Dunckerocampus dactyliophorus"
#> [1032] "Syngnathus dactyliophorus"
#> [1033] "Epinephelus lanceolatus"
#> [1034] "Holocentrus lanceolatus"
#> [1035] "Promicrops lanceolatus"
#> [1036] "Mizuhopecten yessoensis"
#> [1037] "Patinopecten yessoensis"
#> [1038] "Patiopecten yessoensis"
#> [1039] "Pecten yessoensis"
#> [1040] "Platypoecilus maculatus"
#> [1041] "Xiphophorus maculatus"
#> [1042] "Triplophysa rosa"
#> [1043] "Antechinus flavipes"
#> [1044] "Phascogale flavipes"
#> [1045] "Balaena musculus"
#> [1046] "Balaenoptera musculus"
#> [1047] "Rhinolophus ferrumequinum"
#> [1048] "Vespertilio ferrumequinum"
#> [1049] "Oryza brachyantha"
#> [1050] "Chrysemys picta"
#> [1051] "Testudo picta"
#> [1052] "Trachemys picta"
#> [1053] "Tetrahymena thermophila_SB210"
#> [1054] "Amygdalus communis"
#> [1055] "Amygdalus dulcis"
#> [1056] "Prunus amygdalus"
#> [1057] "Prunus communis"
#> [1058] "Prunus dulcis"
#> [1059] "Prunus dulcis_var._sativa"
#> [1060] "Oryzias latipes"
#> [1061] "Poecilia latipes"
#> [1062] "Sarcophilus harrisii"
#> [1063] "Sarcophilus laniarius_(Owen,_1838)"
#> [1064] "Sarcophilus laniarius"
#> [1065] "Ursinus harrisii"
#> [1066] "Ictalurus furcatus"
#> [1067] "Pimelodus furcatus"
#> [1068] "Amphioxus belcheri"
#> [1069] "Branchiostoma belcheri"
#> [1070] "Gigantopelta aegis"
#> [1071] "Echinus variegatus"
#> [1072] "Lytechinus variegatus"
#> [1073] "Diaphorina citri"
#> [1074] "Epinephelus moara"
#> [1075] "Serranus moara"
#> [1076] "Stegodyphus dumicola"
#> [1077] "Boleophthalmus pectinirostris"
#> [1078] "Gobius pectinirostris"
#> [1079] "Austrofundulus limnaeus"
#> [1080] "Columba livia_domestica"
#> [1081] "Columba livia"
#> [1082] "Latimeria chalumnae"
#> [1083] "Pleuronectes maximus"
#> [1084] "Psetta maxima"
#> [1085] "Rhombus maximus"
#> [1086] "Scophthalmus maximus"
#> [1087] "Sesamum indicum"
#> [1088] "Sesamum orientale"
#> [1089] "Cyclopterus lumpus"
#> [1090] "Armeniaca mume"
#> [1091] "Prunus mume"
#> [1092] "Myotis davidii"
#> [1093] "Vespertilio Davidii"
#> [1094] "Didelphys agilis"
#> [1095] "Gracilinanus agilis"
#> [1096] "Acanthophacelus reticulata"
#> [1097] "Poecilia (Acanthophacelus)_reticulata"
#> [1098] "Poecilia latipinna_reticulata"
#> [1099] "Poecilia reticulata"
#> [1100] "Australorbis glabratus"
#> [1101] "Biomphalaria glabrata"
#> [1102] "Planorbis glabratus"
#> [1103] "Hypudaeus ochrogaster"
#> [1104] "Microtus ochrogaster"
#> [1105] "Amygdalus persica"
#> [1106] "Persica vulgaris"
#> [1107] "Prunus persica"
#> [1108] "Prunus persica_var._densa"
#> [1109] "Chiloscyllium plagiosum"
#> [1110] "Scyllium plagiosum"
#> [1111] "Cheilinus undulatus"
#> [1112] "Phodopus roborovskii"
#> [1113] "Caenorhabditis remanei"
#> [1114] "Caenorhabditis vulgaris"
#> [1115] "Lamprologus brichardi"
#> [1116] "Neolamprologus brichardi"
#> [1117] "Gymnopis unicolor"
#> [1118] "Microcaecilia unicolor"
#> [1119] "Rhinatrema unicolor"
#> [1120] "Sciaena jaculatrix"
#> [1121] "Toxotes jaculatrix"
#> [1122] "Bos indicus"
#> [1123] "Bos primigenius_indicus"
#> [1124] "Bos taurus_indicus"
#> [1125] "Lacerta sicula_raffonei"
#> [1126] "Podarcis raffoneae"
#> [1127] "Podarcis raffonei"
#> [1128] "Podarcis wagleriana_raffonei"
#> [1129] "Benincasa cerifera"
#> [1130] "Benincasa hispida"
#> [1131] "Benincasa pruriens"
#> [1132] "Cucurbita hispida"
#> [1133] "Lagenaria siceraria_var._hispida"
#> [1134] "Dendrobium catenatum"
#> [1135] "Marsupenaeus japonicus"
#> [1136] "Penaeus japonicus"
#> [1137] "Penaeus (Marsupenaeus)_japonicus"
#> [1138] "Penaeus (Melicertus)_japonicus"
#> [1139] "Chaetodon argus"
#> [1140] "Scatophagus argus"
#> [1141] "Chanos chanos"
#> [1142] "Mugil chanos"
#> [1143] "Bison bison_bison"
#> [1144] "Bos bison_bison"
#> [1145] "Amblyraja radiata"
#> [1146] "Raja radiata"
#> [1147] "Amphimedon queenslandica"
#> [1148] "Hippocampus comes"
#> [1149] "Hipposideros armiger"
#> [1150] "Rhinolophus armiger"
#> [1151] "Cynoglossus (Arelia)_semilaevis"
#> [1152] "Cynoglossus semilaevis"
#> [1153] "Alecto japonica"
#> [1154] "Anneissia japonica"
#> [1155] "Oxycomanthus japonicus"
#> [1156] "Ananas comosus"
#> [1157] "Ananas comosus_var._comosus"
#> [1158] "Ananas lucidus"
#> [1159] "Bromelia comosa"
#> [1160] "Callionymus splendidus"
#> [1161] "Pterosynchiropus splendidus"
#> [1162] "Synchiropus splendidus"
#> [1163] "Neophocaena asiaeorientalis_asiaeorientalis"
#> [1164] "Coluber guttatus"
#> [1165] "Elaphe guttata"
#> [1166] "Pantherophis guttatus"
#> [1167] "Pollicipes cornucopia"
#> [1168] "Pollicipes pollicipes"
#> [1169] "Pseudoliparis swirei"
#> [1170] "Chelonoidis abingdonii"
#> [1171] "Chelonoidis abingdoni"
#> [1172] "Chelonoidis nigra_abingdonii"
#> [1173] "Geochelone nigra_abigdonii"
#> [1174] "Geochelone nigra_abingdoni"
#> [1175] "Geochelone nigra_ephippium"
#> [1176] "Testudo abingdonii"
#> [1177] "Rhincodon typus"
#> [1178] "Ricinus communis"
#> [1179] "Ricinus sanguineus"
#> [1180] "Malania oleifera"
#> [1181] "Ceratotherium simum_simum"
#> [1182] "Kryptolebias marmoratus"
#> [1183] "Rivulus marmoratus"
#> [1184] "Patella vulgata"
#> [1185] "Rhagoletis pomonella"
#> [1186] "Trypanosoma cruzi"
#> [1187] "Squalus fasciatus"
#> [1188] "Squalus tigrinus"
#> [1189] "Stegostoma fasciatum"
#> [1190] "Stegostoma tigrinum"
#> [1191] "Cistudo triunguis"
#> [1192] "Terrapene mexicana_triunguis"
#> [1193] "Terrapene triunguis"
#> [1194] "Odobenus rosmarus_divergens"
#> [1195] "Manatus latirostris"
#> [1196] "Trichechus manatus_latirostris"
#> [1197] "Carcharodon carcharias"
#> [1198] "Squalus carcharias"
#> [1199] "Macrognathus armatus"
#> [1200] "Mastacembelus armatus"
#> [1201] "Anas boschas"
#> [1202] "Anas domesticus"
#> [1203] "Anas platyrhynchos_f._domestica"
#> [1204] "Anas platyrhynchos"
#> [1205] "Theobroma cacao"
#> [1206] "Diabrotica virgifera_virgifera"
#> [1207] "Actinia diaphana"
#> [1208] "Aiptasia pallida"
#> [1209] "Aiptasia pulchella"
#> [1210] "Dysactis pallida"
#> [1211] "Exaiptasia diaphana"
#> [1212] "Exaiptasia pallida"
#> [1213] "Syngnathus acus_rubescens"
#> [1214] "Syngnathus acus"
#> [1215] "Syngnathus rubescens"
#> [1216] "Caretta caretta"
#> [1217] "Testudo caretta"
#> [1218] "Guillardia theta_CCMP2712"
#> [1219] "Anarrhichthys ocellatus"
#> [1220] "Pelodiscus sinensis"
#> [1221] "Trionyx sinensis"
#> [1222] "Hippoglossus olivaceus"
#> [1223] "Paralichthys olivaceus"
#> [1224] "Xiphias gladius"
#> [1225] "Cyprinodon variegatus"
#> [1226] "Bos grunniens_mutus"
#> [1227] "Bos mutus"
#> [1228] "Poephagus mutus"
#> [1229] "Alligator sinensis"
#> [1230] "Morus notabilis"
#> [1231] "Nymphaea colorata"
#> [1232] "Photinus pyralis"
#> [1233] "Periophthalmus magnuspinnatus"
#> [1234] "Meleagris gallopavo"
#> [1235] "Pomacea canaliculata"
#> [1236] "Haplochromis nyererei"
#> [1237] "Pundamilia nyererei"
#> [1238] "Caranx dumerili"
#> [1239] "Seriola dumerili"
#> [1240] "Macrosteles (Macrosteles)_quadrilineatus"
#> [1241] "Macrosteles quadrilineatus"
#> [1242] "Enhydra lutris_kenyoni"
#> [1243] "Fluta alba"
#> [1244] "Monopterus albus"
#> [1245] "Muraena alba"
#> [1246] "Caecilia seraphini"
#> [1247] "Caecilia Seraphini"
#> [1248] "Geotrypetes seraphini"
#> [1249] "Hypogeophis seraphini"
#> [1250] "Chaetodon rostratus"
#> [1251] "Chelmon rostratus"
#> [1252] "Cucumis sativus"
#> [1253] "Cyrtodactylus macularius"
#> [1254] "Eublepharis macularius"
#> [1255] "Felis concolor"
#> [1256] "Panthera concolor"
#> [1257] "Puma concolor"
#> [1258] "Cancer chinensis"
#> [1259] "Fenneropenaeus chinensis"
#> [1260] "Penaeus chinensis"
#> [1261] "Pomacentrus partitus"
#> [1262] "Stegastes partitus"
#> [1263] "Phascum patens"
#> [1264] "Physcomitrella patens_subsp._patens"
#> [1265] "Physcomitrella patens"
#> [1266] "Physcomitrium patens"
#> [1267] "Anas jamaicensis"
#> [1268] "Oxyura jamaicensis"
#> [1269] "Drosophila miranda"
#> [1270] "Lottia gigantea"
#> [1271] "Eurytemora affinis"
#> [1272] "Temora affinis"
#> [1273] "Crotalus tigris"
#> [1274] "Argentina anserina"
#> [1275] "Potentilla anserina"
#> [1276] "Achaearanea tepidariorum"
#> [1277] "Parasteatoda tepidariorum"
#> [1278] "Theridion tepidariorum"
#> [1279] "Uranotaenia lowii"
#> [1280] "Cynolebias whitei"
#> [1281] "Nematolebias whitei"
#> [1282] "Simpsonichthys whitei"
#> [1283] "Sceloporus undulatus"
#> [1284] "Stellio undulatus"
#> [1285] "Helobdella robusta"
#> [1286] "Styela clava"
#> [1287] "Manis afer_afer"
#> [1288] "Orycteropus afer_afer"
#> [1289] "Leucoraja erinacea"
#> [1290] "Raja erinacea"
#> [1291] "Raja erinaceus"
#> [1292] "Raja erinacia"
#> [1293] "Phytophthora nicotianae_INRA-310"
#> [1294] "Anas olor"
#> [1295] "Cygnus olor"
#> [1296] "Lacerta agilis"
#> [1297] "Millepora damicornis"
#> [1298] "Pocillopora caespitosa_laysanensis"
#> [1299] "Pocillopora damicornis_laysanensis"
#> [1300] "Pocillopora damicornis"
#> [1301] "Morone saxatilis"
#> [1302] "Perca saxatilis"
#> [1303] "Miniopterus natalensis"
#> [1304] "Miniopterus schreibersii_natalensis"
#> [1305] "Vespertilio natalensis"
#> [1306] "Anas cygnoid"
#> [1307] "Anser cygnoides"
#> [1308] "Actinia tenebrosa"
#> [1309] "Neptunus trituberculatus"
#> [1310] "Portunus (Portunus)_trituberculatus"
#> [1311] "Portunus trituberculatus"
#> [1312] "Lacerta vivipara"
#> [1313] "Zootoca vivipara"
#> [1314] "Propithecus coquereli"
#> [1315] "Propithecus verreauxi_coquereli"
#> [1316] "Erinaceus europaeus"
#> [1317] "Jatropha curcas"
#> [1318] "Caenorhabditis briggsae"
#> [1319] "Rhabditis briggsae"
#> [1320] "Cherax quadricarinatus"
#> [1321] "Homalodisca coagulata"
#> [1322] "Homalodisca vitripennis"
#> [1323] "Tettigonia coagulata"
#> [1324] "Tettigonia vitripennis"
#> [1325] "Anolis carolinensis"
#> [1326] "Python bivittatus"
#> [1327] "Python molurus_bivittatus"
#> [1328] "Chrysemys scripta_elegans"
#> [1329] "Emys elegans"
#> [1330] "Pseudemys scripta_elegans"
#> [1331] "Trachemys scripta_elegans"
#> [1332] "Protobothrops mucrosquamatus"
#> [1333] "Trigonocephalus mucrosquamatus"
#> [1334] "Trimeresurus mucrosquamatus"
#> [1335] "Daphnia pulex"
#> [1336] "Paramacrobiotus metropolitanus"
#> [1337] "Lipotes vexillifer"
#> [1338] "Petromyzon marinus"
#> [1339] "Poephila guttata"
#> [1340] "Taeniopygia guttata"
#> [1341] "Taenopygia guttata"
#> [1342] "Aplysia californica"
#> [1343] "Phalaenopsis equestris"
#> [1344] "Stauroglottis equestris"
#> [1345] "Balanoglossus kowalevskii"
#> [1346] "Saccoglossus kowalevskii"
#> [1347] "Saccoglossus kowalevskyi"
#> [1348] "Numida meleagris"
#> [1349] "Phasianus meleagris"
#> [1350] "Momordica charantia"
#> [1351] "Callorhinchus milii"
#> [1352] "Sphaerodactylus townsendi"
#> [1353] "Eutainia elegans"
#> [1354] "Thamnophis elegans"
#> [1355] "Corvus hawaiiensis"
#> [1356] "Manacus candei"
#> [1357] "Pipra candei"
#> [1358] "Euleptes europaea"
#> [1359] "Euleptes europea"
#> [1360] "Phyllodactylus europaea"
#> [1361] "Phyllodactylus europaeus"
#> [1362] "Ptyodactylus caudivolvolus"
#> [1363] "Lepisosteus oculatus"
#> [1364] "Altirana parkeri"
#> [1365] "Nanorana parkeri"
#> [1366] "Ahaetulla prasina"
#> [1367] "Dryophis prasinus"
#> [1368] "Fusarium oxysporum_f._sp._lycopersici_4287"
#> [1369] "Heteropelma chrysocephalum"
#> [1370] "Neopelma chrysocephalum"
#> [1371] "Musca domestica"
#> [1372] "Pristis pectinata"
#> [1373] "Ischnura elegans"
#> [1374] "Fringilla chalybeata"
#> [1375] "Vidua chalybeata"
#> [1376] "Coturnix coturnix_japanica"
#> [1377] "Coturnix coturnix_japonica"
#> [1378] "Coturnix coturnix_Japonicus"
#> [1379] "Coturnix japonica_japonica"
#> [1380] "Coturnix japonica"
#> [1381] "Gekko japonicus"
#> [1382] "Platydactylus japonicus"
#> [1383] "Nilaparvata lugens"
#> [1384] "Ardea americana"
#> [1385] "Grus americana"
#> [1386] "Grus americanus"
#> [1387] "Harpia harpyja"
#> [1388] "Vultur harpyja"
#> [1389] "Pipra filicauda"
#> [1390] "Herrania umbratica"
#> [1391] "Ilyonectria robusta"
#> [1392] "Cupidonia cupido_pallidicincta"
#> [1393] "Tympanuchus pallidicinctus"
#> [1394] "Topomyia yanbarensis"
#> [1395] "Parus atricapillus"
#> [1396] "Poecile atricapilla"
#> [1397] "Poecile atricapillus"
#> [1398] "Corapipo altera"
#> [1399] "Acyrthosiphon pisum"
#> [1400] "Acyrthosiphum pisum"
#> [1401] "Varanus komodoensis"
#> [1402] "Saprolegnia parasitica_CBS_223.65"
#> [1403] "Fringilla macroura"
#> [1404] "Vidua macroura"
#> [1405] "Carica papaya"
#> [1406] "Chiroxiphia lanceolata"
#> [1407] "Pipra lanceolata"
#> [1408] "Octopus bimaculoides"
#> [1409] "Lagopus muta"
#> [1410] "Tetrao mutus"
#> [1411] "Bradysia coprophila"
#> [1412] "Sciara coprophila"
#> [1413] "Coluber sirtalis"
#> [1414] "Thamnophis sirtalis"
#> [1415] "Falco peregrinus"
#> [1416] "Falco cherrug"
#> [1417] "Asteracanthion distichum"
#> [1418] "Asterias attenuata"
#> [1419] "Asterias clathrata"
#> [1420] "Asterias disticha"
#> [1421] "Asterias gigantea"
#> [1422] "Asterias pallida"
#> [1423] "Asterias rubens"
#> [1424] "Asterias stimpsoni"
#> [1425] "Asterias vulgaris"
#> [1426] "Manduca sexta"
#> [1427] "Sphinx sexta"
#> [1428] "Condylura cristata"
#> [1429] "Sorex cristatus"
#> [1430] "Cuculus canorus"
#> [1431] "Pezoporus wallicus"
#> [1432] "Aedes aegypti"
#> [1433] "Aedes (Stegomyia)_aegypti"
#> [1434] "Culex aegypti"
#> [1435] "Stegomyia aegypti"
#> [1436] "Falco naumanni"
#> [1437] "Corvus kubaryi"
#> [1438] "Necator americanus"
#> [1439] "Larus tridactylus"
#> [1440] "Rissa tridactyla"
#> [1441] "Aphanomyces astaci"
#> [1442] "Culex (Culex)_pipiens_pallens"
#> [1443] "Culex pipiens_pallens"
#> [1444] "Catharus ustulatus"
#> [1445] "Turdus ustulatus"
#> [1446] "Accipiter gentilis"
#> [1447] "Accipiter gentillis"
#> [1448] "Falco gentilis"
#> [1449] "Crocodylus porosus"
#> [1450] "Amborella trichopoda"
#> [1451] "Falco biarmicus"
#> [1452] "Lagopus leucura"
#> [1453] "Lagopus leucurus"
#> [1454] "Tetrao leucurus"
#> [1455] "Falco rusticolus"
#> [1456] "Phasianus colchicus"
#> [1457] "Corvus brachyrhynchos"
#> [1458] "Uloborus diversus"
#> [1459] "Phytophthora infestans_strain_T30-4"
#> [1460] "Phytophthora infestans_T30-4"
#> [1461] "Empidonax traillii"
#> [1462] "Muscicapa traillii"
#> [1463] "Strix alba"
#> [1464] "Tyto alba"
#> [1465] "Parus major"
#> [1466] "Lepeophtheirus salmonis"
#> [1467] "Gavialis gangeticus"
#> [1468] "Lacerta gangetica"
#> [1469] "Daphnia carinata"
#> [1470] "Aphis gossypii"
#> [1471] "Ampithoe aztecus"
#> [1472] "Hyalella azteca"
#> [1473] "Hyalella knickerbockeri"
#> [1474] "Colletotrichum lupini"
#> [1475] "Gloeosporium lupini"
#> [1476] "Sphaeroforma arctica_JP610"
#> [1477] "Suillus fuscotomentosus"
#> [1478] "Mollisia scopiformis"
#> [1479] "Phialocephala scopiformis"
#> [1480] "Muscicapa cayanensis"
#> [1481] "Myiozetetes cayanensis"
#> [1482] "Hyaloscypha bicolor_E"
#> [1483] "Melopsittacus undulatus"
#> [1484] "Psittacus undulatus"
#> [1485] "Fringilla montana"
#> [1486] "Passer montanus"
#> [1487] "Coccinella axyridis"
#> [1488] "Harmonia axyridis"
#> [1489] "Aimophila crissalis"
#> [1490] "Kieneria crissalis"
#> [1491] "Kieneria crissalis_(Vigors,_1839)"
#> [1492] "Melozone crissalis"
#> [1493] "Pipilo crissalis"
#> [1494] "Pipilo fuscus_crissalis"
#> [1495] "Stomoxis calcitrans"
#> [1496] "Stomoxys calcitrans"
#> [1497] "Anas atrata"
#> [1498] "Cygnus atratus"
#> [1499] "Culex fatigans"
#> [1500] "Culex pipiens_fatigans"
#> [1501] "Culex pipiens_quinquefasciatus"
#> [1502] "Culex quinquefasciatus"
#> [1503] "Hirundo rustica"
#> [1504] "Acanthaster planci"
#> [1505] "Asterias planci"
#> [1506] "Molothrus ater"
#> [1507] "Oriolus ater"
#> [1508] "Laccaria bicolor_S238N-H82"
#> [1509] "Anastrepha obliqua"
#> [1510] "Tephritis obliqua"
#> [1511] "Grapholitha glycinivorella"
#> [1512] "Leguminivora glycinivorella"
#> [1513] "Ammodromus caudacutus_nelsoni"
#> [1514] "Ammospiza nelsoni"
#> [1515] "Nylanderia fulva"
#> [1516] "Paratrechina fulva"
#> [1517] "Agelaius phoeniceus"
#> [1518] "Agelaius phoniceus"
#> [1519] "Oriolus phoeniceus"
#> [1520] "Colletotrichum fructicola"
#> [1521] "Colletotrichum ignotum"
#> [1522] "Euthrips occidentalis"
#> [1523] "Frankliniella brunnescens"
#> [1524] "Frankliniella californica"
#> [1525] "Frankliniella occidentalis_brunnescens"
#> [1526] "Frankliniella occidentalis"
#> [1527] "Motacilla alba_alba"
#> [1528] "Fusarium solani"
#> [1529] "Fusisporium solani"
#> [1530] "Neocosmospora solani"
#> [1531] "Sitophilus oryzae"
#> [1532] "Corvus cornix_cornix"
#> [1533] "Fringilla canaria_Linnaeus,_1758"
#> [1534] "Serinus canaria"
#> [1535] "Serinus canarius"
#> [1536] "Drosophila subpulchrella"
#> [1537] "Chlamydomonas reinhardtii"
#> [1538] "Chlamydomonas smithii"
#> [1539] "Puccinia striiformis_f._sp._tritici"
#> [1540] "Bactrocera cucurbitae"
#> [1541] "Bactrocera (Zeugodacus)_cucurbitae"
#> [1542] "Zeugodacus cucurbitae"
#> [1543] "Zeugodacus (Zeugodacus)_cucurbitae"
#> [1544] "Antrodia serialis"
#> [1545] "Fomitopsis serialis"
#> [1546] "Neoantrodia serialis"
#> [1547] "Polyporus serialis"
#> [1548] "Drosophila suzukii"
#> [1549] "Leucophenga suzukii"
#> [1550] "Aedes smithii"
#> [1551] "Wyeomyia smithii"
#> [1552] "Montifringilla ruficollis"
#> [1553] "Pyrgilauda ruficollis"
#> [1554] "Gymnogyps californianus"
#> [1555] "Vultur californianus"
#> [1556] "Bactrocera (Bactrocera)_dorsalis"
#> [1557] "Bactrocera (Bactrocera)_invadens"
#> [1558] "Bactrocera dorsalis"
#> [1559] "Bactrocera invadens"
#> [1560] "Bactrocera papayae"
#> [1561] "Bactrocera philippinensis"
#> [1562] "Dacus dorsalis"
#> [1563] "Trichoplusia ni"
#> [1564] "Leptothorax curvispinosus"
#> [1565] "Temnothorax curvispinosus"
#> [1566] "Saprolegnia declina_VS20"
#> [1567] "Saprolegnia diclina_VS20"
#> [1568] "Fringilla albicollis"
#> [1569] "Zonotrichia albicollis"
#> [1570] "Bactrocera neohumeralis"
#> [1571] "Dacus tryoni_var._neohumeralis"
#> [1572] "Sphaeria pertusa"
#> [1573] "Trematosphaeria pertusa"
#> [1574] "Fusarium oxysporum_var._redolens"
#> [1575] "Fusarium redolens"
#> [1576] "Anastrepha ludens"
#> [1577] "Trypeta ludens"
#> [1578] "Cantharellus anzutake"
#> [1579] "Malurus melanocephalus"
#> [1580] "Muscicapa melanocephala"
#> [1581] "Melitaea cinxia"
#> [1582] "Papilio cinxia"
#> [1583] "Maniola jurtina"
#> [1584] "Papilio jurtina"
#> [1585] "Anas fuligula"
#> [1586] "Aythya fuligula"
#> [1587] "Bombyx mori"
#> [1588] "Phalaena mori"
#> [1589] "Botys furnacalis"
#> [1590] "Ostrinia furnacalis"
#> [1591] "Priapula caudata"
#> [1592] "Priapulus caudatus"
#> [1593] "Apanteles glomeratus"
#> [1594] "Cotesia glomerata"
#> [1595] "Ichneumon glomeratus"
#> [1596] "Centrocercus urophasianus"
#> [1597] "Centrocerus urophasianus"
#> [1598] "Tetrao urophasianus"
#> [1599] "Montifringilla taczanowskii_(Przewalski,_1876)"
#> [1600] "Onychostruthus taczanowskii"
#> [1601] "Monomorium pharaonis"
#> [1602] "Daktulosphaira vitifoliae"
#> [1603] "Pemphigus vitifoliae"
#> [1604] "Viteus vitifoliae"
#> [1605] "Helicoverpa armigera"
#> [1606] "Heliothis armigera"
#> [1607] "Heliothis (Helicoverpa)_armigera"
#> [1608] "Noctua armigera"
#> [1609] "Drosophila biarmipes"
#> [1610] "Myzus (Nectarosiphon)_persicae"
#> [1611] "Myzus persicae"
#> [1612] "Lucilia sericata"
#> [1613] "Phaenicia sericata"
#> [1614] "Tinamus guttatus"
#> [1615] "Solenopsis invicta"
#> [1616] "Fringilla georgiana"
#> [1617] "Melospiza georgiana"
#> [1618] "Helicoverpa zea"
#> [1619] "Heliothis zea"
#> [1620] "Phalaena zea"
#> [1621] "Drosophila ananassae"
#> [1622] "Drosophila annanassae"
#> [1623] "Fusarium odoratissimum_NRRL_54006"
#> [1624] "Coccinella 7-punctata"
#> [1625] "Coccinella septempunctata"
#> [1626] "Spodoptera frugiperda"
#> [1627] "Tigriopus californicus"
#> [1628] "Tisbe californica"
#> [1629] "Ficedula albicollis"
#> [1630] "Muscicapa albicollis"
#> [1631] "Drosophila pseudoobscura"
#> [1632] "Mytilidion resinicola"
#> [1633] "Mytilinidion resinicola"
#> [1634] "Halyomorpha halys"
#> [1635] "Phycomyces blakesleeanus_NRRL_1555(-)"
#> [1636] "Drosophila willistoni"
#> [1637] "Monoraphidium neglectum"
#> [1638] "Sturnus vulgaris"
#> [1639] "Bactrocera tryoni"
#> [1640] "Tephritis tryoni"
#> [1641] "Apus apus"
#> [1642] "Hirundo apus"
#> [1643] "Suillus paluster"
#> [1644] "Naegleria gruberi"
#> [1645] "Suillus discolor"
#> [1646] "Suillus tomentosus_var._discolor"
#> [1647] "Trichina spiralis"
#> [1648] "Trichinella spiralis"
#> [1649] "Onthophagus taurus"
#> [1650] "Epistrophe balteatus"
#> [1651] "Episyrphus balteatus"
#> [1652] "Episyrphus (Episyrphus)_balteatus"
#> [1653] "Musca balteata"
#> [1654] "Leptinotarsa decemlineata"
#> [1655] "Leptinotarsa decimlineata"
#> [1656] "Stilodes decemlineata"
#> [1657] "Boletus plorans"
#> [1658] "Suillus plorans"
#> [1659] "Dryobates pubescens"
#> [1660] "Picoides pubescens_(Linnaeus,_1766)"
#> [1661] "Picoides pubescens"
#> [1662] "Fusarium proliferatum_ET1"
#> [1663] "Fusarium oxysporum_Fo47"
#> [1664] "Drosophila sechellia"
#> [1665] "Schizophyllum commune_H4-8"
#> [1666] "Depressaria gossypiella"
#> [1667] "Pectinophora gossypiella"
#> [1668] "Parus humilis"
#> [1669] "Podoces humilis"
#> [1670] "Pseudopoces humilis"
#> [1671] "Pseudopodoces humilis"
#> [1672] "Ascidia intestinalis"
#> [1673] "Ciona intestinalis"
#> [1674] "Distomum viverrini"
#> [1675] "Opisthorchis viverrini"
#> [1676] "Puccinia graminis_f._sp._tritici_CRL_75-36-700-3"
#> [1677] "Plutella xylostella"
#> [1678] "Melampsora larici-populina_98AG31"
#> [1679] "Drosophila obscura"
#> [1680] "Fusarium verticillioides_7600"
#> [1681] "Anoplophora glabripennis"
#> [1682] "Anoplophora nobilis"
#> [1683] "Cerosterna glabripennis"
#> [1684] "Melanauster nobilis"
#> [1685] "Calypte anna"
#> [1686] "Ornismya anna"
#> [1687] "Microdochium trichocladiopsis"
#> [1688] "Anopheles merus"
#> [1689] "Bactrocera (Daculus)_oleae"
#> [1690] "Bactrocera (Dacus)_oleae"
#> [1691] "Bactrocera oleae"
#> [1692] "Dacus oleae"
#> [1693] "Musca oleae"
#> [1694] "Fusarium mangiferae"
#> [1695] "Drosophila yakuba"
#> [1696] "Contarinia nasturtii"
#> [1697] "Parastagonospora nodorum_SN15"
#> [1698] "Drosophila virilis"
#> [1699] "Zasmidium cellare_ATCC_36951"
#> [1700] "Drosophila mauritiana"
#> [1701] "Geospiza fortis"
#> [1702] "Eupeodes corollae"
#> [1703] "Eupeodes (Eupeodes)_corollae"
#> [1704] "Metasyrphus corollae"
#> [1705] "Spodoptera litura"
#> [1706] "Sitodiplosis mosellana"
#> [1707] "Microgaster mediator"
#> [1708] "Microplitis medianus"
#> [1709] "Microplitis mediator"
#> [1710] "Drosophila kikkawai"
#> [1711] "Diaporthe citri"
#> [1712] "Phomopsis citri"
#> [1713] "Mesites unicolor"
#> [1714] "Mesitornis unicolor"
#> [1715] "Suillus subaureus"
#> [1716] "Colletotrichum capsici"
#> [1717] "Colletotrichum dematium_f._truncatum_(Schwein.)_Arx,_1957"
#> [1718] "Colletotrichum truncatum"
#> [1719] "Glomerella glycines"
#> [1720] "Vermicularia capsici"
#> [1721] "Vermicularia truncata"
#> [1722] "Drosophila simulans"
#> [1723] "Anisochrysa carnea"
#> [1724] "Chrysopa carnea"
#> [1725] "Chrysoperla carnea"
#> [1726] "Drosophila takahashii"
#> [1727] "Lucilia cuprina"
#> [1728] "Drosophila persimilis"
#> [1729] "Falco albicilla"
#> [1730] "Haliaeetus albicilla"
#> [1731] "Antrostomus carolinensis"
#> [1732] "Caprimulgus carolinensis"
#> [1733] "Nasonia vitripennis"
#> [1734] "Colias crocea"
#> [1735] "Colias croceus"
#> [1736] "Papilio croceus"
#> [1737] "Leptidea sinapis"
#> [1738] "Papilio sinapis"
#> [1739] "Anopheles arabiensis"
#> [1740] "Drosophila ficusphila"
#> [1741] "Vollenhovia emeryi"
#> [1742] "Hermetia illucens"
#> [1743] "Fusarium vanettenii_77-13-4"
#> [1744] "Nectria haematococca_mpVI_77-13-4"
#> [1745] "Thrips palmi"
#> [1746] "Falco leucocephalus"
#> [1747] "Haliaeetus leucocephalus"
#> [1748] "Malaya genurostris"
#> [1749] "Colletotrichum gloeosporioides_(Penz.)_Penz._&_Sacc.,_1884"
#> [1750] "Colletotrichum gloeosporioides"
#> [1751] "Glomerella cingulata"
#> [1752] "Glomerella rufomaculans-vaccinii"
#> [1753] "Gnomoniopsis cingulata"
#> [1754] "Vermicularia gloeosporioides"
#> [1755] "Acanthamoeba castellanii_Neff_strain"
#> [1756] "Acanthamoeba castellanii_strain_Neff"
#> [1757] "Acanthamoeba castellanii_str._Neff"
#> [1758] "Drosophila albomicans"
#> [1759] "Drosophila nasuta_albomicans"
#> [1760] "Spinulophila albomicans"
#> [1761] "Diaporthe amygdali"
#> [1762] "Fusicoccum amygdali"
#> [1763] "Phomopsis amygdali"
#> [1764] "Pelecanus crispus"
#> [1765] "Pelecanus philippensis_crispus"
#> [1766] "Drosophila rhopaloa"
#> [1767] "Aphantopus hyperantus"
#> [1768] "Maniola hyperantus"
#> [1769] "Papilio hyperantus"
#> [1770] "Drosophila serrata"
#> [1771] "Leptopilina heterotoma"
#> [1772] "Peronospora halstedii"
#> [1773] "Plasmopara halstedii"
#> [1774] "Cuculus discolor"
#> [1775] "Leptosomus discolor"
#> [1776] "Aphanomyces invadans"
#> [1777] "Drosophila santomea"
#> [1778] "Sipha flava"
#> [1779] "Drosophila teissieri"
#> [1780] "Aptenodytes forsteri"
#> [1781] "Phaethon lepturus"
#> [1782] "Drosophila bipectinata"
#> [1783] "Fulmaris glacialis"
#> [1784] "Fulmarus glacialis"
#> [1785] "Procellaria glacialis"
#> [1786] "Ardea garzetta"
#> [1787] "Egretta garzetta"
#> [1788] "Anopheles mysorensis"
#> [1789] "Anopheles stephensi_mysorensis"
#> [1790] "Anopheles stephensi"
#> [1791] "Anopheles stephensi_var._mysorensis"
#> [1792] "Neocellia intermedia_Rothwell,_1907"
#> [1793] "Neocellia intermedia"
#> [1794] "Cryptotermes secundus"
#> [1795] "Pestalotiopsis fici_W106-1"
#> [1796] "Aricia agestis"
#> [1797] "Papilio agestis"
#> [1798] "Polyommatus agestis"
#> [1799] "Artogeia napi"
#> [1800] "Papilio napi"
#> [1801] "Pieris napi"
#> [1802] "Drosophila eugracilis"
#> [1803] "Wasmannia auropunctata"
#> [1804] "Oppia nitens"
#> [1805] "Adelges cooleyi"
#> [1806] "Chermes cooleyi"
#> [1807] "Gilletteella cooleyi"
#> [1808] "Acanthisitta chloris"
#> [1809] "Sitta chloris"
#> [1810] "Agrilus feretrius"
#> [1811] "Agrilus marcopoli"
#> [1812] "Agrilus planipennis"
#> [1813] "Drosophila elegans"
#> [1814] "Hyposmocoma kahamanoa"
#> [1815] "Cariama cristata"
#> [1816] "Palamedea cristata"
#> [1817] "Aleurodes tabaci"
#> [1818] "Aleyrodes tabaci"
#> [1819] "Bemisia tabaci"
#> [1820] "Ibis nippon"
#> [1821] "Nipponia nippon"
#> [1822] "Balearica gibbericeps"
#> [1823] "Balearica pavonina_gibbericeps"
#> [1824] "Balearica regulorum_gibbericepse"
#> [1825] "Balearica regulorum_gibbericeps"
#> [1826] "Bombus affinis"
#> [1827] "Varroa jacobsoni"
#> [1828] "Drosophila gunungcola"
#> [1829] "Colius striatus"
#> [1830] "Tauraco erythrolophus"
#> [1831] "Colletotrichum aenigma"
#> [1832] "Colletotrichum communis"
#> [1833] "Colletotrichum dianesei"
#> [1834] "Colletotrichum endomangiferae"
#> [1835] "Colletotrichum hymenocallidis"
#> [1836] "Colletotrichum jasmini-sambac"
#> [1837] "Colletotrichum melanocaulon"
#> [1838] "Colletotrichum siamense"
#> [1839] "Pterocles gutturalis"
#> [1840] "Aethina tumida"
#> [1841] "Galleria mellonella"
#> [1842] "Phalaena mellonella"
#> [1843] "Bicyclus anynana"
#> [1844] "Mycalesis anynana"
#> [1845] "Leptopilina boulardi"
#> [1846] "Zootermopsis nevadensis"
#> [1847] "Achroia grisella"
#> [1848] "Tinea grisella"
#> [1849] "Acremonium falciforme"
#> [1850] "Cephalosporium falciforme"
#> [1851] "Fusarium falciforme"
#> [1852] "Neocosmospora falciformis"
#> [1853] "Drosophila mohavensis"
#> [1854] "Drosophila mojavensis"
#> [1855] "Drosophila innubila"
#> [1856] "Bombus huntii"
#> [1857] "Cuculus indicator"
#> [1858] "Indicator indicator"
#> [1859] "Dendroctonus ponderosae"
#> [1860] "Ardea helias"
#> [1861] "Eurypyga helias"
#> [1862] "Dracunculus loa"
#> [1863] "Loa loa"
#> [1864] "Cadophora gregata"
#> [1865] "Cephalosporium gregatum"
#> [1866] "Phialophora gregata"
#> [1867] "Nestor notabilis_notabilis"
#> [1868] "Nestor notabilis"
#> [1869] "Colymbus stellatus"
#> [1870] "Gavia stellata"
#> [1871] "Cynthia cardui"
#> [1872] "Papilio cardui"
#> [1873] "Vanessa cardui"
#> [1874] "Cladosporium fulvum"
#> [1875] "Fulvia fulva"
#> [1876] "Mycovellosiella fulva"
#> [1877] "Passalora fulva"
#> [1878] "Plodia interpunctella"
#> [1879] "Tinea interpunctella"
#> [1880] "Stilbospora angustata"
#> [1881] "Truncatella angustata"
#> [1882] "Truncatella truncata"
#> [1883] "Chlamydotis macqueenii_macqueenii"
#> [1884] "Chlamydotis macqueenii_macqueeni"
#> [1885] "Chlamydotis macqueenii"
#> [1886] "Chlamydotis undulata_macqueenii"
#> [1887] "Otis macqueenii"
#> [1888] "Anopheles funestus"
#> [1889] "Fusarium fujikuroi_IMI_58289"
#> [1890] "Cephalosporium keratoplasticum_(nom._inval.)"
#> [1891] "Fusarium keratoplasticum"
#> [1892] "Neocosmospora keratoplastica"
#> [1893] "Drosophila lebanonensis"
#> [1894] "Scaptodrosophila lebanonensis"
#> [1895] "Merops nubicus"
#> [1896] "Coniothyrium fuckelii_var._sporulosum"
#> [1897] "Coniothyrium sporulosum"
#> [1898] "Paraconiothyrium sporulosum"
#> [1899] "Paraconyotrichium sporulosum"
#> [1900] "Paraphaeosphaeria sporulosa"
#> [1901] "Fusarium venenatum"
#> [1902] "Fusarium venetum"
#> [1903] "Amyelois transitella"
#> [1904] "Nephopteryx transitella"
#> [1905] "Pelecanus carbo"
#> [1906] "Phalacrocorax carbo"
#> [1907] "Naegleria lovaniensis"
#> [1908] "Papilio machaon"
#> [1909] "Gaeumannomyces tritici_R3-111a-1"
#> [1910] "Papilio aegeria"
#> [1911] "Pararge aegeria"
#> [1912] "Lophyrus lecontei"
#> [1913] "Neodiprion lecontei"
#> [1914] "Sclerotinia sclerotiorum_1980_UF-70"
#> [1915] "Aegialitis vocifera"
#> [1916] "Charadrius vociferous"
#> [1917] "Charadrius vociferus"
#> [1918] "Oxyechus vociferus"
#> [1919] "Drosophila erecta"
For example, to specify arabidopsis:
buildRef(
reference_path = ref_path,
fasta = "genome.fa", gtf = "transcripts.gtf",
genome_type = "",
ontologySpecies = "Arabidopsis thaliana"
)
To use STAR
to align FASTQ files, one must be using a
system with STAR
installed. This software is not available
in Windows. To check if STAR
is available:
ref_path = "./Reference"
# Ensure genome resources are prepared from genome FASTA and GTF file:
if(!dir.exists(file.path(ref_path, "resource"))) {
getResources(
reference_path = ref_path,
fasta = "genome.fa",
gtf = "transcripts.gtf"
)
}
# Generate a STAR genome reference:
STAR_BuildRef(
reference_path = ref_path,
n_threads = 8
)
Note that, by default, STAR_BuildRef
will store the STAR
genome reference in the STAR
subdirectory within
reference_path
. To override this setting, set the
STAR_ref_path
parameter to a directory path of your choice,
e.g.:
STAR_BuildRef(
reference_path = ref_path,
STAR_ref_path = "/path/to/another/directory",
n_threads = 8
)
Sometimes, one might wish to build a genome annotation without first specifying the gene annotations. Reasons one might want to do this include:
We can use STAR_buildGenome
to do this:
# Generate a STAR genome reference:
STAR_buildGenome(
reference_path = ref_path,
STAR_ref_path = "/path/to/hg38"
n_threads = 8
)
This STAR reference is derived from the genome FASTA file but not the gene annotation GTF file. Prior to alignment, additional parameters need to be supplied (which should take 5 minutes). These include:
reference_path
parameterTo generate an on-the-fly (i.e., alignment-ready) STAR reference from a genome-derived reference:
STAR_new_ref <- STAR_loadGenomeGTF(
reference_path = ref_path,
STAR_ref_path = "/path/to/hg38",
STARgenome_output = file.path(tempdir(), "STAR"),
n_threads = 8,
sjdbOverhang = 100,
extraFASTA = "./ercc.fasta"
)
The path to the on-the-fly reference is specified by the return value
(STAR_new_ref
in the above example).
As already explained, this step allows a single STAR reference to be
built for each species, which can be adapted for different projects
based on their specific technical specifications (e.g. different read
length can be adapted by setting different sjdbOverhang
, or
any spike-ins by setting the spike-in FASTA using
extraFASTA
).
Genomes contain regions of low mappability (i.e. areas which are
difficult for reads or fragments to align to). A common computational
cause of low mappability include repeat sequences. IRFinder uses an
empirical method to determine regions of low mappability, which we
adopted in SpliceWiz. These resources are used automatically when
generating the SpliceWiz reference and setting the
genome_type
to supported genomes (hg38, hg19, mm10, mm9).
For other species, one may wish to generate their own annotations of low
mappability regions using the STAR aligner.
The STAR_mappability
wrapper function will use the STAR
aligner to calculate regions of low mappability within the given
genome.
STAR_mappability(
reference_path = ref_path,
STAR_ref_path = file.path(ref_path, "STAR"),
map_depth_threshold = 4,
n_threads = 8,
read_len = 70,
read_stride = 10,
error_pos = 35
)
In the above example, STAR_mappability()
will use the
given STAR reference (inside the STAR_ref_path
directory),
and the genome found within the reference_path
SpliceWiz
reference, to generate synthetic reads.
read_len
specifies the length of these synthetic reads
(default 70
)read_stride
specifies the nucleotide distance between
adjacent synthetic reads (default 10
). These will be
generated with alternate +
/ -
stranderror_pos
introduces a single nucleotide error at the
specified position (default 35
), which will generate an SNP
at the center of the 70-nt synthetic read.These synthetic reads will then be aligned back to the STAR genome to create a BAM file, which is later processed to measure the coverage depth of the genome by these synthetic reads.
Finally, regions with coverage depth of
map_depth_threshold
or below will be defined as regions of
“low mappability”. In the above example, 70-nt reads of 10-nt stride
will produce synthetic reads such that each nucleotide is expected to
have a coverage of 70 / 10 = 7
nucleotides. A coverage of
4
nucleotides or less equates to a coverage of < ~60% of
expected depth.
If STAR
is available on the same computer or server
where R/RStudio is being run, we can use the one-line function
buildFullRef
. This function will:
getResources
)STAR_BuildRef
)STAR_mappability
)buildRef
)This step is recommended when one wishes to build a non-human/mouse genome in a single step, including generating low-mappability regions to exclude measuring IR events with low mappability.
buildFullRef(
reference_path = ref_path,
fasta = "genome.fa", gtf = "transcripts.gtf",
genome_type = "",
use_STAR_mappability = TRUE,
n_threads = 8
)
n_threads
specify how many threads should be used to
build the STAR reference and to calculate the low mappability
regions
If STAR
is not available, Rsubread
is
available on Bioconductor for alignment and can be used to perform
mappability calculations. The example code in the manual is displayed
here for convenience, to demonstrate how this would be done:
require(Rsubread)
# (1a) Creates genome resource files
ref_path <- file.path(tempdir(), "Reference")
getResources(
reference_path = ref_path,
fasta = chrZ_genome(),
gtf = chrZ_gtf()
)
# (1b) Systematically generate reads based on the SpliceWiz example genome:
generateSyntheticReads(
reference_path = ref_path
)
# (2) Align the generated reads using Rsubread:
# (2a) Build the Rsubread genome index:
subreadIndexPath <- file.path(ref_path, "Rsubread")
if(!dir.exists(subreadIndexPath)) dir.create(subreadIndexPath)
Rsubread::buildindex(
basename = file.path(subreadIndexPath, "reference_index"),
reference = chrZ_genome()
)
# (2b) Align the synthetic reads using Rsubread::subjunc()
Rsubread::subjunc(
index = file.path(subreadIndexPath, "reference_index"),
readfile1 = file.path(ref_path, "Mappability", "Reads.fa"),
output_file = file.path(ref_path, "Mappability", "AlignedReads.bam"),
useAnnotation = TRUE,
annot.ext = chrZ_gtf(),
isGTF = TRUE
)
# (3) Analyse the aligned reads in the BAM file for low-mappability regions:
calculateMappability(
reference_path = ref_path,
aligned_bam = file.path(ref_path, "Mappability", "AlignedReads.bam")
)
# (4) Build the SpliceWiz reference using the calculated Mappability Exclusions
buildRef(ref_path)
Note that the default output file for
calculateMappability()
(step 3) is
Mappability/MappabilityExclusion.bed.gz
found within the
reference_path
directory. Then buildRef()
(step 4) will automatically use this file, regardless of the
genome_type
parameter. The exception is if
MappabilityRef
parameter is set to a different file.
This conveniences users to generate their own human/mouse mappability files but use the default non-polyA reference, e.g.:
First, remember to check that STAR is available via command line:
STAR_alignReads(
fastq_1 = "sample1_1.fastq", fastq_2 = "sample1_2.fastq",
STAR_ref_path = file.path(ref_path, "STAR"),
BAM_output_path = "./bams/sample1",
n_threads = 8,
trim_adaptor = "AGATCGGAAG"
)
Note that by default, STAR_alignReads()
will “trim”
Illumina adapters (in fact they will be soft-clipped using STAR’s
--clip3pAdapterSeq
option). To disable this feature, set
trim_adapter = ""
in the STAR_alignReads()
function.
Experiment <- data.frame(
sample = c("sample_A", "sample_B"),
forward = file.path("raw_data", c("sample_A", "sample_B"),
c("sample_A_1.fastq", "sample_B_1.fastq")),
reverse = file.path("raw_data", c("sample_A", "sample_B"),
c("sample_A_2.fastq", "sample_B_2.fastq"))
)
STAR_alignExperiment(
Experiment = Experiment,
STAR_ref_path = file.path("Reference_FTP", "STAR"),
BAM_output_path = "./bams",
n_threads = 8,
two_pass = FALSE
)
To use two-pass mapping, set two_pass = TRUE
. We
recommend disabling this feature, as one-pass mapping is adequate in
typical-use cases. Two-pass mapping is recommended if one expects a
large number of novel splicing events or if the gene annotations (of
transcript isoforms) is likely to be incomplete. Additionally, two-pass
mapping is highly memory intensive and should be reserved for systems
with high memory resources.
SpliceWiz can identify sequencing FASTQ files recursively from a
given directory. It assumes that forward and reverse reads are suffixed
as _1
and _2
, respectively. Users can choose
to identify such files using a specified file extension. For example, to
recursively identify FASTQ files of the format
{sample}_1.fq.gz
and {sample}_2.fq.gz
, use the
following:
# Assuming sequencing files are named by their respective sample names
fastq_files <- findFASTQ(
sample_path = "./sequencing_files",
paired = TRUE,
fastq_suffix = ".fq.gz", level = 0
)
For gzipped fastq files, fastq_suffix
should be
".fq.gz"
or ".fastq.gz"
. For uncompressed
fastq files, it should be ".fq"
or ".fastq"
.
Please check your files in order to correctly set this option.
findFASTQ()
will return a 2- or 3-column data frame
(depending if paired
was set to FALSE
or
TRUE
, respectively). The first column is the sample name
(the file name, if level = 0
, or the parent directory name,
if level = 1
). The subsequent columns are the paths of the
forward and reverse reads.
The data.frame returned by the findFASTQ()
function can
be parsed into the STAR_alignExperiment
function. This will
align all samples contained in the data.frame parsed via the
Experiment
parameter.
STAR_alignExperiment(
Experiment = fastq_files,
STAR_ref_path = file.path("Reference_FTP", "STAR"),
BAM_output_path = "./bams",
n_threads = 8,
two_pass = FALSE
)
Note that, if a directory contains multiple forward and reverse FASTQ
files, they will be aligned to the same BAM file. This can be done by
setting level = 1
in the findFASTQ()
function,
resulting in multiple rows with the same sample name.
To conveniently find all BAM files recursively in a given path:
This convenience function returns the putative sample names, either
from BAM file names themselves (level = 0
), or from the
names of their parent directories (level = 1
).
First, ensure that a SpliceWiz reference has been generated using the
buildRef()
function. This reference should be parsed into
the reference_path
parameter of the
processBAM()
function.
To run processBAM()
using 4 OpenMP threads:
# assume SpliceWiz reference has been generated in `ref_path` using the
# `buildRef()` function.
processBAM(
bamfiles = bams$path,
sample_names = bams$sample,
reference_path = ref_path,
output_path = "./pb_output",
n_threads = 4,
useOpenMP = TRUE
)
Sometimes one may wish to create a COV file from a BAM file without
running processBAM()
. One reason might be because a
SpliceWiz reference is not available.
To convert a list of BAM files, run BAM2COV()
. This is a
function structurally similar to processBAM()
but without
the need to give the path to the SpliceWiz reference:
BAM2COV(
bamfiles = bams$path,
sample_names = bams$sample,
output_path = "./cov_output",
n_threads = 4,
useOpenMP = TRUE
)
Sometimes, users may wish to convert COV files to BigWig. One common reason may be to generate strand-specific coverage to compare with BigWig files on IGV.
For example, to generate a BigWig file containing reads on the negative strand:
se <- SpliceWiz_example_NxtSE()
cov_file <- covfile(se)[1]
cov_negstrand <- getCoverage(cov_file, strand = "-")
bw_file <- file.path(tempdir(), "sample_negstrand.bw")
rtracklayer::export(cov_negstrand, bw_file, "bw")
SpliceWiz processes BAM files using OpenMP-based parallelisation (multi-threading), using our ompBAM C++ library (available via the ompBAM Bioconductor package). The advantage of using this approach (instead of processing multiple BAM files each using a single thread) is that the latter approach uses a lot more memory. Our OpenMP-based approach processes BAM files one at a time, avoiding the memory cost when analysing multiple BAM files simultaneously.
Note that, by default, processBAM
and
BAM2COV
will use OpenMP where available (which is natively
supported on Windows and Linux). For MacOS, if OpenMP is not available,
these functions will use BiocParallel’s MulticoreParam
to
multi-thread process BAM files (1 BAM per thread). Beware that this may
take a lot of RAM! (Typically 5-10 Gb per BAM file). We highly suggest
considering installing OpenMP libraries on MacOS, as this will lower RAM
usage.
Assuming the SpliceWiz reference is in ref_path
, after
running processBAM()
as shown in the previous section, use
the convenience function findSpliceWizOutput()
to tabulate
a list of samples and their corresponding processBAM()
outputs:
This data.frame can be directly used to run
collateData
:
novelSplicing = TRUE
. See the Quick-Start vignette for more
details about the various parameters associated with novel splicing
detection.collateData(
Experiment = expr,
reference_path = ref_path,
output_path = "./NxtSE_output_novelSplicing",
novelSplicing = TRUE
)
Then, the collated data can be imported as a NxtSE
object, which is an object that inherits
SummarizedExperiment
and has specialized containers to hold
additional data required by SpliceWiz.
Please refer to SpliceWiz: Quick-Start vignette for worked examples
using the example dataset.
sessionInfo()
#> R Under development (unstable) (2024-10-21 r87258)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] AnnotationHub_3.15.0 BiocFileCache_2.15.0 dbplyr_2.5.0
#> [4] BiocGenerics_0.53.0 SpliceWiz_1.9.0 NxtIRFdata_1.11.1
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 jsonlite_1.8.9
#> [3] magrittr_2.0.3 farver_2.1.2
#> [5] rmarkdown_2.28 fs_1.6.4
#> [7] BiocIO_1.17.0 zlibbioc_1.53.0
#> [9] vctrs_0.6.5 memoise_2.0.1
#> [11] Rsamtools_2.23.0 DelayedMatrixStats_1.29.0
#> [13] RCurl_1.98-1.16 webshot_0.5.5
#> [15] progress_1.2.3 htmltools_0.5.8.1
#> [17] S4Arrays_1.7.0 curl_5.2.3
#> [19] Rhdf5lib_1.29.0 SparseArray_1.7.0
#> [21] rhdf5_2.51.0 shinyFiles_0.9.3
#> [23] sass_0.4.9 bslib_0.8.0
#> [25] htmlwidgets_1.6.4 plotly_4.10.4
#> [27] cachem_1.1.0 GenomicAlignments_1.43.0
#> [29] iterators_1.0.14 mime_0.12
#> [31] lifecycle_1.0.4 pkgconfig_2.0.3
#> [33] Matrix_1.7-1 R6_2.5.1
#> [35] fastmap_1.2.0 GenomeInfoDbData_1.2.13
#> [37] MatrixGenerics_1.19.0 shiny_1.9.1
#> [39] digest_0.6.37 colorspace_2.1-1
#> [41] ps_1.8.1 patchwork_1.3.0
#> [43] AnnotationDbi_1.69.0 S4Vectors_0.45.0
#> [45] GenomicRanges_1.59.0 RSQLite_2.3.7
#> [47] seriation_1.5.6 filelock_1.0.3
#> [49] fansi_1.0.6 httr_1.4.7
#> [51] abind_1.4-8 compiler_4.5.0
#> [53] withr_3.0.2 bit64_4.5.2
#> [55] BiocParallel_1.41.0 viridis_0.6.5
#> [57] DBI_1.2.3 heatmaply_1.5.0
#> [59] dendextend_1.18.1 HDF5Array_1.35.0
#> [61] R.utils_2.12.3 rappdirs_0.3.3
#> [63] DelayedArray_0.33.0 rjson_0.2.23
#> [65] chromote_0.3.1 tools_4.5.0
#> [67] httpuv_1.6.15 fst_0.9.8
#> [69] R.oo_1.26.0 glue_1.8.0
#> [71] restfulr_0.0.15 rhdf5filters_1.19.0
#> [73] promises_1.3.0 grid_4.5.0
#> [75] generics_0.1.3 gtable_0.3.6
#> [77] BSgenome_1.75.0 ca_0.71.1
#> [79] R.methodsS3_1.8.2 websocket_1.4.2
#> [81] tidyr_1.3.1 hms_1.1.3
#> [83] data.table_1.16.2 xml2_1.3.6
#> [85] utf8_1.2.4 XVector_0.47.0
#> [87] foreach_1.5.2 BiocVersion_3.21.1
#> [89] pillar_1.9.0 genefilter_1.89.0
#> [91] later_1.3.2 splines_4.5.0
#> [93] dplyr_1.1.4 lattice_0.22-6
#> [95] ompBAM_1.11.0 survival_3.7-0
#> [97] rtracklayer_1.67.0 bit_4.5.0
#> [99] annotate_1.85.0 tidyselect_1.2.1
#> [101] registry_0.5-1 Biostrings_2.75.0
#> [103] knitr_1.48 rhandsontable_0.3.8
#> [105] gridExtra_2.3 IRanges_2.41.0
#> [107] SummarizedExperiment_1.37.0 stats4_4.5.0
#> [109] xfun_0.48 shinydashboard_0.7.2
#> [111] Biobase_2.67.0 matrixStats_1.4.1
#> [113] pheatmap_1.0.12 DT_0.33
#> [115] stringi_1.8.4 UCSC.utils_1.3.0
#> [117] lazyeval_0.2.2 yaml_2.3.10
#> [119] shinyWidgets_0.8.7 evaluate_1.0.1
#> [121] codetools_0.2-20 tibble_3.2.1
#> [123] BiocManager_1.30.25 cli_3.6.3
#> [125] xtable_1.8-4 processx_3.8.4
#> [127] munsell_0.5.1 jquerylib_0.1.4
#> [129] Rcpp_1.0.13 GenomeInfoDb_1.43.0
#> [131] png_0.1-8 XML_3.99-0.17
#> [133] parallel_4.5.0 fstcore_0.9.18
#> [135] ggplot2_3.5.1 assertthat_0.2.1
#> [137] blob_1.2.4 prettyunits_1.2.0
#> [139] sparseMatrixStats_1.19.0 bitops_1.0-9
#> [141] viridisLite_0.4.2 scales_1.3.0
#> [143] purrr_1.0.2 crayon_1.5.3
#> [145] rlang_1.1.4 rvest_1.0.4
#> [147] TSP_1.2-4 KEGGREST_1.47.0