Available databases


Phylogenetic ranges of available individual databases secondarily compiled are shown in the tree below.

 

Descriptions of the original sources of the above databases are given below.

Database 1. Human - Refseq

Human peptides registered in RefSeq were downloaded from NCBI Protein on June 20, 2016.

Database 2. Human - Ensembl 84
All 'known' human peptide sequences were downloaded from the Ensembl FTP site.

Database 3. Non-human eutherians - Ensembl 84
Peptide sequences for 'known' genes of all non-human eutherian species listed here were downloaded from the Ensembl FTP site.

Database 4. Non-eutherian mammals - Ensembl 84
Peptide sequences for 'known' genes of all non-eutherian mammals listed here (opossum, tammar wallaby, Tasmanian devil, and platypus) were downloaded from the Ensembl FTP site.

Database 5. Non-mammalian jawed vertebrates - Ensembl 84 and others
Peptide sequences for 'known' genes of all non-mammalian vertebrates listed here (chicken, turkey, zebra finch, anole lizard, soft-shelled turtle, Xenopus tropicalis, coelacanth, zebrafish, Atlantic cod, stickleback, fugu, Tetraodon, medaka, Nile tilapia, platyfish, Mexican cavefish, Amazon molly, and spotted gar) were downloaded from the Ensembl FTP site. Plus, peptides predicted in the genomes of the green sea turtle (Chelonia mydas) and the painted turtle (Chrysemys picta) were downloaded from NCBI Protein under their BioProject entries [ Chelonia | Chrysemys ]. The peptide sequence set was downloaded from GigaDB for the Chinese alligator.

Database 6. Cartilaginous fish (Chondrichthyes) and cyclostomes
Peptide sequences predicted in the whole genome assembly of elephant shark (published only recently in Nature) was downloaded from here. Peptide sequences for the sea lamprey (Petromyzon marinus) predicted by the genome annotation consortium (Smith et al., 2013. Nat. Genet.) has also been newly included in this database, in addition to its predicted peptides by Ensembl. Peptides predicted on the publicly available genome assembly of Lethenteron camtschaticum (Japanese lamprey or Arctic lamprey) (LetJap1.0) released by Venkatesh Lab are derived from our own gene model inference and available at our own laboratory web site

Database 7. All vertebrate entries except mammalians in NCBI Protein
Peptide sequences excluding those of mammals registered for the taxonomic group 'Craniata' (although this taxon name is no longer valid phylogenetically) [ link ] were downloaded on June 20, 2016.

Database 8. Invertebrate deuterostomes (tunicates, echinoderms and amphioxus)
Peptide sequences of the species included below were downloaded from the FTP site of Ensembl Genome:

sea urchin (Strongylocentrotus purpuratus)

Peptide sequences based on the genome assemblies of Ciona intestinalis and Ciona savignyi were downloaded from the Ensembl FTP site. Sequences of Oikopleura dioica were downloaded from the individual web sites of the genome projects. 28623 peptide sequences based on the version 2 Branchiostoma floridae genome assembly by JGI were downloaded from NCBI Protein

Database 9. Arthropods - EnsemblGenomes 31
Peptide sequences of the species listed below were downloaded from the FTP site of Ensembl Genome:

waterflea Daphnia pulex
human louse Pediculus humanus
leaf cutter ant Atta cephalotes
red fire ant Solenopsis invicta
honeybee Apis mellifera
monarch butterfly Danaus plexippus
silkworm Bombyx mori
African malaria mosquito Anopheles gambiae
yellow fever mosquito Aedes aegypti
house mosquito Culex quinquefasciatus
12 Drosophila species including Drosophila melanogaster
postman butterfly Heliconius melpomene
small pteromalid parasitoid wasp Nasonia vitripennis
deer tick or blacklegged tick Ixodes scapularis
red flour beetle Tribolium castaneum
mountain pine beetle Dendroctonus ponderosae
pea aphid Acyrthosiphon pisum
Glanville fritillary Melitaea cinxia
termite Zootermopsis nevadensis
Antarctic midge Belgica antarctica
common eastern bumblebee Bombus impatiens
sea louse Lepeophtheirus salmonis
Australian sheep blowfly Lucilia cuprina
itch mite Sarcoptes scabiei
African social velvet spider Stegodyphus mimosarum

Database 10. Nematodes - EnsemblGenomes 31 and others
Peptide sequences of below species were downloaded from the FTP site of Ensembl Genome:

Trichinella spiralis
Caenorhabditis elegans
Caenorhabditis brenneri
Caenorhabditis brigssae
Caenorhabditis japonica
Caenorhabditis remanei
Brugia malayi
Pristionchus pacificus
Loa loa
Onchocerca volvulus
Strongyloides ratti

Predicted peptides for Meloidogyne incognita were downloaded from its genome project web page.

Database 11. Other protostomes - EnsemblGenomes 31 and others
Peptide sequences of below species were downloaded from the FTP site of Ensembl Genome:

Schistosoma mansoni
Capitella teleta
Helobdella robusta
Pacific oyster Crassostrea gigas
Octopus bimaculoides
Lingula anatina


Peptide sequences of predicted genes for other lophotrochozoan species were downloaded from individual web project sites (Schistosoma japonicum and pearl oyster Pinctada fucata)

Database 12. Non-bilaterian metazoans (cnidarians, ctenophoran, placozoan & poriferan)
Peptide sequences of the species listed below were downloaded from the FTP site of Ensembl Genome:

placozoan Trichoplax adherens
Nematostella vectensis
poriferan Amphimedon queenslandica
comb jellyfish Mnemiopsis leidyi
myxosporean Thelohanellus kitauei

Predicted peptide sequences were downloaded from individual sources for 2 cnidarians (Acropora digitifera and Hydra magnipapillata).

Database 13. All metazoan entries except vertebrates in NCBI Protein
Peptide sequences excluding those of vertebrates registered for the taxonomic group 'Metazoa' [ link ] were downloaded on June 20, 2016. 


Notes
Please contact the chief administrator if you know of any source of sequences that seems more authentic or comprehensive than the one covered here.

Back to top