RefSeq
Content | |
---|---|
Description | curated non-redundant sequence database of genomes. |
Contact | |
Research center | National Center for Biotechnology Information |
Primary citation | Pruitt KD & al. (2005)[1] |
Access | |
Website | http://www.ncbi.nlm.nih.gov/RefSeq |
The Reference Sequence (RefSeq) database[1] is an open access, annotated and curated collection of publicly available nucleotide sequences (DNA, RNA) and their protein products. This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule (i.e. DNA, RNA or protein) for major organisms ranging from viruses to bacteria to eukaryotes.
For each model organism, RefSeq aims to provide separate and linked records for the genomic DNA, the gene transcripts, and the proteins arising from those transcripts. RefSeq is limited to major organisms for which sufficient data are available (more than 16,000 distinct “named” organisms as of September 2011),[2] while GenBank includes sequences for any organism submitted (approximately 250,000 different named organisms).
RefSeq categories
Category | Description |
---|---|
NC | Complete genomic molecules |
NG | Incomplete genomic region |
NM | mRNA |
NR | ncRNA |
NP | Protein |
XM | predicted mRNA model |
XR | predicted ncRNA model |
XP | predicted Protein model |
For more details and more categories, see Table 1 in Chapter 18 of the book The Reference Sequence (RefSeq) Database.
See also
- GenBank
- Sequence analysis
- Sequence profiling tool
- Sequence motif
- UniProt
- List of sequenced eukaryotic genomes
- List of sequenced archaeal genomes
References
External links
Sources
- This article incorporates public domain material from the National Center for Biotechnology Information document "NCBI Handbook".