UCSC Genome Bioinformatics
  Frequently Asked Questions: Assembly Releases and Versions
  Return to FAQ Table of Contents


  List of UCSC genome releases
 

Question:
"How do UCSC's release numbers correspond to those of other organizations, such as NCBI?"

Response:
The first release of an assembly is given a name using the first three characters of the organism's genus and species classification in the format gggSss#, with subsequent assemblies incrementing the number. Assemblies predating the 2003 introduction of the six-letter naming system were given two-letter names in a similar gs# format and human assemblies are named hg# for human genome.

SPECIES UCSC VERSION RELEASE DATE RELEASE NAME STATUS
MAMMALS    
Humanhg38Dec. 2013Genome Reference Consortium GRCh38Available
 hg19Feb. 2009Genome Reference Consortium GRCh37Available
 hg18Mar. 2006NCBI Build 36.1Available
 hg17May 2004NCBI Build 35Available
 hg16Jul. 2003NCBI Build 34Available
 hg15Apr. 2003NCBI Build 33Archived
 hg13Nov. 2002NCBI Build 31Archived
 hg12Jun. 2002NCBI Build 30Archived
 hg11Apr. 2002NCBI Build 29Archived (data only)
 hg10Dec. 2001NCBI Build 28Archived (data only)
 hg8Aug. 2001UCSC-assembledArchived (data only)
 hg7Apr. 2001UCSC-assembledArchived (data only)
 hg6Dec. 2000UCSC-assembledArchived (data only)
 hg5Oct. 2000UCSC-assembledArchived (data only)
 hg4Sep. 2000UCSC-assembledArchived (data only)
 hg3Jul. 2000UCSC-assembledArchived (data only)
 hg2Jun. 2000UCSC-assembledArchived (data only)
 hg1May 2000UCSC-assembledArchived (data only)
AlpacavicPac2Mar. 2013Broad Institute Vicugna_pacos-2.0.1Available
 vicPac1Jul. 2008Broad Institute VicPac1.0Available
ArmadillodasNov3Dec. 2011Broad Institute DasNov3Available
BushbabyotoGar3Mar. 2011Broad Institute OtoGar3Available
BaboonpapHam1Nov. 2008Baylor College of Medicine HGSC Pham_1.0Available
 papAnu2Mar. 2012Baylor College of Medicine Panu_2.0Available
CatfelCat5Sep. 2011ICGSC Felis_catus-6.2Available
 felCat4Dec. 2008NHGRI catChrV17eAvailable
 felCat3Mar. 2006Broad Institute Release 3Available
ChimppanTro4Feb. 2011CGSC Build 2.1.4Available
 panTro3Oct. 2010CGSC Build 2.1.3Available
 panTro2Mar. 2006CGSC Build 2.1Available
 panTro1Nov. 2003CGSC Build 1.1Available
Chinese hamstercriGri1Jul. 2013Beijing Genomics Institution-Shenzhen C_griseus_v1.0Available
CowbosTau8Jun. 2014University of Maryland v3.1.1Available
 bosTau7Oct. 2011Baylor College of Medicine HGSC Btau_4.6.1Available
 bosTau6Nov. 2009University of Maryland v3.1Available
 bosTau4Oct. 2007Baylor College of Medicine HGSC Btau_4.0Available
 bosTau3Aug. 2006Baylor College of Medicine HGSC Btau_3.1Available
 bosTau2Mar. 2005Baylor College of Medicine HGSC Btau_2.0Available
 bosTau1Sep. 2004Baylor College of Medicine HGSC Btau_1.0Archived
DogcanFam3Sep. 2011Broad Institute v3.1Available
 canFam2May 2005Broad Institute v2.0Available
 canFam1Jul. 2004Broad Institute v1.0Available
DolphinturTru2Oct. 2011Baylor College of Medicine Ttru_1.4Available
ElephantloxAfr3Jul. 2009Broad Institute LoxAfr3Available
FerretmusFur1Apr. 2011Ferret Genome Sequencing Consortium MusPutFur1.0Available
GibbonnomLeu3Oct. 2012Gibbon Genome Sequencing Consortium Nleu3.0Available
 nomLeu2Jun. 2011Gibbon Genome Sequencing Consortium Nleu1.1Available
 nomLeu1Jan. 2010Gibbon Genome Sequencing Consortium Nleu1.0Available
GorillagorGor3May 2011Wellcome Trust Sanger Institute gorGor3.1Available
Guinea pigcavPor3Feb. 2008Broad Institute cavPor3Available
HedgehogeriEur2May 2012Broad Institute EriEur2.0Available
 eriEur1Jun. 2006Broad Institute Draft_v1Available
HorseequCab2Sep. 2007Broad Institute EquCab2Available
 equCab1Jan. 2007Broad Institute EquCab1Available
Kangaroo ratdipOrd1Jul. 2008Baylor/Broad Institute DipOrd1.0Available
ManateetriMan1Oct. 2011Broad Institute TriManLat1.0Available
MarmosetcalJac3Mar. 2009WUSTL Callithrix_jacchus-v3.2Available
 calJac1Jun. 2007WUSTL Callithrix_jacchus-v2.0.2Available
MegabatpteVam1Jul. 2008Broad Institute Ptevap1.0Available
MicrobatmyoLuc2Jul. 2010Broad Institute MyoLuc2.0Available
Minke whalebalAcu1Oct. 2013KORDI BalAcu1.0Available
Mousemm10Dec. 2011Genome Reference Consortium GRCm38Available
 mm9Jul. 2007NCBI Build 37Available
 mm8Feb. 2006NCBI Build 36Available
 mm7Aug. 2005NCBI Build 35Available
 mm6Mar. 2005NCBI Build 34Archived
 mm5May 2004NCBI Build 33Archived
 mm4Oct. 2003NCBI Build 32Archived
 mm3Feb. 2003NCBI Build 30Archived
 mm2Feb. 2002MGSCv3Archived
 mm1Nov. 2001MGSCv2Archived (data only)
Mouse lemurmicMur1Jul. 2007Broad Institute MicMur1.0Available
Naked mole-rathetGla2Jan. 2012Broad Institute HetGla_female_1.0Available
 hetGla1Jul. 2011Beijing Genomics Institute HetGla_1.0Available
OpossummonDom5Oct. 2006Broad Institute release MonDom5Available
 monDom4Jan. 2006Broad Institute release MonDom4Available
 monDom1Oct. 2004Broad Institute release MonDom1Available
OrangutanponAbe2Jul. 2007WUSTL Pongo_albelii-2.0.2Available
PandaailMel1Dec. 2009BGI-Shenzhen AilMel 1.0Available
PigsusScr3Aug. 2011Swine Genome Sequencing Consortium Sscrofa10.2Available
 susScr2Nov. 2009Swine Genome Sequencing Consortium Sscrofa9.2Available
PikaochPri3May 2012Broad Institute OchPri3.0Available
 ochPri2Jul. 2008Broad Institute OchPri2Available
PlatypusornAna1Mar. 2007WUSTL v5.0.1Available
RabbitoryCun2Apr. 2009Broad Institute release OryCun2Available
Ratrn6Jul. 2014RGSC Rnor_6.0Available
 rn5Mar. 2012RGSC Rnor_5.0Available
 rn4Nov. 2004Baylor College of Medicine HGSC v3.4Available
 rn3Jun. 2003Baylor College of Medicine HGSC v3.1Available
 rn2Jan. 2003Baylor College of Medicine HGSC v2.1Archived
 rn1Nov. 2002Baylor College of Medicine HGSC v1.0Archived
RhesusrheMac3Oct. 2010Beijing Genomics Institute CR_1.0Available
 rheMac2Jan. 2006Baylor College of Medicine HGSC v1.0 Mmul_051212Available
 rheMac1Jan. 2005Baylor College of Medicine HGSC Mmul_0.1Archived
Rock hyraxproCap1Jul. 2008Baylor College of Medicine HGSC Procap1.0Available
SheepoviAri3Aug. 2012ISGC Oar_v3.1Available
 oviAri1Feb. 2010ISGC Ovis aries 1.0Available
ShrewsorAra2Aug. 2008Broad Institute SorAra2.0Available
 sorAra1Jun. 2006Broad Institute SorAra1.0Available
SlothchoHof1Jul. 2008Broad Institute ChoHof1.0Available
SquirrelspeTri2Nov. 2011Broad Institute SpeTri2.0Available
Squirrel monkeysaiBol1Oct. 2011Broad Institute SaiBol1.0Available
TarsiertarSyr1Aug. 2008WUSTL/Broad Institute Tarsyr1.0Available
Tasmanian devilsarHar1Feb. 2011Wellcome Trust Sanger Institute Devil_refv7.0Available
TenrecechTel2Nov. 2012Broad Institute EchTel2.0Available
echTel1Jul. 2005Broad Institute echTel1Available
Tree shrewtupBel1Dec. 2006Broad Institute Tupbel1.0Available
WallabymacEug2Sep. 2009Tammar Wallaby Genome Sequencing Consortium Meug_1.1Available
White rhinoceroscerSim1May 2012Broad Institute CerSimSim1.0Available
     
VERTEBRATES    
American alligatorallMis1Aug. 2012Int. Crocodilian Genomes Working Group allMis0.2Available
Atlantic codgadMor1May 2010Genofisk GadMor_May2010Available
BudgerigarmelUnd1Sep. 2011WUSTL v6.3Available
ChickengalGal4Nov. 2011ICGC Gallus-gallus-4.0Available
 galGal3May 2006WUSTL Gallus-gallus-2.1Available
 galGal2Feb. 2004WUSTL Gallus-gallus-1.0Available
CoelacanthlatCha1Aug. 2011Broad Institute LatCha1Available
Elephant sharkcalMil1Dec. 2013IMCB Callorhinchus_milli_6.1.3Available
Fugufr3Oct. 2011JGI v5.0Available
 fr2Oct. 2004JGI v4.0Available
 fr1Aug. 2002JGI v3.0Available
LampreypetMar2Sep. 2010WUGSC 7.0Available
petMar1Mar. 2007WUSTL v3.0Available
LizardanoCar2May 2010Broad Institute AnoCar2Available
 anoCar1Feb. 2007Broad Institute AnoCar1Available
MedakaoryLat2Oct. 2005NIG v1.0Available
Medium ground finchgeoFor1Apr. 2012BGI GeoFor_1.0 / NCBI 13302Available
Nile tilapiaoreNil2Jan. 2011Broad Institute Release OreNil1.1Available
Painted turtlechrPic1Dec. 2011IPTGSC Chrysemys_picta_bellii-3.0.1Available
SticklebackgasAcu1Feb. 2006Broad Institute Release 1.0Available
TetraodontetNig2Mar. 2007Genoscope v7Available
 tetNig1Feb. 2004Genoscope v7Available
TurkeymelGal1Dec. 2009Turkey Genome Consortium v2.01Available
X. tropicalisxenTro3Nov. 2009JGI v.4.2Available
 xenTro2Aug. 2005JGI v.4.1Available
 xenTro1Oct. 2004JGI v.3.0Available
Zebra finchtaeGut2Feb. 2013WashU taeGut324Available
 taeGut1Jul. 2008WUSTL v3.2.4Available
ZebrafishdanRer7Jul. 2010Sanger Institute Zv9 Available
 danRer6Dec. 2008Sanger Institute Zv8 Available
 danRer5Jul. 2007Sanger Institute Zv7 Available
 danRer4Mar. 2006Sanger Institute Zv6 Available
 danRer3May 2005Sanger Institute Zv5 Available
 danRer2Jun. 2004Sanger Institute Zv4 Archived
 danRer1Nov. 2003Sanger Institute Zv3 Archived
     
DEUTEROSTOMES    
C. intestinalisci2Mar. 2005JGI v2.0Available
 ci1Dec. 2002JGI v1.0Available
LanceletbraFlo1Mar. 2006JGI v1.0Available
S. purpuratusstrPur2Sep. 2006Baylor College of Medicine HGSC v. Spur 2.1Available
 strPur1Apr. 2005Baylor College of Medicine HGSC v. Spur_0.5Available
     
INSECTS    
A. melliferaapiMel2Jan. 2005Baylor College of Medicine HGSC v.Amel_2.0 Available
 apiMel1Jul. 2004Baylor College of Medicine HGSC v.Amel_1.2 Available
A. gambiaeanoGam1Feb. 2003IAGP v.MOZ2Available
D. ananassaedroAna2Aug. 2005Agencourt Arachne releaseAvailable
 droAna1Jul. 2004TIGR Celera releaseAvailable
D. erectadroEre1Aug. 2005Agencourt Arachne releaseAvailable
D. grimshawidroGri1Aug. 2005Agencourt Arachne releaseAvailable
D. melanogasterdm6Aug. 2014BDGP Release 6 + ISO1 MTAvailable
 dm3Apr. 2006BDGP Release 5Available
 dm2Apr. 2004BDGP Release 4Available
 dm1Jan. 2003BDGP Release 3Available
D. mojavensisdroMoj2Aug. 2005Agencourt Arachne releaseAvailable
 droMoj1Aug. 2004Agencourt Arachne releaseAvailable
D. persimilisdroPer1Oct. 2005Broad Institute releaseAvailable
D. pseudoobscuradp3Nov. 2004Flybase Release 1.0Available
 dp2Aug. 2003Baylor College of Medicine HGSC Freeze 1Available
D. sechelliadroSec1Oct. 2005Broad Institute Release 1.0Available
D. simulansdroSim1Apr. 2005WUSTL Release 1.0Available
D. virilisdroVir2Aug. 2005Agencourt Arachne releaseAvailable
 droVir1Jul. 2004Agencourt Arachne releaseAvailable
D. yakubadroYak2Nov. 2005WUSTL Release 2.0Available
 droYak1Apr. 2004WUSTL Release 1.0Available
     
NEMATODES    
C. brennericaePb2Feb. 2008WUSTL 6.0.1Available
 caePb1Jan. 2007WUSTL 4.0Available
C. briggsaecb3Jan. 2007WUSTL Cb3Available
 cb1Jul. 2002WormBase v. cb25.agp8Available
C. elegansce10Oct. 2010WormBase v. WS220Available
 ce6May 2008WormBase v. WS190Available
 ce4Jan. 2007WormBase v. WS170Available
 ce2Mar. 2004WormBase v. WS120Available
 ce1May 2003WormBase v. WS100Archived
C. japonicacaeJap1Mar. 2008WUSTL 3.0.2Available
C. remaneicaeRem3May 2007WUSTL 15.0.1Available
 caeRem2Mar. 2006WUSTL 1.0Available
P. pacificuspriPac1Feb. 2007WUSTL 5.0Available
     
OTHER    
Sea HareaplCal1Sep. 2008Broad Release Aplcal2.0Available
YeastsacCer3April 2011SGD April 2011 sequenceAvailable
sacCer2June 2008SGD June 2008 sequenceAvailable
 sacCer1Oct. 2003SGD 1 Oct 2003 sequenceAvailable
     
VIRUSES    
Ebola ViruseboVir3June 2014Sierra Leone 2014 (G3683/KM034562.1)Available



  Initial assembly release dates
 

Question:
"When will the next assembly be out?"

Response:
UCSC does not produce its own genome assemblies, but instead obtains them from standard sources. For example, the human assembly is obtained from NCBI. Because of this, you can expect us to release a new version of a genome soon after the assembling organization has released the version. A new assembly release initially consists of the genome sequence and a small set of aligned annotation tracks. Additional annotation tracks are added as they are obtained or generated. Bulk downloads of the data are typically available in the first week after the assembly is released in the browser.



  Data sources - UCSC assemblies
 

Question:
"Where does UCSC obtain the assembly and annotation data displayed in the Genome Browser?"

Response:
All the assembly data displayed in the UCSC Genome Browser are obtained from external sequencing centers. To determine the data source and version for a given assembly, see the assembly's description on the Genome Browser Gateway page or the List of UCSC Genome Releases.

The annotations accompanying an assembly are obtained from a variety of sources. The UCSC Genome Bioinformatics Group generates several of the tracks; the remainder are contributed by collaborators at other sites. Each track has an associated description page that credits the authors of the annotation.

For detailed information about the individuals and organizations who contributed to a specific assembly, see the Credits page.



  Comparison of UCSC and NCBI human assemblies
 

Question:
"How do the human assemblies displayed in the UCSC Genome Browser differ from the NCBI human assemblies?

Response:
Recent human assemblies displayed in the Genome Browser (hg10 and higher) are identical to the NCBI assemblies.



  Differences between UCSC and NCBI mouse assemblies
 

Question:
"Is the mouse genome assembly displayed in the UCSC Genome Browser the same as the one on the NCBI website?"

Response:
The mouse genome assemblies featured in the UCSC Genome Browser are the same as those on the NCBI web site with one difference: the UCSC versions contain only the reference strain data (C57BL/6J). NCBI provides data for several additional strains in their builds.



  Accessing older assembly versions
 

Question:
"I need to access an older version of a genome assembly that's no longer listed in the Genome Browser menu. What should I do?"

Response:
In addition to the assembly versions currently available in the Genome Browser, you can access the data for older assemblies of the browser through our Downloads page.



  Frequency of GenBank data updates
 

Question:
"How frequently does UCSC update its databases with new data from GenBank?"

Response:
Daily and weekly incremental updates of mRNA, RefSeq, and EST data are in place for several of the more recent Genome Browser assemblies. Assemblies that are not on an incremental update schedule are updated whenever we load a new assembly or make a major revision to a table.

Data are updated on the following schedule:

  • Native and xeno mRNA and refSeq tracks: updated daily for human and mouse assemblies; updated approximately weekly for all other organisms.
  • EST data: updated weekly on Saturday morning
  • Downloadable data files: updated weekly on Saturday morning
  • Outdated sequences - removed once per quarter

Mirror sites are not required to use an incremental update process, and should not experience problems as a result of these updates.



  Coordinate changes between assemblies
 

Question:
"I noticed that the chromosomal coordinates for a particular gene that I'm looking at have changed since the last time I used your browser. What happened?"

Response:
A common source of confusion for users arises from mixing up different assemblies. It is very important to be aware of which assembly you are looking at. Within the Genome Browser display, assemblies are labeled by organism and date. To look up the corresponding UCSC database name or NCBI build number, use the release table.

UCSC database labels are of the form hg#, panTro#, etc. The letters designate the organism, e.g. hg for human genome or panTro for Pan troglodytes. The number denotes the UCSC assembly version for that organism. For example, ce1 refers to the first UCSC assembly of the C. elegans genome.

The coordinates of your favorite gene in one assembly may not be the same as those in the next release of the assembly unless the gene happens to lie on a completely sequenced and unrevised chromosome. For information on integrating data from one assembly into another, see the Converting positions between assembly versions section.



  Converting positions between assembly versions
 

Question:
"I've been researching a specific area of the human genome on the current assembly, and now you've just released a new version. Is there an easy way to locate my area of interest on the new assembly?"

Response:
See the section on converting coordinates for information on assembly migration tools.



  Missing annotation tracks
 

Question:
"Why is my favorite annotation track missing from your latest release?"

Response:
The initial release of a new genome assembly typically contains a small subset of core annotation tracks. New tracks are added as they are generated. In many cases, our annotation tracks are contributed by scientists not affiliated with UCSC who must first obtain the sequence, repeatmasked data, etc. before they can produce their tracks. If you have need of an annotation that has not appeared on an assembly within a month or so of its release, feel free to send an inquiry to genome@soe.ucsc.edu. Messages sent to this address will be posted to the moderated genome mailing list, which is archived on a SEARCHABLE PUBLIC Google Groups forum.



  What next with the human genome?
 

Question:
"Now that the human genome is "finished", will there be any more releases?"

Response:
Rest assured that work will continue. There will be updates to the assembly over the next several years. This has been the case for all other finished (i.e. essentially complete) genome assemblies as gaps are closed. For example, the C. elegans genome has been "finished" for several years, but small bits of sequence are still being added and corrections are being made. NCBI will continue to coordinate the human genome assemblies in collaboration with the individual chromosome coordinators, and UCSC will continue to QC the assembly in conjunction with NCBI (and, to a lesser extent, Ensembl). UCSC, NCBI, Ensembl, and others will display the new releases on their sites as they become available.



  Mouse strain used for mouse genome sequence
 

Question:
"What strain of mouse was used for the Mus musculus genome?"

Response:
C57BL/6J.



  UniProt (Swiss-Prot/TrEMBL) display changes
 

Question:
"What has UCSC done to accommodate the changes to display IDs recently introduced by UniProt (aka Swiss-Prot/TrEMBL)?"

Response:
Here is a detailed description of the database changes we have made to accommodate the UniProt changes. If you are using the proteinID field in our knownGene table or the Swiss-Prot/TrEMBL display ID for indexing or cross-referencing other data, we strongly suggest you transition to the UniProt accession number. These changes will also affect anyone who is mirroring our site.

  1. The latest UniProt Knowledgebase (Release 46.0, Feb. 1st, 2005) was parsed and the results were stored in a newly created database sp050201.
  2. A corresponding database, proteins050201, was constructed based on data in sp050201 and other protein data sources.
  3. Two new symbolic database pointers, uniProt and proteome, have been created to point to the two new databases mentioned above. Some parts of our programs use the data in these two DBs.
       uniProt  ---> sp050201
       proteome ---> proteins050201
  4. The existing protein symbolic database pointers, swissProt and proteins remain unchanged. Some parts of our programs still use these two pointers and the data in their associated protein databases.
       swissProt ---> sp041115
       proteins  ---> proteins041115
  5. Two new tables, spOldNew and uniProtAlias, have been added to the proteome database.

    The spOldNew table contains three columns:
    • acc -- primary accession number
    • oldDisplayId -- old display ID
    • newDisplayId -- new display ID

    The uniProtAlias table contains four columns:
    • acc -- UniProt accession number
    • alias -- alias (could be acc, old and new display IDs, etc.)
    • aliasSrc -- source of the alias type
    • aliasSrcDate -- date of the source data

    The aliases include primary accessions, secondary accessions new display IDs, old display IDs, and old display IDs corresponding to new secondary accessions.

  6. Three new functions have been added to kent/src/hg/spDb.c:
       char *oldSpDisplayId(char *newSpDisplayId);
       /* Convert from new Swiss-Prot display ID to old display ID */
          
       char *newSpDisplayId(char *oldSpDisplayId); 
       /* Convert from old Swiss-Prot display ID to new display ID */
    
       char *uniProtFindPrimAcc(char *id);
       /* Return primary accession given an alias. */

    The uniProtFindPrimAcc() function is enabled by the new uniProtAlias table.

We anticipate additional changes down the road and may eventually merge the two sets of protein DB pointers into one set.

Currently, the proteinID field of the knownGene table for existing genome releases (hg15, hg16, hg17, mm3, mm4, mm5, rn2, and rn3) uses old Swiss-Prot/TrEMBL display IDs (pre-1 Feb. '05). In the future, we may change this field to show the UniProt accession number. Should we choose not to change the content of the proteinID field, we may consider adding a new field, uniProtAcc.

If you have any questions about these changes and their impact on your work, please email us at [email protected]. Mirror sites may send questions to [email protected]. Messages sent to these addresses will be posted to the moderated mailing lists, which are archived on SEARCHABLE PUBLIC Google Groups forum.