  About ENCODE Data

The Encyclopedia of DNA Elements (ENCODE) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.

Click to enlarge ENCODE data are now available for the entire human genome. All ENCODE data are free and available for immediate use via :

To search for ENCODE data related to your area of interest and set up a browser view, use the UCSC Experiment Matrix or Track Search tool (Advanced features). The Experiment List (Human) and Experiment List (Mouse) links provide comprehensive listings of ENCODE data that is released or in preparation.

All ENCODE data is freely available for download and analysis. However, before publishing research that uses ENCODE data, please read the ENCODE Data Release Policy, which places some restrictions on publication use of data for nine months following data release.    Read more about ENCODE data at UCSC.


28 September 2012 - New ENCODE on-line tutorial and Quick Reference Cards

To supplement the existing ENCODE Foundations tutorial, OpenHelix has developed a second tutorial describing newer ENCODE data and access tools. Slides for this tutorial are freely available from the website at: http://openhelix.com/ENCODE2. Also available free (up to 30 cards shipped free in the US) is a Quick Reference Card that provides a handy summary of access features. For more information see the OpenHelix announcement.

27 September 2012 - ENCODE data releases: UNC/BSU ProtGenc, UNC FAIRE (Rel 2), HAIB TFBS (Rel 3), CSHL Long RNA-seq (Rel 3)

One new track and three track updates were released on the human hg19 browser:

Proteogenomics Hg19 and GENCODE Mapping from ENCODE/Univ. North Carolina/Boise State Univ: This track displays mass spectrometry data that have been matched to genomic sequences in GM12878, H1-hESC, H1-neuron, and K562 cell types. Peptides were mapped to an in silico translation and proteolytic digestion of the whole human genome (UCSC Hg19), and the GENCODE translation of protein-coding transcripts database. The track can be used to identify which parts of the genome are translated into proteins, to verify which transcripts discovered by other ENCODE experiments are protein-coding, to reveal new genes and/or splice variants and proteins with post-translational modifications (PTM). Of particular interest is the possibility of uncovering the translation of small open reading frames (ORFs), antisense transcripts, or protein-coding regions that have been annotated as introns previously.

Open Chromatin by FAIRE from ENCODE/OpenChrom(UNC Chapel Hill) (Release 2): FAIRE (Formaldelhyde-Assisted Isolation of Regulatory Elements) is a method to isolate and identify nucleosome-depleted regions of the genome. Release 2 of this track contains 12 new experiments, including 11 new cell lines.

Transcription Factor Binding Sites by ChIP-seq from ENCODE/HAIB (Release 3): This release contains 110 new experiments including 3 new cell lines. There were also data corrections (see Release Notes section for details).

Long RNA-seq from ENCODE/Cold Spring Harbor Lab (Release 3): This release contains additional files for many experiments including GENCODE V10 referenced Transcripts, Genes and Exons (previous datasets referenced GENCODE V7). Complete details are in the track Release Notes.

25 September 2012 - Mouse ENCODE data releases: UW DNaseI DGF, UW DNaseI HS (Rel 2), LICR Histone (Rel 3), CSHL Long RNA-seq (Rel 3)

One new track and three track updates were released on the Mouse mm9 browser. Read more.

5 September 2012 - ENCODE results published in Nature, Science and other journals

The results of the ENCODE project were published today in a coordinated set of 30 papers published in multiple journals. Read more.

