Attribute List for Genome Assembly: hg18


This page summarizes the different attribute groups included in EpiGRAPH and provides references to the source from which the datasets were obtained. Further information can be obtained from the EpiGRAPH Background page and from the EpiGRAPH attribute reference sheet.



DNA_Sequence


Attributes that describe the DNA sequence itself, including base composition and oligonucleotide patterns


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
Base_compositionStrand-specific frequency of occurence for each nucleotide (A, C, G and T)Calculated directly from the DNA sequence
All_2mersFrequency of occurence separately for each oligonucleotides of size two that does not include any NsCalculated directly from the DNA sequence
All_4mersFrequency of occurence separately for each oligonucleotides of size four that does not include any NsCalculated directly from the DNA sequence


DNA_Structure


Attributes that describe the DNA structure (as inferred from the DNA sequence), including distortions of the helix and DNA melting temperatures


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
Predicted_Helix_StructureHelix structure of naked DNA as predicted from octamers with known structureCalculated by a simple sliding window approach using the simulation data reported in Gardiner et al. (2003) J Mol Bioltwist
roll
tilt
rise
slide
shift
Predicted_Solvent_Accessible_SurfaceSolvent accessible surface area of naked DNA as predicted from trimers with known valuesCalculated similarly to the UCSC Genome Browser Boston University ORChID track (http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=78806550&c=chr7&g=encodeBu_ORChID1)pk1_mean
pk2_mean
pk3_mean


Repetitive_DNA


Attributes that describe repetition within the DNA, including transposable elements, tandem repeats and segmental duplications


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
RepeatMaskerRepeats as detected by RepeatMasker. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=rmsk for details.UCSC Genome Browser, tables chr1_rmsk to chrY_rmskswScore
repStart
repLeft
repClass
repFamily
Simple_RepeatsTandem repeats as detected by Tandem Repeats Finder. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=simpleRepeat for details.UCSC Genome Browser, table simpleRepeatperiod
copyNum
score
entropy
Self_ChainSelf-chain alignments as detected by blastz. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=chainSelf for details.UCSC Genome Browser, tables chr1_chainSelf to chrY_chainSelfnormScore
Segmental_DupsSegmental duplications as detected by the 'fuguization' method. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=genomicSuperDups for details.UCSC Genome Browser, table genomicSuperDups


Chromosome_Organisation


Attributes that describe the large-scale functional organisation of the chromosomes, including chromosomal bands and special-interest regions


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
Chromosome_BandChromosome Bands localized by FISH mapping clones. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=cytoBand for details.UCSC Genome Browser, table cytoBandgieStain
ENCODE_RegionsTarget regions of the international ENCODE project pilot phase. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=78872892&c=chr2&g=encodeRegions for details.UCSC Genome Browser, table encodeRegions


Evolutionary_History


Attributes that describe the evolutionary history of the genome, including conservation and local recombination rates


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
Recomb_RateLow-resolution recombination rates estimated from genetic maps. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=recombRate for details.UCSC Genome Browser, table recombRatedecodeAvg
decodeFemale
decodeMale
marshfieldAvg
genethonAvg
Multiz_AlignmentAlignment of 16 vertebrates, including mammalian, amphibian, bird, and fish species, with the human genome. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=multiz17way for details.UCSC Genome Browser, table multiz17waySummaryscore
leftStatus
rightStatus
src
Multiz_ConservationEvolutionary conservation in 17 vertebrates, including mammalian, amphibian, bird, and fish species, based on a phylogenetic hidden Markov model (phastCons). See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=multiz17way for details.UCSC Genome Browser, directory multiz17wayscore


Population_Variation


Attributes that describe the variability among today's individuals, including SNPs and microdeletions


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
SNPsCompilation of simple nucleotide polymorphisms from dbSNP build 126. http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=snp126 for details.UCSC Genome Browser, table snp126class
func


Genes


Attributes that describe the distribution of known and predicted protein-coding genes, pseudogenes and non-coding genes within the genome


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
CCDSHigh-confidence gene annotations from the Consensus Coding DNA Sequence (CCDS) project. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=ccdsGene for details.UCSC Genome Browser, table ccdsGene
Known_GenesKnown protein-coding genes based on protein data from UniProt (SWISS-PROT and TrEMBL) and mRNA data from the NCBI reference sequences collection (RefSeq) and GenBank. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=knownGene for details.UCSC Genome Browser, table knownGene
RefSeq_GenesKnown protein-coding genes taken from the NCBI mRNA reference sequences collection (RefSeq). See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=refGene for details.UCSC Genome Browser, table refGene
EvoFoldRNA secondary structure predictions made with the EvoFold program, a comparative method that used multiple-sequence alignments for identifying conserved functional RNA structures. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=78858929&c=chr2&g=evofold for detailsUCSC Genome Browser, table evofoldscore
sno_miRNAAnnotation of four different types of non-coding RNAs, based on several public databases. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=78858929&c=chr2&g=wgRna for detailsUCSC Genome Browser, table wgRnatype


Regulatory_Regions


Attributes that describe putative regulatory regions and functional elements in the genome


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
TFBS_ConservedAnnotation of TRANSFAC-predicted transcription factor binding sites conserved in the human/mouse/rat alignment. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=tfbsConsSites for details.UCSC Genome Browser, table tfbsConsSiteszScore
name
TScanS_miRNAAnnotation of microRNA regulatory targets predicted by TargetScanS. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=78858929&c=chr2&g=targetScanS for details.UCSC Genome Browser, table targetScanSscore
CpG_IslandsCpG islands according to a UCSC Genome Browser detection algorithm. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr7&g=cpgIslandExt for details.UCSC Genome Browser, table cpgIslandExtperGc
obsExp
Bona_Fide_CpG_IslandsCpG island strength predictions and maps of predicted bona fide CpG islands according to three different thresholds (sensitive, balanced and specific, respectively)C Bock et al. (2007) "CpG island mapping by epigenome prediction". http://neighborhood.bioinf.mpi-inf.mpg.de/CpG_islands_revisited/CombinedEpigeneticScore
OptimizedScore
isCGI_sensitive
isCGI_balanced
isCGI_specific


Transcriptome


Attributes that describe the transcriptional activity, including non-genic transcription and promoter activity


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
GNF_Atlas_2Gen expression data from the GNF Gene Expression Atlas for 79 individual samples grouped into 28 tissue types. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=78858929&c=chr2&g=gnfAtlas2 for detailsUCSC Genome Browser, table gnfAtlas2 with significant post-processingadrenalgland
Appendix
bonemarrow
fetalbrain
fetalliver
fetallung
heart
kidney
liver
lung
OlfactoryBulb
Ovary
pancreas
pituitary
placenta
Pons
Prostate
salivarygland
Skeletal_Muscle
skin
SmoothMuscle
spinalcord
testis
thymus
tongue
Tonsil
trachea
Uterus
type
Human_mRNAsAnnotation of alignments between human mRNAs in GenBank and the genome. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=mrna for detailsUCSC Genome Browser ENCODE data, table all_mrna
Spliced_ESTsAnnotation of alignments between human spliced ESTs in GenBank and the genome. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=intronEst for detailsUCSC Genome Browser ENCODE data, tables chr*_intronEst
Human_ESTsAnnotation of alignments between human ESTs in GenBank and the genome. See http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&c=chr2&g=est for detailsUCSC Genome Browser ENCODE data, table all_est


Epigenome_and_Chromatin_Structure


Attributes that describe the chromatin structure and epigenetic modifications, including histone modifications and protein binding


Attribute nameDescriptionData source for attributeScore columnsClass columnsCategory columns
DNA_Methylation_BrainRestriction enzyme digestion and direct sequencing of 3073 unmethylated domains and 2565 methylated domains from human brain DNA. See http://epigenomics.cu-genome.org/html/meth_landscape/ for details.Rollins et al. (2005) Genome ResearchCpG_richness
Methylation
NIH_Chromatin_BloodGenome-wide ChIP-seq for multiple chromatin modifications (H3K4me1, H3K4me2, H3K4me3, H3K9me1, H3K9me2, H3K9me3, H3K27me1, H3K27me2, H3K27me3, H3K36me1, H3K36me3, H3K79me1, H3K79me2, H3K79me3, H3R2me1, H3R2me2, H4K20me1, H4K20me3, H2A+H4R3me2, H2BK5me1, H2A.Z, PolII and CTCF)Barksi et al. (2007) Cell, downloaded from http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/hgtcell.html/chromMod