EpiGRAPH Background Information and Supplementary Website

This website provides background material on the EpiGRAPH software for genome and epigenome analysis, including:

For a less technical overview, please see the EpiGRAPH Introduction Page.
Or proceed to EpiGRAPH's Login Page to get started using EpiGRAPH.

 

Video tutorials: Getting started with EpiGRAPH

The following video tutorials explain the basic and more advanced features of EpiGRAPH in the context of biological case studies.

Title Purpose Description Relevant Datasets
Tutorial 1 - DNA methylation analysis on chromosome 21 Introduction into EpiGRAPH This tutorial introduces the use of EpiGRAPH for analysis and prediction of epigenome datasets by a simple case study on DNA methylation, which is adapted from a paper published in PLoS Genetics. The experimental DNA methylation dataset was taken from a paper by Yamada et al., published in Genome Research.
Tutorial 2 - Independent validation of DNA methylation predictions Demonstration of slightly advanced features, including epigenome prediction on new data and a brief overview of the X-GRAF format used for documenting EpiGRAPH analyses This tutorial describes a follow-up analysis to tutorial 1. It illustrates how EpiGRAPH analyses and prediction models can be shared between researchers. We start by uploading the results documentation file from tutorial 1 into the EpiGRAPH account of a different user. Then we perform DNA methylation prediction based on the prediction setup that was orginally defined in tutorial 1. The resulting classifier is validated based on new experimental data that was obtained by bisulfite sequencing of a large number of promoter regions on chromosome 21 (Tierling et al., in preparation).
Tutorial 3 <deleted> <deleted>

This tutorial partially overlapped with the other tutorials. It has been deleted and the contents of this tutorial have been merged into tutorials 2 and 4

 
Tutorial 4 - (Epi-) genome analysis of highly polymorphic gene promoters: Part 1 and Part 2 Demonstration of the interplay between UCSC Genome Browser, Galaxy and EpiGRAPH This tutorial shows how UCSC Genome Browser, Galaxy and EpiGRAPH can work together when analyzing (epi-) genome data. We use the UCSC Genome Browser to retrieve data on putative promoter regions and SNPs, Galaxy to calculate which promoters show the highest vs. lowest degree of polymorphism and EpiGRAPH to detect significant genomic and epigenomic differences between highly polymorphic promoter regions and promoter regions that contain almost no known SNPs.
  • All required datasets are directly obtained from the UCSC Genome Browser
  • Results documentation for tutorial 4

Recommendation: Listen to tutorial 1 before you start using EpiGRAPH for the first time. Afterwards, try to perform the DNA methylation analysis and prediction by yourself - as described in tutorial 1 - and refer back to the tutorial if you are having problems. Next, get started analyzing your own data. After you had a few tries with EpiGRAPH, you may want to come back to the tutorials and see how to perform a more advanced analysis. Tutorial 2 addresses the topics of epigenome prediction and reproducibility of EpiGRAPH analyses, and tutorial 4 highlights the interplay and combined power of several web services for genome analysis.

 

Case study on monoallelic gene expression: Supplementary material

The case study in the EpiGRAPH paper focuses on the analysis and prediction of monoallelic gene expression. All input data, settings and results can be downloaded from here:

Note: To reproduce an analysis, download the corresponding XML file, log into your EpiGRAPH account, press the "Execute Analysis Based on Existing XML File" button and upload the XML file. To inspect the results of an analysis without recalculating, do the same but tick the "Retain previously calculated results" box on the upload page. The results will be instantly available inside your EpiGRAPH account (but the upload can take a long time - please wait for the completion page to appear and don't interrupt the process, even when your web browser seems to indicate that it has stopped loading).

 

EpiGRAPH's default attributes

We have prepared extensive background material explaining the different types of attributes included in EpiGRAPH:

EpiGRAPH currently supports five genome assemblies and four species:

For each of these genomes, we manually selected a large number of genomic attributes that are likely to be predictive of interesting genomic phenomena:

Table of EpiGRAPH's Default Attributes

 

The X-GRAF XML format used to specify and store EpiGRAPH analyses

EpiGRAPH stores all analyses and custom attributes in XML files that adhere to the XML Genomic Relationship Analysis Format (X-GRAF).

Syntactically, all X-GRAF-compatible files have to validate against the X-GRAF schema definition document (X-GRAF_schema_v1.00.xsd), which is itself an XML file and compliant with the W3C XML schema format (http://www.w3.org/XML/Schema).

X-GRAF-compatible XML files can incorporate two major subtrees, “attribute definition” and “analysis" (see example). The attribute definition section keeps track of genomic attributes, which are organized in attribute groups and can be defined by embedded tab-separated tables or by referring to external data sources such as a database or a URL. The analysis section documents all analysis steps, including attribute calculation, statistical analysis, diagram generation, machine learning analysis and prediction analysis. Each of these subsections comprises the analysis configuration (a description of what is to be calculated), analysis tracking information (e.g. submission data, current state and error messages) and the results of the analysis (in the form of tables and diagrams directly embedded in the XML file).

In addition to the syntax rules defined by the X-GRAF schema definition, all EpiGRAPH-compatible XML files must fulfill a number of semantic rules:

 

Downloading and using EpiGRAPH's source code

You want to run EpiGRAPH locally, extend EpiGRAPH or build your own tool that is much better than EpiGRAPH? That's great, just go ahead and download the EpiGRAPH source code from here: http://epigraph.mpi-inf.mpg.de/DownloadSource.php

 

Having trouble? Please contact us!

We are well aware that EpiGRAPH's functionality is sometimes not easy to understand. After all, EpiGRAPH implements a complex workflow that involves the calculation of up to a thousand attributes and extensive follow-up analysis by statistical and machine learning methods. Hence, please do not hesitate to contact us for help when you have trouble to get the calculations and results you had hoped for. Please write an e-mail to Christoph Bock and try to be as specific as possible (e.g. include an Excel sheet with example data and / or screenshots that show at what point you were encountering problems).