gene ontology: go
TRANSCRIPT
http://www.ugr.es/~oliver/ José L. Oliver
• …los bioquímicos caracterizan las proteínas por su actividad y concentración
• …los genéticos caracterizan los genes por sus fenotipos y sus mutaciones
• …todos los biólogos reconocen ahora que probablemente existe un universo limitado de genes y proteínas, conservado en la mayor parte de las formas de vida
Esto ha promovido la gran unificación de la biología; la información acerca de los genes y proteínas compartidos contribuye a comprender mejor las diversas formas de vida
Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., et al. (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature genetics, 25(1), 25-9. doi: 10.1038/75556.
Regulatory elements
miRNA
CisRed
Transcription Factor Binding Sites
Biocarta pathways
InterPro Motifs
Bioentities from literature:
Diseases terms Chemical terms
Gene Expression in tissues
Keywords Swissprot
Biological databases
Arabidopsis thaliana
Homo sapiens
Mus musculus
Rattus norvegicus
Drosophila melanogaster
Caenorhabditis elegans
Saccharmoyces cerevisae
Gallus gallus
Danio rerio
HGNC symbol
EMBL acc
RefSeq
PDB
Protein Id
IPI….
Genes IDs
Gene Ontology
Biological Process Molecular Function Cellular Component
UniProt/Swiss-Prot
UniProtKB/TrEMBL
Ensembl IDs
EntrezGene
Affymetrix
Agilent
KEGG pathways
Gene Ontology CONSORTIUM
http://www.geneontology.org
• The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products
• These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them
• The controlled vocabularies of terms are structured
http://www.ugr.es/~oliver/ José L. Oliver
The three categories of GO
Molecular Function
the tasks performed by individual gene products; examples are transcription factor and DNA helicase
Biological Process
broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions
Cellular Component
subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and origin recognition complex
http://www.ugr.es/~oliver/ José L. Oliver
http://www.ugr.es/~oliver/ José L. Oliver
So what does a Gene Ontology do?
• A Gene Ontology takes a gene product (protein) and gives it a cellular context
• For each of the three ontology's, gene products can be placed where they belong, and various terms can be looked up to find the associated gene products
http://www.ugr.es/~oliver/ José L. Oliver
Go Term
• A decriptive term that is used to give a gene product a cellular, molecular, or biological context
• Terms are standardized across all databases and use synonyms to bridge gaps in spelling or similar function
• Older terms can become obsolete
http://www.ugr.es/~oliver/ José L. Oliver
Example of gene product data
• Look up gene “Q59J86”
• Gives:
• Name(s) “DNA polymerase”
• Type “protein”
• Species “Gallus gallus (chicken)”
• Synonyms “IPI00588123”
• Sequence
• References
• Term associations
http://www.ugr.es/~oliver/ José L. Oliver
Types of Relationships
• Is_a [i]
• Part_of [p]
• Regulates/ positively_regulates / negatively_regulates [r]
GO:0010467 : gene expression [r] GO:0010468 : regulation of gene expression ---[i] GO:0045449 : regulation of transcription [p] GO:0006350 : transcription ---[r] GO:0045449 : regulation of transcription
http://www.ugr.es/~oliver/ José L. Oliver
Example
Solid lines are Is_a relationships
Dotted Lines are Part_of relationships
http://www.ugr.es/~oliver/ José L. Oliver
Graph structure
• The ontologies are structured as directed acyclic graphs (DAG), which are graphs that do not cycle or repeat
• These are similar to hierarchies but differ in that a more specialized term (child) can be related to more than one less specialized term (parent)
• This allows annotations to one GO term to be also annotated to related GO terms connected in the graph structure
http://www.ugr.es/~oliver/ José L. Oliver
Database structure
• All 3 Gene Ontologies, Annotations, and Gene products are stored in one relational database.
• The Database is written in MySQL and is updated with various daily, weekly, and monthly builds in addition to various mirrors and stored previous builds
• The database can be accessed by AmiGO or queried remotely by various methods, or even downloaded
• The Ontology data is in OBO file format (Open Biomedical Ontologies)
http://www.ugr.es/~oliver/ José L. Oliver
http://amigo.geneontology.org/cgi-bin/amigo/search.cgi?session_id=1830amigo1294836292