BridgeDb tutorial: Gene HGNC name to Ensembl identifier
This tutorial explains how to use the BridgeDb identifier mapping service to translate HGNC names to Ensembl identifiers. This step is part of the OpenRiskNet use case to link Adverse Outcome Pathways to WikiPathways.
First we need to load the Python library to allow calls to the BridgeDb REST webservice:
import requests
Let's assume we're interested in the gene with HGNC MECP2 (FIXME: look up a gene in AOPWiki), the API call to make mappings is given below as callUrl. Here, the H indicates that the query (MECP2) is an HGNC symbol:
callUrl = 'http://bridgedb.prod.openrisknet.org/Human/xrefs/H/MECP2'
The default call returns all identifiers, not just for Ensembl:
response = requests.get(callUrl)
response.text
'GO:0001964\tGeneOntology\nuc065cav.1\tUCSC Genome Browser\n312750\tOMIM\nGO:0042551\tGeneOntology\nuc065car.1\tUCSC Genome Browser\nA0A087X1U4\tUniprot-TrEMBL\n4204\tWikiGenes\nGO:0043524\tGeneOntology\nILMN_1702715\tIllumina\n34355_at\tAffy\nGO:0007268\tGeneOntology\nMECP2\tHGNC\nuc065caz.1\tUCSC Genome Browser\nA_33_P3339036\tAgilent\nGO:0006576\tGeneOntology\nuc065cbg.1\tUCSC Genome Browser\nGO:0006342\tGeneOntology\n300496\tOMIM\nGO:0035176\tGeneOntology\nuc065cbc.1\tUCSC Genome Browser\nGO:0033555\tGeneOntology\nGO:0045892\tGeneOntology\nA_23_P114361\tAgilent\nGO:0045893\tGeneOntology\nENSG00000169057\tEnsembl\nGO:0090063\tGeneOntology\nGO:0005515\tGeneOntology\nGO:0002087\tGeneOntology\nGO:0005634\tGeneOntology\nGO:0007416\tGeneOntology\nGO:0008104\tGeneOntology\nGO:0042826\tGeneOntology\nGO:0007420\tGeneOntology\nGO:0035067\tGeneOntology\n300005\tOMIM\nNP_001104262\tRefSeq\nA0A087WVW7\tUniprot-TrEMBL\nNP_004983\tRefSeq\nGO:0046470\tGeneOntology\nGO:0010385\tGeneOntology\n11722682_at\tAffy\nGO:0051965\tGeneOntology\nNM_001316337\tRefSeq\nuc065caw.1\tUCSC Genome Browser\nA0A0D9SFX7\tUniprot-TrEMBL\nA0A140VKC4\tUniprot-TrEMBL\nGO:0003723\tGeneOntology\nGO:0019233\tGeneOntology\nGO:0001666\tGeneOntology\nGO:0003729\tGeneOntology\nGO:0021591\tGeneOntology\nuc065cas.1\tUCSC Genome Browser\nGO:0019230\tGeneOntology\nGO:0003682\tGeneOntology\nGO:0001662\tGeneOntology\nuc065cbh.1\tUCSC Genome Browser\nX99687_at\tAffy\nGO:0008344\tGeneOntology\nGO:0009791\tGeneOntology\nuc065cbd.1\tUCSC Genome Browser\nGO:0019904\tGeneOntology\nGO:0030182\tGeneOntology\nGO:0035197\tGeneOntology\n8175998\tAffy\nGO:0016358\tGeneOntology\nNM_004992\tRefSeq\nGO:0003714\tGeneOntology\nGO:0005739\tGeneOntology\nGO:0005615\tGeneOntology\nGO:0005737\tGeneOntology\nuc004fjv.3\tUCSC Genome Browser\n202617_s_at\tAffy\nGO:0050905\tGeneOntology\nGO:0008327\tGeneOntology\nD3YJ43\tUniprot-TrEMBL\nGO:0003677\tGeneOntology\nGO:0006541\tGeneOntology\nGO:0040029\tGeneOntology\nA_33_P3317211\tAgilent\nNP_001303266\tRefSeq\n11722683_a_at\tAffy\nGO:0008211\tGeneOntology\nGO:0051151\tGeneOntology\nNM_001110792\tRefSeq\nX89430_at\tAffy\nGO:2000820\tGeneOntology\nuc065cat.1\tUCSC Genome Browser\nGO:0003700\tGeneOntology\nGO:0047485\tGeneOntology\n4204\tEntrez Gene\nGO:0009405\tGeneOntology\nA0A0D9SEX1\tUniprot-TrEMBL\nGO:0098794\tGeneOntology\n3C2I\tPDB\nHs.200716\tUniGene\nGO:0000792\tGeneOntology\nuc065cax.1\tUCSC Genome Browser\n300055\tOMIM\n5BT2\tPDB\nGO:0006020\tGeneOntology\nGO:0031175\tGeneOntology\nuc065cbe.1\tUCSC Genome Browser\nGO:0008284\tGeneOntology\nuc065cba.1\tUCSC Genome Browser\nGO:0060291\tGeneOntology\n202618_s_at\tAffy\nGO:0016573\tGeneOntology\n17115453\tAffy\nA0A1B0GTV0\tUniprot-TrEMBL\nuc065cbi.1\tUCSC Genome Browser\nGO:0048167\tGeneOntology\nGO:0007616\tGeneOntology\nGO:0016571\tGeneOntology\nuc004fjw.3\tUCSC Genome Browser\nGO:0007613\tGeneOntology\nGO:0007612\tGeneOntology\nGO:0021549\tGeneOntology\n11722684_a_at\tAffy\nGO:0001078\tGeneOntology\nX94628_rna1_s_at\tAffy\nGO:0007585\tGeneOntology\nGO:0010468\tGeneOntology\nGO:0031061\tGeneOntology\nA_24_P237486\tAgilent\nGO:0050884\tGeneOntology\nGO:0000930\tGeneOntology\nGO:0005829\tGeneOntology\nuc065cau.1\tUCSC Genome Browser\nH7BY72\tUniprot-TrEMBL\n202616_s_at\tAffy\nGO:0006355\tGeneOntology\nuc065cay.1\tUCSC Genome Browser\nGO:0010971\tGeneOntology\n300673\tOMIM\nGO:0008542\tGeneOntology\nGO:0060079\tGeneOntology\nuc065cbf.1\tUCSC Genome Browser\nGO:0006122\tGeneOntology\nuc065cbb.1\tUCSC Genome Browser\nGO:0007052\tGeneOntology\nC9JH89\tUniprot-TrEMBL\nB5MCB4\tUniprot-TrEMBL\nGO:0032048\tGeneOntology\nGO:0050432\tGeneOntology\nGO:0001976\tGeneOntology\nI6LM39\tUniprot-TrEMBL\nGO:0005813\tGeneOntology\nILMN_1682091\tIllumina\nP51608\tUniprot-TrEMBL\n1QK9\tPDB\nGO:0006349\tGeneOntology\nGO:1900114\tGeneOntology\nGO:0000122\tGeneOntology\nGO:0006351\tGeneOntology\nGO:0008134\tGeneOntology\nILMN_1824898\tIllumina\n300260\tOMIM\n0006510725\tIllumina\n'
You can also see the results are returned as a TSV file, consisting of two columns, the identifier and the matching database.
We will want to convert this reply into a Python dictionary (with the identifier as key, as one database may have multiple identifiers):
lines = response.text.split("\n")
mappings = {}
for line in lines:
    if ('\t' in line):
        tuple = line.split('\t')
        identifier = tuple[0]
        database = tuple[1]
        if (database == "Ensembl"):
            mappings[identifier] = database
print(mappings)
{'ENSG00000169057': 'Ensembl'}
Alternatively, we can restrivct the return values from the BridgeDb webservice to just return Ensembl identifiers (system code En). For this, we add the ?dataSource=En call parameter:
callUrl = 'http://bridgedb-swagger.prod.openrisknet.org/Human/xrefs/H/MECP2?dataSource=En'
response = requests.get(callUrl)
response.text
'ENSG00000169057\tEnsembl\n'
Version History
master @ 0f98fd8 (latest) Created 6th Apr 2022 at 14:12 by Marvin Martens
Update genes.ipynb
Frozen
 master
master0f98fd8
    master @ 5f34ac1 Created 6th Apr 2022 at 14:09 by Marvin Martens
added citation info
Frozen
 master
master5f34ac1
    master @ 5f34ac1 (earliest) Created 6th Apr 2022 at 14:02 by Marvin Martens
added citation info
Frozen
 master
master5f34ac1
     Creators and Submitter
 Creators and SubmitterCreator
Additional credit
Egon Willighagen
Submitter
Views: 5645 Downloads: 1403
Created: 6th Apr 2022 at 14:02
Last updated: 6th Apr 2022 at 14:09
 Tags
 Tags Attributions
 AttributionsNone

 View on GitHub
View on GitHub Download RO-Crate
Download RO-Crate


