A new series of reporting guidelines have been developed by researchers from the European Molecular Biology Laboratory (EMBL) in an effort to aid proteomics data integration and comparison.
The guidelines, published in two 'perspective' papers in the journal Nature Biotechnology earlier this month, aim to increase the consistency of biological information to allow full integration, exchange and comparison.
Researchers from Cellzome used the approach in a large-scale profiling study of the interactions of three kinase inhibitor drugs. In the process, they discovered a potential new disease indication for the billion dollar anticancer drug Gleevec (imatinib).
This data was entered into a variety of databases hosted by the EBML's European Bioinformatics Institute (EBI) allowing researchers to gain access to the proteomic, mass spectral and molecular interaction data.
The data was linked to identified kinases with known functions in the UniProt database, setting it within a wider biological context and making it far more valuable to future drug discovery programs.
"Through the community-wide uptake of agreed minimum reporting standards, we can all benefit from easier identification and use of information that is most relevant to our own areas of work," said Henning Hermjakob and a co-author of both of the perspectives.
"This is the next step in providing freely accessible data repositories of the highest possible quality,"
The first of the perspectives, entitled "The minimum information about a proteomics experiment (MIAPE)" , details guidelines for the range of information that should be documented as part of a proteomics experiment.
This is particularly important as mass spectrometry (MS) proteomics experiments are dependent on the separation techniques used, the specific MS used as well as the protein identification tool and database used to assign the proteins identities.
The authors write that "to understand an analysis, perform comparisons between datasets, or derive statistics from their aggregation, it is crucial to understand both the biological and methodological contexts [of an experiment]."
This leads the authors to propose that along with the proteomics data itself contextual data should be included detailing where the samples came from and how the analyses were performed.
By including the extra information detailed on the HUPO Proteomics Standards Initiative webpage , proteomics researchers will be better able to look at previous research and combine it meaningfully with their own, enhancing the usefulness of literature data.
The second paper details "The minimum information required for reporting a molecular interaction experiment (MIMIx)" and discusses similar issues to the MIAPE paper but with regards to molecular interaction experiments.
These experiments aim to decipher the molecular mechanisms of cell function by tracking the plethora of interactions between the numerous components of living cells as well as external agents such as pharmaceutical compounds or environmental toxins.
The authors write that: "the single greatest source of data loss in transferring interaction data into a database is the use of ambiguous molecule identifiers, such as gene names."
They clam that as much as 70 per cent of overall database curation time is spent mapping molecule identifiers to common database entries.
The authors believe that molecules should be identified by the database accession number as well as a shorter molecule name for the benefits of both curation and readability respectively.
These guidelines were used by researchers from Cellzome in a paper published earlier this week in Nature Biotechnology entitled: "Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitors."
The authors coupled quantitative mass spectrometry with affinity purification using Cellzome's Kinobead technology to identify protein target and binding affinities of the kinase inhibitor drugs, Novartis' Gleevec (imatinib), Bristol-Myers Squibb's Sprycel (dasatinib) and Bosutinib (SKI-606) which is currently in Phase II development by Wyeth.
The simultaneous profiling of the hundreds of potential targets of these drugs not only helped predict the compounds in vivo efficacy and safety but also highlighted a number of previously unidentified targets.
In particular, the Cellzome researchers discovered that Gleevec targets the discoidin domain receptor kinase 1 (DDR1) and the NQ02 non-kinase oxidoreductase more strongly than the BCR-ABL leukaemia target the drug was designed to inhibit.
DDR1 has been implicated in the fibrosis of the lung, liver and kidney - diseases which are poorly served by existing medication suggesting that imatinib could provide be used as a therapy for these debilitating conditions.