SciBite’s artificial intelligence (AI) software platform is designed to help pharmaceutical researchers and other life-science professionals parse through their data to unlock useful insights. According to the company, the platform pairs machine learning with ontology-based semantic capabilities.
James Malone (JM), SciBite’s chief technology officer, spoke with Outsourcing-Pharma about the progress of AI use and understanding in the pharma industry, and how the company’s AI technology seeks to build upon previous technological capabilities.
OSP: Please talk a bit about the evolution of AI’s use in life sciences—how long it’s been present, how its understanding and application has changed in the industry, and what might lie ahead?
JM: There is a broad spectrum of approaches in AI, some of which have been used for a long time in life sciences. For instance, knowledge engineering using ontologies to describe metadata, expert systems for helping triage symptoms online, and machine learning for image analysis.
Most recently the innovation in deep learning, combined with availability of big data and powerful compute, has provided huge improvements in the performance of some of these approaches. This is particularly true of areas such as language comprehension where they now represent the state of the art. It is likely these approaches will be increasingly combined into software in the near future and that scientists will benefit from the innovation without having to become deep learning experts.
In some domains, the future is already here, with voice recognition software commonplace in many applications. In the area of semantics, this may include approaches in harmonizing electronic medical records, analyzing self-reported patient data, and enabling natural language questions to be asked of large data stores.
OSP: What are some of the challenges the industry has faced that SciBiteAI is designed to help overcome?
JM: The primary goals of SciBiteAI are to combine our expertise in semantics and life sciences to enable a broad as possible audience to benefit from machine learning approaches we are offering. One of the biggest barriers to building machine learning models is obtaining high-quality training data.
Our existing technology means we are able to identify and create relevant training sets in an efficient and accurate manner. Our understanding of biomedical entities - drugs, diseases, genes, assays, etc. - are encoded in our ontologies and in turn are built into the machine learning models we derive from them.
We are building our life sciences understanding into SciBiteAI - this is what we call semantics-based deep learning. A huge amount of human understanding is already encoded in a computer-readable form via ontologies. However, many AI companies don’t utilize this resource and are in essence using AI with “one hand tied behind its back”; we’re asking AI to make predictions and it’s much better to arm it with what we know already rather than ask it to work “in the dark”.
This combined strategy has been shown many times to outperform a single approach and is perhaps most famously demonstrated as the strategy that Watson used to win Jeopardy. Given SciBite’s vast ontology resources, we can create workflows that comprehend scientific data and extract entities and patterns pertinent to those working in the field with many possible applications, for instance drug-adverse event detection, finding biomarkers or identifying novel biologics in text.
OSP: Why is incorporating AI technology that does not require becoming an AI wizard beneficial to life-science users?
JM: The ethos of SciBite is to enable the widest audience possible to benefit from advances in semantic technology. Tools such as TERMite for advanced named entity recognition and CENtree for democratizing enterprise ontology management, have brought technology often seen as the domain of experts, to a large audience of scientists, researchers, and application developers; SciBiteAI follows this same pattern, enabling simple calls to the tool which can exploit a lot of powerful deep learning underneath the hood.
The alternative, of data collection, building training sets, creating code to train models and tweak them, then to wrap that up for use by others, consistently, can represent a significant time investment and a barrier for many. Data is an incredibly important asset for everyone working in the life science field, from big pharma to clinics to academic groups. Maximizing the value of this data to anyone using it is our mission.
Another aspect of this is systems integration and in particular “productionizing” AI. Many AI models are developed to address specific questions raised in a particular experiment or study. As such there is less attention paid to how such models are deployed or re-used within other applications.
Our ultimate goal is to have services based on AI that are integrated into day-to-day scientific applications - indeed the user may not even know there is an AI-based algorithm operating. All they get is a system that does what they expect. For instance, smart data-entry systems that understand what users are entering into form fields and modify the forms behavior based on constant assessment of what the user is trying to do.
OSP: Can you share any examples of the SciBiteAI technology being put to use in a real-world situation?
JM: We are about to publish a study on the use of the technology in accurately identifying novel interactions between biological molecules, distinguishing between inconsequential mentions (e.g. two molecules mentioned in a list) from significant events (‘X’ activates ‘Y’ for example). This is a key part of generating computable knowledge and a hard problem for the field, but these advances will lead to better accuracy of extracted facts and consequently better insights and productivity.
OSP: What would you like to add about the technology that I didn’t touch upon above?
JM: As with any technology area that makes rapid advances, there is much excitement, a degree of hype and a lot of potential. Our approach to utilizing these innovations, in deep learning in particular, is to cherry-pick the most suitable for a given task, ensure they offer real improvements and deploy them appropriately.
We don’t see machine learning as a panacea for all data challenges. SciBiteAI offers an exciting addition to the SciBite suite and it is the combination of our tools for making data FAIR (Findable, Accessible, Interoperable, Reusable) across an organization and applying them to critical questions where we see SciBiteAI fulfilling a valuable need.