The laboratory is an important part of the drug discovery equation. However, if the information uncovered and gathered in a lab isn’t shared efficiently, effectively, and with the right parties, all the efforts by those white-coated staff are for naught. Having the right tools and systems in place is vital.
Richard Lee, director of core technologies and capabilities with ACD/Labs, spoke with Outsourcing-Pharma about the tools and technologies that the modern-day R&D laboratory can make use of to ensure optimal data management and sharing.
OSP: Could you please tell us about how the average drug discovery laboratory has evolved in recent years?
RL: Automation has played a significant role in the laboratory in recent years, particularly with data management. More labs are looking to facilitate data sharing and access across departments, groups, and sites—as well as between organizations (e.g., sponsor and partner relationships).
Although having data in a central repository isn’t a new concept, transferring the data effectively and efficiently from one location to another remains an obstacle. As a result, data accessibility and availability to scientists is still a significant challenge and one that many struggle with.
There has also been a shift in the paradigm of who should have access to this data. It’s not just chemists today, it’s data scientists, too. But not in the traditional sense. Data scientists need the curated and abstracted results of the processed data for machine consumption into their ML/AI frameworks.
While scientists can deal with disparate data types in different formats well, for that same data to be used by machine learning algorithms it must be normalized. This is a huge undertaking for analytical data and not one that can be resolved by electronic laboratory notebooks (ELNs), laboratory information management systems (LIMS), and the existing traditional scientific informatics systems.
OSP: Specifically, how have the measurement instruments progressed?
RL: In the last 20-25 years, the industry has clearly updated its instruments—specifically with robotics systems to address the high throughput experimentation. More recently, there have been technology investments to address the vast amounts of samples generated from plate-based experiments—multiplexed HPLCs and CEs—but also non-traditional sample injection for mass spectrometry systems, which can reduce data acquisition for a 96-well plate from hours to a few minutes.
OSP: How has the management of the modern drug discovery lab been managed?
RL: In order to deal with an influx of analytical data generated, systems need to adjust to account for post-acquisition data availability. Automated data processing upon data acquisition can be performed to reduce any initial time consequence, but the challenge is when the data needs to be re-interrogated.
Having systems in place to account for data access, data visualization, and facile data reprocessing is key for efficient lab data management. New technologies are available to address this need. For example, a new ultra high capacity and low latency data storage server supporting Katalyst D2D (software for HTE from ACD/Labs) can reprocess a 96 well-plate worth of LC, LC/UV, and LC/UV/MS within minutes, compared to an hour or more with typical existing workflows.
OSP: What about LIMS—how have they made the lives of lab managers and others along the drug discovery pipeline easier?
RL: LIMS systems have played a critical role in sample management, as scientists request analysis of samples taken from a reaction. The main uses of LIMS include generating metadata IDs that can be easily tracked across systems/instruments, requesting analysis, and pulling physical reports based on prescribed automated routines.
Generating these identifiers/reports are only one aspect. Additionally, LIMS can be integrated into a number of other supporting systems, such as ELNs to transfer these reports and IDs into the respective ELN records. However, LIMS and ELNs primarily provide tabulated results of the analysis and cannot deliver access to the raw or processed data. Gaps still exist for scientists to conveniently access the acquired raw/processed data and review and interrogate analytical results reported through LIMS.
OSP: How can organizations like ACD/Labs help lab managers fill the gaps left by ELNs and LIMS?
RL: ACD/Labs software helps fill the gap between the scientist and the data. We are not an ELN/LIMS provider ourselves, but instead, support our customers’ workflows by integrating our technologies with these systems. As mentioned earlier, the main gap left by ELN/LIMS is the inability of scientists to access acquired raw and processed data. Users are also limited in the use of analytical data in downstream applications, like ML and AI.
Lab managers and their organizations need an established data management system first and foremost (of which ELNs and LIMs can be integrated), where automation plays a key role. Manual data entry for individual applications, data association, and extraction can diminish the access to acquired raw/processed data.
The ACD/Labs infrastructure and applications allow data to be marshaled from source to various destinations—providing access to the raw and processed data—and more importantly the ability to reprocess data on demand. As a result, scientists and chemists have visual access and interactivity for data interrogation, along with the ability to query a central database by metadata, spectra, chemical structure, etc.
Complementary to chemists, the abstracted data is also ready for machine consumption by data scientists via a well-developed API. Finally, project leads, managers, and QA/QC teams—those requiring access to analytical results to track projects, plan hardware acquisition/support, improve productivity/efficiency, or respond to regulatory inquiries—can also access the same rich data. Visualizations of the data can be configured to best fit the requirements of each kind of user.
OSP: Do you have anything to add?
RL: Data has always been the cornerstone of the lab and will continue to be. As more are looking to gain insights and advanced prediction capabilities with machine learning and artificial intelligence, data management—from data marshaling to sharing to analysis— must be a priority.
ELNs and LIMS serve unique purposes, but they are not the solution for end-to-end data management. To fill those gaps, organizations must seek technologies that support workflows with particular niche requirements (like analytical workflows). It is also essential that those technologies integrate with their existing informatics landscape to ensure effortless data flow and access.