Much has been spoken of the potential for artificial intelligence and machine learning to speed up drug development. However, what could be just as impactful is applying that same technology to other areas, including drug formulation.
A team of researchers used the technology for the formulation of long-acting injectable drugs, which they referred to as one of the most promising therapeutics strategies for the treatment of chronic diseases, in the study published in Nature Communications.
The challenges that exist with this type of drug formulation are the difficulty in understanding the interplay between multiple parameters, such as the physicochemical properties of the drug and polymer.
This makes it difficult to predict the performance of different drug formulations. In turn, this necessitates the development and characterization of a wide array of formulation candidates, which is both costly and delays the time until reaching the successful formulation.
To understand whether machine learning could be applied to navigate these challenges, the research team investigated whether tools could be created to accurately predict the rate of drug release. From there, they trained and evaluated eleven different machine learning models, including multiple linear regression, random forest, light gradient boosting machine (lightGBM), and neural networks.
“Once we had the data set, we split it into two subsets: one used for training the models and one for testing. We then asked the models to predict the results of the test set and directly compared with previous experimental data. We found that the tree-based models, and specifically lightGBM, delivered the most accurate predictions,” said Pauric Bannigan, research associate with the Allen research group at the Leslie Dan Faculty of Pharmacy, University of Toronto.
Following these conclusions, the research team applied the models to inform the design of new long-acting injections, using analytical techniques to extract design criteria from the lightGBM model. The team then tested the design criteria for a drug used to treat ovarian cancer, and Bannigan stated that the formulation created had a suitable slow-release rate.
Bannigan added, “This was significant because in the past it might have taken us several iterations to get to a release profile that looked like this, with machine learning we got there in one.”
To build on the work being done, the team is calling for more open access databases to work from, stating that the lack of open data could hinder further progress in the area. To promote openness across the industry, the team made their datasets and code fully available to fellow researchers.