Analysis Tool Automates Drug Discovery Data Acquisition and Management
- Created: Wednesday, 01 June 2005
Pharmaceutical company uses calculation software to automate data capture, analysis, and visualization.
In the race to deliver drugs to market more quickly, biopharmaceutical companies need to classify and interpret data at unprecedented rates with tools that eliminate redundant analysis tasks and promote collaboration among scientists.
Infinity Pharmaceuticals is an early-stage biopharmaceutical company focused on the discovery of life-saving cancer therapeutics developed through novel chemistry. A typical approach to determining promising molecules involves high throughput screening against targets of interest. A significant subset of these molecules is tested in dose response, requiring various curve fitting calculations. In particular, the company generates IC50 values — the concentration of a drug that is required to inhibit 50% of enzyme activity -— for all of the compounds, and KI values for most of them. The latter uses implicit functions.
Infinity sought to standardize on a data analysis tool that would integrate with their in-house applications and provide efficient analysis of large data sets. To avoid manually entering results into a central database, the company automated the process of capturing, analyzing, and visualizing data, which required software that would enable standardization on various calculations, and ensure the integrity of their screening results across projects.
MATLAB graphical analysis software was chosen to integrate with Infinity’s existing applications and calculate the IC50 curves. Infinity’s analysts developed an automated process by which the data generated by a scientist’s instrument is read directly into a database. Project teams review, analyze, and annotate data by removing outliers and dynamically refitting the curves. The team was able to identify and systematically work their way through key bottlenecks. Curve-fitting was a priority, and in particular, the scientists needed to acquire selected data points and recalculate values accordingly.
The process involved performing a high-throughput screen against a deck and storing data to an Oracle database. Then, a subset of the compounds is selected, based on these results (“actives”). Serial dilutions of the compounds and an assay are performed. IC50s and KI’s are generated for each series, and the data is stored in the Oracle database. Outliers can be deleted automatically or the scientists can view the curves through spotfire, deleting points where appropriate. When points are deleted, the values are recalculated, and the data is again stored in the Oracle database.
To calculate the IC50 curves, Infinity developed a Web-based interface that uses Java to communicate with MATLAB through various Java applications. The interface prompts scientists to perform their calculations. Parameters from the interface use M-file templates to execute scripts in MATLAB. Scientists used MATLAB and the Statistics Toolbox to perform batch calculations of hundreds of thousands of data points. Visualizations, along with IC50 values and coefficients in response elements, are displayed to scientists in a browser. The IC50s persist in a database to ensure that data is retained.
Once the scientists approve the modifications, they update the values stored in the database with a single click and publish it to the project leaders and scientific leadership teams for immediate notification.
Engineers are also using MATLAB and related toolboxes across internally developed applications, including KI calculations for finding the enzyme inhibition constant, and for cluster analysis and analytical chemistry result derivation.
This work was done by Nick Encina, senior informatics analyst, at Infinity Pharmaceuticals, using software from The MathWorks. For Free Info Visit http://info.ims.ca/5081-122