The Human Genome Project, that aimed to map and sequence the entire human genome, began in 1990 and ended in 2003 with a starting budget of over $1.5 million. It provided us, for the first time, a means to access invaluable data through genes – evolution patterns, diseases and their treatments, gene mutations and their effects, anthropological information, etc. Now, powerful software and analysis tools are being built that can decode an entire genome in a matter of hours. Data analytics is quickly becoming one of the most important branches of science that can be applied in the biotech industry.
DNA sequencing generates a huge amount of data that needs to be analyzed with care, as the information and conclusions drawn are applicable in a whole range of industries from medicine to forensic science. It involves data science at various levels:
Storage: The first step is storage of DNA sequencing data. If we were to sequence the genome of every living thing from a microbe to a human, then we need to have powerful data science tools that help us store, track and retrieve relevant information.
Annotation: Annotation is the process of adding notes to specific genes in the sequence. Tools are being built to put an automated annotation system in place, which requires pattern recognition and identification.
Visualization: DNA can be visualised on many levels and in different dimensions. Data visualization tools help understand this data in the form of various layouts, showing correlations and helping the user identify problems easily. Data analytics also help in building robust DNA software with features like zoom, pan and interactive features built into the interface to facilitate quick study. Newer, innovative ways of visualization are coming into the market as well!
Clockwise from top right, the genomes of a human, a chimpanzee, a mouse and a zebrafish are arranged in a circle, with each color square corresponding to a pair of chromosomes. Lines connect similar DNA sequences, visually emphasizing just how much DNA we share with other species. (Image: Martin Krzywinski/EMBO)
Analysis: Data analytics software helps draw certain inferences from specific gene sequences and mutations that are invaluable in the healthcare industry. The information obtained from data analysis can also be applied in the drug discovery and development sphere for targetting specific diseases and customizing treatment approaches.
Ilumina, a company that sells DNA sequencing analysis tools, is all set to release two new sequencing machines which allow more accurate insight into genes.
Researchers in biotech are often pressurized by time, but contrarily, the research undertaken to achieve a desired result can go on for years. Data analytics, when applied to clinical trials and experiments, help quickly identify the source of error with greater ease. They also help build predictive models and provide information on optimum parameters that will achieve the desired outcome of an experiment.
Data modelling helps biotech and pharma companies screen drugs before they pick the one that’s most effective, based on the computer-generated feedback. The best options are then taken further to clinical trials. Analysis also helps hospitals monitor and evaluate patient progress and their treatment plans. Genentech has developed a database of patients previously diagnosed and treated for cancer, and this is now helping them choose effective therapies for patients currently being treated. Predilytics, a healthcare predictive analytics company, has recorded the data of about 250 million consumers, creating insights into the where, what and when of patient requirements.
Agricultural biotech firms can also take advantage of data science tools by using them to identify the best performing crops with minimum impact on the environment, especially among the genetically modified plants.
The pharma industry in particular has seen an explosion of data that is now available to them, and with this, mapping small, clinical trials to real-world situations is becoming more and more challenging. The data at their disposal comes in a variety of formats and is often noisy, so scientists have to come up with software that polishes this raw data and provides accurate solutions.
Big data also helps businesses get a deeper insight into their market, and tailor solutions to specific audiences based on their behaviours. Within an organisation, data analytics can help in making operations more effective and efficient. This report published by McKinsey outlines eight ways that pharma companies can benefit from big data.
Data analytics provides insightful metrics for companies to be able to identify bottlenecks and overcome challenges. It prompts unambiguous, data-driven decisions, which can strengthen an organisation’s operations, processes, sales and in turn, its future.