In the last decade, India’s use of genomics has undergone a significant transformation, so much so that the diagnoses, management, and treatment of many diseases — including tuberculosis, cancers, and those caused by antimicrobial resistance — stand on the cusp of a revolution.
Most recently, in January 2024, the Department of Biotechnology said it had completed sequencing 10,000 genomes from 99 ethnic groups under its ‘Genome India’ project. This national initiative aims to develop a reference genome for Indian people, which will help design genome-wide and disease-specific ‘genetic chips’ for low-cost diagnostics and research.
Earlier, in October 2020, the Council for Scientific and Industrial Research (CSIR) had reportedly sequenced the entire genomes of 1,008 individuals in India representing diverse ethnic groups in six months. This effort was part of a mission called ‘IndiGen’ — to create a pilot dataset with which researchers could analyse the epidemiology of genetic diseases and help develop affordable screening approaches, optimise treatment, and minimise adverse events for them.
Other, more disease-specific consortia have also sprung up around the country and efforts are underway to create new datasets to address individual health problems, ranging from the age-old scourge of tuberculosis to cancers, rare genetic disorders in children, and even antimicrobial resistance. Researchers have also been able to extract more value from these using artificial intelligence and machine learning, and by combining their contents with other extensive datasets on proteins (proteomics), gene expression in cells (transcriptomics), and chemical changes that regulate gene expression (epigenomics) to develop a ‘multi-omics’ approach to tackle diseases.
Tuberculosis
A recent consortium concerns tuberculosis, a disease that continues to pose significant challenges to its eradication, in India and around the world. The Indian Tuberculosis Genomic Surveillance Consortium (InTGS) comprises 10 Report India sites covering eight states for tuberculosis, with the goal of sequencing around 32,000 tuberculosis clinical strains from active patients, and develop a centralised biological repository of clinical Mycobacterium tuberculosis strains in India.
Other major objectives vis-à-vis tuberculosis include mapping the genetic diversity of pulmonary and extra-pulmonary isolates of the tuberculosis bacterium from newly reported active cases in India, the associated treatment outcomes, and correlating mutations with drug resistance patterns, according to Vinay Nandicoori, director of the CSIR-Centre for Cellular and Molecular Biology (CCMB), Hyderabad. The project’s ultimate goal is to validate identified mutations to develop a sequence-based method to determine drug resistance, and to combine the epidemiological data with results from whole-genome sequencing to develop working solutions.
Scientists from a mix of leading research institutes have divided the various parts of the project thus. In the first stage, scientists from the Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry; the National Institute for Research in Tuberculosis, Bhagwan Mahavir Medical Research Centre, Hyderabad; the Byramjee Jeejeebhoy Government Medical College, Pune; and the P.D. Hinduja Hospital, Mumbai, will collect the clinical samples, including the patients’ metadata. Next, scientists at the International Centre for Genetic Engineering and Biotechnology, New Delhi, will isolate the genetic material from the samples and set up a strain repository. In the third stage, scientists at CCMB and the National Institute of Biomedical Genomics, Kalyani, will conduct whole genome sequencing. In the fourth and final stage, a team at the National Institute of Immunology, New Delhi, will conduct RNA sequencing data analysis, and develop AI and ML models to predict drug resistance and take cognisance of the metadata to detect resistance patterns, according to Dr. Nandicoori.
“This is a huge, huge, huge project,” he added. The starting point is to generate baseline data — a relatively ignored task in India compared to several other countries.
Rare genetic disorders
India has also launched a pan-country mission for Paediatric Rare Genetic Disorders (PRaGeD), which, despite their rarity, have become a common public health concern. Mission PRaGeD is planning to create awareness, perform genetic diagnosis, discover and characterise new genes or variants, provide counselling, and develop new therapies for rare genetic diseases that afflict India’s children.
The mission will incorporate IndiGen data in its in-house bioinformatic pipelines it will use to analyse the parts of a genome that code for proteins (exome). The CSIR-Centre for DNA Fingerprinting and Diagnostics (CDFD), Hyderabad, in collaboration with 15 centres across India, plans to recruit patients and their families with rare genetic disorders.
“The study aims to identify novel genes for various known as well as unexplained inherited phenotypes (observable traits) but also help the patient and family with management of disease and prenatal diagnosis,” Ashwin Dalal, group leader for diagnostics and a scientist at CDFD. The team will also characterise novel genes or variants thereof to determine their function or role in the disorder, using cell lines and/or model organisms such as mice, fruit flies, and zebrafish.
Also on the anvil in the mission is the use of next-generation sequencing, one of the latest tools to manage rare diseases and to assess the probability of developing several chronic ailments, especially when conventional tests give negative results. “Implementing newborn genetic testing at a national level can contribute to the management of rare genetic conditions through faster and more accurate diagnoses,” Dr. Dalal said.
Cancers
The Indian Cancer Genome Consortium (ICGC-India), part of the larger International Cancer Genome Consortium (ICGC) and supported by the Department of Biotechnology, plans to characterise genomic abnormalities in different types of cancers in Indian patients and identify population-specific genetic variations that are linked to cancer risk and treatment response. Such population-wide genome sequencing projects can facilitate the discovery of novel biomarkers, potential new treatment targets, and personalised treatment strategies, according to Dinesh Gupta, group leader of translational bioinformatics at the International Centre for Genetic Engineering and Biotechnology, New Delhi.
Several Indian institutions have established the ICGC-like genomic data repositories to facilitate cancer research and precision medicine initiatives that cater to the genetic makeup of Indian people, Dr. Gupta said. Another example is the Indian Cancer Genome Atlas project, a not-for-profit public-private-philanthropic initiative that is trying to create a comprehensive catalogue of genomic alterations across various cancer types prevalent in India. This could help researchers identify novel biomarkers and treatment targets. The Atlas collects and generates detailed genomics with linked clinical data.
Clinical trials in cancer are also beginning to incorporate genomics in the country, Dr. Gupta added. Indian cancer centres classify patients using genomic profiling for clinical trials that are based on their molecular subtypes, and match potential responders with targeted therapies.
Antimicrobial resistance
Genomics and metagenomics are coming in handy to analyse antimicrobial resistance and understand the possibility of rapid spread of any antibiotic resistance functions between bacterial species. Some of the microbes, such as the bacteria that cause tuberculosis, grow very slowly, even in laboratory conditions, Bhabhatosh Das, associate professor at the Translational Health Science and Technology Institute, Faridabad, explained. “So clinicians prescribe antibiotics without knowing the actual resistance profile of the infectious agents.”
In such cases, genome sequencing is very helpful because it can provide information about the resistance profile of microbes without researchers having to grow them in the lab, he said. “Such information helps clinicians make judicious use of antibiotics.” In tuberculosis, pathogen-specific resistance signatures “should add immense value to antimicrobial-resistance diagnostics and the selection of appropriate drug combinations for successful antimicrobial therapy.”
AI, ML, and multi-omics
Meanwhile, artificial intelligence (AI) and machine learning (ML) algorithms are lending a helping hand to genomics in analysing the extensive datasets. These technologies can help predict an individual’s risk of developing cancer, develop diagnostic tools to detect some cancers early, classify them, and develop treatment strategies, Dr. Gupta said.
Researchers have also stated using AI and ML to help with analysing genome-sequencing data in cases of rare genetic disorders. A single instance of sequencing the entire exome of an individual can yield 5 Gb of data and whole-genome sequencing can yield 90 Gb, Dr. Dalal explained: “Analysis of such massive sequencing is impossible without use of computational tools.” Technicians are using AI- and ML-based approaches in the in-house bioinformatic pipelines as well as part of commercial tools for analysis of the sequencing data to identify disease-causing variants.
With the rapid expansion of AI, it is now easy to access multi-omics and analyse Big Data products rapidly, even with only standard computational facilities, according to Dr. Das, adding that multi-omics is today an emerging technology in the field of clinical science in India.
T.V. Padma is a science journalist in New Delhi.