Computer hardware maker Nvidia is staking out a larger role in genetic research as it undergoes a transformative shift thanks to pioneering efforts in artificial intelligence (AI).
Developed in a joint effort with Argonne National Laboratory and the University of Chicago, a new Large Language Model named GenSLMs has drawn significant attention for its ability to generate gene sequences that closely mirror real-world variants of the SARS-CoV-2 virus, which causes COVID-19. This suggests that AI can exhibit a sophisticated understanding of complex genetic patterns.
GenSLMs can also distinguish between COVID variants thanks to its training on over 110 million genomes, enabling it to classify and cluster genome sequences.

How blockchain is being used to track the coronavirus
A deadly strain of the coronavirus that originated in China has quickly spread across the world. It causes pneumonia; those who have it suffer from coughs, fevers, and breathing difficulties. In a matter of weeks, over twenty thousand have contracted the virus and over four hundred people have died, with the number growing each day. The coronavirus first spread from a wet market in the city of Wuhan, where food is traded alongside live animals; the city is now under lockdown. Almost no shops ar...
“The AI’s ability to predict the kinds of gene mutations present in recent COVID strains — despite having only seen the Alpha and Beta variants during training — is a strong validation of its capabilities,” said Arvind Ramanathan, the project's lead researcher from Argonne, in an official statement shared by Nvidia.
For its part in the research, Nvidia provided the team advanced computational resources, including NVIDIA A100 Tensor Core GPU-powered supercomputers, which proved crucial in processing the extensive dataset of nucleotide sequences.
Impact of Large Language Models in Genetics
Medicine-focused Large Language Models like GenSLMs, Ankh, and CancerGPT represent major advancements in modern genetic research. These AI systems learn from extensive textual datasets to predict and generate contextually relevant language patterns. In genetics, this translates to the ability to analyze and interpret complex genetic sequences, very similar to linguistic analysis.
This innovative application of LLMs has opened a new chapter in genetics, where the deep understanding of genetic sequences leads to breakthroughs in identifying disease markers and advancing personalized medicine.
AI Could Unlock the Language of Proteins: Meet Ankh, the Protein Whisperer
What if AI could talk to proteins? You know, those large, complex molecules that play many critical roles in the body? Well, hold on to your lab coats, because Ankh— a new protein language model—purports to do just that. Ankh was created by a group of experts from the Universities of Munich and Columbia in collaboration with the biotech company Protinea. The name comes from an ancient Egyptian symbol representing life, apt for an AI language model that delves into the very building blocks of lif...
Ankh, developed collaboratively by the Universities of Munich and Columbia with the biotech startup Proteinea, delves into the language of proteins, while CancerGPT, a joint project from the University of Texas and the University of Massachusetts, predicts drug interactions in cancer treatment using LLMs. These studies signify a major shift in processing and deriving insights from vast amounts of genetic data.
GenSLMs' ability to forecast viral mutations opens new possibilities for vaccine development and treatment strategies for diseases like COVID-19, Nvidia claims. The applications of Ankh in drug development and CancerGPT in understanding cancer treatments are paving the way for more targeted and effective medical interventions.
Edited by Ryan Ozawa.