AlphaGenome is an artificial intelligence (AI) model developed by Google DeepMind, announced in June 2025 and officially published in Nature in January 2026. It is designed to understand the "non-coding" 98% of the human genome—often referred to as "dark matter"—which regulates gene activity but does not contain instructions for making proteins.
Here is a detailed breakdown of what AlphaGenome is and why it is significant:
1. Core Capabilities and Function
- Deciphering Non-Coding DNA: While only 2% of the genome codes for proteins, the remaining 98% tells genes when and where to start or stop functioning. AlphaGenome predicts the impact of mutations in these crucial regulatory regions.
- Extensive Context (1 Million Bases): Unlike previous models, AlphaGenome can analyze up to 1 million base pairs of DNA sequence at once while maintaining single-base-pair resolution. This allows it to see how distant DNA regions interact to control a gene.
- Multimodal Prediction: It simultaneously predicts 11 different molecular properties from a single input sequence, including gene expression (RNA-seq), DNA accessibility, and how RNA is spliced.
- Variant Impact Scoring: The AI can take a specific mutation (a single-letter change in DNA) and predict its functional consequence, such as whether it will increase or decrease the expression of a gene.
2. How it Works
AlphaGenome uses a deep learning architecture that combines:
- Convolutional Layers: To detect small, repeating patterns in the DNA sequence.
- Transformers: To analyze how different, distant positions in the 1-megabase sequence interact with each other.
- Training Data: The model was trained on massive datasets from humans and mice, including data from ENCODE, GTEx, and the 4D Nucleome project.
3. Significance for Science and Health
- Rare Disease Diagnosis: It helps researchers identify the functional impact of variants in non-coding regions, which are often overlooked in patients with undiagnosed, rare diseases.
- Cancer Research: AlphaGenome can predict how non-coding mutations might activate "oncogenes" (cancer-driving genes).
- Drug Development & Synthetic Biology: By understanding how to control gene expression, scientists can design synthetic DNA for gene therapies or identify new therapeutic targets.
4. Relation to Other Models
AlphaGenome is part of DeepMind's "Alpha" family of scientific AI, succeeding:
- AlphaFold/AlphaFold 3: Which predicts protein structures.