DeepMind’s AlphaGenome API Lets Biologists Query 1-Million-Base DNA Sequences in Seconds

AlphaGenome: Browsing a Million DNA Letters in Seconds

AlphaGenome: Browsing a Million DNA Letters in Seconds

On 25 June 2025 DeepMind quietly pushed a preview of the AlphaGenome API into the hands of researchers. In one stroke, genomics gained a search bar: paste up to one-million bases of DNA, hit Run, and watch the system return thousands of functional read-outs before your coffee has cooled. (deepmind.google)

Why this moment feels different

Genomics has lived with an uncomfortable trade-off: choose long-range context or base-pair precision—never both. AlphaGenome’s architecture collapses that dichotomy by coupling convolutional filters (for local motifs) with Transformers (for long-distance crosstalk), giving us a panoramic yet granular view of regulatory DNA. deepmind.google, eu.36kr.com

The payoff is immediate. A variant deep in a non-coding desert can be scored across chromatin accessibility, splice-junction shifts, RNA output, and 100-plus epigenomic marks in a single call. What once required stitching together half a dozen niche models and bespoke pipelines now lands in a tidy JSON payload. For bench biologists, that trims days—sometimes weeks—off an experimental cycle.

Benchmarks & raw speed

Early tests show that AlphaGenome scores a single variant against a reference 1 Mb window in “about a second” on DeepMind’s hosted TPU back-end. biotecnika.org Through the API, a typical lab can batch a few thousand candidate variants during lunch, iterate on hypotheses by dinner, and only walk to the sequencer when the in-silico evidence looks compelling.

Rule of thumb: Expect ~1–2 sec per 1 Mb region for standard variant-effect scoring via the public endpoint. Throughput throttles at ~10k calls/hr, so whole-genome sweeps still belong on HPC clusters—but focused biology projects are now interactive.

How does it actually work?

1. Long-context encoding

The input sequence—up to one million base pairs—is first parsed by a stack of convolutions that capture motifs: promoters, splice acceptors, TF binding sites. Those embeddings feed a Transformer that enables any base to “see” every other base, modeling enhancer-promoter loops or insulator effects that might sit hundreds of kilobases apart.

2. Multimodal decoders

Separate heads predict gene expression, splice-junctions, chromatin marks, Hi-C contact maps and more. Because the heads share a common backbone, information learned in one modality (say, histone marks) subtly improves another (such as RNA abundance).

3. Variant diffing at light-speed

To score a mutation, AlphaGenome runs two forward passes—wild-type and mutant—then computes modal deltas. Crucially, DeepMind engineered the kernels so that the second pass reuses most cached activations, collapsing runtime to near-constant cost. That optimisation is why “one-second” feels believable even at million-base scale.

Real-world snapshots

Rare-disease gene discovery: In re-analysing T-ALL patient genomes, AlphaGenome linked a non-coding insertion to activation of the TAL1 oncogene, matching years of wet-lab work in a single query. biotecnika.org
Synthetic enhancer design: Computational biologists at a Boston biotech are iteratively mutating enhancer scaffolds in silico to achieve neuron-specific expression before ordering gBlocks, cutting synthesis costs by an order of magnitude.
Functional fine-mapping of GWAS hits: An academic lab fed 240 lead SNPs from an obesity GWAS through the API, triaging the list to six loci with plausible chromatin and splice effects for CRISPR follow-up—work that previously consumed an entire PhD rotation.

Getting your hands on the API

The preview is free for non-commercial research. Sign-up requires a Google account and a brief use-case description. Expect rate limiting; DeepMind is still sizing demand. Installation is a single pip install alphagenome followed by an API key drop-in. Clear tutorials cover variant scoring, visualisation, and ontology navigation. alphagenomedocs.com

Caveats worth remembering

Not a clinical tool: The model predicts molecular phenotypes, not disease risk. Environment, developmental timing, and polygenic interactions remain outside scope.
Distance decay: Accuracy wanes beyond ~100 kb for very distal regulatory loops. Interpret long-range predictions with caution. biotecnika.org
Human-centric: Training data skews heavily toward human (and some mouse) assays. Cross-species inference is promising but unvalidated.

What’s next?

AlphaGenome feels like the “GPT-3 moment” for regulatory genomics: a foundational model that others will fine-tune, compress, or extend. DeepMind hints at expanding species coverage and adding new modalities—think methylation dynamics or 3D nucleosome positioning. If that materialises, the API could evolve from a query engine into a living atlas of gene regulation.

For now, the message is simple: keep your primers dry a little longer. Run the sequence first, and walk to the bench with sharper questions. Our notebooks just got a potent co-author.

DeepMind’s AlphaGenome API Lets Biologists Query 1-Million-B

AlphaGenome: Browsing a Million DNA Letters in Seconds

Why this moment feels different

Benchmarks & raw speed

How does it actually work?

1. Long-context encoding

2. Multimodal decoders

3. Variant diffing at light-speed

Real-world snapshots

Getting your hands on the API

Caveats worth remembering

What’s next?

RELATED POSTS

“Google’s ‘AI Mode’ Transforms Search Experience with Conver

“Funders commit $1B toward developing AI tools for frontline

“Google’s Veo 3: AI Video Generator Redefines Realism and Ra