Short Reads vs Long Reads: What’s the Difference and Why It Matters

6 minute read

Not all sequencing technologies are created equal. If you’ve ever wondered why some people swear by Illumina and others rave about Oxford Nanopore or PacBio, this post is for you. Let’s break down the differences between short and long read sequencing and why you should care.

Summary of Different Technologies

Feature	Illumina	MGI	Element Biosciences	Ultima Genomics	PacBio	ONT
Read Length	50–300 bp	50–300 bp	75–300 bp	225–400 bp	10–25 kb	10 kb – 4 Mb
Accuracy	90% > Q30 (99.9%)	85% > Q30 (99.9%)	90% > Q30 (99.9%)	75% > Q30 (99.9%)	95% > Q30 (99.9%)	Q20–Q30 (99–99.9%)
Throughput (per flowcell)	Up to 26B reads (7.8 Tb)	Up to 40B reads (12.0 Tb)	Up to 1B reads (300 Gb)	Up to 12B reads (3.0 Tb)	Up to 30 Gb	Up to 290 Gb
Approx. Cost per Gb	~$2–$3	~$1.50–$2	~$5–$7	~$1–$1.50	~$10–$20	~$8–$15
Flagship Instruments	MiSeq, NextSeq, NovaSeq	DNBSEQ-T20x2	AVITI	UG 100	Revio, Sequel II	MinION, PromethION 48
Best For	Population WGS, RNA-seq, WES	High-throughput WGS, RNA-seq	Mid-throughput labs, targeted WGS	Population-scale WGS at ultra-low cost	de novo assembly, structural variants	Ultra-long reads, structural variants

What Are Short Reads?

Short-read sequencing refers to platforms that generate relatively small snippets of DNA, typically 50 to 300 base pairs long. These reads are then aligned or assembled computationally to reconstruct genomes or transcriptomes. hbddgvev bfdhfdghfffhd

Key Players:

Illumina - The dominant force in short-read sequencing since 2007
MGI - A competitive player out of China offering DNBSeq technology since 2016
Element Biosciences - A US-based newcomer offering high-accuracy short-read sequencing at lower operating costs for mid-throughput labs since 2022
Ultima Genomics - Entered the market in 2022 with a high-throughput short-read sequencer aiming to dramatically lower sequencing costs, with a stated goal of enabling $1 human genomes at scale

Strengths:

Accuracy: Their error rate is incredibly low, making it the gold standard for variant calling and quantification.
Cost-effective: High throughput and low per-base cost make it ideal for large-scale studies like Genome-Wide Association Studies (GWAS) or RNA-seq.
Tool ecosystem: Most software tools are optimized for short-read data.

Weaknesses:

Repetitive regions: Short reads struggle to resolve large structural variations and repetitive elements.
Context loss: Without long reads, it’s hard to understand haplotypes, phasing, and full-length transcripts.
Assembly challenges: Assembling a genome de novo with short reads is like putting together a puzzle with too many identical pieces.

What Are Long Reads?

Long-read sequencing generates DNA fragments that can span tens of thousands (or even millions!) of bases. These are useful for studying structural complexity and building more complete assemblies.

Key Players:

Oxford Nanopore Technologies (ONT) - Known for portability (MinION), real-time sequencing, and extreme read lengths
PacBio - Uses circular consensus sequencing to combine long reads with very high accuracy

Strengths:

Structural resolution: Excellent for detecting insertions, deletions, inversions, translocations
Assembly: Makes genome assembly easier and more contiguous
Transcriptomics: Full-length isoform sequencing (Iso-Seq, cDNA or direct RNA)

Weaknesses:

Cost: Higher cost per gigabase (though rapidly decreasing)
Throughput: Lower than short-read platforms for most instruments
Computational overhead: Larger files and different error models require more advanced handling (CPU vs GPU) and sometimes custom pipelines

Use Cases in the Wild

Application	Ideal Read Type	Why?
Single Nucleotide Polymorphism (SNP) calling	Short reads	Cheap, accurate, scalable
Different expression (RNA-seq)	Short reads	High accuracy and precise quantification at low cost
Full-length transcript detection	Long reads	Preserves isoform structure
Bacterial genome sequencing	Long or hybrid	De novo or complete assemblies
Cancer structural variant detection	Long reads	Detect large rearrangements
Metagenomics profiling	Short reads	High throughput, cost-efficient

What About Hybrid Approaches?

One of the most effective strategies today is to combine both short and long reads.

Long reads can scaffold the genome and resolve complex regions
Short reads polish the sequence to improve base-level accuracy

Other use cases for hybrid sequencing allow for proper identification of SNPs (short reads) and structural variants (long reads) by combining both technologies.

Tools like Unicycler, Pilon, or MaSuRCA allow you to integrate both datasets for high-quality assemblies or variant calls.

Things to Consider Before Choosing

Budget: Short-read sequencing remains cheaper per Gb, but ONT is catching up due to their cheaper instrument costs
Computational skills: Long reads need different QC, alignment, and polishing tools
Downstream needs: Are you calling SNPs or assembling new genomes?

If you’re just getting started in genomics, short reads might be more accessible but don’t underestimate the power of long reads for solving complex questions.

How Do These Technologies Actually Work?

Each sequencing platform takes a different approach to reading DNA. Here’s a breakdown of how the six major players do it:

Illumina - Sequencing by Synthesis (SBS)

Illumina uses sequencing by synthesis, where fluorescently labeled nucleotides are added one base at a time. As each base is incorporated, a signal is recorded, allowing the instrument to “read” the DNA.

Strengths: Accuracy, throughput, and cost efficiency
Limitations: Short reads (typically 150-300 bp), limited structural context

MGI - DNA Nanoball Sequencing

MGI (from BGI) uses DNBSeq, which involves amplifying DNA into nanoballs and sequencing them via combinatorial probe-anchor synthesis (cPAS).

Strengths: Low duplication rates, reduced index hopping, high output
Limitations: Still short-read technology, with ecosystem locked to MGI software/hardware

Element Biosciences - Avidity Sequencing

Element uses Avidity Sequencing, a twist on sequencing by synthesis that separates the steps of nucleotide incorporation and signal detection. It uses “avidites”, multivalent binding complexes, to improve accuracy and reduce reagent use.

Strengths: Very high accuracy (Q40+), lower reagent costs, flexible throughput for mid-scale labs
Limitations: Currently limited to short reads (up to 300 bp) and still building a large install base

Ultima Genomics - Open Substrate SBS

Ultima’s platform uses an open substrate and continuous sequencing-by-synthesis chemistry, designed to massively scale throughput while cutting reagent costs. The system runs on large circular wafers rather than flow cells.

Strengths: Extremely low projected cost per genome (goal: $1 WGS at scale), high throughput per run
Limitations: Still short reads (~300-400 bp), limited public performance data as adoption ramps up

Oxford Nanopore - Electrical Signal Detection

ONT devices pass single-stranded DNA through a nanopore and detect changes in ionic current, which correspond to different base sequences.

Strengths: Ultra-long reads, real-time sequencing, portable devices (like MinION)
Limitations: Historically lower accuracy, sensitive to base modifications and homopolymer runs

PacBio - Single Molecule Real-Time (SMRT) Sequencing

PacBio’s HiFi technology uses circular consensus sequencing: a polymerase reads a circularized template multiple times to generate a consensus with very high accuracy.

Strengths: Long reads with Illumina-like accuracy (>99.9%)
Limitations: Lower throughput compared to short-read platforms, higher cost per Gb

What About Roche?

Roche has re-entered the sequencing game in a big way, investing in technologies that aim to combine the strengths of both short and long reads. Roche has announced that in 2026 it plans to launch a short-read platform using its new X-Binding Sequencing (XBS) chemistry. This technology is designed to deliver high-accuracy (Q40+) short reads with faster run times, potentially reshaping the clinical sequencing market.

While full specifications aren’t public yet, Roche is positioning XBS as a complement to its long-read HiFi offerings through AVENIO.If XBS delivers on its promises, it could become a serious contender in both research and clinical genomics, offering an alternative to Illumina, MGI, Element, and Ultima in the short-read space. For now, it’s one to watch and we’ll be keeping an eye out for its performance data when it hits the market.

What’s Coming Next?

This post kicks off a mini-series on reference-based vs de novo assembly strategies, where we’ll look at:

When and how to use short, long, or hybrid approaches in practice
How reference-based methods align to existing genomes
Why de novo assemblies are important (and hard)

Stay tuned for the next post and if you’ve got questions about your own dataset, feel free to drop them in the comments.

Share on

X Facebook LinkedIn Bluesky

Mario F. Bisconti

Short Reads vs Long Reads: What’s the Difference and Why It Matters

Summary of Different Technologies

What Are Short Reads?

Key Players:

Strengths:

Weaknesses:

What Are Long Reads?

Key Players:

Strengths:

Weaknesses:

Use Cases in the Wild

What About Hybrid Approaches?

Things to Consider Before Choosing

How Do These Technologies Actually Work?

Illumina - Sequencing by Synthesis (SBS)

MGI - DNA Nanoball Sequencing

Element Biosciences - Avidity Sequencing

Ultima Genomics - Open Substrate SBS

Oxford Nanopore - Electrical Signal Detection

PacBio - Single Molecule Real-Time (SMRT) Sequencing

What About Roche?

What’s Coming Next?

Share on

Comments

You May Also Enjoy

Mapping with Confidence: A Beginner’s Guide to Reference-Based Assembly

Reference-Based vs De Novo Assembly: Choosing the Right Approach

Terminally Chill: Getting Comfortable with the Command Line

Conda? Mamba? Docker? Figuring Out Package Management Without Losing Your Mind