1471 stories
·
2 followers

A Visual Guide to DNA Sequencing

1 Share
Ella Watkins-Dulaney for Asimov Press.

When the Human Genome Project (HGP) released its initial draft sequence in 2001, President Bill Clinton hailed it as “the most wondrous map ever produced by mankind.” After more than ten years of work, an estimated $3 billion in research costs, and a “genome war” with Craig Venter’s private company, Celera Genomics, the project had produced a nearly complete sequence of a human genome.1

UK Prime Minister Tony Blair predicted that this map would yield “a revolution in medical science whose implications far surpass even the discovery of antibiotics.” (Whether this claim turned out to be true is debatable.) A few months later, the two teams — from HGP and Celera — published cover stories in Nature and Science, respectively.

Although the quest to sequence a human genome began in 1990, the techniques it used had already been in development for more than twenty years. And those DNA sequencing methods, in turn, were directly inspired by protein and RNA sequencing research stretching all the way back to the 1940s.

In the twenty years after the draft human genome was first released, the average sequencing cost per genome fell roughly one hundred thousand-fold, ending up just north of $500. In that same period, the cost to sequence a million letters or “megabase” of DNA fell to six tenths of a cent.2 This plummeting price is due largely to technological innovation, including new sequencing chemistries, computational methods for assembling raw reads into finished genomes, and highly efficient commercial sequencing machines.

Out of the many sequencing methods developed over the decades, five are particularly important. These are their histories.

Subscribe now

Sanger Sequencing

Fred Sanger was biology’s great decoder. A British biochemist who spent his entire career at the University of Cambridge, Sanger earned two Nobel Prizes in the same field: first, the 1958 Nobel Prize in Chemistry for creating a method to determine the amino acid sequence of proteins (most famously insulin) and, second, a share of the 1980 Nobel Prize in Chemistry for inventing methods to sequence DNA.

After winning his first Nobel, Sanger turned his gaze to RNA, seeking to become the first person to sequence a full strand. He was beaten by Cornell biochemist Robert Holley, however, who reported the full 77-nucleotide sequence of the alanine transfer RNA molecule in 1965.3

Although many scientists today assume that Sanger was the first to figure out how to sequence DNA, that’s not the case. As with RNA, Sanger was edged out by a Cornell biochemist. This time it was Ray Wu, who, in 1970, published a method to “read” specific sections of two bacterial virus genomes, called λ and bacteriophage 186. Wu’s method was only capable of sequencing “cohesive ends,” short single-stranded sections of these particular phage genomes, and so wasn’t considered a “general” solution to the DNA sequencing problem. In 1974, Wu’s lab refined this technique into the first general sequencing method, but it proved extremely labor-intensive and failed to catch on.

Output from Ray Wu’s 2-D homochromatography method. Credit: Jay E. et al. Nucleic Acids Research (1974).

In 1975, Sanger published his own DNA sequencing method alongside laboratory technician Alan Coulson, called the “plus and minus” technique. First, scientists mixed the DNA strand to be sequenced with an enzyme, DNA polymerase, as well as a primer, three normal dNTPs and one radiolabeled dNTP. Radiolabeled nucleotides are incorporated into growing DNA strands just like normal nucleotides, but are tagged with radioactive isotopes, such as phosphorus-32 or sulfur-35, so they can be detected using radiation-measuring equipment.

This reaction would contain only low concentrations of the dNTPs and relied upon brief incubation times, so that DNA synthesis would stall at random positions along the template and yield a population of DNA fragments with varying lengths. These unfinished DNA fragments were then purified and used as templates in four “minus” and four “plus” reactions.

For each minus reaction, the purified DNA fragments were incubated together with three of the four dNTPs, meaning each fragment would be extended by DNA polymerase until the missing nucleotide was needed, at which point synthesis would halt.

Output from Sanger and Coulson’s Plus-Minus method. Credit: Sanger & Coulson, J. Mol. Biol. (1975).

Plus reactions worked differently: they used T4 DNA polymerase, an enzyme with strong 3’ to 5’ exonuclease activity, meaning it can chew back the end of a DNA strand. In the presence of only one dNTP, T4 DNA polymerase would degrade each fragment from its 3’ end until it reached a nucleotide complementary to that dNTP, at which point the exonuclease activity would be inhibited. This ensured that all fragments in a given plus reaction ended with the same nucleotide.

Since the eight reactions were run on fragments of random length, the eight plus and minus reactions collectively produced DNA fragments of all possible lengths.4 The fragments in these eight reactions were separated by size (using gel electrophoresis) and then imaged with autoradiography. Gels were dried and then placed against X-ray film, allowing the radioactive DNA fragments to expose the film and appear as dark bands, which a scientist could then painstakingly translate into the DNA sequence. In 1977, Sanger and colleagues sequenced the first full DNA genome using this method: a small bacterial virus with 5,386 nucleotides in its genome, called ɸX174 or “PhiX.”

In 1977, Sanger developed a much simpler sequencing method, called “chain termination,” which is today known simply as Sanger sequencing. This technique took advantage of a different type of special nucleotide called a dideoxyribonucleotide, or ddNTP. ddNTPs lack one of the hydroxyl groups present on a normal dNTP, preventing the chemical reaction necessary to add another nucleotide and terminating DNA elongation.

Sanger sequencing reaction mixtures included purified template DNA, a primer, DNA polymerase, and all four dNTPs. Each reaction also included a radiolabeled ddNTP version of just one of the four nucleotides. Only a small amount of each ddNTP was added, however, to ensure that a fraction of the total DNA fragments produced stopped at each occurrence of that base. As with previous methods, separating fragments via length and performing autoradiography allowed scientists to read the final sequence.

A different sequencing method that chemically cleaved DNA at specific bases, developed by Allan Maxam and Walter Gilbert, was the dominant technology into the 1980s. Radiolabeled DNA samples were incubated in four separate reactions, each of which contained a chemical that cleaved after a different nucleotide — either A/G, G, C, or C/T. By adding the right amount of each chemical, it was possible to produce different fragments chopped off at each individual base. The sequence could then be read using gel electrophoresis and autoradiography. Maxam–Gilbert sequencing was easier than the plus and minus method to run and interpret, but was eventually surpassed by Sanger’s chain termination method, which molecular biologists found both technically preferable and more “elegant” since it mirrored the natural copying of DNA.

Output from the Sanger Sequencing method, with chain-terminating inhibitors. Credit: Sanger et al. PNAS (1977).

While Sanger sequencing was highly accurate and less labor-intensive than its predecessors, it still required the use of radioactive reagents and manual sequence recording. In 1986, Leroy Hood’s lab at Caltech replaced the radiolabeled ddNTPs with fluorescently labeled nucleotides, using fluorophores that emitted different colors of light for each base. They were now able to run the products of all four reactions on the same gel and have a computer read the sequence by detecting the color of each fluorescent signal as fragments passed through a laser beam.

The first commercial Sanger sequencing machine was produced that year by Applied Biosystems (ABS), which Hood had co-founded in 1981. Called the ABI 370A, it retailed for $92,500. Since Sanger never patented his method, other companies were free to develop competing products, and by 1988, there were three Sanger sequencing machines on the market. These were followed by numerous others, including the Perkin-Ellmer 3700, used by Celera and the Human Genome Project, and the ABS 3500 Genetic Analyzer, which is still found in many laboratories today.

ABI 370A Sanger sequencing prototype. Source: Science Museum

454 Pyrosequencing

By the time Sanger sequencing was commercialized, the groundwork for an entirely new sequencing chemistry was already well underway. In 1985, Swedish biochemists Pål Nyren and Arne Lundin published a paper illustrating a procedure that measured the concentration of a molecule, called pyrophosphate (PPi), using an enzymatic cascade that emits light. In early 1986, Nyren realized that the method he’d helped develop could be applied to DNA sequencing, because PPi is naturally produced as a byproduct of DNA synthesis.5

Funding limitations prevented Nyren from dedicating much time to the project at first, but in 1993, he was finally able to publish a proof-of-principle. His technique began by mixing the template DNA with a primer, a single dNTP, and three enzymes: the familiar DNA polymerase plus the light cascade enzymes, ATP sulfurylase and firefly luciferase.6 If the dNTP was incorporated into a strand of DNA, PPi would be produced in the chemical reaction. ATP sulfurylase could then convert the PPi into ATP, which would provide energy for the luciferase enzyme, producing light. Thus, it was possible to determine each base in the sequence by cycling through the dNTPs one at a time until light was detected, and then washing extra nucleotides out between each step. By literally rinsing and repeating, the sequence could be recorded one letter at a time without the use of any gels, which often took hours to run and were difficult to automate.

Nyren’s sequencing method earned the name “pyrosequencing” since it revolved around the production of pyrophosphate. At first, pyrosequencing could sequence only short DNA snippets, with a few nucleotides. In 1996, however, Nyren’s lab demonstrated sequencing of up to 15 bases by using a modified “A” nucleotide to reduce their signal-to-noise ratio.7 In 1998, they increased this to 34 bases by adding another enzyme, called apyrase, to the mix; apyrase degraded unincorporated nucleotides, removing the need for constant wash steps.

The year before, Nyren’s lab had also spun off a company, Pyrosequencing AB, to refine and commercialize the technology. Pyrosequencing was not the firm that would bring the technology to market, however; that distinction went to Connecticut-based 454 Life Sciences, who licensed whole-genome pyrosequencing applications in 2003. 454 made chips which enabled highly efficient, parallelized sequencing reactions and released the GS20 sequencer in 2005 for the price of $500,000. The GS20 worked by attaching each individual DNA template molecule to a bead and copying it many times using polymerase chain reaction (PCR). Each bead was then loaded into a well in a microplate, where sequencing reactions would be carried out. The light from luciferase activation could be detected through the bottom of the wells, enabling sequences to be read.

Pyrosequencing wasn’t developed early enough to be employed by the Human Genome Project or Celera, but it was still the first method other than Sanger sequencing to hit the commercial market, marking the start of “next generation” sequencing methods (NGS). Pyrosequencing worked in real-time, though it struggled to accurately capture regions with several of the same nucleotides in a row. This was because the amount of light didn’t always scale cleanly when pyrophosphate was produced through successive reactions.

In 2006, 454 collaborated with Swedish paleogeneticist Svante Pääbo to sequence the first million base pairs of the Neanderthal genome; the project would be completed four years later, albeit with some help from Illumina sequencing. Illumina and other subsequent NGS technologies rendered pyrosequencing non-competitive, and in 2013, 454 was shut down by Roche, which had acquired it six years earlier. The technology is still used today for some applications, but most importantly, it was the first commercially viable alternative to Sanger sequencing, and the first sequencing method that could be fully automated because it didn’t rely on gels or other tedious steps.

A Life Sciences 454 sequencer. Credit: National Museum of American History

Sequencing by Synthesis

In the mid-1990s, University of Cambridge biochemists David Klenerman and Shankar Balasubramanian were trying to solve a fundamental problem: how to watch a single DNA polymerase molecule at work. Their approach used modified nucleotides, called reversible terminators, tagged with four different colors of fluorescent molecules. If one of these “terminators” was grabbed by the DNA polymerase and incorporated onto the replicating DNA strand, it would block the addition of any other bases until removed using a separate chemical reaction.

Klenerman and Balasubramanian’s great insight was that template DNA could be sequenced by synthesizing a complementary strand of reversible terminators; basically, extending the chain one base at a time and determining the identity of each nucleotide by looking at the color of its fluorophore. In 1998, the pair started a company called Solexa to develop the technology.

Detecting fluorescence from a single DNA molecule proved difficult in practice, however. And so, in 2004, Solexa acquired the IP rights to a method called colony sequencing, developed by French scientists Pascal Mayer and Laurent Farinelli, to solve the detection problem. Colony sequencing affixed DNA fragments to a surface and amplified them over and over, generating “colonies” containing massive numbers of identical DNA strands. By reading the fluorescence from each strand in a colony simultaneously, it became possible to determine the base added at each step with much better accuracy, since random errors in individual strands would be averaged out by the consensus signal.

Now that single-molecule detection was no longer necessary, Solexa was able to develop its signature sequencing chemistry. The process takes place on a chip called a flow cell, which contains a lawn of short DNA sequences affixed to its surface. The template DNA is broken up into small fragments, and adapter sequences, complementary to the DNA on the flow cell’s surface, are added to the ends of each fragment. DNA fragments are then passed over the flow cell, where the adapter sequences bind to spots on the DNA lawn. At this point, primers are added, and an initial round of amplification takes place: the short DNA sequences on the flow cell are extended to create sequences complementary to the bound template DNA fragments, which are then washed away. The sequences present in the fragments of template DNA are now affixed directly to the flow cell.

At this point, each bound sequence exists as a single copy, which produces too faint a signal to detect reliably. Colony sequencing solves this by generating clusters of identical fragments through a process called bridge amplification. The adapter on the free end of each DNA strand will be complementary to some of the original, short sequences on the DNA lawn, and when this binding occurs, the strand bends over to form a bridge shape. Another round of amplification takes place, resulting in two complementary strands each directly affixed to the flow cell. This “bridge amplification” process is repeated over and over to propagate the sequence.

From here, the actual sequencing can begin. Primers and fluorescently-labeled chain terminators are added to the reaction mixture, resulting in the addition of one nucleotide to each strand of DNA on the lawn. A picture is taken of the entire chip, then the chain terminators’ blockers are cleaved to allow addition of the next base. This process proceeds until the reaction is complete, resulting in massively parallelized sequencing. The short reads acquired through this process can be combined via a computational technique called paired end analysis, which links reads by analyzing overlapping sections, to generate the whole sequence.

Solexa’s first product, the Genome Analyzer, launched in 2006 with a retail price of $400,000, and the company was acquired by the American genomics firm Illumina the following year. In 2008, the company published a paper demonstrating their technology’s ability to efficiently sequence whole genomes via short reads. Illumina’s method is commonly known as “sequencing by synthesis.” While the label could technically be applied to other methods, including Sanger’s, which also indirectly assesses sequence by detecting the incorporation of nucleotides complementary to the template strand, it’s most commonly used to refer to Illumina’s chemistry.

Since the release of Solexa’s Genome Analyzer, Illumina has created several new sequencing machines designed to fill different price niches. Illumina’s short reads are highly accurate, and the technique has played a crucial role in reducing average sequencing costs.

An Illumina Genome Analyzer II, ca. 2007. Credit: Jon Callas

Unsurprisingly, Illumina has become by far the most common NGS method, maintaining roughly an 80 percent share over the last few years. This is largely owing to its versatility. Illumina sequencing has been used to create new reference genomes, including the common tomato, but has been especially useful in cases requiring repeated sequencing of short DNA sequences. For example, Illumina machines are routinely used to quantify the activity of genome editors like CRISPR; template DNA will either be edited or unedited, and reading the area around the edit many times provides an accurate quantification of editing percentages. Similarly, large numbers of short reads are useful for sequencing ancient DNA, taken from bones or other remains, since such samples often have degraded stretches. In addition to its role in the Neanderthal Genome Project, Illumina has been used to sequence 10,000-year-old human bodies and to track migration and population turnover in Neolithic Denmark.

PacBio SMRT Sequencing

While Illumina ultimately opted for a method that simultaneously detected massive numbers of DNA strands, others still believed that single-molecule sequencing methods offered a better path forward. Sequencing by synthesis requires repeated amplification, which introduces the possibility of error at each step and biases outputs towards sequences readily amplified by DNA polymerase. Single-molecule techniques were billed as a way to sequence DNA with minimal bias while simultaneously reducing cost.

The first such method was developed in biophysicist Steve Quake’s lab at Caltech and commercialized by Helicos Biosciences, but became unavailable after the company declared bankruptcy, saddled by legal issues and unable to find a market niche. These days, the canonical technique comes from a company called Pacific Biosciences (PacBio). Scientists often refer to single-molecule techniques as “third-generation” DNA sequencing, though they’re often also lumped into the NGS bucket with Illumina.

PacBio was founded in 2004 to develop sequencing methods based on work done in the labs of biophysicist Watt Webb and engineer Harold Craighead, both at Cornell University. The previous year, the two had collaborated to create zero-mode waveguides (ZMWs): small containers just big enough to hold a single DNA polymerase and containing tiny holes at the bottom through which light could be detected. They were able to fix a DNA polymerase to the bottom of a ZMW and detect the incorporation of individual fluorescent “C” nucleotides through the holes, which fed into a microscope capable of detecting light emissions.

A PacBio RSII machine, ca. 2013. Credit: Konrad Förstner

In 2009, PacBio published a paper expanding the principle into a full-blown sequencing technique. Once again, each nucleotide was labeled with a different colored fluorophore detectable by the ZMW to determine which base had been incorporated. The fluorophores were attached such that they would be cleaved off during the chemical reaction incorporating the base into the growing DNA strand; they would then diffuse out of the ZMW so that the next fluorophore could be detected. Sequencing took place on a chip with many wells simultaneously — a different type of parallelization where each well detected a single DNA molecule undergoing the same basic chemical reaction.

The next year, PacBio developed a new method to allow multiple sequencing passes on the same DNA molecule. Double stranded DNA templates were ligated to two single stranded adapters, creating what the company called a “SMRTbell template.” Sequencing began at a primer on one of the adaptors and could proceed multiple times per molecule due to circularization, in a process called rolling-circle amplification. This helped reduce PacBio’s error rates significantly.

With its core technology in place, PacBio was ready to go commercial. In 2011, the company released the RS sequencing machine, and has since created multiple new machines containing chips with increased numbers of sequencing wells. PacBio calls the technique single molecule real time (SMRT) sequencing, though it’s colloquially referred to simply as PacBio sequencing. Rather than producing short overlapping reads like Illumina, PacBio generates very long reads; at first these were a few thousand bases, but today they can be well over 10,000.

PacBio’s ability to produce extremely long reads makes it a useful complement to Illumina. Indeed, PacBio machines are better at sequencing “confusing” genomes, such as those with many copies of the same gene, long stretches of repetitive motifs, and “structural variations” like large insertions or deletions, which may not show up in short-read sequences. For instance, PacBio was used to sequence a very difficult bacterium called Clostridium autoethanogenum, which contains repeats, nine copies of a single gene, and insertions from bacterial virus genomes — basically the genomic equivalent of a Thomas Pynchon novel.

Nanopore Sequencing

Nanopore is the most recently commercialized major sequencing method, collectively developed by several groups starting in the 1990s. A nanopore is a protein or lipid with a small hole in its center through which other materials, such as DNA, can pass. The first nanopore used for sequencing was ⍺-hemolysin, a protein toxin from the bacterium Staphylococcus aureus, though other biological and synthetic nanopores have since been tested.

In 1996, David Deamer and Daniel Branton’s labs at UC Santa Cruz and Harvard collaborated on a paper showing that when an electric current runs through a nanopore, passing purine (A and G) and pyrimidine (T and C) DNA bases through the nanopore disrupted the current to different degrees. While the technique couldn’t yet discriminate between all four bases, the general idea for a new single-molecule sequencing method was there.

In 2001, Hagan Bayley’s lab at Texas A&M demonstrated a limited sequencing method based on the observation that correctly and incorrectly paired DNA bases disrupted nanopore current to different extents. They tethered a short piece of DNA with a few unknown bases to the entrance of the nanopore, then added other short DNA strands with different bases at the position corresponding to the unknown base on the tethered strand. By looking at which base produced the disruption corresponding to a perfect match, they could guess the unknown nucleotide.

In order to directly assess DNA strands going through the nanopore, two major problems needed solving. The first was that DNA moved too fast to reliably detect; the second was that individual bases still could not be differentiated, just purines and pyrimidines. In 2005, Bayley (who by then had moved to Oxford) made progress on the first issue, working with scientists at the Scripps Institute to slow the template DNA down by adding short “hairpin” structures that partially blocked off the pore. That year, Bayley co-founded Oxford Nanopore Technologies (ONT) to develop the emerging sequencing method. ONT quickly brought together various technologies, licensing IP from the labs of Bayley, Deamer, Branton, and others.

Deamer’s original nanopore sketch, ca. 1989. Credit: Oxford Nanopore

In 2010, ONT combined two technologies addressing each of the main outstanding problems. The first was an engineered nanopore developed in collaboration with Bayley’s lab that could discriminate between individual DNA bases, solving the resolution issue. The second was a trick to slow the DNA down to detectable speeds, using the familiar DNA polymerase enzyme. Mark Akeson’s lab at UC Santa Cruz had identified a specific polymerase from the bacterial virus ɸ29 that replicated DNA at an ideal speed for detection via nanopore. Template DNA strands were replicated just before entering the nanopore, passing through slowly enough for individual bases’ effect on the electrical current to be detectable and allowing the DNA sequence to be read one base at a time.

By 2012, ONT had unveiled its first sequencing data, and Nanopore sequencing quickly established itself as a quick method for generating long reads without DNA synthesis, albeit with a higher error rate than some earlier methods. (Nanopore reads had an accuracy of about 85-90 percent per base in 2017, compared to over 99 percent for Illumina. Recent improvements, though, have boosted this accuracy to more than 99 percent for most applications.)

ONT released its first commercial product in 2015: a handheld machine called the MinION that retailed for just $1000, a fraction of the price of most sequencers. Subsequent releases include more traditional benchtop sequencers such as the PromethION and GridION.

The MinION Nanopore. Credit: Oxford Nanopore

Conclusion

In its early days, sequencing was a laborious (and literally radioactive) biochemical process. Today, sequencing machines are ubiquitous, safe, and much less labor-intensive. This evolution was enabled not just by advances in biochemistry but insights from biophysics and materials science, as well as manufacturing ingenuity that turned lab sequencing setups into machines ready for shipping to customers.

Ultimately, DNA sequencing technology extends beyond these five techniques, but they represent the most transformative and widely adopted methods of the past fifty years. Together, they have enabled physicians to identify disease-causing variants in patients, allowed researchers to sequence entire microbial communities from ocean water or human guts, and opened windows into deep time by recovering genomes from Neanderthals and early humans.

New sequencing methodologies are still under development, too. In 2025, Roche announced a new single-molecule technique called Sequencing by Expansion, which inserts large engineered molecules called ‘Xpandomers’ between nucleotides for more accurate detection via nanopore. Both new techniques and refinements to existing methods are aimed at further decreasing the cost of sequencing, with some groups looking to read an entire human genome for $100 or less. Ultima Genomics met this target with its UG100 sequencing machine, unveiled in 2022 and shipped in 2024. Element Biosciences’ VITARI system, announced in February and expected to ship in the second half of 2026, achieved the same price point with a smaller device. The $100 price tag advertised by these companies includes only the consumables used by the machine itself, excluding labor, data analysis, and other costs.

Anyone able to approach this target stands to benefit tremendously, given the obvious demand for DNA sequencing. For example, recent years have seen the proliferation of cohort studies focused on clinical analyses of whole-genome sequencing data. These include the Stanford ELITE study, which is focused on identifying genetic determinants of aerobic capacity, and the NIH’s All of Us Research Program, which has sequenced well over 200,000 genomes in order to study genetic diseases.

Innovation in DNA sequencing will surely continue, but these five techniques have already transformed a feat that was impossible just fifty years ago into something that can be done overnight.

Correction: An earlier version of this article incorrectly claimed that Frederick Sanger is the only individual to receive two Nobel Prizes in the same field. We apologize for the error.


Evan DeTurk is an MPhil student at Cambridge in the history of science. He writes about biology and its history on Substack. Previously, Evan researched genome editing at UC Berkeley and earned an A.B. in Molecular Biology from Princeton.

Schematics created by Ella Watkins-Dulaney.

Cite: DeTurk, E. “A Visual Guide to DNA Sequencing.” Asimov Press (2026). DOI: 10.62211/58ew-79yt

1

While both projects sequenced DNA from multiple anonymous donors, Celera had mostly used Venter’s own DNA.

2

The human genome is three billion base pairs long but producing a correct sequence requires sequencing the whole thing many times over and computationally assembling all of those reads. For this reason, sequencing a human genome is much more expensive than sequencing three thousand megabases of raw DNA.

3

Holley’s method was distinct from Sanger’s. He began by isolating the alanine tRNA molecules and then cutting them into shorter pieces of unequal lengths, using enzymes. Then, he separated each fragment by size using column chromatography and ran various chemical techniques to figure out the sequence of each piece. Finally, he “aligned” these fragments by finding overlaps between them, thus reconstructing the full, 77-nucleotide strand.

4

At least, up until the maximum length created by the first DNA polymerase reaction.

5

DNA polymerase adds a nucleotide triphosphate to the growing DNA strand by promoting a cleavage between the phosphates. One of the three phosphates becomes part of the DNA backbone, and the other two are released as PPi.

6

Firefly luciferase is the enzyme found in the firefly abdomen that gives them their characteristic glow. It’s a common reporter in molecular biology because of its immediate read out and ease of detection.

7

The signal to noise ratio is the amount of an observed effect due to the intended process compared to other “background noise”. In this case, light resulting from nucleotide addition versus other chemical reactions. Luciferase recognizes and acts on dATP in solution, creating a spurious signal unrelated to nucleotide addition, which must be accounted for during analysis. Fortunately, the modified “A” nucleotide effectively suppresses this side reaction, improving the signal to noise ratio.

Read the whole story
mrmarchant
18 minutes ago
reply
Share this story
Delete

The Appalling Stupidity of Spotify’s AI DJ

1 Share
Am I naïve in expecting Artificial Intelligence to be smart? Is my interpretation of the word “intelligence” too literal? And when an AI behaves stupidly, who’s to blame? The programmers or the AI entity itself? Is it even proper to make a distinction between the two? Or does the AI work in so mysterious a way that the programmers need no longer take responsibility? ... more ...
Read the whole story
mrmarchant
19 minutes ago
reply
Share this story
Delete

Secret Agent Man

1 Share
Secret Agent Man

Some weeks, the education technology news is incredibly grim, and sorry to say this was one of those weeks. (Warning: this is a long email.) Indeed, anytime education- and child-related tech stories fill a Garbage Day newsletter as they did on Monday -- Garbage Day describes itself as a publication that “doomscrolls so you don’t have to” -- it’s just not a good sign.

Several of those stories fell under the subheader “Roblox, OpenAI, The New Web, And Radicalization,” including The Wall Street Journal’s coverage of the internal debate at OpenAI on whether or not to alert Canadian authorities about what eventual school shooter Jesse Van Rootselaar had been typing to ChatGPT. (OpenAI banned Van Rootselaar from the platform but did not alert police.) And 404 Media wrote about how Van Rootselaar had created a shooting simulator inside Roblox, a video game very popular with young people.

Garbage Day’s Ryan Broderick argues that

...both ChatGPT and Roblox are not traditional social platforms. We are very used to the social media wild goose chases that happen after mass shootings, where users scour public platforms for content that might provide some kind of insight into why the attack happened. The unspoken hope being that if we had just caught it in time, things may have been different. To say nothing of all the would-be attackers that are reported to law enforcement in time because of their Facebook or X posts. But apps like ChatGPT and Roblox are not simple feed-based platforms. They are far more reactive and personalized and we are quickly discovering how hard they will be to moderate.

Later in the week, Vulture published a Q&A with the head of Parental Advocacy at Roblox, who (no surprise) says “we’re all responsible” for kids’ online safety.

No need to worry. No need to regulate. No need to hold the company -- this company, that company -- responsible. We just need better “digital literacy.”


“Literacy.” Honestly, that word is getting to be some real bullshit. Often, what’s framed as a “literacy” problem is actually the technology working by design, urging us to be compliant clickers.

But damn, “literacy” is such a friendly way to frame training and branding exercises. It sounds so progressive, so eminently philanthropically fundable: Digital literacy. Web literacy. Coding literacy. AI literacy. Gambling literacy.

[Record scratch.] Wait what? You haven’t heard of the latter?

Yeah, apparently some folks [cough] are trying to make it “a thing” – and what with the rise of sports gambling and prediction markets, I think we can see what the next ed-tech trend will be. At least Edsurge, bless their hearts, tried to make the case for gambling literacy this week: “The Math Skill Schools Should Teach — Gambling.”

When I texted a friend with a link to the article and my very savvy commentary “what. the. fuck,” I learned that the Alliance for Decision Education exists, its founder a former professional poker player. So that's something to look forward to once education inevitably pivots away from "AI" (as it did with MOOCs and adaptive learning and every other ed-tech trend ever).


Speaking of literacy-laundering, on Monday morning The 74 came out with “the exclusive”: “New Google Partnership a ‘Sizable Investment’ in AI for Teachers” – that is, a three year deal between ISTE+ASCD and Google to “offer AI training to ‘all six million K-12 teachers and higher education faculty’ in the U.S.” (Or as Ben Riley wryly put it, “Google and ISTE+ASCD announce new partnership to destroy US education.”)

This "sizable investment" (an undisclosed amount) will flow into ISTE+ASCD under the guise of "AI training" and “AI literacy,” the latter of which, as MIT’s Justin Reich told The 74, is a phrase that doesn’t really have an agreed-upon meaning, let alone being a thing with any substantive research supporting its application. (As Justin has argued elsewhere, we got “web literacy” really really wrong for a long time, and we miseducated a couple of generation of students as a result. So why exactly are we rushing into this whole “AI literacy” thing? I mean, other than the obvious grift, of course.)

Interestingly, the Pew Research Center released some survey data this week on teens’ use and views of “AI,” and somehow somehow without schools providing adequate (or any) “AI training," more than half of them are using “AI” to do their homework. Why, it’s almost as if, chatbots are just another in a long line of consumer-facing technologies, and akin to posting on Facebook or watching YouTube don't actually require any special courses or classes.

To be clear, when Google (or OpenAI or Anthropic or Microsoft or whoever) says they’re offering teachers and/or students “AI training” (let alone promoting “AI literacy”) what they’re really doing is brand marketing. This is simply an effort to get more users to outsource their thinking to their particular product.


Perhaps what we need is not technology training but technology un-training – the former is cognitive surrender; the latter will be the only way we can actually pursue learning. Of course, what anti-democratic billionaire technoauthoritarian would ever pay for that?


What’s the Point of School When AI Can Do Your Homework?,” asks Matthew Gault in 404 Media. Arguably, one point might be to help people not ask such stupid fucking questions.

The story covers “a new agentic AI called Einstein that will, according to its developers, live the life of a student for them. Einstein’s website claims that the AI will attend lectures for you, write your papers, and even log into EdTech platforms like Canvas to take tests and participate in discussions.”

And I know that we’re all supposed to freak out about this stuff and wring our hands and teachers are supposed to sign up for the “AI” training programs so that they can be “AI” ready and engineer “AI ready” classrooms and churn out “AI ready” students, but my god. This is bullshit. Einstein is bullshit. It’s a scam, a fraud, a con, a grift.

I mean, yes, it’s bad. The promise of a magic button that will do your work for you is bad. This is bad: the founder who says

agentic AIs are a method of freeing people from the labor of education. ‘I think we really need to question what learning even is and whether traditional educational institutions are actually helping or harming us,’ he said. ‘We're seeing a rise in unemployment across degree holders because of AI, and that makes me question whether this is really what humans are born to do. We've been brainwashed as a society into valuing ourselves by the output of our productive work, and I think humanity is a lot more beautiful than that. Is it really education if we're just memorizing things to perform a task well?’

But also: agentic AI cannot do all that, most of the time not at all, and even if it can, certainly not consistently – and that’s despite all the ways in which universities have sought over the past few decades to instrumentalize and standardize everything, primarily through the terrible technology of the learning management system. (Einstein claims to complete coursework hosted on Canvas.) Agentic AI cannot do this because all classes are different and all instruction is messy and no two professors, even those in the same department, teach the same or grade the same or set up their course in an LMS the same way. I know that the “AI” hype-monsters think we’re on the cusp of the “cheat on everything,” “automate education” world. But we’re not. (One demo, hell, 20 demos of an agentic AI posting on a discussion forum or answering a quiz question does not an agentic anti-education revolution make.)

That’s not to say that this threat is meaningless or irrelevant. We have to confront the beliefs and practices that underlie Einstein’s promises, let alone its supposed adoption. We have to challenge them and examine them with students, with professors, with university administrators, with parents, etc etc etc.

For the past seventy years or so, everyone has been told that education is a the silver bullet and going to college will lead to financial stability, if not success. That was a lie but not because learning is worthless or because school is a rip-off. Rather, it’s because capitalism never cared about your Bachelors degree, and society has been purposefully structured and restructured in such a way that opportunity has declined and precarity increased.


Two more school-related stories from that Garbage Day newsletter on Monday:


Yik Yak is back, I learned this week. The pseudonymous social network launched over a decade ago, but quickly shut down after a number of high profile scandals and cyberbullying incidents (and after college students lost interest in its toxicity). It was apparently relaunched a couple of years ago, because Silicon Valley insists on shipping their shitty, exploitative, democracy-destroying ideas to the young.

Any space, any place where there is a potential for community and growth will be surveilled and poisoned.


A little pushback on a comment that Justin Reich makes in that article in The 74 in which he claims that we don’t understand how LLMs work, that even Google engineers don’t understand how LLMs work. We do.

As Rusty Foster insists in a glorious Today in Tabs missive, AI, “isn’t a black box. It’s a statistical model of data connected to a mechanism for producing more data that resembles the data in the model.” Yes, sure, there’s a lot more math and a lot more code going on in there (and much of it is beyond my pay-grade), but it’s not actually a mystery, despite those heavily invested in the Great and Powerful Oz sort of rhetoric about the technology.

Sam Altman suggested this week (again) that humans are radically inferior to “AI” and, when challenged about the amount of electricity that it takes to manufacture Oz, made some dumb quip about the amount of food it takes to raise a human, an amount that results in a far inferior “intelligence.” What an utter misanthrope.

But Rusty writes something rather lovely/loving while comparing the ways in which “AI” ostensibly and children actually learn (the latter really is marvelous, miraculous, beautiful), as he offers a critique of a recent piece by Gideon Lewis-Kraus in The New Yorker:

Lewis-Kraus writes: “At the dawn of deep learning, a little more than a dozen years ago, machines picked up how to distinguish a cat from a dog… Once they had seen every available image of a cat, they could reliably sort cats from non-cats.” Later he asserts that “If a language model can bootstrap its way to linguistic mastery, we can no longer rule out the possibility that we’re doing the same thing.”

I’ve watched all three of my children learn what a cat is, and in each case the number of pictures of a cat they needed to see was not “all of them.” It was like, two or three? Half a dozen, tops. I helped them learn to speak and read fluently, and the number of Reddit posts required was not “every Reddit post.” I don’t need to know what mechanism underlies human intelligence to rule out the possibility that it’s the same as what a large language model does. The whole trick underlying the apparent magic of modern A.I. is simply giving it tons of data. Give it the whole internet. Give it every book ever written. This is required — it does not work with less training data....

Read all of Rusty's essay, particularly if you think that Anthropic are "the good guys."

And then let's ask this quite serious question: what is to be gained with arguing that humans have been surpassed by “AI”? Why push for an end to education? Why insist that your LLM is more intelligent than your child? What does this say about your belief in humanity, in your vision for the future? Why do "AI"' advocates hate humans so much, why are they so committed to engineering away all the complexities and richness of the human mind, the human life, the human psyche, the human experience?


Still more links, mostly without commentary:


They Built Stepford AI and Called It ‘Agentic’” by Abi Awomosu. "Women’s ‘ick’ for AI isn’t technophobia or a gap to close. It’s wisdom to act on.”

The industry narrative about AI automation tells a story about factories — robots replacing assembly workers, self-driving trucks replacing drivers. This is the visible, masculine-coded story about production.

But look at what’s actually being automated first: customer service (predominantly female), administrative assistants (94% female), data entry (predominantly female), scheduling and coordination (predominantly female), contact centers (70%+ female), emotional support (feminized).

The factory narrative is the cover story. The actual automation is happening in the reproductive economy—the care, attention, organization, and emotional labor that women have always performed.

The labor was always treated as mechanical. If a machine can do it, the implication is the work was never truly human. Essential but not skilled. Now it’s being replaced by software that doesn’t need to be paid.

Women don’t need “AI training.” Teachers don’t need “AI training.” They need their work -- all their productive and reproductive labor -- recognized and valued. Politically. Culturally. Materially.


Secret Agent Man
(Image credits)

Today’s bird is the Eurasian hoopoe. According to Wikipedia, “The call is typically a trisyllabic oop-oop-oop, which may give rise to its English and scientific names, although two and four syllables are also common. An alternative explanation of the English and scientific names is that they are derived from the French name for the bird, huppée, which means crested.”

And isn’t that just exemplary of Internet informatics: could be this, could be that, who knows, but let’s hit “publish” anyway. Wonder and curiosity once prompted more scientific investigation, but now we just have intellectual choose-your-own-adventures and chatbots that (wrongly) reassure their users that “that’s just how it is” and “there’s nothing more to know or do.”

There's plenty to do. There's plenty still to know.

Thanks for reading Second Breakfast. I'm exhausted.

Read the whole story
mrmarchant
9 hours ago
reply
Share this story
Delete

The Man Who Stole Infinity

1 Share

When Demian Goos followed Karin Richter into her office on March 12 of last year, the first thing he noticed was the bust. It sat atop a tall pedestal in the corner of the room, depicting a bald, elderly gentleman with a stoic countenance. Goos saw no trace of the anxious, lonely man who had obsessed him for over a year. Instead, this was Georg Cantor as history saw him. An intellectual giant…

Source



Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete

Say Goodbye to the Undersea Cable That Made the Global Internet Possible

1 Share

Jane Ruffino’s reported feature for Wired opens with what is, so far, my favorite piece of magazine art this year. (Nice one, Rob Vargas.) From there, you’re quickly aboard the Maasvliet, the diesel electric ship whose crew is tasked with hauling up thousands of kilometers of TAT-8—the first fiber-optic cable to span the seabed of the Atlantic Ocean, a feat that Ruffino describes as “practically tantamount to human galactic expansion.” An absolute showcase of explanatory writing, Ruffino’s story goes deep on the history and afterlife of the cables that enable digital communication, and grants the many people who manage such infrastructure some welcome visibility.

Fiber-optic transmission is a near-magical way of carrying information by pulses of light. Most people don’t even think about how quickly we’ve accepted instantaneous communication as normal, even those of us who can remember when an international phone call had to be booked in advance. The more people I meet in this industry, in this network of networks of people and things, the more insulting it sounds to hear that “we” only notice it when it breaks. (Who is this “we,” I always want to know?) Billions of people are able to walk around not noticing this infrastructure because of the daily work of a few thousand people, sometimes at sea, other times buried under piles of permits, surveys, and purchase orders for thousands of kilometers of cables that will join the millions of kilometers of cables on the seabed that ensure that our planet is continuously being hugged by light.

Read the whole story
mrmarchant
2 days ago
reply
Share this story
Delete

They’re Vibe-Coding Spam Now

1 Share

The problem with making coding easier for more people is that it makes spam more conventionally attractive. Which is bad.

They’re Vibe-Coding Spam Now

I have a problem: Unlike most people, I actually read my spam folder on a regular basis. (Often, they’re some of the most interesting emails I get.) I find spam to be intriguing, interesting, and often highlighting some modern trends.

And sometimes, it surfaces something I actually care about that missed my other folders, like an upcoming interview I’m excited to share with all of you.

But one thing about spam that has been true across the board is that it’s ugly. Really, really ugly. Often, what will happen with spam is that they’ll get your email address through questionable means, say a leak of your information in an exploit, and flood your inbox with some of the worst crap you’ve ever seen.

But recently, some of these clearly trash emails have gotten a design upgrade:

spam-screenshot.png
A spam email informing me that my fake cloud storage platform is full.

That is a relatively attractive spam email, trying to sell me on a scam. It is obviously the work of one Claude A. Fakeguy.

It has that swing. Other, less attractive spam emails also have this swing, such as this one:

UglySpam.png
A less attractive email informing me of upcoming video game addiction litigation. How did they know!?!?

But what I think the real tell is that these emails hang together when you have images off, which they did not in the past. This is a problem, because in your spam folder, images are automatically turned off.

Hence why this email warning me that my antivirus plus renewal failed now looks like this:

Warning.png
Oh no, what will I do on my Linux computer that doesn’t support your antivirus program?

This is a funny, if troubling element in the history of spam—and probably a spot of bad news for people who use vibe coding to actually make real things.

… You?
Sponsored By … You?

If you find weird or unusual topics like this super-fascinating, the best way to tell us is to give us a nod on Ko-Fi. It helps ensure that we can keep this machine moving, support outside writers, and bring on the tools to support our writing. (Also it’s heartening when someone chips in.)

We accept advertising, too! Check out this page to learn more.

freespins.png
The strange thing about spam is that it tells you what the internet’s underbelly is into.

The slop looks more competent than ever

Put simply: Now that the baseline of what makes something well-designed, albeit spartan, has increased, many of the signs we once used to detect a spam message are getting thrown out the window.

Which means that we’re more likely to get hit by spam that tricks us into clicking. And that’s bad news as we attempt to protect ourselves from the crap hiding in our inbox. We’re likely to trust less and accidentally give away more. And untrustworthy figures who don’t know how to code are more likely to throw more crap our way.

This is a point Anthropic itself pointed out in one of its own reports from last summer, about “no-code” ransomware that can be built by people incapable of actually building ransomware without the help of an LLM.

Despite this, these people can create commercial malware programs that they can sell for up to $1,200 a pop.

The security platform Guard.io makes clear that platforms like Lovable are going to enable a new class of criminal:

Just like with “Vibe-coding”, creating scamming schemes these days requires almost no prior technical skills. All a junior scammer needs is an idea and access to a free AI agent. Want to steal credit card details? No problem. Target a company’s employees and steal their Office365 credentials? Easy. A few prompts, and you’re off. The bar has never been lower, and the potential impact has never been more significant. That’s what we call VibeScamming.

And, for people who vibe code, the real problem is that, long-term, their stuff is going to look very untrustworthy because of the specific mix of chrome, color, and emojis that vibe-coded applications specialize in.

The thing that ultimately makes something look human is the addition of actual design and human flair. I encourage you to actually put a little humanness into what you build if you’re going to do it and share it with the world.

How to spot a vibe-coded faker

But for many, it is going to be harder than ever to tell what’s real and what’s fake. Which means you should probably go out of your way to use techniques like email obfuscation and email aliases to protect yourself. (It makes it easier to tell which bread-baking forum violated your trust, for one thing.)

On the plus side, there are still tells. A key one is if they refer to you by not your name, but the name of your email address. Another is the from address, which is often some highly obfuscated bit of junk designed to evade detection.

The one that made me laugh recently was when I got really crappy spam emails on an address that has never gotten them for the first time, promoting traditional spam topics with a Claudecore flair. They seemed random, but were extremely easy to get rid of, because they were all emailed from a bare Firebase domain, meaning that I could remove them with the help of a single filter.

Just because spam emails are more attractive now doesn’t mean the people making them aren’t still extremely stupid.

Spam-Free Links

A quick shout-out to the only tool that makes my inbox bearable in 2026, Simplify Gmail.

Oh good, there’s a new web browser for PowerPC Macs in 2026, and per my pal Action Retro, it’s quite good!

Speaking of inboxes, this story of an AI safety exec letting an AI tool delete her inbox is so darkly funny that I’m surprised it’s real.

--

Find this one an interesting read? Share it with a pal!

Want to actually learn how to code with minimal vibes? Check out our sponsor Scrimba, which mixes video lessons with interactive code windows—and makes it feel downright approachable. Sign up here for a 20% discount.

Read the whole story
mrmarchant
2 days ago
reply
Share this story
Delete
Next Page of Stories