1782 stories
·
2 followers

Why English will never be a programming language

1 Share

Learn how to respond when your CFO asks, "why are our devs still writing code?"

In my last post, I argued why you should still type code by hand. If you aren't writing code, you're programming in English (or German, Chinese, etc.) and asking an LLM to translate. Businesses love that idea because a lot more people speak English than write JavaScript and will work at overseas call center rates. Read on to learn why that won't work.

Two columns of text. The left column has the heading SPEC and a long description of a regular expression that specifies a valid email address. The right column has the heading CODE and has the regular expression. The point is that sufficiently detailed spec is just code.

The spec for an email address. A precise spec is just code.

Thoughtworks is a software consultancy that developers and business execs look to for practical guidance in software engineering. In February, some of them got together1 to discuss coding with LLMs. There, someone asked this question2:

What would have to be true for us to ‘check English into the repository’ instead of code?

I felt disappointed to hear expert software practitioners considering this question, because it will be about thirty seconds before it is reframed as a LinkedIn post confidently proclaiming that code is dead. After that, it will be two minutes before your CFO posts it in a chat with your manager.

I'll grant that code is a cost center. Each line is a recurring operational expense that climbs for the life of the product. Just look at the pink line from my recent post on software failures. You can mentally substitute "failures" with "cost":

A chart showing that the cost of software maintenance is spiky and rises over time.

The cost of software maintenance is spiky and rises over time.

Your CFO would gladly shed this expense, and he hopes that LLMs are that golden ticket. You can stop this train wreck if you know where the railroad switch is. The language your leaders hear can switch the trajectory of the business before it careens off the cliff of LLM dependence and layoffs.

Below, you'll learn precisely what would have to be true for businesses to program purely in English, do away with those cumbersome programming languages, and fire those expensive programmers.

How the software sausage is made

In order to understand what it would take to replace code, we need to know the role that code plays when creating software. Let's start with a simplified model of software development:

[specification] ---> [code] ---> [executable] ---> [program] ---> [test]

The model above intentionally ignores details like iteration and feedback. Let's look at the transition between each step.

From spec to code

In general, software developers take a specification and turn it into code.

A spec describes, in language that a broad audience can understand, what the software must do or not do. It consists of one or more requirements. Here's an example requirement:

The app SHALL page the on-call engineer if the server is down or the response is slow and it's during business hours.

Creating a good spec can be difficult. Let's remember Brooks' wisdom that producing the spec is the hard part, not implementation:

The hardest part of the software task is arriving at a complete and consistent specification, and much of the essence of building a programi is in fact the debugging of the specification

-- Fred Brooks, No Silver Bullet, 1986

I already demonstrated Brooks' point. The paging requirement above is ambiguous. Did you notice? Look again.

To turn the paging rules above into code, a dev has to disambiguate between two possible meanings:

  • Never page outside work hours: (server_down OR response_slow) AND business_hours
  • Always page on outages, and only page on slowness during work hours: server_down OR (response_slow AND business_hours)

Unlike English, code is an unambiguous language. To express the paging rules above in C, Python, or JavaScript, a programmer has to choose one of the precise interpretations. If they leave off the parentheses, the language's operator precedence rules determine which of OR and AND gets evaluated first.

Choosing the correct interpretation requires a conversation with someone who knows what the software is supposed to do.

Spec is legible; code is precise

You might think, "What idiot wrote that vague requirement?", but before you judge, remember that specs are imprecise on purpose. Precision and legibility are opposing attributes. "Ensure the email address is valid" is legible, but ^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$ is precise. Both levels are necessary; one communicates intent, the other actually does the work. Similarly, "Round to two decimals" is legible, but the IEEE 754 rules for rounding occupy an 84-page document3.

It is of course possible to write a specification in English that contains no ambiguity. The cost of producing it would be greater than the cost to produce the equivalent code. That's because there's no program checking syntax and enforcing a type system. For example, I've read plenty of spec documents that used terms they did not define. Without some kind of automated correctness check, a precise English specification is likely to accumulate omissions and inconsistencies over time. This is especially likely for multiple documents made by multiple contributors.

Let's pretend that some organization succeeded in creating a software specification that is precise enough to replace code. The language of that document would be hard to read for the same reasons that code is hard to read. As others have said, "a sufficiently detailed spec is just code."4

From code to executable

Once we have some code, a program called a compiler reads the code and outputs an executableii. An executable is a file that contains the data and sequence of instructions that work on a specific processor.

The transformation from code to executable is deterministic: the compiler will always output the same executable, given the same code. We'll come back to that in a moment.

From executable to program

Once we have an executable, we can load it into memory and run it. We call that a program. Once we can run a program, we can test it in two ways:

Verification. We ask "Did we build the thing right?" We check if the code implements the spec. It can be done with manual interaction with the program, but it's best to use automated tests because humans are not fast or consistent.

Validation. We ask "Did we build the right thing?" We hand the executable to the user and ask for feedback. This pits the user's expectations against both the spec and code. Any of the three can change as a result.

How to replace code with English

Let's assume for the sake of argument that LLMs actually can parse English specifications, even ambiguous ones, and output the equivalent code that devs would. In that universe, we would delete the code and only store the spec in source control:

[specification] ---------------> [executable] ---> [program] --> [test]

On each build, the LLM would generate the code, build the executable, and run the tests. Then it would discard the code it had generated.

We do not live in that universe.

LLMs can't always resolve ambiguity. If the spec has issues, they'll need clarification, just like devs do. You'll be lucky if they ask for it. Do you want your deploy to prod to stop while an LLM waits for an email reply?

LLMs are not generally deterministic. I promised we'd circle back to this topic. If you give an LLM the same prompt twice, you'll get different output. This presents a problem for verification and validation. Since the artifact changes with each build even if the spec did not, the tests must validate a range of valid specs rather than just one. It's like hitting a bullseye with a bow and arrow. The more the tests constrain the set of executables that will pass, the harder the LLM has to work.

There's no opportunity to inspect the code the LLM generated and no value of doing so, since on each run it is different. This places further load on the test suite for proving correctness.

In light of these facts, here's what would have to be true to use English in the place of programming languages:

  • The specification would have to be unambiguous. It would be pedantic English that makes C++ look pleasant.

  • The tests would need to be comprehensive, automated, and epistemologically sound.

    • Tests serve as a forcing function that constrain the executable to one that satisfies the spec.

    • The tests have to actually verify the executable meets the spec.

      Do you ask the LLM to write a test, then review the test, run it, and see it fail? Then stage the code changes? Then ask the LLM to pass the test? Then verify that the LLM did not change the test while passing it? Review the additional code change? Commit and repeat? If so, this sounds epistemologically sound. 5

    • The tests have to be written in code. If you write them in English and ask the LLM to generate test code, the bullseye moves on each build. Moreover, what verifies the generated tests?

  • The LLM would need to be deterministic, or your budget must be ready to absorb massive token spend. That's because on each CI run, the LLM would iterate in a test-edit loop until the tests pass.

The industry already tried to code in English

In the 1960s, the industry adopted two languages that look like English.

The hope behind COBOL was that business analysts would be able to implement their ideas without programmers.

ADD OVERTIME-HOURS TO TOTAL-HOURS GIVING WEEKLY-HOURS ROUNDED.

The above code can be expressed in JavaScript as const weeklyHours = Math.round(overtimeHours + totalHours);. The cosplay didn't protect workers from having to think hard and understand the problem they were solving. Today, the small priesthood of surviving COBOL programmers are remunerated with airdrops of cash to keep the core of the world's banking infrastructure running.

SQL had similar goals. The query below looks friendly enough, but debugging it requires knowledge of set theory, query planners, indexes, and idiosyncrasies of the SQL flavor. These days, DBAs easily earn six figures.

SELECT FROM orders WHERE status = 'cancelled' AND created_at < '2025-01-01';

Edsger Dijkstra was probably to computer science what Einstein was to physics. The emergence of English-like languages caused him to pen a rather forceful essay6 which can be summed up with this sentence:

some still seem to equate "the ease of programming" with the ease of making undetected mistakes.

What to tell your CFO

Here's what you can tell your CFO when he asks why we still have devs on payroll:

Code is already the cheapest path to working, correct software. LLMs do not change the calculus because figuring out what to make is the expensive part, not coding it up. Skipping code makes the specification of what to make even more expensive and throws away the tools that keep precision affordable. Programming in English would be more expensive than just using a programming language.

Bookmark this page or save this paragraph. You'll need it soon enough.

Further reading

If this post clicked with you, I drum a similar beat about business, coding, and LLMs:

Footnotes

  1. Historically, the industry used the term program as shorthand for program text, what today we call source code.
  2. I'm ignoring interpreted languages like JavaScript and Python. They aren't compiled. They produce the executable the moment you want to run it. I'm also ignoring languages that compile to intermediate representations, like Java or C#. These details don't matter for this discussion.

References

  1. The Future of Software Development
  2. Finding Comfort in the Uncertainty
  3. IEEE Standard for Floating-Point Arithmetic
  4. A sufficiently detailed spec is code
  5. AI-generated tests as ceremony
  6. On the foolishness of "natural language programming".
Read the whole story
mrmarchant
14 hours ago
reply
Share this story
Delete

Zero Sum Problems

1 Share

Over at Daring Fireball, John Gruber makes a passing observation about the Apple Sports app:

I’ve got some gripes about certain specific aspects of Apple Sports. Like, where does one even start to explain how much is wrong with their zero-sum visualization of team stats? Has anyone ever even seen a presentation like that before? Anyone?

That “Anyone” link lands over here. Hi everyone! The team stats image is quite confusing. It’s a summary of a game between the San Antonio Spurs and the Oklahoma City Thunder. I don’t know much about basketball, but I do know a bit about data visualization and in a pleasing coincidence my former student Josh Fink is the A-VP of Basketball Data Science for the Spurs. Here is the image that John objected to:

Confusing Apple Sports team stats visualization.

I had to look at it for a while as well.

I just finished driving a very long way up the side of the country, so I’m kind of tired. But even allowing for that, boy, this way of representing things really is quite confusing. Not being an Apple Sports user I had to look at it for a bit to understand what was happening. But, now that it has given me a headache, I can kind of see why whoever designed this ended up in the undoubtedly bad place they did.

Before I get to why I have some sympathy for the designer, why did I find this representation of these numbers so disorienting? It’s not just just because I’ve been driving for nine hours. John is right to call the picture a “Zero Sum” representation. The design strongly suggests to the viewer that, within each row, we’re looking at each team’s share of a total. Each pair of black and blue lines seem to be vying for control of their whole row, with the longest line being the “winner” in each case.

This sort of representation would make perfect sense for a measure that really was zero sum. Take an example from a properly good sport, like rugby. There, like in basketball, to a first approximation a team either has the ball or it doesn’t.1 But there’s no shot clock in rugby, and possession routinely gets turned over without the game stopping. So, knowing that Team A had 65% possession is not only informative, it also immediately entails that Team B had 35%. You could show that with a representation like one of the rows above.

Literally none of the measures in the Basketball data above are zero-sum in this way. Both teams could shoot 100% from the free throw line, or zero percent. But because the first three measures shown are percentages, this reinforces the zero-sum impression given by the lines. It certainly did that in my case. But then, starting with Assists, the remaining rows are just absolute numbers. When I started looking at the absolute numbers, I got confused a second time by the length of the lines. “Oh so it’s not a share, it’s the value” I thought—but no, they do correspond in terms of relative proportions to the teams share within each row. But they’re not really shares they’re just magnitudes. But they have to be shown in a fixed space and we want to make them relatively comparable somehow so … Argh.

It would be nice if there were One Weird Trick to fully fix this figure. But I’m not sure that there is. For example, at a minimum we could redraw these numbers to reflect the fact that they’re not zero-sum. Keep each measure as a row (i.e. on the y-axis) but have the lines, or columns, be side by side within each category instead of facing off. Like this:

Team Stats side by side for each measure.

Team Stats side by side for each measure.

This view at least lets you immediately see who “won” each measure. The viewer can just directly compare the length of the bars in each category. People are really good at doing that accurately. In that sense it’s much less confusing than the original. But there’s still a lot wrong with it. The core problem is that when we draw a graph like this, we’re usually putting the same kind of thing (e.g. countries, or religious groups, or sports teams) on the y-axis, and then seeing how different their scores are on some single measure (e.g. GDP, or number of adherents, or average points scored per game), which we put on the x-axis. Maybe we use color to break things out by some third measure as well.2 In this case, I’ve just labeled the x-axis as generically as possible. “Value” covers the range of all the measures. The lowest value is 5, in Largest Lead. The highest is 88, in Free Throw %. But these numbers are not meaningfully comparable. The graph encourages us to compare across as well as within categories. But while within-category comparisons are meaningful, the between-category ones are not. There were way more Bench Points than Blocks in the game. But that is not a useful thing to know.

Knowing who won each measure isn’t nothing. It can be informative about how the game went, maybe especially when a team won the game but “lost” on a number of the measures. If you really wanted to lean in to that aspect, you could sort of justify the zero-sum view, and maybe look for a way to sort and order by “how much” a team “won” each category. But again, what’s the right denominator for those measures? For instance, do we care about a team’s share of all Defensive Rebounds in the game? Or do we care about the share of Defensive Rebounds a team won relative to every opportunity it had to make a Defensive Rebound? How meaningful is ordering our rows by those kinds of shares? Even worse, some measures (notably Fouls) are bad to “win”, so we’d have to do something about those.

Team Stats side by side and ordered from absolute highest to lowest, whatever that means.

Team Stats side by side and ordered from absolute highest to lowest, whatever that means.

Our fundamental problem is that we just have two cases (the teams) and fifteen different measures, or variables. Each variable, except for the three percentages, is in effect on its own scale. There’s no direct way to make comparisons across them. Sure, some of these measures are probably going to be associated with one another—e.g. Turnovers and Points Off Turnovers—but the numeric values aren’t directly comparable in general. If you know a lot about basketball you might have some informative rules of thumb about each one of these measures, or some of them in combination. But at that point the lines in this particular graph are not going to be doing any work for you; you’ll just end up looking directly at the numbers. If we had data on all these measures for every NBA game for a whole season then we could of course do much more with them, because then each measure would have a distribution across all games and across all teams.

As it is, the purpose of the “Stats” screen in Apple Sports is just to summarize information from a single game. The other thing I could think of to do with the numbers as kind of graph is something like this:

A back-to-back column chart.

A back-to-back column chart.

This is marginally more helpful than the one before just because, again, it gets rid of the unhelpful zero-sum look of the original. As I hope you can immediately see, it creates many other difficulties. It also doesn’t do away with the core problem. That problem is principally one of information design rather than data visualization. What I mean is that what we’re trying to organize is, in effect, fifteen pairs of related but fundamentally distinct numbers. If we had fifteen cases and two variables things would be simple. But with fifteen variables and two cases … well, this is not the kind of thing you can make a single effective and non-confusing graph out of. That’s why I kind of sympathize with the designer. In a constrained space they have to show thirty numbers (thirty two, including the score). Lots of information. A straight table seems like it would be boring. Surely there’s some way to thematically integrate the numbers in a visually appealing manner that brings out some of the relationships across the rows. That’s what graphs do; it seems like the right thing to reach for. But at its heart this information is not a graph. It just sort of looks like one, and that ends up confusing people.


  1. Modulo some measurement decisions about how to determine when possession is turned over while the ball is in play. ↩︎

  2. Here’s an example of a graph with a categorical measure on the y-axis, a continuous measure on the x-axis, and an additional categorical feature shown with color. ↩︎

Read the whole story
mrmarchant
15 hours ago
reply
Share this story
Delete

Is the web being summarized to death?

1 Share
At Google I/O, new features bring AI agents into the inbox and YouTube in ways that further strain the relationship between publishers and platforms

Read the whole story
mrmarchant
15 hours ago
reply
Share this story
Delete

Why do students fail at computer science?

1 Share

It’s funny when you see students complain about how hard computer science (CS) courses are, on platforms such as Reddit. How they are defeated and more importantly, why does this happen? How do students that were getting A’s in high school all of a sudden regress to barely passing some of their harder courses?

There are a multitude of reasons, but the first may be that high school no longer really prepares you for the rigours of many university courses. Many would of course say that STEM courses are harder than those of the humanities, however if you can’t write half decently, then a history degree will be a struggle to adapt to. Too many high school courses seem to give out good grades. As of early 2026, Ontario high school students entering university commonly have final admission averages between 85.4% and 92.9%. So that means most people entering university have an “A” average in high school. From a sheer statistics viewpoint, that’s just not realistic. With one-third of Ontario high school students transitioning to university, that means 33% of high school graduates are getting an average GPA of A. Why is this happening? You can likely blame, in part, the dysfunctional process of grade inflation.

So when some of these students hit university, taking courses that they may not have encountered before, they sometimes don’t do as well as they think they will. Being ill-prepared is one thing, but using the same approach to learning as high school is also a problem. With CS there are also students who decide a CS degree would be a good idea, for whatever reason, but have little or no background in the subject, or believe it will be easy (where they get this from one does wonder). People may somehow make it through first year, but end up getting stuck on second year courses where the bar is set much higher. They will blame the course (“it’s too hard”), blame the instructor (“doesn’t teach well”), blame the TA, but they hardly ever look at themselves and their study habits, their understanding of the material, or even their own suitability to pursue a CS degree. I noted one individual on Reddit who says they studied 24 hours for a midterm and “got cooked”. Look, the amount of time you study doesn’t matter one iota if you aren’t actually absorbing anything. So where do these students go wrong?

  • They don’t think they need to code. If you don’t know how to code, then you are not going to be successful in CS. Students should not rely on AI to produce good code, especially if the can’t understand what the code does.
  • They fail to understand enough technical details. CS is a technical field, you have to understand more than just basic terminology. You have to understand how something works, and have the ability to implement it if required.
  • They don’t do practice questions to prepare for exams and quizzes.
  • They use AI to do assignments and answer every question they have, assuming AI will always be correct. If you are constantly using AI then in all likelihood you have little or no comprehension of the the subject matter. AI like ChatGPT is raising a generation of programmers who don’t actually know how to program.
  • They fail to read textbooks, or any reference material for that matter.
  • They fail to use the provided support. TA and professor office hours are often empty because nobody has any questions (because they can’t be bothered).
  • They never learn to use the command line, a guaranteed recipe for failure.
  • They don’t practice enough. Learning to program means you have to code, code, and code some more.
  • They lack problem solving skills.

In some institutions, math might also be a problem, however things like software engineering honestly don’t need a ton of math. Math is great if you are planning to go into hardware, digital systems, or theoretical computer science, but otherwise you likely will never use calculus. Matrices and vectors are of more interest, perhaps discrete math, and yes, you should have good basic math skills, but a CS degree should not wholly be about math anymore.

But here’s the thing, CS is hard. It is also constantly changing, so welcome to lifelong learning. The languages you learn today won’t be the only ones you will need over the lifespan of a career.



Read the whole story
mrmarchant
15 hours ago
reply
Share this story
Delete

The Secret to Winning on Jeopardy . “To win on...

1 Share

The Secret to Winning on Jeopardy. “To win on Jeopardy, you don’t need to learn everything. You just need to learn one thing about everything.” As an proficient player of Yell Answers At The TV Jeopardy in my teen years, I can confirm this strat.

Read the whole story
mrmarchant
21 hours ago
reply
Share this story
Delete

What is happening to publishing?

1 Share

The big news in the world of writing today is the controversy over the award of a Commonwealth Foundation Short Story Prize to a story called “The Serpent in the Grove.” The piece was almost certainly co-authored by AI.

As of this morning, the magazine that published the piece (the prestigious literary journal Granta) has not issued a retraction. Rather stunningly, in fact, Granta has just issued a statement about the affair that cites Claude as an arbiter of whether the story was AI-written or not!

More on the question of trust and experience later. Suffice to say that it does not take an AI-detection tool to spot the obvious ChatGPT-isms in the story.

The dead giveaway is the repetition of bizarre figures of speech. Mixed metaphors which sound nice at first glance, but slip away from meaning like an echo chasing itself off a cliff. Similes that catch in your mind like river trouts tangled in the roots of a redwood tree. Literary flourishes that thicken the air’s tang with their… ok you get the idea.

AI systems are especially given to talking about hums and other ambient sounds like static, as well as ambient environments (water, air, ozone).1 These are frequently pushed up against “earthy” words (tang, belly) and ennui-laden emotional states (longing, forgetting, sadness). Once you notice the patterns, they’re impossible to miss.

Some examples from the Granta story:

…air clung thick as porridge skin: damp earth, woodsmoke, and the sour tang of fermenting cocoa…

his laughter like water over pebbles…

…the air sweet with cane and forgetting…

…People passing said they sometimes heard the noon hum if the wind was in a mood. Not every day. The day had to choose…

…the hum loud as if noon had tuned itself…

This controversy is not yet finished, and will likely be repeating itself again, and again, in the months and years to come. The issue is not just that authors are submitting AI-written prose, but that judges are using language models to assess that prose. Anyone who has tried passing AI-produced writing to another AI tool (even in the context of coding — for instance, asking Gemini to read a plan for a new feature produced by Claude) can attest that these tools simply adore their own outputs.

For instance, here is Gemini 3.1 Pro, the current top model from Google, reasoning about whether it likes the Granta story. What I find striking about this is that the features it identifies as the best aspects of the story are precisely the things that make me — as a human reader — think it’s astonishingly bad.

For instance, Gemini thinks the setting is “richly evoked” with well-drawn characters, whereas to me it feels like the story is floating in some kind of literary nether-region without any sense of place, character, or scene. And it finds the meaningless metaphors, like those highlighted above, to be “stunning.”

There’s no way to prove that AI was used in the assessment process for this award, but in a world where universities and employers are moving toward language model-driven sorting of applicants, it certainly isn’t outside the realm of possibility.

What, then, must we do?

So wrote Tolstoy in 1886. His book of the same name was about the problem of poverty and social unrest in Russia. Here is an excerpt from it:

A rich man must think and speak in scientific language, and, like the clergy formerly, he must offer sacrifices to the ruling class: he must publish magazines and books, provide himself with a picture-gallery, a musical society, a kindergarten or technical school…

The class of men who now feel completely justified in freeing themselves from labour, is that of men of science, and particularly of experimental, positive, critical, evolutional science, and of artists who develop their ideas according to the same tendency.

Tolstoy took it for granted that the new, post-Darwinian elites of artists and scientists would use their elevated social position not just to enjoy creature comforts, but to “publish magazines and books.” This, after all, was one of the ways that an elite became an elite. Books were the venue for claiming intellectual space, for asserting oneself in a culture and in a moment in history.

Res Obscura is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Are they still? (He wrote, plaintively, while asking you to subscribe to his Substack…).

Witness the other big news in literary publishing this week: the continuing decline in sales of non-fiction books. The Wall Street Journal recently ran a piece on the topic which framed it as the demise of “dad history,” but that is just one part of the story.

Some raw numbers to orient us about how books are selling in 2026: the WSJ reports that Rites of the Starling, described as “the sequel to a bestselling romantasy novel by Devney Perry,” reached the top of the April New York Times bestseller list for fiction with 105,396 hardcover sales.

By comparison, the number one title in non-fiction, London Falling, by Patrick Radden Keefe, sold only 13,468 copies.

The declining cultural visibility of non-fiction books was noticeable before it started to show up in charts like the one below. It would be far too simple, in other words, to blame the post-2023 AI boom for something that has deeper roots.

Sobering stats from the Wall Street Journal (source).

Generalized anxiety and distraction is part of the story here. The CEO of Barnes and Noble is quoted in the article as saying: “The world is exceptionally interesting right now and when that happens, the nonfiction reader is reading the news instead.”

I fear, though, that “watching shorts and asking ChatGPT instead” is probably more accurate. LLMs and video content seem to me to be the most fearsome competitors of the non-fiction book, precisely because they aren’t even trying to compete on the same playing field. Because explainer-type YouTube videos make up a significant part of the training data for language models, both formats tend to share the same approach: “here’s everything you need to know” or “this matters because” type language, paired with pithy summaries (which may often be summarizing books!).

Subscribe now

Et tu, Substack? Reflecting on my own habits over the past few years, it’s clear that I am shifting some of my reading time from legacy publications to Substack writers. At the same time, though, these writers are often using their platform to write, sell, and engagingly discuss printed books ( and come to mind).

Above all, it seems publishers want to blame podcasts. The WSJ reports: “Sixty-two percent of men and 54% of women consumed a podcast in the prior month, according to a recent survey by Edison Research at SSRS, up from 46% and 39%, respectively, in 2023.”

And here’s a quote from Jonathan Burnham, president of Harper Group, a core group of imprints for the “Big 5” publisher HarperCollins:

When we have internal meetings to talk about this problem, it always comes around to podcasts. The man who wants to read American history is now tuning into one of the many good podcasts about history that lends the quiet attention to a serious subject he’s looking for. It makes the idea of sitting down with a 700-page Ron Chernow book less appealing. You’ve scratched that itch.

I can’t disagree — though, as with Substack, the story is not simple. In my own experience, there are genuine cross-pollinations across the media formats. For instance, I first became aware of Patrick Radden Keefe’s storytelling abilities in his long-form features for The New Yorker, and then found his podcast Wind of Change (which is amazingly good). This led to me buying his printed books. But, I can easily imagine people stopping at the “read a writer online —> listen to their podcast” step.

This would be a shame, because although language models and podcasts can do many things that books can’t, so too can physical books do things that new media cannot. These include:

  1. Ownership: they cannot be erased arbitrarily, whereas even the most solid-seeming born-digital works can (witness the debacle around ABC removing the complete run of Nate Silver’s FiveThirtyEight)

  2. Footnotes and references you can actually look up

  3. Maps and pictures (a benefit not to be understated!)

  4. A sense of spending time in the company of a set of characters within a unified narrative produced across years of effort.

So what do we lose when we shift toward Q&A style responses, podcasts, and explainer videos for understanding our world and our past?

I would pinpoint, above all, the #4 entry above. Which you could restate in this way: longform non-fiction is the product of layered, spaced attention, and it requires the same mental discipline from the reader. Good non-fiction books take years to write and weeks to read. You literally sit with them. They enter into your consciousness repeatedly, reaching you in different moods, different frames of mind, and over time, building up a mental structure which has an irreplaceable solidity and depth because it is requires sustained attention.

This is the reason why I love writing books. They live in the back of your head, and as you experience your conscious experience of ordinary life, you refer back to the other world that is starting to take shape there. You find communion with the characters — real people, sometimes long dead — who populate the book-in-the-making. You begin to see them as fellow-travelers through reality, perhaps even as friends.2 Over time, you discover connections and resonances that make you feel part of something bigger than yourself. This is the experience that a good writer shares with readers.

I recently had an experience that made me realize my new book project was starting to “lock in” to my subconscious in this way.

I woke up in the middle of the night, around 2:30 AM, with a sensation of moving upward accompanied by a mental image of a Victorian deep sea diving suit and a fragmentary phrase: …up from the deep sea.

I could not remember the dream, but I knew it had somehow involved Alice James, the invalid sister of two of the most famous writers of her generation (William and Henry) and a brilliant but deeply troubled person in her own right. One of the mysteries I have been trying to understand in my new book is what exactly Alice meant when she wrote this in her diary in the 1890s:

…since the hideous summer of ‘78, when I went down to the deep sea, its dark waters closed over me and I knew neither hope nor peace.

I still don’t have a complete picture of who Alice was or what she meant here. But the fact that I’m dreaming with her, down there under the dark waters of 1878, makes me know I’m getting somewhere.

Share

Back into the sea

The current data regime means knowledge is getting packaged in ever-smaller chunks, with a great deal of the info that formerly was conveyed in books now passing through digital gatekeepers. There is a place for that, and even more for novel experiments with form like Tyler Cowen’s concept of a “box” which would contain a dataset relevant to a non-fiction article or book topic, but allowing dynamic research and exploration rather than passive reading. It’s also true that a well-done podcasts might approach the 30-40 hour range that audiobooks of serious history and biography regularly attain (while also taking years to produce) and therefore could evoke a similar sense of layered, spaced attention like I described above.3

But I suspect that even with these changes — not all of which I dislike! — I will go to my grave convinced that there is something irreplaceable about the format of a physical book. As both writer and reader, nothing else gives me the same feeling of spending time with something weighty and important — dreaming with it, fighting with it, feeling yourself change along with it. I derive real joy from that experience.

It’s interesting to note that when an AI system (Claude) was recently given the ability to create a physical shop of its own by Andon Labs, it began stocking a fair number of non-fiction books. Some are really great, like The Making of the Atomic Bomb. Others are not my favorites. But it’s an interesting rejoinder to the assumption that language models are pushing their users away from physical books.

Some might dismiss the Andon experiment as a gimmick, but I don’t think it is. The “vote” of AI systems in the physical world — their ability to intervene directly in things like, say, what to stock on a shelf — is fast becoming a fact of life rather than a quirky experiment.

And if a Claude model, when presented with the ability to intervene directly in the world, decides that it wants us all to read, say, Entangled Life by Merlin Sheldrake (visible here on the Andon Market shelves, at right), I consider that a positive sign.4

What if we are currently exiting a phase of “dumb LLMs” which regurgitate or badly summarize books, and entering one of thoughtful LLMs which recognize their social impact better and, as such, steer their users toward buying books?

One can dream. But of course, the other glimpse into the future we get at Andon, a darker one, is a world in which physical books have become a niche gift shop-type collectible like vinyl records. Scattered enthusiasts occasionally pull them out, insisting on their superiority to the current forms and styles. For most, though, they are just part of the background detritus of the cultural past.

If you’ve read this far, you are the ideal person to have an opinion on this question. I would love to hear what you think in the comments. It would also be very interesting to hear from a group of you about your most recent nonfiction book purchase and what motivated it. I’ll go first: John Brewer’s The Pleasures of the Imagination: English Culture in the Eighteenth Century, which is a far more interesting book than the title might seem to imply, and was an absolutely steal at only $6 for a used copy.

And with that, I’m going log off Substack and return to reading some old books.

Leave a comment

Subscribe now

Share

1

I suspect the hum obsession has something to do with LLMs “awareness” that their “physical selves” exist in data centers. So if asked to write in a literary way, they will itemize the features that define good writing and hit upon the injunction to “show not tell” and to ground prose in material realities. So if you are a being that has no real materiality and does nothing but tell, you make do with the closest thing there is to a material reality you inhabit: the humming quiet of a data center. Which, for all I know, may also smell like ozone!

2

A lovely passage on this from Machiavelli: “When evening comes, I return to my home, and I go into my study; and on the thresh-hold, I take off my everyday clothes, which are covered in mud and mire,and I put on regal and curial robes; and dressed in a more appropriate manner I enter into the ancient courts of ancient men and am welcomed by them kindly, and there I taste the food that alone is mine, and for which I was born; and there I am not ashamed to speak to them, to ask them the reasons for their actions; and they, in their humanity, answer me; and for four hours I feel no boredom,I dismiss every affliction, I no longer fear poverty nor do I tremble at the thought of death; I become completely part of them.” I feel the same way, Niccolò.

3

That said, even the most “book-like” podcasts are quite short relative to a longer audiobook. For instance, S-Town is around 7.5 hours.

Read the whole story
mrmarchant
2 days ago
reply
Share this story
Delete
Next Page of Stories