Artificial intelligence is trained on data. It will process billions of words of human text, countless images, and the inane, ridiculous questions of its human users. It will learn to write in the active voice most of the time, and to keep sentences under 200 characters. It will learn that dogs have four legs and the Sun is normally yellow. And it might learn that Lorraine Woodward of Ontario wants to know how to prevent the buildup of ear wax.
Most of what we feed into AI has been made by a human — human art, human text, human prompts. And so, it’s clear that AI will inherit the biases and prejudices of human intelligence. For example, a lot has been written about how “racist” and “sexist” AI is.
“Draw a picture of a doctor,” we might prompt.
AI whirrs through its stock catalogue, where 80% of its doctor images are white, male, and gray-haired. It creates the most likely image of a “doctor” the user requires. Is this a kind of racism and sexism? It certainly propagates both, but it’s not really the AI’s fault. It’s ours.
In this week’s Mini Philosophy interview, I spoke with anthropologist Christine Webb about human exceptionalism — “the belief that humans are the most superior or important entity in the Universe” — and how it leaks into our science, ethics, and, increasingly, our AI. Her worry isn’t just about the limits of artificial “intelligence,” but also how damaging it might turn out to be.
The value-free science full of values
A lot of Webb’s work, both in and outside of her book, Arrogant Ape, is directed at calling out “anthropocentrism” in science and technology. Webb has argued that human exceptionalism embedded in mainstream scientific practice has shaped what we study, how we study it, and what we conclude — even when science presents itself as value-free. As she put it in a paper with Kristin Andrews and Jonathan Birch, “values drive research questions, methodological choices, statistical interpretation, and the framing of results,” which means those values can “influence empirical knowledge as much as the data.”
Here are three examples:
Research questions: Animal welfare science often asks how to optimize productivity or “reduce stress” within farming systems, rather than what environments animals themselves would choose. A common study question might be, “What cage enrichment reduces feather-pecking in hens?” rather than, “Do hens prefer to live caged or uncaged at all?” The bias is that the first question assumes or glosses over the legitimacy of caging, tacitly accepting the human agricultural system as the baseline.
Methodological choices: In our interview, Webb pointed out that when comparing cognition between humans and other primates, researchers typically use human-designed tasks (touchscreens, puzzles, symbols). These setups often require fine motor control or familiarity with human artifacts. As a result, apes “underperform,” leading to the conclusion that humans are smarter. But the methods themselves privilege human-like skills. The test is designed for humans to win.
Statistical interpretation: Statistical “significance” thresholds (like p < 0.05) are used to declare that an effect exists or not, but these conventions were originally developed for tightly controlled laboratory and industrial experiments. In animal welfare studies, subtle behavioral changes — like shifts in grooming, gaze, or social spacing — may be dismissed as “not significant,” even though they reflect genuine distress or preference.
AI that looks just like us
When we think about artificial intelligence, we are really talking about artificial human intelligence. Large language models and machine learning technologies are built on human data, operate from human prompts, and give human-like responses.
AI research asks how to make systems useful for humans (more accurate, more personalized, more profitable), not how they might affect nonhuman or ecological systems in terms of energy use, resource extraction, or long-term environmental stability. The framing already presupposes anthropocentrism.
The language we use often betrays both anthropocentrism and anthropomorphism — where we imagine nonhuman things as behaving and thinking just like humans. For example, researchers often describe models as “hallucinating,” “reasoning,” or “aligning” — all metaphors that project human cognition onto statistical systems. The framing centers our self-image rather than the system’s actual operations.
But the most obvious example of anthropocentrism in AI is that the entire field is focused on building intelligence in the same way as humans. It involves neural networks, symbolic reasoning, and goal-directed behavior.
Lessons from a moss
Of course, this makes sense if we want a product that humans can interact with. It makes sense if we want to develop human technology, human medicine, and human progress. But Webb points out that AI research focuses disproportionately on the good of human intelligence while ignoring the problems of “environmental destruction, decay, and arrogance with how we deal with the environment.”
In our conversation, Webb gave an interesting alternative that I’d love to see as a science fiction short story one day. And that’s to imagine an AI with the intelligence of a moss. As Webb put it:
“Robin Wall Kimmerer writes a lot in her work about how mosses have been around for 500 million years or something like that, compared to humans’ meager, like, 200,000 years. And so if we were really interested in intelligence, in living well, and in evolutionary success, maybe we should turn to other forms of life like mosses, who’ve managed to do it for hundreds of millions of years and ask them for solutions to some of the ecological problems that we’re facing today.
And how would we use that intelligence? Well, mosses are amazing, because they survive not by outcompeting others, but by creating highly diverse, thriving environments for other species to survive in. Like, that’s how they survive: by creating these tight, multi-species communities. So that would be a great thing to learn from about.”
We needn’t abandon the entire AI project to see the reasoning in Webb’s argument. If AI is HI but bigger, then it will amplify both the good and the bad. The trade-offs will be on a different scale. And when we’re talking about superhuman-like intelligence, does that mean super destruction, super bugs, and super catastrophe? Because human intelligence does all that. If we’re wanting to build an intelligence that can change the world, should we include a few more members of that world first?
This post assumes you know algebra, but no linear algebra. Lets dive in.
There are two big ideas I want to introduce in the first chapter: Gaussian elimination, (which is not strictly a linear algebra thing, and has been around for years before linear algebra came along), and row picture versus column picture, which is a linear algebra thing.
Money example
Let’s say you have a bunch of nickels and pennies, and you want to know how many of each do you need to have 23 cents.
You could write that as an equation that looks like this:
x is the number of nickels you need, y is the number of pennies you need. And you need to figure out the x and y values that would make the left-hand side work out to 23. And this one is pretty easy, you can just work it out yourself. You’d need four nickels and three pennies.
So x is four, y is three.
This kind of equation is called a linear equation. And that’s because when you plot this equation, everything is flat and smooth. There are no curves or holes. There isn’t a ^2 in the equation for example to make it curved. Linear equations are great because they’re much easier to work with than curved equations.
Aside: Another solution for the above is 23 pennies. Or -4 nickels + 43 pennies.
The point is you have two variables (x and y for nickels and pennies), and you are trying to combine them in different ways to hit one number. The trouble starts when you have two variables, and you need to combine them in different ways to hit two different numbers. That’s when Gaussian elimination comes in. In what world would you have to hit two different numbers? Does that seem outlandish? It’s actually very common! Read on for an example.
Food example
Now let’s look at a different example. In the last one we were trying to make 23 cents with nickels and pennies. Here we have two foods. One is milk, the other is bread. They both have some macros in terms of carbs and protein:
and now we want to figure out how many of each we need to eat to hit this target of 5 carbs and 7 protein.
This is a very similar question to the one we just asked with nickels and pennies, except instead of one equation, we have two equations:
Again we have an x and a y. Lets find their values. To solve these kinds of questions, we usually use Gaussian elimination. If you’ve never used Gaussian elimination, strap in.
Gaussian elimination
Step one is to rewrite this as a set of two equations:
Now you subtract multiples of one equation from another to try to narrow down the value of one variable. Lets double that second equation:
See how we have a +2y and a -2y now? Now we can add the two equations together to eliminate y:
We’re left with one equation and one variable. We can solve for x:
Aha, we know x = 3. Now we can plug that into one of the equations to find y.
We plug that in to one of the equations and find out that y equals 1, and there we have answer: three milks, one bread, is what we need.
This method is called Gaussian elimination, even though it was not discovered by Gauss. If you haven’t seen Gaussian elimination, congratulations, you learned a big idea! Gaussian elimination is something we will talk about more. It’s part of what makes linear algebra useful.
We can also find the solution by drawing pictures. Let’s see how that works.
Picture version
Let’s plot one of these lines. First, we need to rewrite the equations in terms of x.
Reminder: first equation is for carbs, second for protein. x is number of milks, y is number of breads.
Now let’s plot the graph for the first equation.
Now, what does this line represent?
It’s all the combinations of bread and milk that you can have to get exactly five carbs:
So you can eat no milk and two-and-a-half breads, or two milks and one-and-a-half breads, or five milks and no bread, to get to exactly five carbs. All of those combinations would mean you have eaten exactly five carbs. You can pick any point that sits on this line to get to your goal of eating five carbs.
Note: You can see the line goes into the negative as well. Technically, 5 breads and -5 milks will give you 5 carbs as well, but you can’t drink negative milks. For these examples, let’s assume only positive numbers for the variables.
Now, let’s plot the other one. This is the same thing, but for protein.
If you eat any of these combinations, you’ll have met the protein goal:
You can pick a point that sits on the first line to meet the carb goal. You can pick a point that sits on the second line to meet the protein goal. But you need a point that sits on both lines to hit both goals.
How would a point sit on both lines? Well, it would be where the lines cross. Since these are straight lines, the lines cross only once, which makes sense because there’s only a single milk and bread combo that would get you to exactly five grams of carbs and seven grams of protein.
Now we plot the lines together, see where they intersect, and that’s our answer:
Bam! We just found the solution using pictures.
So that’s a quick intro to Gaussian elimination. But you don’t need linear algebra to do Gaussian elimination. This is a technique that has been around for 2,000 years. It was discovered in Asia, it was rediscovered in Europe, I think in the 1600s or something, and no one was really talking about “linear algebra”. This trick is just very useful.
That’s the first big idea you learned. You can stop there if you want. You can practice doing this sort of elimination. It’s a very common and useful thing.
The column picture
What we just saw is called the “row picture”. Now I want to show you the column picture. I’m going to introduce a new idea, which is: instead of writing this series of equations, what if we write just one equation? Remember how we had one equation for the nickels and pennies question?
What if we write one like that for food? Not a system of equations, just a single equation? What do you think that would look like? Something like this:
It’s an equation where the coefficients aren’t numbers, they’re an “array” of numbers. The big idea here is: what if we have a linear equation, but instead of numbers, we have arrays of numbers? What if we treat [1 2], the way we treat a number?
Can that actually work? If so, it is pretty revolutionary. Our whole lives we have been looking at just numbers, and now we’re saying, what if we look at arrays of numbers instead?
Let’s see how it could work in our food example. What if the coefficients are an array of numbers? Well, this way of thinking is actually kind of intuitive. You might find it even more intuitive than the system of equations version.
Each of these coefficients are called vectors. If you’re coming from computer science, you can kind of think of a vector as an array of numbers (i.e. the order matters).
Lets see how we can use vectors to find a solution to the bread and milk question.
Step one: graph the vectors.
Yeah, we can graph vectors. We can graph them either as a point, like I’ve done for the target vector here, or as an arrow, which is what I’ve done with the vector for bread and the vector for milk:
Use the two numbers in the vector as the x and y coordinates.
That is another big idea here: We always think of a set of coordinates giving a point, but you can think of vectors as an arrow instead of just a point.
Now what we’re asking is how much milk and how much bread do we need, to get to that point?
This is a pretty simple question. It’s simple enough that we can actually see it. Let me add some milks:
And let me add a bread. Bingo bango, we’re at the point:
Yeah, we literally add them on, visually. I personally find this more intuitive. I think the system of equations picture can confuse me sometimes, because the initial question was, “how much bread and how much milk should I eat?” The vector way, you see it in terms of breads and milks. The row way, you see it as one of the lines is the carbs, the other line is the protein, and the x and y axes are the amount of bread, which results in the same thing, but it’s a little more roundabout, a little more abstract. This one is very direct.
The algebra way
We just saw that we can graph vectors too. Graphing it works differently from graphing the rows, but there is a graph we can make, and it works, which is pretty cool. What about the algebra way?
Here is the equation again:
Since we already know the answer, I’ll just plug that in:
Now, the question is how does the left side equal the right side? The first question is how do you define this multiplication? Well, in linear algebra, it’s defined as, if you multiply a scalar by a vector, you just multiply it by each number in that vector:
Now you are left with two vectors. How do you add two vectors? Well, in the linear algebra you just add the individual elements of each vector:
And you end up with the answer.
Congratulations, you’ve just had your first taste of linear algebra. It’s a pretty big step, right? Instead of numbers, we’re working with arrays of numbers. In future chapters, we will see why this is so powerful.
That’s the first big concept of linear algebra: row picture vs column picture.
Finally, I’ll just leave you with this last teaser, which is: how would you write these two equations in matrix notation? Like this:
This is the exact same thing as before. You can write it as scalars times columns, as we had done before:
or you can write it as a matrix times a vector, as above. Either one works.
Matrices are a big part of linear algebra. But before we talk about matrices, we will talk about the dot product, which is coming up next.
Additional reading
Check out Gilbert Strang’s lectures on linear algebra on YouTube.
Summary
Thanks for reading DuckTyped! Subscribe for free to receive new posts and support my work.
I can juggle. It’s a fun little skill to pull out from time to time. Most years I try to teach students the basics of juggling at some point, maybe during homeroom or when we have some free time after state testing or something like that. Usually leads to some goofy fun.
Students often think that to learn how to juggle, they should try juggling with three balls right away. They always want to grab three, and start trying to keep them all in the air.
The actual best way to learn to juggle is to start with one ball. Practice throwing from both hands so the ball reaches about the top of your head (experienced jugglers will keep their tosses lower, but head height is good for beginners). After some practice with one ball, move to two balls. Toss one, then toss the other. The goal is for the two tosses to look a bit like the McDonald’s arches: about the same height, offset horizontally. Here it’s important to practice for a while. Get better at that exchange — tossing the second ball while that hand prepares to catch the first ball. Start with both the right hand and the left hand. Make sure the tosses stay the same distance from your body and don’t creep forward. Make sure they’re the same height. Only after a lot of practice with two balls is it time to go to three.
Break It Down
I don’t take teaching juggling too seriously. It’s a fun activity to kill a bit of time with students. But juggling is a good illustration of a broader principle: the best way to learn skill X often isn’t to practice skill X. The best way to learn is often to break skill X into sub-skills A, B, C, and so on, then practice A, B, and C, and work your way up to X.
Here are a bunch of education examples:
To help students get better at word problems, don’t just give students lots of word problems.
To help students get better at taking standardized tests, don’t just give students lots of mock standardized tests.
To help students become better readers, don’t just ask students to read lots of books.
To help students get better at math fact fluency, don’t just give students lots of math fact practice.
To help students write better essays, don’t just ask them to write lots of essays.
To help students write better explanations, don’t just ask them to write lots of explanations.
To help students improve their problem-solving, don’t just give students lots of challenging problems.
To help students get better at note-taking, don’t just ask them to take notes all the time.
To help students get better at collaboration, don’t just assign lots of group projects.
I think every teacher has been guilty of this type of thinking. I know I have.
Juggling isn’t the best example here. I don’t think I’ve met a middle schooler who already knows how to juggle. Students start on a pretty level playing field. But that’s not the case for academic learning. If I’m teaching word problems, some students already have pretty strong word problem skills. Others really struggle. The students who already have strong skills will probably benefit from some broad practice where I throw a bunch of random word problems together and say “practice your word problem skills.” But others will flounder unless that skill is broken down into smaller pieces.
Helpful For All, Harmful For None, Crucial For Some
There’s a nice quote about teaching phonics from Snow & Juel: phonics is “helpful for all children, harmful for none, and crucial for some.” Phonics is just one example of breaking a complex skill down into small pieces. And the tricky thing about this type of teaching is that not all students need it. But for some students it’s absolutely essential. The reasons for that difference are beyond the scope of this post, but the consequences are straightforward. Teaching without breaking learning down into small, manageable steps will work for some students, but not for others. Breaking learning into small steps is the best strategy to help all students make progress. It might be intuitive that, if I want to help students write better explanations for their mathematical thinking, I should ask them to explain their reasoning all the time. It’s also an easy way out: ask for lots of explanations, watch some students write great explanations, pat myself on the back, and call it a day. Much harder is to figure out the component parts of writing a great explanation, teach them one at a time, and help many more students develop a new skill they struggled with before.
The toughest part here is figuring out the components of these skills. They’re not obvious. Unfortunately, that’s how the human mind works: as we become proficient with a skill, the little pieces of that skill melt away and become invisible to us. One of the most important challenges of teaching is finding new ways to break skills down into smaller, more manageable pieces. It never ends. Every year I find new ways to break things down, to figure out which pieces are tough for students, to connect those pieces back together into a larger whole. That’s some of the most important and most challenging intellectual work of teaching. The better I get at seeing the hidden elements of learning, the more students I can help find success in math class.
This video from YouTube Science Explainer channel Kurzgesagtsays a lot of the things I would say about “AI” in this moment, namely: At first it seemed cool, but then it quickly became apparent that the version of it presented to consumers as a creative tool was both deeply flawed and also based on the theft of work from literally millions of creators (including myself!). The bullshit it is generating is now quickly eating the Internet, to the detriment of the actual creative people who make their livelihoods there and also to the detriment of, you know, truth and facts.
In the video, the folks at Kurzgesagt outline how they will and won’t use “AI” — basically not for writing or factchecking, but occasionally for things like automating animation processes and other such backend stuff. I think this is reasonable — and indeed, if one is using creative tools more involved than a pen and a piece of paper, “AI” is damn near unavoidable these days, even allowing for the fact that “AI” is mostly a marketing phrase for a bunch of different processes and tools which in a different era would have been called “machine learning” or “neural networks” or something else now horribly unsexy.
This is also how I’m approaching my writing here on Whatever. Every word you see here is written by an actual live human, usually either me or Athena, but also the individual authors of the Big Idea posts. Good, bad or indifferent, it came out of someone’s skull, and not out of a prompt field. I do this because a) I care about the quality of the posts you see here, and also b) as Athena and I are both actually decent writers with substantial experience, it’s easier just to write things ourselves than to prompt an “AI” to do it and then spend twice as much time editing for facts and tone. That’s right! “AI” doesn’t make our writing job easier! Quite the opposite in fact!
(Also: I don’t use generative AI to create images here — there are a few from years ago, before it became clear to me the generators were trained on copyrighted images, and I stopped when it was made clear this was done without creator consent — so images are almost all photographed/created by me (or Athena) directly, are non-AI-generated stock images I have a license for (or are Creative Commons or in public domain), or are publicity photos/images which are given out for promotional purposes. I do often tweak them with photo editing tools, primarily Photoshop. But none of the images comes out of a prompt.)
I think there’s a long conversation to be had about at what point the use of software means that something is less about the human creation and more about the machine generation, where someone scratching words onto paper with a fountain pen is on one end of that line, and someone dropping a short prompt into an LLM is on the other, and I strongly suspect that point is a technological moving target, and is probably not on a single axis. That said, for Whatever, I’m pretty satisfied that what we do here is significantly human-forward. The Internet may yet be inundated with “AI” slop, but Whatever is and will remain a small island of human activity.
When OpenAI launched the Sora 2 video generator last week, the company wrote that it was taking measures to "block depictions of public figures" by default. But creators and viewers of Sora 2 videos are finding that prohibition has a rather large loophole, allowing for videos of public figures that happen to be dead.
OpenAI places a moving Sora watermark over each generated video, which limits the risk of viewers being fooled by fake footage of real people. Still, seeing these deceased celebrities used as props by an AI tool can obviously be upsetting to their living relatives and fans.
I have always been haunted in some way by Day-Lewis. He is clearly among the greatest living screen actors, with a career that includes several performances that no one else could have accomplished at his level. But from when I quit acting through to when I wrote my own book on The Method, until now, I have always wondered whether the brilliance he is capable of requires the lengths to which he drives himself. As his techniques have been adopted by a whole generation of self-serious actors both good (Christian Bale) and not (Jared Leto), I have also come to wonder if the legends are even true. It turns out that the answers to both questions are far more complicated than I thought.
As someone who used to write quite a bit about relaxed concentration, I was especially interested in this bit:
Another reason, the one I find most persuasive, is that if you are able to live as fully as possible in the imagined reality of the character, you enter a flow state where you stop thinking and start doing and being. Day-Lewis struggles most in interviews to answer questions that require what he terms “objectifying,” or thinking outside of the headspace of the character. When he is on set, he wants to never be objective, to never interrupt the process of being and doing in order to think. When asked once about specific physical gestures he made in There Will Be Blood, he replied, “my decision-making process has to happen in such a way that I’m absolutely unaware of it, otherwise I’m objectifying a situation that demands something different.” When asked about the meaning of Bill the Butcher in Gangs of New York, he said he can’t answer the question, because “there was no conscious intention to show him as one way or another.”
This state of pure being is the actor’s equivalent of when great athletes are “in the zone,” or the trance that a jazz improviser enters when they’re really cooking. The name for it is a Russian word, perezhivanie, which means experiencing.
“When did you become such an adventurous eater?” my mom often asks me, after I’ve squealed about some meal involving jamón ibérico or numbing spices. The answer is, I don’t know, but I can think of moments throughout my life where food erupted as more than a mere meal: My cousin and his Ivy League rowing team hand-making pumpkin ravioli for me at Thanksgiving. Going to the pre-Amazon Whole Foods and giddily deciding to buy bison bacon for breakfast sandwiches assembled in a dorm kitchen. Eating paneer for the first time in India. Slurping a raw oyster in New Orleans.
What made me even want to try a raw oyster in 2004, despite everything about an oyster telling me NO, was an entire culture emerging promising me I’d be better for it. Food, I was beginning to understand from TV and magazines and whatever blogs existed then, was important. It could be an expression of culture or creativity or cachet, folk art or surrealism or science, but it was something to pay attention to. Mostly, I gleaned that to reject foodieism was to give up on a new and powerful form of social currency. I would, then, become a foodie.
To be a foodie in the mid-aughts meant it wasn’t enough to enjoy French wines and Michelin-starred restaurants. The pursuit of the “best” food, with the broadest definition possible, became a defining trait: a pastry deserving of a two-hour wait, an international trip worth taking just for a bowl of noodles. Knowing the name of a restaurant’s chef was good, but knowing the last four places he’d worked at was better — like knowing the specs of Prince’s guitars. This knowledge was meant to be shared. Foodies traded in Yelp reviews and Chowhound posts, offering tips on the most authentic tortillas and treatises on ramps. Ultimately, we foodies were fans, gleefully devoted to our subculture.
To be called a “foodie” now is the equivalent of being hit with an “Okay, boomer.” But the ideals the foodie embodied have been absorbed into all aspects of American culture.
Which inevitably leads to some problems, when, say, the celebrities the subculture has put on a pedestal are revealed to be less-than-honorable actors, or when values like authenticity and craft are inevitably challenged. What it’s historically meant to be a foodie, a fan, has shifted and cracked and been reborn.
And ultimately, it has died. Or at least the term has. To be called a “foodie” now is the equivalent of being hit with an “Okay, boomer.” But while the slang may have changed, the ideals the foodie embodied have been absorbed into all aspects of American culture. There may be different words now, or no words at all, but the story of American food over the past 20 years is one of a speedrun of cultural importance. At this point, who isn’t a foodie?
Once upon a time, there was the gourmand, which even in 1825, lawyer and self-proclaimed gourmand Jean Anthelme Brillat-Savarin felt was misunderstood. “There is a perpetual confusion of gourmandism in its proper connotation with gluttony and voracity,” he writes in his seminal The Physiology of Taste. Gourmandism was not about mere excess, but about appreciation. It was “an impassioned, considered, and habitual preference for whatever pleases the taste,” he writes, a love of delicacies and an “enemy of overindulgence.”
As for who can be a gourmand, Brillat-Savarin posits, in the scientific fashion of the time, that some are chosen by nature to have a heightened sense of taste. And although anyone may be born a gourmand, just as anyone may be born blind or blond, to take advantage of that innate sense requires capital. Being rich doesn’t automatically give one good taste, but “anyone who can pile up a great deal of money easily is almost forced, willy-nilly, to be a gourmand.”
Julia Child and James Beard insisted that the greatest food was completely achievable in your own kitchen, often using humble ingredients.
For the next centuries, things mostly stayed that way. It was the wealthy who spent on the finest wines and meats, and in the public imagination, to be a gourmand was in many ways to perform wealth and flaunt access. This was true in a lot of places, whether it was a royal Chinese banquet or through the development of Mughal cuisine, though Brillat-Savarin was speaking squarely from a European stage.
As gourmandism crossed the ocean from Brillat-Savarin in 1800s France to 20th-century America, it was often limited to fine dining and French cuisine; finding joy in the offerings of Grandma’s pot or the Automat did not earn you a culinary title. But in the later 20th century, the purviews of American gourmands were changing, as both access to fine ingredients and knowledge about their preparation became more populist. Craig Claiborne turned restaurant reviews into sites of true arts criticism, and Julia Child and James Beard insisted that the greatest food was completely achievable in your own kitchen, often using humble ingredients. Alice Waters celebrated the fruits of California, and Ruth Reichl championed places like New York Noodletown, a Chinatown spot that she described as “a bare, bright, loud restaurant where the only music was the sound of noodles being slurped at tables all around.”
The scope was widening. But “the thing that makes food both challenging and interesting as a cultural vector is that food is not a mechanically reproducible experience,” says Helen Rosner, food critic at the New Yorker. You still had to be physically in those locations, or have those ingredients in your own kitchen, for it to work. It seemed absurd for someone to care what Chez Panisse was like if they never even had a chance of going. So while new technologies had made other cultural products — music, film, television — easier and cheaper to engage with than ever, allowing new communities to form over their shared interests, food was still a more localized obsession. “If I have an opinion about a movie and I live in Los Angeles, my opinion is still relevant to somebody who lives in Toronto,” says Rosner. “If I have an opinion about bagels and I live in Queens, my opinion is barely relevant to someone who lives more than 10 blocks from my apartment.”
And yet, at the turn of the last century, two platforms developed in food culture that shifted it from an individual identity to a shared one, turning food from culture to pop culture: food television, and the internet.
Chef Hubert Keller looks skeptically at contestant Ken Lee’s pan-seared halibut. The two pieces rest against each other over a soybean puree, encircled by tomato compote and a ring of fig gastrique, like a glamorous mandala. But during Top Chef’s first-ever Quickfire Challenge, Lee has already gotten into trouble by tasting a sauce with his fingers, and arguing after being told that was unsanitary. The cast has turned against him, questioning his hubris in the face of bland fish. Later that episode, he becomes the show’s first chef asked to pack his knives and go.
Top Chef, which premiered in 2006, immersed viewers in the world of the professional kitchen. Chefs use “plate” as a verb, hand things off to the “pass,” don their “whites.” I probably didn’t even need to put those words in quotes, as you already know what they mean. They’re part of our cultural vocabulary now.
How did we get to chefs-holding-squeeze-bottles as entertainment? The 1984 Cable Communications Policy Act deregulated the industry, and by 1992, more than 60 percent of American households had a cable subscription. Food Network launched in 1993, and compared to Julia Child or Joyce Chen drawing adoring viewers on public broadcasting programs, the channel was all killer, no filler, with shows for every mood. By the early 2000s, you could geek out with Alton Brown on Good Eats, experience Italian sensuality with Molto Mario or Everyday Italian, fantasize about a richer life with Barefoot Contessa, or have fun in your busy suburban kitchen with 30 Minute Meals. Anthony Bourdain’s A Cook’s Tour gave viewers an initial taste of his particular brand of smart-alecky wonder, and there were even competition shows, like the Japanese import Iron Chef.
Top Chef gave viewers a shared language to speak about food in their own lives. Now, people who would never taste these dishes had a visual and linguistic reference.
The premiere of 2005’s The Next Food Network Star, which later gave us Guy Fieri, baron of the big bite, was the network’s first admission that we were ready to think of food shows in terms of entertainment, not just instruction and education. But Food Network was still a food network. The mid-aughts brought the revelation that food programming didn’t have to live just there, but could be popular primetime television — when that was an actual time and not just a saying.
Then came Top Chef, inspired by the success of Bravo’s other reality competition series, Project Runway. There is no overstating Top Chef’s lasting influence on food entertainment, but off the bat it did one thing that further cemented foodieism as a bona fide subculture: Its air of professionalism gave people a vocabulary. “The real pushback from the network was but the viewers can’t taste the food,” says Lauren Zalaznick, president of Bravo at the time. But just like the experts on Project Runway could explain good draping to someone who didn’t know how to sew, Top Chef “committed to telling the story of the food in such a way that it would become attainable no matter where you were,” she says.
This gave viewers a shared language to speak about food in their own lives. Now, people who would never taste these dishes had a visual and linguistic reference for molecular gastronomy, and could speculate about Marcel Vigneron’s foams. If you didn’t know what a scallop was, you learned, as Top Chef was awash in them. Yes, you could hear Tom Colicchio critique a classic beurre blanc, but also poke, al pastor, and laksa, and now that language was yours too. And you could hear chefs speak about their own influences and inspirations, learning why exactly they thought to pair watermelon and gnocchi.
The food scene then “was more bifurcated,” says Evan Kleiman, chef and longtime host of KCRW’s Good Food. “There were super-high-end restaurants that were expensive, maybe exclusive, and for the most part represented European cuisines. And then what was called ‘ethnic food’ was often relegated to casual, family-run kind of spots.” Top Chef may have been entertainment for the upwardly mobile foodie, but in 2005, Bourdain’s No Reservations premiered on the Travel Channel, similarly emphasizing storytelling and narrative. In his hands, the best meals often didn’t even require a plate. His was a romantic appreciation of the authentic, the hole-in-the-wall, the kind of stuff that would never be served in a dining room. It set off an entire generation of (often less respectful, less considered) foodie adventurism.
“No Reservations is what got me interested in the culture of eating,” says Elazar Sontag, currently the restaurant editor at Bon Appétit. Because it was about food as culture, not as profession. But there was programming for it all. Also in 2005, Hell’s Kitchen premiered on Fox, with an amped-up recreation of a dinner service in each night’s challenge. “Hell’s Kitchen’s high-octane, insane, intense environment of a restaurant kitchen is actually what made me think, when I was maybe 12 or 13, that I want to work in restaurants,” says Sontag.
All these shows were first and foremost about gathering knowledge, whether it was what, indeed, a gastrique was, or the history of boat noodles in Thailand. It didn’t matter if you’d ever been there. The point was that you knew. “Food was becoming a different kind of cultural currency,” says Sontag. “I didn’t clock that shift happening at the time, but it’s very much continued.”
Language is meant to be spoken; knowledge is meant to be shared. Now that everyone knew there were multiple styles of ramen, there was no better place to flex about it than with a new tool: the social internet. Online, “talking about restaurants and going to restaurants became something that people could have a shared identity about,” says Rosner. “There was this perfect storm of a national explosion of gastronomic vocabulary and a platform on which everybody could show off how much they knew, learn from each other, and engage in this discovery together.” Your opinion about your corner bagel shop suddenly had a much wider relevance.
Sites like Chowhound and eGullet launched in 1997 and 2001, respectively, and became ever more popular hubs for people seeking out interesting food, and homes for seminal food writers like Jonathan Gold and Robert Sietsema. If you were in college in 2004, you might have already been on (the) Facebook; more crucially, that was the year Yelp launched, which allowed users to review local businesses. Almost immediately, it became restaurant-centric. And anyone could start a blog to document their own food opinions.
In 2005, Michelin released its first guide to an American city (New York), and a few years before, The World’s 50 Best Restaurants list declared that anyone who was anyone had better be dining at El Bulli in Spain. But while anyone online could review high-end restaurants like French Laundry and Gramercy Tavern if they wanted to, they’d likely be competing with experienced, professional reviewers. Where the foodies of the internet shined was in highlighting “ethnic food,” following in Bourdain’s worn boots to champion casual places that may not have traditionally gotten mainstream media attention.
The enthusiasm with which bloggers began to share and review “lowbrow” meals created a culture in which those meals began to rise in value.
In 2006, Zach Brooks was in his 30s, living in Manhattan, and like plenty of other office workers with disposable income, was “stuck in Midtown for lunch,” he says. So in the vein of food blogs he read like Chowhound and, yes, Eater, he began documenting his meals. “To me, lunch hour is sacred- and I’m not going to waste it in some generic overpriced ‘deli,’” he wrote on his blog, Midtown Lunch. Instead, he was dedicated to finding “gems” like the best taco trucks and halal carts in a sea of mediocrity. “There were just so many different immigrant groups, so you had access to so many different kinds of food, and I think there’s a natural curiosity,” says Brooks. Like many early food bloggers, he was white, and took an almost explorational attitude toward his mission, traipsing to the carts and counters of Midtown like points on a globe. “Like, why wouldn’t you want to try everything?”
It might sound obvious now, but the internet allowed you to find opinions and experiences outside of your immediate social circle; your coworker might not have known where to go for lunch, but some guy online knew where you could get a plate of Ecuadorian food three blocks away. And the enthusiasm with which bloggers began to share and review “lowbrow” meals created a culture in which those meals began to rise in value. “I think what came out of this time period was that it wasn’t just about the fine dining world anymore,” says chef Sam Yoo. “It was cool to go to Jackson Heights or Flushing, and find hole-in-the-wall momos.”
Because the 2008 recession made it even harder for most people to experience fine dining, food trucks and cheap eats moved closer to the center of the culinary world, such that a chef like Roy Choi could open his Kogi BBQ truck that year and be named Bon Appétit’s Best New Chef for it shortly after. This shift was also happening as social media began to be ever more convenient (the iPhone came out in 2007) and visual (the first YouTube video was uploaded in 2005; Instagram launched in 2010). All this further flattened the culinary landscape of the internet. You could now, for the first time ever, take a photo of what you were eating and upload it to the internet before you even took a bite. A $3 taco and a plate of duck at Momofuku Ko would show up the same size on an Instagram grid — and could get the same number of likes.
Ultimately, the internet fueled a great democratization of knowledge and experience around food. “Your access to information is so much easier than it was before,” says Kleiman. “You don’t need to get on a plane and fly to Switzerland to learn about some dish, or even to try and make it in your home. All you have to do is look at your phone and click on it.” Through television and the internet, you could become well-informed about Indian cuisine without ever having been to the country, and then debate with strangers which restaurant in your city has the best tandoori. You could learn how to make sushi on YouTube, or just watch one of Epic Meal Time’s videos of bacon-covered monstrosities for a laugh. There wasn’t just Top Chef, but Top Chef recap blogs and cooking parties, subreddits where fans developed parasocial relationships with the stars of the Bon Appétit test kitchen, drama in Yelp Elite circles, and food festivals. Everyone started a food blog, and one of them was turned into a movie, Julie & Julia. Everyone posted photos of their lunch.
“Once upon a time, food was about where you came from,” wrote John Lancaster for the New Yorker in 2014. “Now, for many of us, it is about where we want to go — about who we want to be, how we choose to live.” The gourmand was dead. The foodie had been born.
The greatest innovation of food in the 21st century is that diners aren’t just diners anymore, they are fans. Literally, by definition. “There are three essential components of something that is a fandom,” says Mel Stanfill, an associate professor at the University of Central Florida who studies media and fandom. “There is an emotional attachment. It’s something that’s being interpreted. And there is a community, so you’re doing that interpretation in relation to each other.” This is how Marvel characters went from being the purview of nerds to the subject of mainstream action films, how Fifty Shades of Grey went from Twilight fanfic to best-seller, and how food went from something you enjoyed to something you consumed with every part of your life, not just your mouth.
Foodieism checks all the boxes of fandom: You absorb the stories being told on TV, you iterate on recipes in your own kitchen, and you post online about what you eat to a wider community. But when any fandom explodes into wider visibility, gathering new fans and bigger communities, there are always conflicts, often stemming from newer participants bringing a critical eye to how things have been done. And with foodieism, there was plenty to criticize.
For a moment, it looked like there could be a reckoning in the role of the foodie. Should we really be fans of such fallible people? Was there a better way to engage?
Foodie culture, at its start, was bolstered largely by white bloggers and chefs who, perhaps admirably, wanted to break out from Euronormativity and geek out over other cultures. “It was an incredible expansion of the white gaze, but it was, nevertheless, the gaze,” says Rosner. “Authenticity” was the bar to meet, a very real concern when plenty of Americans equated Mexican cuisine with Old El Paso, and every day it seemed like another white person was opening an “Asian” restaurant while at the same time disparaging Asian traditions.
People of color have been part of this conversation from the jump. But the devotion to authenticity, often by people who came from the outside to the cultures they were defending, started to feel like a trap, like we had to live up to others’ expectations instead of our own. The rise of third-culture cooking — multicultural cuisine often done by members of a diaspora who meld family tradition with wider influences — is in part a reaction to the white foodie’s feverish classifications, a way to say we will define our cuisine, or invent an entirely new one, for ourselves, thank you very much.
There was also the machoness of it all, the fawning over the badass, boys’ club, rock-star chef who gives no fucks and makes no menu substitutions. There was a fetishization of the tough world of the kitchen, the yelling and punishment and hedonism it seemed to require. In hindsight it was perhaps predictable that many of the chefs lauded as the hardest and brashest turned out to be accused of abuse, racism, or sexual misconductin the kitchen.
There were genuine attempts to improve the industry, both the professional kitchen and the media that surrounds it. There were widespread conversations about the white gaze’s role in food culture, and how often white voices and tastes were elevated over others. Chefs spoke out more about mental health struggles and discrimination. And for a moment, it looked like there could be a reckoning in the role of the foodie, too. Should we really be fans of such fallible people? Was there a better way to engage?
When the locus of a fandom falls from grace, “people experience that something has been taken away from them, something they used to like,” says Stanfill. But instead of pulling back on the intensity of fandom, usually something else just fills the hole. The internet allows foodies to find community and engage in fandom together, but also find new people to fixate on. And as the social internet grew, everyone could become their own content creator.
“There was no plan, because back then, there was no such thing as an ‘influencer’ or ‘content creator.’ Those words didn’t exist yet,” says Mike Chau, who since 2013 has been operating his food account, @foodbabyny. Chau, who still has a full-time job, says he started and continues just for the fun of trying new things with his family around New York, with the occasional perk of an invitation to a restaurant opening. But more recently he’s noticed shifts, from the paid opportunities available for influencers to the increased opportunity for “virality,” as Instagram and TikTok algorithms can give anyone, no matter their following, their 15 minutes of fame. In Chau’s opinion, this has altered the way influencers do business (namely that they are doing business at all) by focusing on engagement rather than their own enjoyment of the food at the center of their content. “Talking to other influencers, you hear them say, I’ve got to go here, this place would do well on my page. I think that’s the main driving force,” he says.
“On the internet, we live and die by our superlatives. That has created a significant shift in food culture, the need for everything to be a promise of a superior experience.”
“We live and die by our superlatives in a way that I think speaks less to the whims of food media leaders and more to the state of the internet,” says Sontag. If the initial draw of being a foodie is that being a part of this culture will enrich your life in some way, the algorithm has made that enrichment a matter of the “best” places to go. “I think that has created a significant shift in food culture, the need for everything to be a promise of a superior experience,” says Sontag.
What was once the promise that food could be a source of knowledge, culture, and joy now feels more like the pressure that every meal must be the best one, that the risk of trying something unvetted — once the whole point — is too great. “You have this exploration, and then these loud and charismatic people declare that actually we found everything, and these are the best, you can stop looking,” says Rosner. The world of fandom whittled down to a checklist.
“You have to try the mouthfeel of the mignonette,” Tyler gushes at his date, Margot. He won’t stop saying things like this. It’s awful. He cries over scallops and scolds Margot for smoking, which he says will ruin her palate. You want him to die, which, luckily, he does.
The Menu is a movie about a chef so fed up with simpering, obsessive foodies like Tyler (Nicholas Hoult) that he’s willing to destroy himself and everything he’s ever created just to get back at them for ruining his profession. As part of this punishment, chef Slowik (Ralph Fiennes) invites Tyler into the kitchen, insisting that if he’s so knowledgeable, why doesn’t he cook dinner himself? Tyler fumbles, asking for ingredients and equipment he does not know how to wield. “Shallots for the great foodie!” mocks Slowik, before tasting Tyler’s creation and spitting it out. “You are why the mystery has been drained from our art.”
Who would wanna be like that guy? Tyler’s supposed superior knowledge and appreciation of food doesn’t make him a great critic or thinker. It just makes him annoying, and crucially, isolates him from any community he could seek through it. (He spends half the meal sneering at other diners for being unworthy — the narcissism of small differences.) He doesn’t want to connect over food, he wants to brag. This is where “foodie” has landed. At worst, it’s an insult that marks you as a melodramatic know-it-all who turns every meal into a lesson.
There’s no real need anymore to name the idea that one should be stimulated by and curious about food, and no point in making it your entire personality.
It’s also plain outdated. To call anyone a “foodie” with any sincerity in 2025 is like asking a “metrosexual” about the “truthiness” of his “blog.” You’d sound ridiculous. That’s because there’s no real need anymore to name the idea that one should be stimulated by and curious about food, and no point in making it your entire personality. Because in a fifth of the time it took, say, film, to achieve the same results, being “into” food went from niche interest to a fandom to mass culture. This is just what we do now. “Now when people say that they’re a foodie, I’m like, yeah, you, me, and my uncle,” says Sontag.
Maybe it’s easy to think we’ve grown beyond whoever we were in 2009, drinking bacon-washed cocktails out of mason jars and demanding to know which farm exactly the pig came from. But the undercurrent of foodieism — food as culture, worthy of active engagement — thrives, even if the title has died. Enough people are familiar with the inner workings of the brigade system such that movies like Ratatouille and The Menu, or a TV show like The Bear, can be not just legible, but successes. There’s Thai curry sauce at the Buffalo Wild Wings. People magazine wrote an article about “tomato girls.”
Even so, I find myself nostalgic for the era when “foodie” was a badge of honor, as every restaurant opening seems to be a steakhouse, as Gen Z opts for suburban chains like Chili’s over anything new and independent, and as tariffs threaten access to spices and other global ingredients, especially from non-European origins. I get the desire for the safety of the known, especially when even the most mediocre meal can set you back in rent payments. But I miss when the coolest thing you could do was geek out over where your food came from, who was making it, and what made it special. “A term like foodie was an indicator that you put in some level of legwork,” says Sontag.
But just because that legwork is now part of the cultural fabric, and just because it’s easier to do, doesn’t mean it’s not still work. You can watch a million TikToks, but to engage, you still need to go to the hot bakery. You still need to actually make the ramen you saw on YouTube. You still need to get the reservation, and then your taste buds have to wrap themselves around a chutney pizza made by a second-generation chef and open themselves up to what is happening. To paraphrase a modern poet, no one else can taste it for you. Hell, you still need to watch the TikToks. There is no outsourcing this. And as the past 20 years have cemented, even if we could, few would want to. We’re foodies through and through. Even if we don’t want to say it.
Julia Dufossé is an Austin-based illustrator specializing in surreal and atmospheric illustrations.
The moment I learned how to program, I wanted to experiment with my new super powers. Building a BMI calculator in the command line wouldn't cut it. I didn't want to read another book, or follow any other tutorial. What I wanted was to experience chaos. Controlled, beautiful, instructive chaos that comes from building something real and watching it spectacularly fail.
That's why whenever someone asks me how they can practice their new found skill, I suggest something that might sound old-fashioned in our framework-obsessed world. Build your own blog from scratch. Not with WordPress. Not with Next.js or Gatsby or whatever the cool kids are using this week. I mean actually build it. Write every messy, imperfect line of code.
A blog is deceptively simple. On the surface, it's just text on a page. But underneath? It's a complete web application in miniature. It accepts input (your writing). It stores data (your posts). It processes logic (routing, formatting, displaying). It generates output (the pages people read).
When I was in college, I found myself increasingly frustrated with the abstract nature of what we were learning. We'd implement different sorting algorithms, and I'd think: "Okay, but when does this actually matter?" We'd study data structures in isolation, divorced from any practical purpose. It all felt theoretical, like memorizing chess moves without ever playing a game.
Building a blog changed that completely. Suddenly, a data structure wasn't just an abstract concept floating in a textbook. It was the actual list of blog posts I needed to sort by date. A database wasn't a theoretical collection of tables; it was the real place where my article drafts lived, where I could accidentally delete something important at 2 AM and learn about backups the hard way.
This is what makes a blog such a powerful learning tool. You can deploy it. Share it. Watch people actually read the words your code is serving up. It's real. That feedback loop, the connection between your code and something tangible in the world, is irreplaceable.
You don't need tutorials anymore.
So how do you start? I'm not going to give you a step-by-step tutorial. You've probably already done a dozen of those. You follow along, copy the code, everything works perfectly, and then... you close the browser tab and realize you've learned almost nothing. The code evaporates from your memory because you never truly owned it.
Instead, I'm giving you permission to experiment. To fumble. To build something weird and uniquely yours.
You can start with a single file. Maybe it's an index.php that clumsily echoes "Hello World" onto a blank page. Or perhaps you're feeling adventurous and fire up a Node.js server with an index.js that doesn't use Express to handle a simple GET request. Pick any language you are familiar with and make it respond to a web request.
That's your seed. Everything else grows from there.
The Problems You'll Encounter
Once you have that first file responding, the questions start arriving. Not abstract homework questions, but real problems that need solving.
Where do your blog posts live? Will you store them as simple Markdown or JSON files in a folder? Or will you take the plunge into databases, setting up MySQL or PostgreSQL and learning SQL to INSERT and SELECT your articles?
I started my first blog with flat files. There's something beautiful about the simplicity. Each post is just a text file you can open in any editor. But then I wanted tags, and search, and suddenly I was reinventing databases poorly. That's when I learned why databases exist. Not from a lecture, but from feeling the pain of their absence.
You write your first post. Great! You write your second post. Cool! On the third post, you realize you're copying and pasting the same HTML header and footer, and you remember learning something about DRY (don't repeat yourself) in class.
This is where you'll inevitably invent your own primitive templating system. Maybe you start with simple includes: include('header.php') at the top of each page in PHP. Maybe you write a JavaScript function that stitches together HTML strings. Maybe you create your own bizarre templating syntax. It will feel like magic when it works. It will feel like a nightmare when you need to change something and it breaks everywhere.
And that's the moment you'll understand why templating engines exist.
I had a few blog posts written down on my computer when I started thinking about this next problem: How do you write a new post? Do you SSH into your server and directly edit a post-1.json file with vim? Do you build a crude, password-protected /admin page with a textarea that writes to your flat files? Do you create a whole separate submission form?
This is where you'll grapple with forms, authentication (or a hilariously insecure makeshift version of it), file permissions, and the difference between GET and POST requests. You'll probably build something that would make a security professional weep, and that's okay. You'll learn by making it better.
It's one thing to write code in a sandbox, but a blog needs to be accessible on the Internet. That means getting a domain name (ten bucks a year). Finding a cheap VPS (five bucks a month). Learning to ssh into that server. Wrestling with Nginx or Apache to actually serve your files. Discovering what "port 80" means, why your site isn't loading, why DNS takes forever to propagate, and why everything works on your laptop but breaks in production.
These aren't inconveniences, they're the entire point. This is the knowledge that separates someone who can write code from someone who can ship code.
Your Homegrown Solutions Will Be Terrible
Your blog won't use battle-tested frameworks or well-documented libraries. It will use your solutions. Your weird routing system. Your questionable caching mechanism. Your creative interpretation of MVC architecture.
Your homemade caching will fail spectacularly under traffic (what traffic?!). Your clever URL routing will throw mysterious 404 errors. You'll accidentally delete a post and discover your backup system doesn't work. You'll misspell a variable name and spend three hours debugging before you spot it. You'll introduce a security vulnerability so obvious that even you'll laugh when you finally notice it.
None of this is failure. This is the entire point.
When your blog breaks, you'll be forced to understand the why behind everything. Why do frameworks exist? Because you just spent six hours solving a problem that Express handles in three lines. Why do ORMs exist? Because you just wrote 200 lines of SQL validation logic that Sequelize does automatically. Why do people use TypeScript? Because you just had a bug caused by accidentally treating a string like a number.
You'll emerge from this experience not just as someone who can use tools, but as someone who understands what problems those tools were built to solve. That understanding is what transforms a code-copier into a developer.
We've Lost Something Important
Building your own blogging engine used to be a rite of passage. Before Medium and WordPress and Ghost, before React and Vue and Svelte, developers learned by building exactly this. A simple CMS. A place to write. Something that was theirs.
We've lost a bit of that spirit. Now everyone's already decided they'll use React on the frontend and Node on the backend before they even know why. The tools have become the default, not the solution.
Your blog is your chance to recover that exploratory mindset. It's your sandbox. Nobody's judging. Nobody's watching. You're not optimizing for scale or maintainability or impressing your coworkers. You're learning, deeply and permanently, by building something that matters to you.
So here's my challenge: Stop reading. Stop planning. Stop researching the "best" way to do this.
Create a folder. Create a file. Pick a language and make it print "Hello World" in a browser. Then ask yourself: "How do I make this show a blog post?" And then: "How do I make it show two blog posts?" And then: "How do I make it show the most recent one first?"
Build something uniquely, personally, wonderfully yours. Make it ugly. Make it weird. Make it work, then break it, then fix it again.
Embrace the technical chaos. This is how you learn. Not by following instructions, but by discovering problems, attempting solutions, failing, iterating, and eventually (accidentally) building something real.
Your blog won't be perfect. It will probably be kind of a mess. But it will be yours, and you will understand every line of code in it, and that understanding is worth more than any tutorial completion certificate.
If you don't know what that first blog post will be, I have an idea. Document your process of building your very own blog from scratch. The blog you build to learn programming becomes the perfect place to share what programming taught you.
Welcome to development. The real kind, where things break and you figure out why. You're going to love it.
I believed that giving users such a simple way to navigate the internet would unlock creativity and collaboration on a global scale. If you could put anything on it, then after a while, it would have everything on it.
But for the web to have everything on it, everyone had to be able to use it, and want to do so. This was already asking a lot. I couldn’t also ask that they pay for each search or upload they made. In order to succeed, therefore, it would have to be free. That’s why, in 1993, I convinced my Cern managers to donate the intellectual property of the world wide web, putting it into the public domain. We gave the web away to everyone.
— Tim Berners-Lee, Why I gave the world wide web away for free
Language models are statistical techniques for learning a probability distribution over linguistic data by observing samples of language use and encoding their observed presence in a large number of probability parameters. This probability distribution is a representation of human language. With increased processing power and well-designed process models, the resulting language model can be used as a component in a system to generate language for some purpose. Today’s generative language models are tasked to output the most probable or plausible sequence of strings given some window of relevant preceding discourse. Such a generative language model is not a communicative agent in its own right, but a representation of observations of general language use, much as a lexicon and a grammar would be, albeit one which can be interrogated with less expertise.
Probability Distributions Alone Do Not a Language Make
The variation inherent in the probability distributions of linguistic items is enormous, and there is no a priori single most plausible, most appropriate, and most useful path through the possible continuations of a string provided by preceding discourse. To help manage the decision space, the probability distribution can be modulated by modifying the representation of the model with a smaller selected sample of task-specific texts in a fine-tuning,6 instruction training,9 and alignment1 training process, and through carefully curated human feedback sessions where human assessors rate variant outputs for acceptability.8
The intention of such additional fine-tuning, alignment, and instruction is to enable a generative language model to generate task- and situation-appropriate material to fit a discourse of interest, essentially to create a voice out of language. Doing so is not a value-neutral process, and imposes normative constraints on the linguistic capacity of language models.
We Are in an Eliza Moment
Humans practice language usage skills over their lifetime and put great value upon using language effectively. Educational institutions spend great effort in instructing us to follow conventions that are based on general virtues, largely arbitrary with respect to their details, external to the code of language itself, and culturally defined. When human language users encounter conversational counterparts that are able to produce fluent and coherent contributions to discourse, they will ascribe sophistication to such counterparts and view them in a positive light. Generative language models produce impressively fluent language and human users of such models are prone to adopt an intentional stance2 toward such models, and to believe that a generative language model is an erudite and intelligent counterpart. This is, on balance, unfortunate, and is a result from conflating linguistic competence with other desirable human characteristics.
This deeply seated human instinct leads us to use terminology such as “trustworthiness” and “truthfulness” to describe qualities of the output of a generative language model, and to label some of the output “hallucination.” In fact, in light of the current architecture, representation scheme, and processing model, the entire output of a typical generative language model is hallucinated: language model output is not grounded in anything language-external. The target notion of a generative language model is to provide plausible strings as specified by its probability distribution of string segments, and “truthful” or “trustworthy” are not relevant concepts to describe such output. The fact that those strings occasionally or even mostly constitute language that conforms with human experiences does not change that labels such as “truthful” and “untruthful” are not relevant for the language model itself.
This is as it should be: if the linguistic capacity of a system would be constrained by adherence to some notion of truth it would be less versatile and useful, and our human language processing components are perfectly capable of prevarication, deception, and dishonesty, intentionally and not. This is a desirable property of both language and of language users. Language is not truthful or untruthful in itself: utterances are, by virtue of being pronounced in some context for some purpose by some language user.
Language models by themselves lack such purpose. Agents built on top of language models, however, might be implemented to have purpose. Consider an agent (or a bot) that fetches information from a database using input and output in natural language, and that uses a language model to manage the language interaction. If such an agent produces an output that is inconsistent with the information in the database, it would be relevant to talk about truthfulness, but this truthfulness would be a property of the entire system, and not of the language model.
Behavioral Conventions
Human communicative behavior is governed by conventions and rules on very varying levels of abstraction, and many of those conventions are fairly universal. There are human conversational principles that are easy to agree with: most everyone in most every culture will agree that one’s verbal behavior should be honest, relevant, and mindful of one’s counterparts’ feelings; in general, we wish to express ourselves in ways which are informative and which simultaneously establish our social position with an appropriate level of gloss and shine.a How these partially contradictory objectives are reconciled and realized varies across cultures. Linguistics, as a field of study, offers theoretical tools for the analysis of such behavioral choices and actions.
The British linguist Paul Grice formulated a Cooperative Principle for human-human conversation, which in short is “Make your contribution such as required for the purposes of the conversation you are engaged” and which is further specialized into four more specific Maxims of Conversation: the maxims of Quantity, Quality, Manner, and Relevance. Broadly, contributions to conversation should neither be too verbose nor too terse, should be truthful, should hold to the topic of the conversation at hand, and should be comprehensible and fluent.3
Elaborating on the aforementioned maxims, linguists study how social considerations and politeness modulate linguistic choices and how they facilitate or come in the way of efficient interaction.b A partial formalization are the Rules of Politeness, such as “Don’t impose,” “Give options,” “Be friendly,”4 or the Politeness Principle: “Minimize the expression of impolite beliefs,” by assessing relative costs and benefits and the extent of interlocutor praise and criticism.5
Most everyone will agree with the principles and maxims as stated. Adhering to such conversational conventions is good practice; departing from them will be viewed as an anomaly and carries social costs, whether done on purpose to achieve some desired effect or inadvertently. On the level of abstraction that the Maxims of Conversation and Rules of Politeness are given, they seem eminently reasonable, to the point of being self-evident and thus unhelpful as a theoretical tool.
Yet we find in the lively research areas of pragmatics, sociolinguistics, and conversational behavior analysis that conversational style differs noticeably and significantly across situations and counterpart configurationsc and, more importantly, across cultures. How they are operationalized and their various goal notions are balanced against each other varies from situation to situation, from language to language, and from culture to culture. In some cultural areas terseness is interpreted as rudeness, in others, verbosity is considered overbearing. Requests can in some cultural contexts be given as explicit imperatives (“Get the notes for me”), while they in others must be reformulated more indirectly as modal questions (“Could I borrow the notes from you?”).7
Conversational conventions are only incidentally explicitly accessible to participants in conversation. Mostly they are acquired through interaction with others. Anecdotes are frequent of second language learners blundering through a conversation in disregard of local conventions that are thought to be obvious. The language user must tread a careful path between brusqueness and obsequiousness: errors in either direction detract from the perceived trustworthiness and reliability of the conversational participant.
Conversational conventions are realized as linguistic and para-linguistic behavior in ways that are present in any collection of language. Appropriate behavioral guidelines are thus accessible to a language model with sufficient training data, but how these are realized in the generative phase depends crucially on the fine-tuning, alignment, and instruction training mentioned in this Opinion column. Those processes are not value-neutral, nor are they universal, even if the abstract principles they are based on are. This generalizes obviously to notions of bias, safety, harmfulness, and trustworthiness, all of which are generally accepted but encoded variously in situations, communities, and cultures.
Conclusion
Language is quite similar across populations, cultures, and individuals; the purpose of communication is approximately the same wherever and whoever you are. Our systems of expression involves choices related to modality, certainty, mindfulness, and courtesy in every language. Many of the features of language are hence quite amenable to transfer learning. But how these human universalia are instantiated across languages is governed by culturally specific conventions and preferences. Linguistic principles could be formalized further to be cultural parameters, which will allow us to avoid culturally specific markers that may come across as insensitive, impertinent, or imperious if translated blindly.
Today, most datasets for instruction tuning are translated from one language (almost always North American English) into others using automatic translation tools. Current translation tools are not built to modify the instruction data sets to fit cultural patterns and sensitivities, but render training items rather literally, occasionally to the extent as to make them nonsensical.d Such instruction sets need grounding in the communicative practices of a culture. Usefulness of instruction sets will persist longer than most models, and to ensure their quality, especially for smaller speaker communities less well served by large scale technology, a community-based effort to formulate them to fit local concerns is a sustainable investment which will be valuable in the long run.
In general, fine-tuning and alignment of a model’s parameters is a non-reversible operation, executed through some cultural or ideological lens. A model which has been rendered acceptable for a certain cultural or ideological context is likely to be of less use in the general case, and such modifications should be done openly and transparently by properly recording the provenance of alignment and instruction training sets on model cards; models that claim to be open should be made available both in unaligned and aligned form. The analogy to grammars and lexica is appropriate here: a lexicon and a grammar unable to describe rude behavior, verbal aggression, or cursing is both less generally useful and less empirically accurate than one that does.
Alignment, instruction training, and fine-tuning cannot in general be done by reference to universal values however virtuous, since while such values are easy to agree with, they mean different things to us, depending on a wide range of cultural and situational values.