1369 stories
·
1 follower

When does ‘good at maths’ actually mean good at maths?

1 Share

I almost failed one of my first year university mathematics exams. I suspect some people have a vision of university maths as just doing bigger calculations. More numbers. Longer equations. Solve for x, y and z.

But new undergrads soon hit a steep new learning curve with what’s known as ‘mathematical analysis’. Suddenly, maths is no longer just about performing calculations; it’s about building rigourous proofs.

This means being able to solve problems like the below:

Let (an) be a real sequence. Prove that if (an) converges to both L and M, then L = M.

Now you might be thinking ‘well if (an) converges to both L and M, then it’s obvious L=M’. And if you thought this then, much like first year undergraduate Adam, you’d leave the exam with very few marks1.

There’s good reason for demanding such rigour; in the 19th Century, many theorems assumed to be ‘obvious’ would end up collapsing when they encountered concepts like infinity. It would take mathematicians like Karl Weierstrass, Bernhard Riemann and others to put things back on a solid footing, and make sure proofs behaved even when dealing with things that were infinitely small or infinitely large.

Over time, I learned how to be good and rigorous at mathematical analysis, and by the end of my degree, these exams produced some of my highest marks. In turn, I’d spend my PhD getting better at Bayesian inference, a topic that I didn’t focus on so much as an undergrad. In the process, some previous expertise fell by the wayside. If you’d asked me in sixth form to write down Newton’s equations of motion off the cuff, it would have been easy for me. But if you asked me as a PhD student, I’d have had to look them up.

Exams that were once hard had become easy, and vice versa. I remember one morning during my PhD, I’d been sat discussing some questions from that year’s Sixth Term Examination Paper (STEP) with my fellow students. STEP is an exam for A-Level students hoping to get on to leading UK maths degree courses; it’s designed to reach beyond school level, examining university-like mathematical thinking. It’s administered by the University of Cambridge, and PhD students would often help out with marking.

When I’d taken STEP at school, I’d found the physics questions easiest. The main challenge was adapting familiar equations to new questions. Conceptual proof-like problems about properties of numbers or functions seemed much harder. Yet when I looked over that STEP paper years later in the Cambridge coffee room, it hit me that the opposite was now true. The abstract questions now seemed easy and the physics questions drew a blank. I’d forgotten the physics equations I’d rote-learned, while building the logical toolkit I needed to tackle pure maths questions.

In other words, my ability hadn’t increased evenly; it had become spiky. Training to build depth in one area had come at the cost of breadth in another.

This ‘spikiness’ in knowledge is now a common theme in discussion of AI skill. For leading models, the performance profile isn’t rounded - it sticks out far in certain areas, and not much at all in others.

This isn’t an accident. AI models are effectively targeting top marks on the same narrow set of exams. And that makes it difficult to define what ‘good’ means.

In a recent post, pointed out that many LLM performance benchmarks aren’t necessarily as impressive as they might seem. Take the AIME 2025 mathematical benchmark, with questions taken from the 2025 American Invitational Mathematics Examination. GPT-5.2 scored 100% in AIME 2025. As Sukhareva notes:

But could OpenAI fine-tune their model on AIME 2025 to get 100%?

They don’t even need to. The questions and answers are all over the internet. These thirty questions are public, they could have just trained on them or fine-tune an already-trained model on it if the dataset was published after knowledge cut-off.

If this is the case, it’s like a student boasting they’ve got top marks on a past exam paper, after having used that same paper as a revision tool.

This can explain why LLMs that appear very good at some narrow tasks can perform very poorly on others. When learning is focused on a narrow region of the space of possible problems, it can deliver impressive results so long as the task remains stable and repetitive. But, much like a maths student relying heavily on past exam experience, it can lead to confident overfitting when the structure of the problem changes.

This can also be true of human expertise. Great pure mathematicians may struggle with data-driven problems. Strong statisticians may be uncomfortable with physics. Talented physicists may be weak at pure maths.

The reason researchers poured so much effort into the field of mathematical analysis in the 19th Century? They wanted to be confident that their results would actually hold true. They didn’t want awkward counterexamples – or ‘mathematical monsters’, as some called them – to come along in future and trample on their work.

But the spikiness of AI knowledge, and risk of overfitting to narrow tasks, means we don’t currently have that confidence when it comes to artificial skill. What seems like ‘good’ performance in one situation won’t necessarily translate to another. We may get mastery – or we may get a monster.


If you’re interested in reading more about Weierstrass, Riemann and mathematical monsters, you might like my latest book Proof: The Uncertain Science of Certainty.


Cover image: Antoine Dautry

1

The person who wrote the exam later got a Fields Medal, so I guess I shouldn’t feel too bad that it was hard.



Read the whole story
mrmarchant
40 seconds ago
reply
Share this story
Delete

Meet Veronika, the tool-using cow

1 Share

Far Side fans might recall a classic 1982 cartoon called "Cow Tools," featuring a cow standing next to a jumble of strange objects—the joke being that cows don't use tools. That's why a pet Swiss brown cow in Austria named Veronika has caused a bit of a sensation: she likes to pick up random sticks and use them to scratch herself. According to a new paper published in the journal Current Biology, this is a form of multipurpose tool use and suggests that the cognitive capabilities of cows have been underestimated by scientists.

As previously reported, tool use was once thought to be one of the defining features of humans, but examples of it were eventually observed in primates and other mammals. Dolphins can toss objects as a form of play which some scientists consider to be a type of tool use, particularly when it involves another member of the same species. Potential purposes include a means of communication, social bonding, or aggressiveness. (Octopuses have also been observed engaging in similar throwing behavior.)

But the biggest surprise came when birds were observed using tools in the wild. After all, birds are the only surviving dinosaurs, and mammals and dinosaurs hadn’t shared a common ancestor for hundreds of millions of years. In the wild, observed tool use has been limited to the corvids (crows and jays), which show a variety of other complex behaviors—they’ll remember your face and recognize the passing of their dead.

Read full article

Comments



Read the whole story
mrmarchant
12 minutes ago
reply
Share this story
Delete

The toil of (blog) art

1 Share

When writing a technical blog, the first 90% of every article is a lot easier than the final 10%. Sometimes, the challenge is collecting your own thoughts; I remember walking through the forest and talking to myself about the articles about Gödel’s beavers or infinity. Other times, the difficulty is the implementation of an idea. I sometimes spend days in the workshop or writing code to get, say, the throwaway image of a square-wave spectrogram at the end of a whimsical post.

That said, by far the most consistent challenge is art. Illustrations are important, easy to half-ass, and fiendishly difficult to get right. I’m fortunate enough that photography has been my lifelong hobby, so I have little difficulty capturing good photos of the physical items I want to talk about:

A macro photo of a photodiode sensor. By author.

Similarly, because I’ve been interested in CAD and CAM for nearly two decades, I know how to draw shapes in 3D and know enough about rendering tech to make the result look good:

An explanation of resin casting, by author.

Alas, both approaches have their limits. Photography just doesn’t work for conceptual diagrams; 3D could, but it’s slow and makes little sense for two-dimensional diagrams, such as circuit schematics of most function plots.

Over the past three years, this forced me to step outside my comfort zone and develop a new toolkit for simple, technical visualizations. If you’re a long-time subscriber, you might have seen the changing art style of the posts. What you probably don’t know is that I often revise older articles to try out new visualizations and hone in my skills. So, let’s talk shop!

Circuit schematics

Electronic circuits are a common theme of my posts; the lifeblood of this trade are circuit schematics. I’m old enough to remember the beautiful look of hand-drawn schematics in the era before the advent of electronic design automation (EDA) software:

An old circuit schematic.

Unfortunately, the industry no longer takes pride in this craft; the output from modern schematic capture tools, such as KiCad, is uniformly hideous:

An example of KiCad schematic capture.

I used this style for some of the electronics-related articles I published in the 2010s, but for this Substack, I wanted to do better. This meant ditching EDA for general-purpose drawing software. At first, I experimented with the same CAD software I use for 3D part design, Rhino3D:

Chicken coop controller in Rhino3D. By author.

This approach had several advantages. First, I was already familiar with the software. Second, CAD tools are tailored for technical drawings: it’s a breeze to precisely align shapes, parametrically transform and duplicate objects, and so forth. At the same time, while the schematics looked more readable, they were nothing to write home about.

In a quest for software that would allow me to give the schematics a more organic look, I eventually came across Excalidraw. Excalidraw is an exceedingly simple, web-based vector drawing tool. It’s limited and clunky, but with time, I’ve gotten good at working around many of its flaws:

A schematic of a microphone amplifier in Excalidraw, by author.

What I learned from these two tools is that consistency is key. There is a temptation to start every new diagram with a clean slate, but it’s almost always the wrong call. You need to develop a set of conventions you follow every time: scale, line thickness, font colors, a library of reusable design elements to copy-and-paste into new designs. This both makes the tool faster to use — rivaling any EDA package — and allows you to refine the style over time, discarding failed ideas and preserving the tricks that worked well.

This brings us to Affinity. Affinity is a “grown-up” image editing suite that supports bitmap and vector files; I’ve been using it for photo editing ever since Adobe moved to a predatory subscription model for Photoshop. It took me longer to figure out the vector features, in part because of the overwhelming feature set. This is where the lessons from Rhino3D and Excalidraw paid off: on the latest attempt, I knew not to get distracted and to focus on a simple, reusable workflow first.

My own library of electronic components in Affinity.

This allowed me to finally get in the groove and replicate the hand-drawn vibe I’ve been after. The new style hasn’t been featured in any recent articles yet, but I’ve gone ahead and updated some older posts. For example, the earlier microphone amplifier circuit now looks the following way:

A decent microphone amplifier. By author.

Explanatory illustrations

Electronic schematics are about the simplest case of technical illustrations. They’re just a map of connections between standard symbols, laid out according to simple rules. There’s no need to make use of depth, color, or motion.

Many other technical drawings aren’t as easy; the challenge isn’t putting lines on paper, it’s figuring out the most effective way to convey the information in the first place. You need to figure out which elements you want to draw the attention to, and how to provide visual hints of the dynamics you’re trying to illustrate.

I confess that I wasn’t putting much thought into it early on. For example, here’s the original 2024 illustration for an article on photodiodes:

Photodiode structure.

It’s not unusable, but it’s also not good. It’s hard to read and doesn’t make a clear distinction between different materials (solid color) and an electrical region that forms at the junction (hatched overlay).

Here’s my more recent take:

A better version of the same.

Once again, the trick isn’t pulling off a single illustration like this; it’s building a standardized workflow that lets you crank out dozens of them. You need to converge on backgrounds, line styles, shading, typefaces, arrows, and so on. With this done, you can take an old and janky illustration, such as the following visual from an article on magnetism:

A simple model of a conductor.

…and then turn it into the following:

A prettier model of the same. By author.

As hinted earlier, in many 2D drawings, it’s a challenge to imply a specific three-dimensional order of objects or to suggest that some of them are in motion. Arrows and annotations don’t always cut it. After a fair amount of trial and error, I settled on subtle outlines, nonlinear shadows, and “afterimages”, as shown in this illustration of a simple rotary encoder:

Explaining a rotary encoder.

The next time you see a blog illustration that doesn’t look like 💩 and wasn’t cranked out by AI, remember that more time might have gone into making that single picture than into writing all of the surrounding text.

Subscribe now

Read the whole story
mrmarchant
10 hours ago
reply
Share this story
Delete

The Computational Web and the Old AI Switcharoo

1 Share

The Computational Web and the Old AI Switcharoo

For two grand I can buy a laptop with unfathomable levels of computational power. Stock, that laptop comes with silicon made of resistors so impossibly tiny, it operates under different laws of physics. For just a few hundred dollars more, I get a hard drive that stores more documents than I can write if I wrote 24 hours a day for the rest of my life. It fits in my pocket. If I'm worried about losing those documents, or if I want them synced across all my devices, I pay Apple for iCloud. Begrudgingly.

If I were on a budget, any two-hundred-dollar laptop and a twenty-dollar USB, could handle the computational load required for the drivel I pour daily into plain text files (shout / out / to / .txt).

Yet, the average note-taking app charges ten bucks per month in perpetuity. It stores my writings in proprietary file formats that lock me into the app. In exchange, I get access to compute located 300 miles away, storage I don't need, and sync-and-share capabilities that I already pay for. Now, I can also expect a 20% hike on all my subscriptions for BETA-level AI solutions desperately searching for a problem to solve.

Welcome to the Computational Web.

I define the “Computational Web” by the increasingly gargantuan levels of computational power (compute) required to run the modern Internet, enacted by a small group of firms uniquely positioned to meet those demands.

The Computational Web is the commodification of computational power. The Computational Web marks the achievement of absolute control over the modern technology stack. The Computational Web signals a future where all of our personal computers devolve into mere cloud portals. These devices would be sleek, and thin, and inexpensive, and incapable of answering "how many 'R's are in the word strawberry” without an internet connection (and maybe not even then).

The cloud used to be the place we stored our files as backups, and kept our devices synchronized.

But increasingly, we are seeing the cloud takeover everything our computers are capable of doing. Tasks once handled by our MacBook's CPUs and GPUs are being sent to an edge server to finish. No product on the market facilitates this process better than cloud-based AI solutions.

Aside: (I think) Sam Altman fundamentally misunderstood the role his product plays in the Computational Web. Now he's scrambling, begging companies with infrastructure for more compute.

Artificial intelligence, manifested in Chatbots and agents, isn't the product. The product is the trillion-dollar data center kingdoms required to power those bots. ChatGPT might be OpenAI's Ford F150, but datacenters are Microsoft's gasoline. Without Microsoft's infrastructure, ChatGPT is a $500 billion paperweight. I don't know when Sam Altman realized AI is just a means to sell retail compute to the masses. Probably just before the ink dried on the pair's partnership agreement.

Compute is expensive and difficult to scale. AI is the most compute-hungry consumer technology in the history of the web. So, shoehorning AI features into our apps isn't just tech bros following their tail. It's setting the expectation that all consumer technology requires AI. If all technology requires AI, and only a handful of companies are equipped to handle the computational load that AI requires, then compute itself becomes a moat too deep for competition to enter, and consumers to flee from.

Compute is a scarce resource, turning the tech industry into a cloud-oligopoly (Google, Amazon, Microsoft, and Meta (GAMM)). Our devices—laptops, desktops, phones—have grown dependent on the cloud, not just for storage, but to complete the types of tasks that our devices are largely capable of handling.

Our level of dependency on cloud-computing has made, by and large, local-computing redundant. Something has got to go. Can you guess which it'll be?

An interesting side effect of late-stage capitalism is the gained ability to forecast business strategies. Some in the blogosphere loathe this type of navel-gazing, but I find it fun. Because all you must do to extrapolate big tech's strategies is realize that morality, ethics, and sometimes even the law, are not considerations when one develops a “corner the market“ business plan. It's Murphy's Law but for big tech. If it's possible and profitable; if it causes dependency and monopolies, then that's the plan. Competition is for losers, after all.

The ol’ switcharoo #

Each iteration of the web starts with a promise to the people, and ends with that promise broken, and more money in the pockets of the folks making the promises.

Through the early nineties, the “proto-web” promised us a non-commercial, interconnected system for cataloging the world’s academic knowledge. We spent millions in tax dollars building out the Internet’s infrastructure. The proto-web then ended with the Telecommunications Act of 1996— the wholesale redistribution of the people’s internet to the American Fortune 500.

Web 1.0, or the “Static Web,” promised a democratization of information, and the ability to order your Pizza Hut on the Net. The Static web ended with the dot-com bubble.

Web 2.0, or the “Social Web,” promised to connect the world, even if that meant people dying. So we gave them our contact info and placed their JavaScript snippets into our websites. The result of Web 2.0, an era that ended around 2020, is the platform era: techno-oligarchs and fascists who control all of our communication infrastructure and use black-box algorithms to keep us on-platform for as long as possible.

We are halfway into Web 3.0. The Computational Web has tossed a lot of hefty promises into that Trojan Horse we call AI—ending world hunger, poverty, and global warming just to name a few. But this is for all the marbles. Promises of utopia are not enough. They must scared the shit out of us, too, by implying that AI in the wrong hands can bring about a literal apocalypse.

So, how will Web 3.0 end?

On New Year's, I made a couple of silly tech predictions for 2026 (because it's fun guessing what our tech overlords will do to become a literal Prometheus).  The first prediction is innocuous enough. I think personal website URLs will become a status symbol on social media bios for mainstream content creators. Linktrees are out. Funky blogs with GIFs and neon typography are in. God, how I hope this one happens. Not even for nostalgia. In the hyper-scaled, for-profit web, personal websites are an act of defiance. It's subversive. It's punk.

My second 2026 prediction, something, for the record, I do not hope happens, is an attack on local computing. We'll see a mainstream politician and/or tech elite call for outlawing local computing. This is big tech's end goal—position AI (LLM, agentic, or whatever buzzword of the time) as critical infrastructure needed to run our software, leverage fear tactics into regulatory capture, then, the long game is to work towards a cloud-tethered world where local compute is a thing of the past. Thin clients with a hefty egress invoice each month. Google, Amazon, Microsoft, and Meta (GAMM) will become the Comcast of computational power. (Not all of this is happening in 2026. I kind of went off the rails a bit.)

To get that ball rolling, all big tech needs is a picture of a brown man next to a group of daisy-chained Mac Minis, and the headline AI-assisted Terrorist Attack.

Read the whole story
mrmarchant
14 hours ago
reply
Share this story
Delete

State High Points to Skip (Hike These Instead)

1 Share
From: HikingGuy.com
Duration: 22:36
Views: 3,226

State High Points you can actually hike. Plus safer alternatives when the real high point is technical (Rainier, Denali, and more).
*Gear I'm Using Now:*
* inReach: https://hkgy.co/inreach
* Hiking App: https://hkgy.co/app
* Watch: https://hkgy.co/watch
* Shoes: https://hkgy.co/shoes
* Pack: https://hkgy.co/backpack
* Rescue Insurance: https://hkgy.co/insurance

*Full Hiking Gear List (What I Use Now - Tested & Not Sponsored):*
https://hikingguy.com/gear

*Links:*
* Guide List: https://hikingguy.com/guides/hikeable-state-high-points/
* Website: https://hikingguy.com
* Patreon: https://www.patreon.com/c/hikingguy
* Monthly Hiking News (free): https://www.patreon.com/c/hikingguy/
* Subscribe: https://www.youtube.com/c/Hikingguy

“Hikeable” = a normal hiking route without ropes, glacier travel, or technical climbing. When a state high point is technical or dangerous, I list the best hikeable alternative.

00:00 The High Point Problem
00:44 Alaska: Denali vs Flattop Mountain
01:42 California: Mt Whitney Hike
02:27 Colorado: Mt Elbert Hike
03:00 Washington: Rainier vs Mt St Helens
03:44 Wyoming: Gannett Peak vs Medicine Bow Peak
04:30 Hawaii: Mauna Kea Hike
05:12 Utah: Kings Peak Hike
05:40 New Mexico: Wheeler Peak
05:56 Nevada: Boundary Peak vs Wheeler Peak
06:49 Montana: Granite Peak vs Trapper Peak
07:18 Idaho: Borah Peak vs Scotchman Peak
07:44 Arizona: Humphreys Peak
08:19 Oregon: Mt Hood vs South Sister
09:10 Texas: Guadalupe Peak
09:40 South Dakota: Black Elk Peak
10:06 North Carolina: Mt Mitchell
10:40 Tennessee: Kuwohi (Clingmans Dome)
11:07 New Hampshire: Mt Washington
11:38 Virginia: Mt Rogers
12:04 Nebraska: Panorama Point
12:25 New York: Mt Marcy
12:38 Maine: Katahdin
13:00 Oklahoma: Black Mesa
13:14 West Virginia: Spruce Knob
13:34 Georgia: Brasstown Bald
13:59 Vermont: Mt Mansfield
14:22 Kentucky: Black Mountain
14:48 Kansas: Mt Sunflower
15:00 South Carolina: Sassafras Mountain
15:30 North Dakota: White Butte
15:58 Massachusetts: Mt Greylock
16:24 Maryland: Hoye-Crest
16:48 Pennsylvania: Mt Davis
17:07 Arkansas: Mount Magazine
17:30 Alabama: Cheaha Mountain
17:48 Connecticut: Mt Frissell
18:00 Minnesota: Eagle Mountain
18:25 Michigan: Mt Arvon
18:49 Wisconsin: Timms Hill
19:04 New Jersey: High Point
19:31 Missouri: Taum Sauk Mountain
19:50 Iowa: Hawkeye Point
20:13 Ohio: Campbell Hill
20:31 Indiana: Hoosier Hill
20:41 Illinois: Charles Mound
21:00 Rhode Island: Jerimoth Hill
21:19 Mississippi: Woodall Mountain
21:29 Louisiana: Driskill Mountain
21:46 Delaware: Ebright Azimuth
22:04 Florida: Britton Hill

---

*Disclaimer:*
Some of these links are affiliate links where I’ll earn a small commission if you make a purchase at no additional cost to you. I only recommend gear I actually use and trust.

Read the whole story
mrmarchant
23 hours ago
reply
Share this story
Delete

3:10 to Yuma

1 Share

It is harder to automate harvesting of lettuce than to send a human to the moon, to get any product in the world shipped to you within a few hours, to get vaccines for mumps, chicken pox, etc, to get a computer and connectivity in almost every person’s pocket, and many others.

I had the privilege to visit multiple harvesting operations during my visit to Yuma, Arizona, last week. I also had the chance to connect with other players in the ecosystem who have worked on various aspects of the problem.

These included forward-thinking Farm Labor Contracting (FLC) companies, local manufacturers, AgTech adoption services companies, research institutions, and, of course, many growers, shippers, and packers. Just as with any population, I encountered people with extremely strong opinions about how things should be done!

Harvesting lettuce (and many other specialty crops) is still very much a majority human-driven endeavor.

A brief history of automation and mechanization attempts

According to a 1973 research paper, serious attempts to mechanize lettuce harvesting at public research universities began in 1962. We have been trying to fully automate lettuce harvesting for almost 64 hours, but we have not succeeded yet.

And it is not for lack of trying.

There are multiple startups and other custom shops working to automate this process.

According to the latest crop robotics report from Better Food Ventures, there are more than 450 robotics companies, with many focused on harvesting operations.

The Bracero Program

US Growers, shippers, and packers prior to the 1960s benefited from the Bracero program.

This was a government-to-government agreement. The US government acted as the labor contractor in the early years, transporting workers to distribution centers. It was massive in scale and focused almost exclusively on Mexican nationals.

Growers paid low wages, and the program itself was accused of suppressing domestic wages. There was little financial incentive to replace Braceros with machines, since labor was so affordable.

The Bracero program was terminated in 1964. This created a huge challenge for growers, unless they had access to undocumented workers.

Right after the program was terminated, UC Davis began commercializing a tomato harvester for processed tomatoes, which was quickly adopted. Some acreage for labor-intensive crops, such as asparagus and strawberries, moved south to Mexico.

The Blackwelder Tomato Harvester, developed by the University of California, Davis, is supposed to have saved the processing tomato industry in California in the 1960s (Image Source: UC Davis)

The Bracero Program ended in 1964, and many farmers expected to continue to hire Mexican workers under the H-2 program. However, the DOL issued regulations on December 19, 1964, that required employers of H-2 workers to offer and pay the AEWR to any U.S. workers they employed (Martin, 2007, pp. 15–16). Unlike bracero employers, H-2 employers were also required to offer and provide U.S. workers with the same housing and transportation guarantees that were included in bracero contracts. At a time when U.S. farm workers were not covered by minimum wage laws, most U.S. farmers adjusted to the end of the Bracero Program by mechanizing or changing crops rather than continuing to rely on migrant workers (Clemens, Lewis, and Postel, 2018).

The post-Bracero period saw many attempts to mechanize lettuce harvesting. But these early machines were heavy, slow, and often crushed the lettuce. The technology of the time (analog sensors and hydraulics) lacked the finesse of a human hand.

With the rise in undocumented workers, the pressure to automate harvesting decreased a bit.

Thanks for reading Software is Feeding the World! Subscribe for free to receive new posts and support my work.

From the 1980s to the 2010s, the industry paused on full automation and switched to doing “harvest aid” machines.

The harvest aid machine typically consists of a slow-moving tractor that pulls a long conveyor belt wing. A crew of about 20 people walks behind the conveyor belt. They continue to cut, trim, and place lettuce heads on a belt.

Harvesting can start in the middle of the night and can go till early in the morning (Photo by Rhishi Pethe)

On the right-hand side of the picture, you can see about 20-25 people selecting, cutting, and cleaning lettuce heads, then placing them on a conveyor belt that moves from left to right.

There are about 3-6 people at the end of the conveyor belt who are responsible for quality-checking the lettuce. The QC people are called “ojeros” as they keep an eye on the quality of produce coming through.

If the field has a high-quality product, the harvesting crew will have more cutters and fewer cleaners. If the product quality in the field is not top-notch, the harvesting crew will have more QC personnel and fewer cutters.

For shed pack products (as seen in the picture above, which will be used for processing, the lettuce heads are kept clean through a spray of water and then end up in large bins (left of the picture), which are then immediately moved to a cooling facility for further processing for salads, etc.

The two or three people standing by the bins make sure the bins are filled uniformly. This part of the harvesting crew is called “reinas” (queens, as they stand and work higher up on the machine).

For field packs, the lettuce heads are immediately wrapped in plastic and placed in boxes for shipping to a cooler.

Harvest aid machines reduce physical strain on workers by reducing walking and bending (depending on the product) and increasing harvesting speed.

The most critical process in harvesting lettuce requires the ability to spot a mature head, cut it at the right angle and at the right spot, and trim the bad leaves. The robotics capabilities of the 1990s and 2010s didn’t have the capabilities to replicate it cost-effectively, whereas a trained human can do it quite effectively.

(This should quell the notion that farming is an unskilled job)

As a result, human harvesting still accounts for more than 50% of the total production cost for specialty crops like lettuce and berries.

The University of California, Davis, publishes detailed cost breakdowns by different crop types in different parts of California. The table below shows a screenshot of the total operating costs per acre for field-packed iceberg lettuce.

I have highlighted the harvest/field pack costs of $7,200 per acre, out of the total operating costs/acre of $14,584. This data is for 2023 for iceberg lettuce in the Central Coast region of California. Labor costs will be slightly lower in Yuma, Arizona, but the percentages will remain the same.

H2-A program

The H-2A program began in 1986, though the first visas were not issued until 1992. The H2-A program's goals and structure were in sharp contrast to those of the Bracero program.

After the Bracero program was terminated for suppressing wages, the Department of Labor (DOL) implemented strict rules for H-2 visas. They created the Adverse Effect Wage Rate (AEWR).

AEWR is a minimum wage floor designed to prevent foreign workers from being paid less than US workers.

For example, today in Arizona, the hourly legal rate for an H-2A worker is around $18, and they are required to have housing and other amenities, which bumps the hourly rate to $25 to $30 per hour.

For the first 20-25 years, the H2-A program was hardly used, as it was cheaper and easier for farmers to hire undocumented workers. As a result, in the first 20-25 years of the program, H2-A issued only 30,000 to 40,000 visas nationwide each year.

Around 2010, due to stricter border enforcement and demographic changes in Mexico, the cheap, undocumented labor supply started to dry up. It forced farmers into the H-2A program as the “labor of last resort.”

Image source: Choice Magazine

As H2-A program costs have risen, it has once again created a financial incentive for growers to automate as much of the harvesting process as possible to reduce costs and stay competitive with other parts of the world, such as Mexico and South America. My friend and fellow AgTech Alchemist, Walt Duflock, has written extensively about these challenges, and you should follow him on LinkedIn.

Subscribe now

As you can see from the images below, harvesting lettuce requires speed, keen observation, precision, and stamina. Harvesting lettuce across Salinas, California, and Yuma, Arizona, is a 6-day-a-week, 52-weeks-a-year, around-the-clock operation.

Humans are very dexterous and versatile in their skill sets. Picking romaine or iceberg lettuce is not straightforward.

For example, lettuce picked for shed packing for value-added products like salads has to be cut at the right angle and placed on the lettuce stem; a few leaves have to be removed, and the core is stabbed with a coring knife.

This lettuce is then put into large bins, which are sent to the cooler for further processing.

For a product like romaine heart lettuce, the human has to discern which lettuce to pick and which to leave behind, cut it at the right spot, and then remove the right amount of leaves to leave a beautiful-looking heart, which is then immediately packaged.

This is a bit of a consumer expectation problem: consumers expect beautiful-looking produce that stays fresh for at least a reasonable amount of time in their refrigerator before it is used.

Romaine Heart (Photo by Rhishi Pethe)

As a result, a large amount of potentially usable product is left behind (by some estimates, 40-50%). This is plowed back into the ground before the next crop goes in, providing some nutrients to the soil.

The bottom part of the field in the image above has already been harvested by the machine. The top half is yet to be harvested. (Photo by Rhishi Pethe)

As you can see from some of the examples here, it is challenging for a machine to completely replace a human, as it must match the speed, quality, and cost of the human crew. The harvesting crew has to perform the following three main operations for lettuce.

Cutting, cleaning, and packing.

Share

Harvesting process capabilities and requirements

There are 5 different buckets of capabilities that an automated machine should be able to meet to reduce the amount of human labor used or almost eliminate it.

These required capabilities will illustrate why, even though the ecosystem has been working for more than 64 years, we still have not fully automated lettuce harvesting.

Though with powerful vision systems, sophisticated AI algorithms, edge processing, advances in materials engineering, and rising H2-A costs, it might be feasible to address this problem.

Performance and Throughput

The system must match or exceed the efficiency of manual labor crews to justify the capital expenditure. Current crews work fast and move through fields quickly, easily processing thousands of heads per hour.

The crew has to maintain the quality of the cut, the cleaning, and the packing while maintaining a high throughput.

The image above shows the precise cut needed to be made by the human operator. If the operator cuts above or below the ideal cut point, it results in wasted product. (Photo by Rhishi Pethe)

As I said earlier, lettuce harvesting happens 6 days a week, all year round. The system must be capable of near-continuous operation during the harvest window of April–November in regions like Salinas, CA, and November through March/April in Yuma, Arizona. The changeover time between fields has to be kept to a minimum.

Economic and Commercial Viability

The machine must reduce the human workforce requirement over a longer time period, and it should have a reasonable and acceptable ROI timeline for adoption.

Obviously, any new machine will not meet these requirements on Day 1, and so there should be a plan to go down the cost curve by learning with enough reps in the field.

Food Safety and Hygienic Design

The machine must adhere to the Produce Safety Rule (PSR) under the Food Safety Modernization Act (FSMA). The Produce Safety Rule has specific requirements around agricultural water, biological soil amendments, sprouts, domesticated and wild animals, worker training and health and hygiene, and equipment, tools, and buildings.

Food safety is absolutely critical. All the operations I saw took food safety very seriously. We always had to wear gloves, hair nets, and beard nets (whether you had a beard or not)!

Machines are washed, and surfaces are cleaned regularly. Everyone has to constantly wash their hands to maintain a hygienic environment.

One of the growers had a pet peeve about birds! Birds can be a nuisance as they can damage the crop. If birds do their “business” while flying over the field, it makes harvesting the product challenging from a food-safety standpoint.

I heard stories about how 100s of acres of crops had to be abandoned due to excessive bird excrement! This particular grower had hired an 8-person crew with the sole responsibility to scare the birds away. Innovative attempts to scare the birds away include drones chasing birds, specific frequency sounds, etc.

Product Quality and Grading

The cut must be made exactly at the “collar” (where the leaves meet the stem) to prevent a reduction in shelf life. The automated system must identify and reject heads with visible defects. Mechanical damage should be kept to a minimum and meet USDA standards for different grades of produce.

Operation and Environment

Meeting operational and environmental requirements is one of the most challenging aspects for these machines. The harvesting environment and the time period are characterized by mud, dust, and high heat.

Furrows might be wet and muddy. Machines have to turn around at edges, which may have different characteristics. It should be able to enter a field and be easy to transport from one field to the next.

Growers are very passionate about how they grow their crop. Some growers are extremely passionate about their bed width, and it would be very hard to get them to change it if it meant an easier operation for their harvesting machine.

Due to this, a harvest machine should be flexible and able to adjust to different bed widths (ranging from the low 30s to mid 80s inches).

Ecosystem Support

For any product, especially a startup, it is impossible to get all of these right in the first version. The product development process has to start with a particular capability and then build experience in the field through a real, growing operation.

Given the number of startups working on robotics, it is often challenging for startups to reach the right type of growers to get feedback on their products and then have the infrastructure to test and iterate on their product.

This is where the ecosystem comes into play. I had the privilege to connect with a few public and private players in the space in Yuma.

For example, YCEDA (Yuma Center of Excellence for Desert Agriculture) under the University of Arizona umbrella is

dedicated to bringing scientific research and industry together to find solutions that bring value to stakeholders by addressing “on-the-ground” needs of Desert Ag production.

YCEDA members can provide technical and research assistance, field acres for testing, and ecosystem support to startups and other players working on bringing innovation to market.

An organization like The Reservoir provides support, office space, field testing infrastructure, and grower connections.

Reservoir helps founders build, test, and scale technologies that farmers can put to work—faster, fitter, and with lasting impact.

The Western Growers Association serves as a voice for specialty crop growers and provides important connections, expertise, and perspectives on technology adoption, policy, and industry needs.

A private organization like Axis Ag provides “Technology services for the agricultural community.”

Axis Ag empowers farmers with innovative agricultural solutions that drive sustainable growth, optimize productivity, and foster environmental stewardship. We are dedicated to advancing technology and partnerships that enhance the future of farming, building resilient food systems, and supporting the communities we serve.

Axis Ag has an interesting model. They have years of experience in specialty crop agriculture, strong grower connections, and the expertise to field-test different AgTech solutions, provide feedback to startups, and find the right set of early adopters to test and scale them.

They have partnered with multiple AgTech companies to help them bring their products to market and to scale.

The services and support provided by YCEDA, The Western Growers Association, The Reservoir, Axis Ag, and many others are critical to bringing these very much needed AgTech innovations to market and scaling them.

The normal conversations within AgTech can feel depressing. The farm economy is definitely in trouble, but my trip to Yuma, Arizona, showed me that the AgTech ecosystem is working hard, is collaborating, and is focusing on solving the right type of problems with rigor.

It might take longer than sending a human to the moon, but many of the existing problems, will get solved in the near future!

Next week’s edition

I will write about the amazing irrigation infrastructure in Yuma, Arizona; how Yuma became the winter salad bowl of the US; the challenges the region faces; and the role AgTech can play.

Happenings

I will be leading a two-and-a-half-hour session during the “AI in Agriculture Forum” at World Agritech San Francisco on March 16, 2026. The focus of the session will be on AI benchmarking. I hope to see many of you in San Francisco in March.

I will be hosting a session titled Robotics & Autonomy: The New Agricultural Workforce on March 17th, 2026, at World Agritech, featuring speakers from CNHi, Kubota, Tavant, and AgZen.

AgTech Alchemy will host an event in Monterey, California, on February 17th, in collaboration with AgSafe. The AgTech Alchemy team will share more details shortly.

Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete
Next Page of Stories