477 stories
·
0 followers

Stats from a dying web

1 Share
Stats from a dying web

This is a column about AI. My boyfriend works at Anthropic. See my full ethics disclosure here.

I.

For more than a year now, I’ve had one eye trained on how generative AI will reshape the web. My primary fear has been that large language models like those found in ChatGPT are now good enough that large numbers of people are beginning to abandon traditional search engines, starving publishers and websites of the traffic and money they need to continue operating. While some publishers have made lucrative deals with AI labs, on balance the number of jobs in journalism is shrinking. And I’m not the only person worried: last month, Pew Research reported that about half of Americans believe that AI will be bad for journalism.

Until now, the effects of AI on the web have mostly been theoretical. But this week, an Apple executive testifying in the remedy phase of Google’s antitrust trial offered a piece of information that spooked observers of both companies.

Here are Emma Roth and Lauren Feiner at The Verge:

Google searches fell in Safari for the first time ever last month, Apple’s senior vice president of services, Eddy Cue, said during Google’s antitrust trial on Wednesday. “That has never happened in 22 years,” Cue added.

Cue linked the dip in searches to the growing use of AI, which Apple is now considering integrating into Safari. The rise of web search in AI tools like ChatGPT, Perplexity, Gemini, and Microsoft Copilot may make users less inclined to visit Google as their primary way of finding information.  

The fallout from Cue’s revelation was swift: Google’s stock price declined 7.5 percent, erasing $150 billion or so of market value, before recovering 1.93 percent today. In response, Google hustled up a statement with the urgency of a flight attendant telling passengers not to panic during a particularly jarring moment of turbulence. The company said

We continue to see overall query growth in Search. That includes an increase in total queries coming from Apple’s devices and platforms. More generally, as we enhance Search with new features, people are seeing that Google Search is more useful for more of their queries — and they’re accessing it for new things and in new ways, whether from browsers or the Google app, using their voice or Google Lens.

The first thing this statement does, quietly, is confirm that Cue’s testimony is accurate. (Something you can’t always take for granted with Apple executives these days.) The second thing it attempts to do is to reassure shareholders that it doesn’t matter. People might be making fewer searches in Safari, but they’re making more searches elsewhere. Google presents this as a natural evolution of search. You couldn’t always search by uploading a photo on your phone, for example; now you can using Lens, which is a feature of the Google app on iOS. If you love to search by taking photos, you might use the Google app more, and Safari a lot less.

I believe Google on this point. Rand Fishkin, an expert in search engine optimization and founder of the audience research company SparkToro, published a report last month that sought to put the rise of AI search in context against Google’s continued dominance. It found that Google searches rose 21.64 percent from 2023 to 2024. As Fishkin notes, this seems to be consistent with what Google CEO Sundar Pichai said when the company rolled out its AI overviews last year: that people who see them tend to do more searches.

Of course, the health of the web is not determined by the number of Google searches alone. Equally important is where people get their answers — and increasingly, they are getting their answers on Google. Multiple analyses have now found that Google’s AI overviews have resulted in declines of 70 to 80 percent in the click-through rates to the web pages from which they derive their information. That’s 70 to 80 percent fewer visits to web pages, and one of the primary web page-producing industries is shrinking accordingly:  CNN, Vox Media, HuffPost, and NBC are among the publishers that have announced layoffs in 2025 so far.

If only to reassure its shareholders, Google would prefer to leave the story there: its search engine still legitimately ascendant, and the threat of disruption largely neutralized. (Disruption from challengers, at least. There’s still the risk of state-mandated disruption: the US government is currently seeking to force the company to spin out its Chrome browser and divest itself of core pieces of its advertising monopoly.)

But investors’ panic earlier this week was not entirely unwarranted. There are signs of an actual shift happening in search, and it is not to Google’s benefit. 

It is not to the benefit of the web, either. 

II.

One of the week’s best-read feature stories was written by James D. Walsh at New York Magazine. Its headline: Everyone is cheating their way through college.

It opens on Roy Lee, the Columbia University student who made headlines earlier this year by releasing software designed to help engineers cheat their way through technical interviews at tech companies. (He subsequently raised $5.3 million to build an app that helps you “cheat on everything”; we talked to him on Hard Fork, too.)

To Walsh, Lee is only the most visible example of an increasingly undeniable trend. “In January 2023, just two months after OpenAI launched ChatGPT, a survey of 1,000 college students found that nearly 90 percent of them had used the chatbot to help with homework assignments,” Walsh writes. 

It has only accelerated since then; one study found that AI usage among college students increased from 66 percent last year to 92 percent this year. Professors who can no longer reliably tell AI-generated assignments from student-written ones speak to Walsh of an existential despair.

In Silicon Valley, this is what is known as product-market fit.

And this market, unlike search, is competitive: as Walsh notes, students don’t only use ChatGPT: they use Google’s Gemini, Anthropic's Claude, and Microsoft’s Copilot, among others. 

For the moment, Google’s AI overviews seem to have quelled the possibility of a sudden mass defection away from its core search engine. But it’s now clear that for the first time in decades, a generation is growing up with the possibility of using something other than Google as its default search. Today, they’re using ChatGPT to do all their homework assignments. By the time they graduate, they may be using it to do almost everything else. 

Google knows this, which is why it is seeking to pre-install Gemini on smartphones wherever it can. The company is having some success; Gemini is the second-most used chatbot after ChatGPT, according to Fishkin’s analysis, and Google overall handles 373 times as many searches as OpenAI’s bot.

But while there was once a time when the company could hand Apple $20 billion for default search placement and spend the rest of the year relaxing, Cue’s testimony shows why that is no longer the case. Bit by bit, current and future generations are shifting their habits away from traditional search and toward chatbots. Google can spend its many remaining billions of dollars on implementing Plan B. But it remains unclear what anyone else is supposed to do.


Elsewhere in the case: Mark Gurman argues persuasively that Cue's primary objective was to convince the judge that the search market is already so competitive that he should allow Google to continue paying it $20 billion a year. And M.G. Siegler has some more theories on why search declined on Safari.

Stats from a dying web
Stats from a dying web

On the podcast this week: Kevin and I discuss how freedom came to the App Store, and Apple's disastrous loss in court last week. Then author Karen Hao stops by to discuss her new book Empire of AI and the downsides of pursuing massive scale. And finally, I introduce Kevin to the world of Italian brainrot.

Apple | Spotify | Stitcher | Amazon | Google | YouTube

Stats from a dying web

Sponsored

Stats from a dying web

We’ve been trying to reach you...

…about your car’s extended warranty. Tired of getting those calls? Here’s your chance to protect yourself, your personal data, and your sanity. Incogni is a personal data removal service that scrubs your sensitive info (think: SSN, DOB, home addresses, health information, and contact details) from the web. With Incogni, you’ll worry way less about: identity theft health insurers raising your rates based on info from data brokers robo and spam calls scammers taking out loans in your name.

Protect your personal info + get 55% off your annual plan with code PLATFORMER

Stats from a dying web

Governing

Stats from a dying web

Industry

I talked with Grindr CEO George Arison for Fast Company. "We’re partly in the dating business, but we’re actually a social network," he told me. "So we don’t see dating fatigue here. What I do see is we need to do a much better job of making it easier for people who want to date to date. If there is one thing that people try other products for, it’s dating—and then they come back to Grindr." Click to learn more about the company's surprising new strategy, which includes declaring war on erectile dysfunction.

Stats from a dying web

Those good posts

For more good posts every day, follow Casey’s Instagram stories.

Stats from a dying web

(Link)

Stats from a dying web

(Link)

Stats from a dying web

(Link)

Stats from a dying web

(Link)

Stats from a dying web

(Link)

Stats from a dying web

Talk to us

Send us tips, comments, questions, and Safari searches: casey@platformer.news. Read our ethics policy here.

Read the whole story
mrmarchant
4 hours ago
reply
Share this story
Delete

What Even Is Vibe Coding?

1 Share

When I first heard the phrase vibe coding, I rolled my eyes a little. It sounded like another fleeting buzzword, one of those things that shows up in a thread, gets memed into the ground, and disappears before it ever really lands. As someone who respects the craft of building software, the idea of just vibing through development felt like it was missing the point entirely.

And honestly, if you had asked me, even just four weeks ago, whether I’d be writing a blog post about vibe coding, I probably would’ve LOL'ed.

But then I saw where it came from.

The term popped up in early 2025 from Andrej Karpathy, who described it like this:

Fully giving in to the vibes. See things, say things, run things, copy paste things. No reading the diffs. Just vibes.

At first glance, it reads like satire. But underneath the humor is something real: a different way of working. Vibe coding, at its core, is about using natural language to tell an AI coding assistant what you want, and then watching it try to build it for you. You’re not manually managing logic or carefully threading state through components, you’re describing intent and letting the model do the busywork.

The first time I saw it in action, I thought, “There’s no way this scales.” But the more I tried it myself and the more I watched others experiment, the harder it became to write it off.

The Room’s a Bit Divided

When I brought up vibe coding on Threads, people had feelings. Some immediately dismissed it as sloppy or unserious. Others were curious but skeptical. It felt like one of those moments where the tech community splits into “this is the future” and “this is a joke,” and neither side is really wrong.

Over on Bluesky, the tone was a little more exploratory. Devs were sharing their wins and fails, people building quick prototypes with nothing more than a vision, or talking about how easy it was to fall into the trap of blindly shipping untested code. The consensus seemed to be: it’s fast, it’s fun, but you still have to think. Vibe coding might be a useful tool in the early stages of a project, but it’s not quite ready to carry the full weight of production.

And that’s fair. The whole conversation hints at something deeper: how do we evolve our workflows without letting quality slip through the cracks?

So What Is Vibe Coding Now?

Originally, vibe coding meant letting go entirely, describe what you want, hit run, and see what happens. You don’t read the code. You don’t edit it. You trust the output, or at least pretend to. That was the bit that made a lot of us nervous.

Simon Willison pushed back on the way the term started evolving. He said:

Vibe coding does not mean ‘using AI tools to help write code.’ It means ‘generating code with AI without caring about the code that is produced.

And he’s not wrong. But language shifts. And lately, “vibe coding” has been getting used as shorthand for any AI-assisted development, even when the dev is still reviewing the output, writing tests, and guiding the structure. In other words: we started using “vibe coding” to describe something much more responsible than the name implies.

That kind of semantic drift isn’t new. We’ve seen it happen with words like “cloud” and “serverless.” And while it can be frustrating, it also reflects the reality that people are experimenting. They’re using the tools in ways that make sense to them, even if the vocabulary gets a little messy along the way.

A Little Skepticism Never Hurt

I’m not new to side-eyeing shiny new tech. I remember wondering if we really needed iPads when they first came out. Why would I want a giant iPhone that doesn’t make calls? And yet, here we are, many of us using them daily (myself included) for things we didn’t even know we’d care about.

Same story with the cloud. Not the infrastructure itself, but the early marketing around it. It felt like buzzwords stacked on buzzwords. Elastic, scalable, serverless magic. But now? Try building anything serious without it. It’s the backbone of how we work.

I’ve been around long enough to know that new tools often sound silly before they sound useful. And honestly, that’s a good thing. A little skepticism keeps us grounded. It forces us to ask the uncomfortable questions early: What does this break? What does it replace? What does it make possible and for whom?

But I’ve also learned that some things need time to show their value. Especially when the first pitch is… let’s say, aspirational. AI coding assistants are kind of in that stage now. Some days, they feel like a party trick. Other days, they feel like the start of something real. I’m not convinced the dust has settled yet, but I’m paying attention. I think we all should be.

Vibe Coding Without Realizing It

Lately, I’ve been spending more time with GitHub Copilot’s Agent mode and it turns out, I’ve been vibe coding without even realizing it.

As someone who builds creative side projects, quick experiments, and weird little tools that don’t need enterprise-grade rigor, I sometimes just open an empty repo, sketch out a vision in a README, and hand it off to GitHub Copilot in agent mode to see what happens. I describe the structure, the feel (yes, the vibe), and what I want the user to be able to do. Then I let the Agent take the first pass.

It scaffolds layouts, creates routes, fills in placeholder content, basically roughs out the shape of the thing I described. I still review it, refactor it, test it, and shape it into something I’d actually ship. But that first pass? It saves me hours. And more importantly, it frees up mental space so I can focus on the interesting parts.

A few weeks ago, I was doing exactly that, just messing around with a fun idea, seeing how far GitHub Copilot could take it, and afterward, it hit me: I had basically vibe coded the whole thing. I wasn’t ignoring the code. I still owned the final product. But I was definitely working from instinct, exploring an idea through natural language and iteration instead of planning every detail from the start.

Sometimes vibe coding doesn’t look like a radical new workflow. Sometimes it just looks like creative play and that’s not a bad thing.

A Word About Ethics (and Who This Helps… and Who It Doesn’t)

AI generated code isn’t magic. It can be fast and impressive, but it still needs review. It can introduce bugs, security vulnerabilities, or quietly license code in ways that create downstream problems. Just like any other tool, it’s only as safe or responsible as the person using it.

Still, I’ve been thinking a lot about who this helps. Vibe coding lowers barriers. Whether you’re someone who struggles with focus, or just dipping in on a weekend to try something fun, it lets more people actually start. It helps people get unstuck when they can’t remember the syntax or the exact method name. That freedom to just start, that’s not nothing. For some folks, it’s what makes creating possible.

But I’ve also had my own quiet concerns about what this means for early-career developers. So much of how I learned came from chasing bugs in broken tutorials and seeing how all the pieces connected, or didn’t. There was value in that. And maybe I’ve been a little protective of it.

A mentor challenged that. He pointed out that debugging AI generated code is a lot like onboarding into a legacy codebase, making sense of decisions you didn’t make, finding where things break, and learning to trust (or rewrite) what’s already there. That’s the kind of work a lot of developers end up doing anyway.

So maybe this shift isn’t as dramatic as it feels. But it does mean we need to be intentional. As AI takes on more of the tedious setup and glue work, we need to rethink what we hand to junior devs. The path in might be different, but it still needs to exist. And it needs to be supported.

Of course, lowering barriers doesn’t just help the good actors. It also opens the door for people to build fast, messy, and potentially harmful systems without having the experience or ethics to understand what they’re unleashing. That kind of scale, in the wrong hands, is more than a nuisance. It’s a real risk.

And then there’s the labor side of this. As companies start using AI to justify shrinking engineering teams, flattening pay, or skipping mentorship entirely, that’s not the future of work. That’s just cutting corners. We’ve seen this story before, and the only thing new is the tool.

So yes, vibe coding has a lot of promise. But we also need to stay vigilant. To protect the parts of the craft that matter.

Where the Industry’s Headed

Whether you’re ready or not, AI assisted development is already here and the industry isn’t slowing down. Tools like GitHub Copilot are getting better at generating code in context. Product teams are building faster, shipping more, and rethinking what “developer productivity” even means.

There’s momentum, no doubt. But with that momentum comes pressure. Pressure to automate more. Pressure to reduce costs. Pressure to deliver without stopping to ask, is this still good? Or are we just moving fast because we can?

And while it’s tempting to treat these AI coding assistants as an automatic upgrade to the dev stack, we haven’t really finished asking what’s downstream of that shift. Are we solving problems, or just doing more work faster? Are we giving teams more space to think or less?

I think vibe coding sits right in the middle of that tension. It’s more than just a cheeky phrase. It’s a reflection of the moment we’re in, a stand-in for the bigger questions we’re grappling with. How do we build responsibly when the scaffolding is done for us? What does creativity look like when the keyboard isn’t always in our hands? Who gets to build, and who benefits when they do?

Some days it feels like we’re evolving the craft. Other days it feels like we’re just accelerating it. But either way, it’s worth paying attention to the values we carry with us while we move forward.

Final Thoughts

So… what is vibe coding?

It’s still being figured out. Right now, it’s part meme, part mindset, and part reflection of how AI is changing the way we work. Some people use the term lightly. Others use it to describe a very real shift in how they interact with code.

I’m not here to tell you whether to love it or hate it. But I do think it’s worth exploring.

Try it. Prompt your way through something you’d normally scaffold by hand. See what the model comes up with. And then apply your judgment, your experience, to shape it into something real.

Because this isn’t about replacing the craft. It’s about redefining what the craft can include.

And if all you take from this is that you’re allowed to start with vibes and follow up with rigor, that’s a pretty good place to start.



Read the whole story
mrmarchant
7 hours ago
reply
Share this story
Delete

Reservoir Sampling

1 Share

samwho keyboard logo

Reservoir Sampling

Reservoir sampling is a technique for selecting a fair random sample when you don't know the size of the set you're sampling from. By the end of this essay you will know:

  • When you would need reservoir sampling.
  • The mathematics behind how it works, using only basic operations: subtraction, multiplication, and division. No math notation, I promise.
  • A simple way to implement reservoir sampling if you want to use it.
Before you scroll! This post has been sponsored by the wonderful folks at ittybit, and their API for working with videos, images, and audio. If you need to store, encode, or get intelligence from the media files in your app, check them out!

# Sampling when you know the size

In front of you are 10 playing cards and I ask you to pick 3 at random. How do you do it?

The first technique that might come to mind from your childhood is to mix them all up in the middle. Then you can straighten them out and pick the first 3. You can see this happen below by clicking "Shuffle."

Every time you click "Shuffle," the chart below tracks what the first 3 cards were.

At first you'll notice some cards are selected more than others, but if you keep going it will even out. All cards have an equal chance of being selected. This makes it "fair."

Click "Shuffle 100 times" until the chart evens out. You can reset the chart if you'd like to start over.

This method works fine with 10 cards, but what if you had 1 million cards? Mixing those up won't be easy. Instead, we could use a random number generator to pick 3 indices. These would be our 3 chosen cards.

We no longer have to move all of the cards, and if we click the "Select" button enough times we'll see that this method is just as fair as the mix-up method.

I'm stretching the analogy a little here. It would take a long time to count through the deck to get to, say, index 436,234. But when it's an array in memory, computers have no trouble finding an element by its index.

Now let me throw you a curveball: what if I were to show you 1 card at a time, and you had to pick 1 at random?

How many cards are you going to show me?

That's the curveball: you don't know.

Can I hold on to all the cards you give me and then pick 1 after you stop?

No, you can only hold on to 1 card at a time. You're free to swap your card with the newest one each time I show you a card, but you can only hold one and you can't go back to a card you've already seen.

Then it's impossible! Why would I ever need to do this anyway?

Believe it or not, this is a real problem and it has a real and elegant solution.

For example, let's say you're building a log collection service. Text logs, not wooden ones. This service receives log messages from other services and stores them so that it's easy to search them in one place.

One of the things you need to think about when building a service like this is what do you do when another service starts sending you way too many logs. Maybe it's a bad release, maybe one of your videos goes viral. Whatever the reason, it threatens to overwhelm your log collection service.

Let's simulate this. Below you can see a stream of logs that experiences periodic spikes. A horizontal line indicates the threshold of logs per second that the log collection service can handle, which in this example is 5 logs per second.

You can see that every so often, logs per second spikes above the threshold. One way to deal with this is "sampling." Deciding to send only a fraction of the logs to the log collection service. Let's send 10% of the logs.

Below we will see the same simulation again, but this time logs that don't get sent to our log collection service will be greyed out. The graph has 2 lines: a black line tracks sent logs, the logs that are sent to our log collection service, and a grey line tracks total logs.

The rate of sent logs never exceeds the threshold, so we never overwhelm our log collection service. However, in the quieter periods we're throwing away 90% of the logs when we don't need to!

What we really want is to send at most 5 logs per second. This would mean that during quiet periods you get all the logs, but during spikes you discard logs to protect the log collection service.

The simple way to achieve this would be to send the first 5 logs you see each second, but this isn't fair. You aren't giving all logs an equal chance of being selected.

# Sampling when you don't know the size

We instead want to pick a fair sample of all the logs we see each second. The problem is that we don't know how many we will see. Reservoir sampling is an algorithm that solves this exact problem.

1 second isn't a long time, can't we just store all the messages we see and then use the select method from way back up there?

You could, but why live with that uncertainty? You'd be holding on to an unknown number of logs in memory. A sufficiently big spike could cause you problems. Reservoir sampling solves this problem, and does so without ever using more memory than you ask it to.

Let's go back to our curveball of me showing you 1 card at a time. Here's a recap of the rules:

  1. I'll draw cards one at a time from a deck.
  2. Each time I show you a card, you have to choose to hold it or discard it.
  3. If you were already holding a card, you discard your held card before replacing it with the new card.
  4. At any point I can stop drawing cards and whatever card you're holding is the one you've chosen.

How would you play this game in a way that ensures all cards have been given an equal chance to be selected when I decide to stop?

How about we flip a coin every new card? If it's heads, we keep the card we have. If it's tails, we swap it out for the new card.

You're on the right track. Let's have a look at how the coin flip idea plays out in practice. Below you see a deck of cards. Clicking "Deal" will draw a card and 50% of the time it will go to the discard pile on the right, and 50% of the time it will become your held card in the center, with any previously held card moving to the discard pile.

The problem is that while the hold vs discard counts are roughly equal, which feels fair, later cards are much more likely to be held when I stop than earlier cards. The first card drawn has to win 10 coin flips to still be in your hand after the 10th card is drawn. The last card only has to win 1.

Scrub the slider below to see how the chances change as we draw more cards. Each bar represents a card in the deck, and the height of the bar is the chance we're holding that card when I stop. Below the slider are the chances we're holding the first card drawn vs. the last card drawn.

Anything older than 15 cards ago is has a less than 0.01% chance of being held when I stop.

You said I was on the right track! How can this be the right track when I'm more likely to win the lottery than to be holding the card I saw 24 draws ago?

Because believe it or not, we only have to make one small change to this idea to make it fair.

Instead of flipping a coin to decide if we'll hold the card or not, instead we give each new card a 1/n chance of being held, where n is the number of cards we've seen so far.

Wait, that's it? That makes it fair?

Yep! In order to be fair, every card must have an equal chance of being selected. So for the 2nd card, we want both cards to have a 1/2 chance. For the 3rd card, we want all 3 cards to have a 1/3 chance. For the 4th card, we want all 4 cards to have a 1/4 chance, and so on. So if we use 1/n for the new card, we can at least say that the new card has had a fair shot.

Let's have a look at the chances as you draw more cards with this new method.

I get how each new card has the right chance of being selected, but how does that make the older cards fair?

So far we've focused on the chance of the new card being selected, but we also need to consider the chance of the card you're holding staying in your hand. Let's walk through the numbers.

# Card 1

The first card is easy: we're not holding anything, so we always choose to hold the first card. The chance we're holding this card is 1/1, or 100%.

Hold
100%
Replace
-

# Card 2

This time we have a real choice. We can keep hold of the card we have, or replace it with the new one. We've said that we're going to do this with a 1/n chance, where n is the number of cards we've seen so far. So our chance of replacing the first card is 1/2, or 50%, and our chance of keeping hold of the first card is its chance of being chosen last time multiplied by its chance of being replaced, so 100% * 1/2, which is again 50%.

Hold
100% * 1/2
50%
Replace
1/2
50%

# Card 3

The card we're holding has a 50% chance of being there. This is true regardless what happened up to this point. No matter whether we're holding card 1 or card 2, it's 50%.

The new card has a 1/3 chance of being selected, so the card we're holding has a 1/3 chance of being replaced. This means that our held card has a 2/3 chance of remaining held. So its chances of "surviving" this round are 50% * 2/3.

Hold
50% * 2/3
33.33%
Replace
1/3
33.33%

# Card N

This pattern continues for as many cards as you want to draw. We can express both options as formulas. Drag the slider to substitute n with real numbers and see that the two formulas are always equal.

Hold
1/(n-1) * (1-(1/n))
-
Replace
1/n
-

If 1/n is the chance of choosing the new card, 1/(n-1) is the chance of choosing the new card from the previous draw. The chance of not choosing the new card is the complement of 1/n, which is 1-(1/n).

Below are the cards again except this time set up to use 1/n instead of a coin flip. Click to the end of the deck. Does it feel fair to you?

There's a good chance that through the 2nd half of the deck, you never swap your chosen card. This feels wrong, at least to me, but as we saw above the numbers say it is completely fair.

# Choosing multiple cards

Now that we know how to select a single card, we can extend this to selecting multiple cards. There are 2 changes we need to make:

  1. Rather than new cards having a 1/n chance of being selected, they now have a k/n chance, where k is the number of cards we want to choose.
  2. When we decide to replace a held card, we choose one of the k cards we're holding at random.

So our new previous card selection formula becomes k/(n-1) because we're now holding k cards. Then the chance that any of the cards we're holding get replaced is 1-(1/n).

Let's see how this plays out with real numbers.

Hold
k/(n-1) * (1-(1/n))
-
Replace
k/n
-

The fairness still holds, and will hold for any k and n pair. This is because all held cards have an equal chance of being replaced, which keeps them at an equal likelihood of still being in your hand every draw.

A nice way to implement this is to use an array of size k. For each new card, generate a random number between 0 and n. If the random number is less than k, replace the card at that index with the new card. Otherwise, discard the new card.

And that's how reservoir sampling works!

# Applying this to log collection

Let's take what we now know about reservoir sampling and apply it to our log collection service. We'll set k=5, so we're "holding" at most 5 log messages at a time, and every second we will send the selected logs to the log collection service. After we've done that, we empty our array of size 5 and start again.

This creates a "lumpy" pattern in the graph below, and highlights a trade-off when using reservoir sampling. It's no longer a real-time stream of logs, but chunks of logs sent at an interval. However, sent logs never exceeds the threshold, and during quiet periods the two lines track each other almost perfectly.

No logs lost during quiet periods, and never more than threshold logs per second sent during spikes. The best of both worlds. It also doesn't store more than k=5 logs, so it will have predictable memory usage.

# Further reading

Something you may have thought while reading this post is that some logs are more valuable than others. You almost certainly want to keep all error logs, for example.

For that use-case there is a weighted variant of reservoir sampling. I wasn't able to find a simpler explanation of it, so that link is to Wikipedia which I personally find a bit hard to follow. But the key point is that it exists and if you need it you can use it.

# Conclusion

Reservoir sampling is one of my favourite algorithms, and I've been wanting to write about it for years now. It allows you to solve a problem that at first seems impossible, in a way that is both elegant and efficient.

Thank you again to ittybit for sponsoring this post. I really couldn't have hoped for a more supportive first sponsor. Thank you for believing in and understanding what I'm doing here.

Thank you to everyone who read this post and gave their feedback. You made this post much better than I could have done on my own, and steered me away from several paths that just weren't working.

If you want to tell me what you thought of this post by sending me an anonymous message that goes directly to my phone, go to https://samwho.dev/ping.

Read the whole story
mrmarchant
7 hours ago
reply
Share this story
Delete

How To Start A School With Your Friends

1 Share

FractalU is a “school” for adults, taught from living rooms in New York City. We’ve run over 100 classes and taught thousands of students. Classes meet weekly and are held on evenings and weekends, since most of our students and teachers are working professionals.

Here's a small sampling of our recent courses:

We teach out of our homes, and keep the administrative burden low. That way, teachers can offer their classes at affordable rates. And adult students can make taking classes and learning new skills a regular part of our lives.

How It Started…

A few years ago, Mari dropped a message in the group chat. She was going through Andrej Karpathy's online course on AI …did anyone want to take it with her?

Thanks for reading Priya’s Substack! Subscribe for free to receive new posts.

It turned out that several other friends were also going through the course. A plan was soon hatched: my friends would meet weekly to watch the lectures and hack on the homework together. They'd do it in person at merlins place, our friend’s apartment which doubles as a community third space.

Mari and Morgan give a presentation.

The average self-paced online course has a completion rate of 5-15%. Perhaps you’ve had the experience of enrolling in a self-paced course only to drop it midway through. Maybe, like me, you even spent money on an online course only to drop out.

It turns out, if you meet up with friends every week to take a course, two things happen:

  1. You have a lot of fun

  2. The average completion rate skyrockets

My friends discovered that the true value of school is the social container. The best classes in the world are freely available online. But going through an online class alone requires willpower that many of us don't have. It’s easy and fun to do focused work when you’re doing it with your friends! And to come back every week until you finish what you set out to do.

You Can Just Learn With Your Friends

One of the participants in this weekly event was my husband Andrew. Considering how enjoyable and useful the container proved to be, Andrew wondered: why aren't we taking classes with our friends regularly?

Thus began FractalU: a low-overhead low-cost “school” where we learn with our friends, from our living rooms.

The first semester of FractalU we ran 4 classes. Two classes continued with this model of co-learning by taking an online course together. But a few of us wanted to teach original material and designed courses from scratch.

Our first classes were:

  1. Foundations of Computing: From NAND to Tetris, an online class TA’ed by Andrew

  2. Building LLMs in Practice, TA’ed by Chris

  3. Body, Mind, World, an original class created by Tyler and Alicia

  4. How to Live Near Your Friends, an original class created by Priya, the author of this post

Eric and Jesse give a guest lecture

We had about 50 students total that first semester.

But then something surprising happened — five of those students wanted to teach a class the next semester. Plus, two people who had guest taught in a class wanted to run their own classes. We also pitched a few other friends who we admired on running their own classes. In our second semester, we ran 18 classes with over 200 students!

How It’s Going…

We just finished our fifth semester of FractalU. We offer over 20-30 courses per semester. We’ve had 62 awesome instructors, who have taught thousands of students.

Most FractalU instructors aren't professional teachers; they have regular day jobs. They teach because they love their subject and want to make friends who share their niche interests. Sometimes they teach in order to gain a better understanding of the subject themselves.

Teaching painting from a tiny New York City living room.

FractalU Doesn’t Exist 👻

So how did we grow from a few humble classes taught by and to our friends to what we have now?

FractalU isn't a business or a nonprofit. In fact, it's not a formal organization at all. It's simply a coordination mechanism with a tiny bit of volunteer admin work done by a team of four.

Three times a year we email instructors and send them a Notion Guide for the upcoming semester. The step-by-step guide helps instructors write a high-quality syllabus and list their class.

The guide also includes a list of spaces where they can host a class. These are living rooms volunteered by our community, in exchange for a small fee. (In addition to local living rooms, we've found two spaces in New York City willing to host: the Google Headquarters and a local dance studio).

After the listing deadline has passed, the admin team advertises the semester. We post to our email list, on social media, and in various local newsletters and lots of group chats.

The instructors handle everything from there. The instructor reviews applications and emails accepted and rejected students.

Most instructors choose to charge a sliding scale rate for their classes. However, since FractalU is not a formal organization, we don't employ anyone. So instructors must deal with collecting money and managing taxes on their own.

We keep admin overhead very low. The admin team only has 3 roles:

  1. Emailing instructors a few times a year.

  2. Advertising online by poasting good.

  3. Curating teachers and classes.

A movement class, on the roof!

Curating teachers and classes

The final task of the admin team is curating teachers and classes. FractalU has grown organically: most of our teachers were first students in another class, and then asked us if they could teach their own class.

We believe the best instructors create high quality social containers. Thus we prefer to choose people we've interacted with a lot both in class and at community events. We want a good sense of their character and social skills before we entrust them to teach.

We also maintain a form where people can apply to teach at FractalU. But it’s rare for us to accept teachers who aren’t already part of the community; it’s hard to vet their competence and social skills from an online form and a Zoom call alone.

Our application process is pretty informal. If someone approaches us to teach, and we have enough data to trust that they’d be a good teacher, we say yes. To keep overhead low, any of the admin team is empowered to independently say yes to a class. If we receive a compelling application from a stranger, we do a video call with them to see if they would be a good fit.

Want To Start a School?

This is the beginning of a two part series. Tomorrow I’ll drop part 2, which is a step-by-step guide on how to start something like this.

If you want even more hands-on help, Tyler and I are running an accelerator this summer to help people start their own campuses, complete with a school, co-living, and regular events. Check it out here.

Fractal Campus Accelerator ↗

Read the whole story
mrmarchant
7 hours ago
reply
Share this story
Delete

Conan O’Brien’s (Mostly Serious) Tips for Traveling the World With Just a Carry-On

1 Share

Rick Steves has nothing on Conan O’Brien.

The comedian and erstwhile late-night host has stuffed a suitcase full of fish in Norway, herded sheep in Armenia, rented a fake family in Japan, and told a South Korean soldier he looked like a Janet Jackson backup dancer.

In fact, O’Brien has gone to some 20 countries for his travel comedy shows, including two seasons of Conan Without Borders and its successor, the Emmy Award–winning Conan O’Brien Must Go, now streaming its second season on MAX. Recently, I spoke with O’Brien in a video call to discuss packing tips and product recommendations hard-won by his years of crisscrossing the globe.

And while O’Brien did suggest bringing items like an “emotional support scorpion” and “as garish and horrific a fanny pack as possible,” he actually has some very good packing advice, including remembering key items, a few of which are also Wirecutter favorites. Below are some of his tried-and-true tips for traveling the world.

Read the whole story
mrmarchant
8 hours ago
reply
Share this story
Delete

When it comes to brains, size doesn’t matter much

1 Share

Image credit: Ionut Stefan

It all started from a reasonable assumption: bigger brains pack more neurons, and more neurons make one smarter, ergo bigger brain = smarter cookie. But if there’s one thing brains love, it’s nonlinearities (to be read as “freaking messes!”). To understand what that means, we’ll talk about how brains of different sizes are structured, and if it’s not size, what features actually make them smart.

We can approach this discussion at two levels: between species and within species. At the first level, brain size varies dramatically (imagine mouse vs. elephant), giving us a coarse-grained understanding of why size alone doesn’t explain intelligence. At the second one, we’ll look at comparisons between people, where the differences in brain size are much smaller, but the data is richer, giving us a more fine-grained picture of what might be going on.

What is intelligence

In both cases, we need to define what “smart” means. Intelligence, like a lot of other higher-order cognitive concepts, suffers from definition fuzziness: there are a bunch of ways to define it, and if we don’t clarify this upfront, we risk talking past each other. For this article, we’ll focus on general intelligence, or mental ability.

Another point to consider is that we’ll be talking about both humans and other species.
For all of them, general intelligence includes the ability to solve problems and come up with novel solutions, the capacity to learn and change one’s behavior based on experience, and the ability to think abstractly.

In humans, however, intelligence tests obviously rely heavily on verbal ability, and many studies use something called the g factor. This variable summarizes positive correlations between different cognitive tasks, and is used to reflect something we’ve observed empirically: that people who do well on one cognitive test usually do well on others too.

There have been attempts to use the g factor for quantifying intelligence in other animals, but as you can imagine, the lack of verbal ability makes standardized inter-species comparisons extremely challenging. Instead, researchers rely on a set of indirect tests. These let them measure things such as learning and problem-solving, memory capacity, or even the ability for self-recognition (using the mirror test).

The lack of standardization makes it difficult to say with certainty that, for example, a crow is more intelligent than a baboon. Yes, crows can do geometry and baboons apparently can’t, but is that all it takes to be smart? Still, imperfect as they are, these measures allow us to challenge the assumption that bigger brains are smarter: if a crow, with a much smaller brain than a baboon’s, can do geometry, clearly there’s more at play than sheer brain size.

The comparison between species

But let’s back up a bit. In the section above, it seemed reasonable to define “smart”, but what if I told you we also need a definition for “brain size”? It might seem a bit ridiculous, but there really is more than one interpretation for this term. We saw the first one in the crow-baboon example, where we introduced absolute brain size. The problem with this is that absolute brain size correlates strongly with body size: larger animals have larger brains. And we don’t even need to compare crows and baboons, we can compare ourselves to whales and elephants. Even though they’re quite intelligent animals, they’re still not exactly on our level. So absolute brain size is not a good indicator of intelligence.

Another way to define brain size would be relative to the body weight. Take humans and whales again: the brain of a human weighs about 1.5 kg, and that of a whale about 9 kg. In terms of absolute brain size, whales win hands down. But if we look at brain weight as a percentage of the body weight, we get about 2.5% in humans and a measly 0.02% in whales. We now have a data point indicating that a species with a higher relative brain size is also more intelligent. We can now expand this to as many species as possible and see if it still holds. It’s not a big surprise that it doesn’t, but I bet you won’t guess which animal breaks the pattern. It’s the Etruscan shrew (this little guy), with a brain weight of about 0.1 g and a body weight of only 2 g, giving us a 5% value, double that of humans!

Alright, that’s another simple explanation gone down the drain. Back to the drawing board it is. We said that larger bodies go hand in hand with larger brains. Now, if intelligence didn’t play any role at all, we could assume that the brain is simply increasing in size because it needs to manage a larger body. In that case, if we knew an animal’s body size, we could mathematically predict what its absolute brain size should be. If we saw any increase in size on top of that, we could only assume that’s due to the extra brain being used for intelligence. That’s the simplest explanation for what scientists termed the third way of defining brain size, the encephalization quotient (EQ). As you might already guess, that didn’t work out very well either. Humans came up pretty well on this metric, but chimpanzees, gorillas, and whales, animals which we know are fairly smart, scored quite low EQs. More attempts were made to improve the EQ calculation formula. These only succeeded in making it more complicated, so I won’t bore you with the details. Bottom line is, the idea that brain size is related to intelligence across species was examined from multiple angles and it always came up short.

But why? Why? Why?

Well, for a bunch of reasons. Let’s start with our first assumption: “bigger brains pack more neurons”. Remember that? Across species, that’s not always true. What’s more, which brain regions have more neurons is also very important. As an example, elephants have 3 times (!) more neurons than humans. But in elephants, a lot of these neurons are found in the cerebellum, not in the cortex (presumably to control the fine-grained movements of the trunk), the neurons themselves are larger, and their number per cubic millimeter is much lower (only 6.000-7.000 neurons/mm3, compared to 25.000-30.000 neurons/mm3 in humans).

In contrast, although crows have tiny brains compared both to humans and elephants, their neuronal density is nothing short of impressive: in the nidopallidum, a region used for executive tasks, it can reach about 130.000-160.000 neurons/mm3. With such numbers, it’s no wonder crows can rival and even outperform some primates on cognitive tasks.

However, as we’ll see more clearly in the next section, the number, density, or location of neurons aren’t the only factors that matter. How they are connected and how fast information can travel between them also play important roles.

The comparison within species

The comparisons above showed us that, across species, brain size isn’t a good predictor of intelligence. Brains don’t just scale up, but some of their properties, such as neuron density or size, also change between species. But within species, and more specifically between people, we don’t expect such significant differences. That’s why, before diving into more complex structural features, it’s worth asking again how the relation between brain size and intelligence holds up in humans.

It turns out that it works slightly better. There is a small, positive correlation between brain size and the g factor. The correlation coefficient r (which goes from -1 to 1, with 1 indicating perfectly correlated, 0 meaning no correlation, and -1 perfectly anticorrelated) sits in the range of 0.2-0.3. It’s not much, but it’s honest work. What’s more, separating the brain into gray and white matter and correlating their volumes with intelligence shows that gray matter is the driver of this effect. But even so, this only explains a small part of the picture. So… where is the rest coming from?

We’ve already hinted above that it has to do with connections and information speed. In more concrete terms, we know the brain is basically a network of neurons. And the information-processing capacity of this network is what translates into intelligence. Unfortunately, studying large networks like the brain and how their properties relate to constructs such as intelligence tends to be a bit…complicated. That’s why, even though gathering relevant data in humans is much easier compared to other species, the full picture is still quite murky.

What we know so far is that networks connected more efficiently appear to correlate with higher intelligence and that good myelination is important for cognitive processing speed. In terms of theories, perhaps the most well-established one is the parieto-frontal integration theory, which tells us that a network formed by lateral frontal and parietal areas is highly relevant for intelligence. However, newer studies suggest it’s not just these regions, but how the entire brain is structured, that determines intelligence.

To sum up

Brains are complicated. And although it seems easy to assume that larger brains with more neurons can do more, nature doesn’t agree. There still a lot of work needed to determine what makes a brain smart. But so far we’ve learned that where those neurons are situated and how efficiently information can flow between them trumps simple upscaling. Maybe something to keep in mind for other so-called “brain-like” systems.

What did you think about this post? Let us know in the comments below. And if you’d like to support our work, feel free to share it with your friends, buy us a coffee here, or even both.

You might also like:

References
Barbey, A. K., Karama, S., & Haier, R. J. (Eds.). (2021). The Cambridge Handbook of Intelligence and Cognitive Neuroscience. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108635462

Coyle, T. R. (2021). Defining and Measuring Intelligence. The Cambridge Handbook of Intelligence and Cognitive Neuroscience, 3–25. https://doi.org/10.1017/9781108635462.003

Dicke, U., & Roth, G. (2016). Neuronal factors determining high intelligence. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1685), 20150180. https://doi.org/10.1098/rstb.2015.0180

Herculano-Houzel, S., Avelino-de-Souza, K., Neves, K., Porfírio, J., Messeder, D., Mattos Feijó, L., Maldonado, J., & Manger, P. R. (2014). The elephant brain in numbers. Frontiers in Neuroanatomy, 8. https://doi.org/10.3389/fnana.2014.00046

Sablé-Meyer, M., Fagot, J., Caparos, S., van Kerkoerle, T., Amalric, M., & Dehaene, S. (2020). Sensitivity to geometric shape regularity in humans and baboons: A putative signature of human singularity. https://doi.org/10.31234/osf.io/hj3m6

Schmidbauer, P., Hahn, M., & Nieder, A. (2025). Crows recognize geometric regularity. Science Advances, 11(15). https://doi.org/10.1126/sciadv.adt3718

Ströckens, F., Neves, K., Kirchem, S., Schwab, C., Herculano‐Houzel, S., & Güntürkün, O. (2022). High associative neuron numbers could drive cognitive performance in corvid species. Journal of Comparative Neurology, 530(10), 1588–1605. Portico. https://doi.org/10.1002/cne.25298

van den Heuvel, M. P., Stam, C. J., Kahn, R. S., & Hulshoff Pol, H. E. (2009). Efficiency of Functional Brain Networks and Intellectual Performance. Journal of Neuroscience, 29(23), 7619–7624. https://doi.org/10.1523/jneurosci.1443-09.2009

The post When it comes to brains, size doesn’t matter much appeared first on Neurofrontiers.

Read the whole story
mrmarchant
8 hours ago
reply
Share this story
Delete
Next Page of Stories