1588 stories
·
2 followers

The Entire Internet Is a UGC Reaction Video Now

1 Share
The Entire Internet Is a UGC Reaction Video Now

I keep a folder in Apple Notes called “cursed websites,” where I save various artefacts that make me feel like the social contract has dissolved. Call it an act of self-loathing. Call it collecting evidence of the fall. Dansugc.com went straight into the folder this morning.

It’s a site where marketers // entrepreneurs (and I find the line between those two groups has become blurred to the point of being illegible) can buy pre-recorded “Reaction” videos for $3 apiece. You browse a library of 2k clips, sorted by emotion (shocked! Happy! Crying! Excitement!), pick a face you find particularly appealing, download a 5-10 second clip of a stranger performing surprise // delight at nothing in particular, and splice it int your own content. The idea is to make it look like an actual someone had an actual emotional response to your app on TikTok. Custom orders run to $8 and let you specify outfits, emotional arcs etc.

The tagline reads: “100% Real Humans. Zero AI.”

And I think that tells you almost everything you need to know about where we are. And where we are is a place where “at least the fake fuckery was produced by a biological organism” counts as a premium feature. The pitch for selling manufactured authenticity at scale is that at leas the people in the factory are still real people. That’s the floor. That’s what passes for premium. We are drowning in content that is functionally the same as so many designer handbags stitched up alongside so many dupes.

The internet is the most powerful communication technology in human history, and we’re using it to sell each other $3 clips of faked surprise.

I don’t blame Dan, if that’s even his real name. He’s running a business filling a niche. He’s recognised that the entire internet advertising “ecosystem” now runs on simulated, casual, spontaneous “cool girl” energy. He’s simply the shovel-seller in an authenticity gold rush; except the gold is par asocial trust, and the shovels are clips of various women pretending to have their minds blown by your calorie-counting app.

100, ready to post UGC videos per month costs $800. A fully managed campaign for 500 videos goes for $10k. Dan claims over 5 billion total views generated, and I don’t doubt his numbers at all. But if this stuff doesn’t set off your alarms, even a little, you’ve probably been marinating in it so long you’ve lost the ability to smell it.

What Dan’s business lays bare - if you actually sit with it - is that the internet, as a social and cultural space is almost entirely performance. The whole apparatus has been hollowed into a content mill that grinds human attention into micro-conversions. I’m aware that I’m not the first person to make the complaint that the internet sucks - but every point of suck has now compounded into the final boss of shitty experiences. The algorithmic timelines, the social media homogeneity, the death of truth, the proliferation of monetisation strategies and side hustles etc have all contributed to this moment: a growth hacked, engagement optimised, brand-building logic that has destroyed our ability to distinguish between a person sharing something they give a shit about, and a person executing a “content” strategy.

Open TikTok right now and ~try to find a video that isn’t, at some level, trying to sell you something. A political identity, a digital product, a lifestyle, a personal brand. It's next to impossible. Every piece of content carries this faint whiff of ~strategy behind it. The girl doing a “Get Ready With Me” video has an affiliate link in her bio, and the asshole ranting about immigration has a Substack he can’t wait to funnel you to. The therapist explaining attachment styles is, naturally, building a course she’ll launch next month, and the couple doing a “day in our van-life” vlog is negotiating a brand deal in their DMs. There is always a funnel, always a CTA, and the output, no matter how “down to earth” it’s designed to feel, is always doubling as a mechanism to convert your attention into revenue.

Jean Baudrillard (read Simulacra and Simulation) identified how modern society replaces reality with the symbols and signs of reality. He mapped the process in four stages: first the image reflects reality, then it masks reality, then it masks the absence of reality, and finally it has no relation to reality whatsoever. A UGC reaction video purchased for $3 and spliced into a TikTok ad is operating at that fourth stage, because the reaction doesn't reference a real reaction, there was never a real reaction, and the whole thing is a sign pointing at nothing, wearing the costume of spontaneity.

You might say who cares, advertising has always been manipulative, and sure, that's true. When Grigory Potemkin allegedly erected fake village facades along the Dnieper River in 1787 to impress Empress Catherine II during her tour of Crimea, he was doing UGC marketing for the Russian Empire (the historical consensus is that the villages were probably real settlements that had been tidied up rather than total fabrications, but the legend stuck because the concept is so useful as shorthand). The instinct to manufacture the appearance of prosperity for the benefit of powerful onlookers is old as dirt. What's different now is the scale and the fact that regular people are doing it to each other all day long, voluntarily, for free or for pennies.

There's a phrase you hear in marketing: "everyone is a creator now." It sounds democratizing, hopeful even, like the whole internet has become a Renaissance workshop where artisans and thinkers reach audiences directly. In practice, everyone is a marketer now. The "creator economy" turned out to be an economy where the thing being created, more and more, is demand for more of yourself. Your aesthetic, your opinions, your morning routine, your trauma, your fitness journey, your face: all raw material for the content machine, all measured against growth metrics that would make a mid-career product manager feel right at home.

The result is an internet that feels, to use a technical term, like shit. Scroll any platform and you're wading through a river of optimized slop, and what makes it depressing is how same-y it all is despite the theoretically infinite diversity of human expression available online.

Political content looks like beauty content and beauty content looks like finance content and finance content looks like fitness content, because they’re all using the same hooks and they’re all built on the same emotional beats. The provocative claim in the first 2 seconds, the false tension, the extreme language, comment if you agree, like and subscribe and so on and on. Don’t forget to share this with someone who ~needs to hear this. Make sure you follow for part 2. The playbook is identical, whether someone’s raving about the best skin serum, or about their least favourite ethnic groups...

AI makes all of this both worse and darkly funny at the same time. AI slop and human slop have now converged to the point that Dan can credibly market “zero AI” as a premium feature, while his customers’ output offers no real elevation from the realm of deepfakes. And as much as Real Humans is a selling point for the internet today, the AI is getting better, too. It can produce damn-near the same hooks, the same engagement-bait captains, the same dead-eyed reaction that a human can churn out today. AI content creators aren’t even poisoning the well; not really. They’re simply drawing from a well we already puked in years ago.

AI slop is human slop with the labor costs removed, and that's why nobody can tell the difference, and that's why Dan has to specify that his product is made by real humans, like a carton of eggs stamped "cage-free."

There's a moment in Don DeLillo's White Noise where a character visits "The Most Photographed Barn in America" and realizes that nobody can actually see the barn anymore because the barn has been completely replaced by the aura of the photographs of the barn. Once you've seen the signs about the barn, he says, it becomes impossible to see the barn. The internet has done this to basically everything. You scroll past enough UGC reaction videos and you can't encounter a real reaction without wondering if it's bought, you read enough performative vulnerability posts and you can't encounter real vulnerability without suspecting it's a hook. The constant presence of the fake thing corrodes your ability to trust the real thing, and the really vicious part is that a lot of the "real things" were fake too, which means the thing you're mourning the loss of may never have existed in the form you remember it.

This is, I suspect, why nostalgia for "the old internet" has become its own genre of content (which is, of course, itself being optimized for engagement, because there's no exit door). People remember a time when someone's blog was their blog and nothing more, when a forum post was written because a person had a thing to say and they said it and moved on. Whether that era was actually as good as we remember is debatable, and I think there's a strong case that we're romanticizing it. Sturgeon's Law applied then too: 90% of everything was crap. But the crap was sincere crap. The crap was some guy with a Blogspot writing 3,000 words about his favorite Star Trek episodes because he liked Star Trek and had opinions about the Borg, with zero intention of building an audience or selling a course called "How I Built a 6-Figure Blog About Star Trek."

Read the whole story
mrmarchant
22 minutes ago
reply
Share this story
Delete

How Small Can A Linux Executable Be?

1 Share
A hex dump of the first iteration of the small ELF file

With ever increasing sizes of various programs (video games being notorious for this), the question of size optimization comes up more and more often. [Nathan Otterness] shows us how it’s done by minifying a Linux “Hello, World!” program to the extreme.

A naive attempt at a minimal hello world in C might land you somewhere about 12-15Kb, but [Nathan] can do much better. He starts by writing everything in assembly, using Linux system calls. This initial version without optimization is 383 bytes. The first major thing to go is the section headers; they are not needed to actually run the program. Now he’s down to 173 bytes. And this is without any shenanigans!

A hexdump of the final ELF file, significantly smaller than the original
The final tiny ELF file

The first shenanigans are extreme code size optimizations: by selecting instructions carefully (and in a way a C compiler never would), he shaves another 16 bytes off. But the real shenanigans begin when he starts looking for spaces in the ELF header that he can clobber while the program is still accepted by Linux: now he can move his already tiny x86_64 code into these “vacant” spaces in the ELF and program headers for a final tiny ELF file weighing in at just 120 bytes.

P.S.: We know it is possible to make this smaller, but leave this as an exercise to the viewer.

Read the whole story
mrmarchant
28 minutes ago
reply
Share this story
Delete

A Very Unscientific History of Scientific Hoaxes

1 Share

The past, present, and future of academic deception

The post A Very Unscientific History of Scientific Hoaxes appeared first on Nautilus.



Read the whole story
mrmarchant
3 hours ago
reply
Share this story
Delete

How to Get to Tomorrow

1 Share
Read the whole story
mrmarchant
23 hours ago
reply
Share this story
Delete

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

1 Share

Trip Venturella released Mr. Chatterbox, a language model trained entirely on out-of-copyright text from the British Library. Here's how he describes it:

Mr. Chatterbox is a language model trained entirely from scratch on a corpus of over 28,000 Victorian-era British texts published between 1837 and 1899, drawn from a dataset made available by the British Library. The model has absolutely no training inputs from after 1899 — the vocabulary and ideas are formed exclusively from nineteenth-century literature.

Mr. Chatterbox's training corpus was 28,035 books, with an estimated 2.93 billion input tokens after filtering. The model has roughly 340 million paramaters, roughly the same size as GPT-2-Medium. The difference is, of course, that unlike GPT-2, Mr. Chatterbox is trained entirely on historical data.

Given how hard it is to train a useful LLM without using vast amounts of scraped, unlicensed data I've been dreaming of a model like this for a couple of years now. What would a model trained on out-of-copyright text be like to chat with?

Thanks to Trip we can now find out for ourselves!

The model itself is tiny, at least by Large Language Model standards - just 2.05GB on disk. You can try it out using Trip's HuggingFace Spaces demo:

Screenshot of a Victorian-themed chatbot interface titled "🎩 Mr. Chatterbox (Beta)" with subtitle "The Victorian Gentleman Chatbot". The conversation shows a user asking "How should I behave at dinner?" with the bot replying "My good fellow, one might presume that such trivialities could not engage your attention during an evening's discourse!" The user then asks "What are good topics?" and the bot responds "The most pressing subjects of our society— Indeed, a gentleman must endeavor to engage the conversation with grace and vivacity. Such pursuits serve as vital antidotes against ennui when engaged in agreeable company." A text input field at the bottom reads "Say hello..." with a send button. The interface uses a dark maroon and cream color scheme.

Honestly, it's pretty terrible. Talking with it feels more like chatting with a Markov chain than an LLM - the responses may have a delightfully Victorian flavor to them but it's hard to get a response that usefully answers a question.

The 2022 Chinchilla paper suggests a ratio of 20x the parameter count to training tokens. For a 340m model that would suggest around 7 billion tokens, more than twice the British Library corpus used here. The smallest Qwen 3.5 model is 600m parameters and that model family starts to get interesting at 2b - so my hunch is we would need 4x or more the training data to get something that starts to feel like a useful conversational partner.

But what a fun project!

Running it locally with LLM

I decided to see if I could run the model on my own machine using my LLM framework.

I got Claude Code to do most of the work - here's the transcript.

Trip trained the model using Andrej Karpathy's nanochat, so I cloned that project, pulled the model weights and told Claude to build a Python script to run the model. Once we had that working (which ended up needing some extra details from the Space demo source code) I had Claude read the LLM plugin tutorial and build the rest of the plugin.

llm-mrchatterbox is the result. Install the plugin like this:

llm install llm-mrchatterbox

The first time you run a prompt it will fetch the 2.05GB model file from Hugging Face. Try that like this:

llm -m mrchatterbox "Good day, sir"

Or start an ongoing chat session like this:

llm chat -m mrchatterbox

If you don't have LLM installed you can still get a chat session started from scratch using uvx like this:

uvx --with llm-mrchatterbox llm chat -m mrchatterbox

When you are finished with the model you can delete the cached file using:

llm mrchatterbox delete-model

This is the first time I've had Claude Code build a full LLM model plugin from scratch and it worked really well. I expect I'll be using this method again in the future.

I continue to hope we can get a useful model from entirely public domain data. The fact that Trip was able to get this far using nanochat and 2.93 billion training tokens is a promising start.

Tags: ai, andrej-karpathy, generative-ai, local-llms, llms, ai-assisted-programming, hugging-face, llm, training-data, uv, ai-ethics, claude-code

Read the whole story
mrmarchant
23 hours ago
reply
Share this story
Delete

As Slow As Possible

1 Share
Pippin Barr tests your attention span with weirdly meditative slowed-down remakes of Pong, Breakout, and Missile Command #
Read the whole story
mrmarchant
23 hours ago
reply
Share this story
Delete
Next Page of Stories