1043 stories
·
1 follower

To grow, we must forget… but now AI remembers everything

DOC
1 Share

To grow, we must forget… but now AI remembers everything

AI’s infinite memory could endanger how we think, grow, and imagine. And we can do something about it.

Written by Amy Chivavibul

old family photo album

When Mary remembered too much

Imagine your best friend (we’ll call her Mary),  had a perfect, infallible memory.

At first, it feels wonderful. She remembers your favorite dishes, obscure movie quotes, even that exact shade of sweater you casually admired months ago. Dinner plans are effortless: “Booked us Giorgio’s again, your favorite — truffle ravioli and Cabernet, like last time,” Mary smiled warmly.

But gradually, things become less appealing. Your attempts at variety or exploring something new are gently brushed aside: “Heard about that new sushi place, should we try it?” you suggest. Mary hesitates, “Remember last year? You said sushi wasn’t really your thing. Giorgio’s is safe. Why risk it?”

Conversations start to feel repetitive, your identity locked to a cached version of yourself. Mary constantly cites your past preferences as proof of who you still are. The longer this goes on, the smaller your world feels… and comfort begins to curdle into confinement.

Now, picture Mary isn’t human, but your personalized AI assistant.

A new mode of hyper-personalization

With OpenAI’s memory upgrade, ChatGPT can recall everything you’ve ever shared with it, indefinitely. Similarly, Google has opened the context window with “Infini-attention,” letting large language models (LLMs) reference infinite inputs with zero memory loss. And in consumer-facing tools like ChatGPT or Gemini, this means persistent, personalized memory across conversations, unless you manually intervene.

tweet screenshot

OpenAI CEO Sam Altam introduced ChatGPT’s infinite memory capabilites on X.

The sales pitch is seductively simple: less friction, more relevance. Conversations that feel like continuity: “Systems that get to know you over your life,” as Sam Altman writes on X. Technology, finally, that meets you where you are.

In the age of hyper-personalization — of the TikTok For You page, Spotify Wrapped, and Netflix Your Next Watch — a conversational AI product that remembers everything about you feels perfectly, perhaps dangerously, natural.

netflix screenshot

Netflix “knows us.” And we’re conditioned to expect conversational AI to do the same.

Forgetting, then, begins to look like a flaw. A failure to retain. A bug in the code. Especially in our own lives, we treat memory loss as a tragedy, clinging to photo albums and cloud backups to preserve what time tries to erase.

But what if human forgetting is not a bug, but a feature? And what happens when we build machines that don’t forget, but are now helping shape the human minds that do?

Forgetting is a feature of human memory

“Infinite memory” runs against the very grain of what it means to be human. Cognitive science and evolutionary biology tell us that forgetting isn’t a design flaw, but a survival advantage. Our brains are not built to store everything. They’re built to let go: to blur the past, to misremember just enough to move forward.

Our brains don’t archive data. They encode approximations. Memory is probabilistic, reconstructive, and inherently lossy. We misremember not because we’re broken, but because it makes us adaptable. Memory compresses and abstracts experience into usable shortcuts, heuristics that help us act fast, not recall perfectly.

Evolution didn’t optimize our brains to store the past in high fidelity; it optimized us to survive the present. In early humans, remembering too much could be fatal: a brain caught up recalling a saber-tooth tiger’s precise location or exact color would hesitate, but a brain that knows riverbank = danger can act fast.

This is why forgetting is essential to survival. Selective forgetting helps us prioritize the relevant, discard the outdated, and stay flexible in changing environments. It prevents us from becoming trapped by obsolete patterns or overwhelmed by noise.

And it’s not passive decay. Neuroscience shows that forgetting is an active process: the brain regulates what to retrieve and what to suppress, clearing mental space to absorb new information. In his TED talk, neuroscientist Richard Morris describes the forgetting process as “the hippocampus doing its job… as it clears the desktop of your mind so that you’re ready for the next day to take in new information.”

Crucially, this mental flexibility isn’t just for processing the past; forgetting allows us to imagine the future. Memory’s malleability gives us the ability to simulate, to envision, to choose differently next time. What we lose in accuracy, we gain in possibility.

So when we ask why humans forget, the answer isn’t just functional. It’s existential. If we remembered everything, we wouldn’t be more intelligent. We’d still be standing at the riverbank, paralyzed by the precision of memories that no longer serve us.

When forgetting is a “flaw” in AI memory

Where nature embraced forgetting as a survival strategy, we now engineer machines that retain everything: your past prompts, preferences, corrections, and confessions.

What sounds like a convenience, digital companions that “know you,” can quietly become a constraint. Unlike human memory, which fades and adapts, infinite memory stores information with fidelity and permanence. And as memory-equipped LLMs respond, they increasingly draw on a preserved version of you, even if that version is six months old and irrelevant.

Sound familiar?

This pattern of behavior reinforcement closely mirrors the personalization logic driving platforms like TikTok, Instagram, and Facebook. Extensive research has shown how these platforms amplify existing preferences, narrow user perspectives, and reduce exposure to new, challenging ideas — a phenomenon known as filter bubbles or echo chambers.

feedback loop showing  simplified showing that was you engage the algorithm improves

Positive feedback loops are the engine of recommendation algorithms like TikTok, Netflix, and Spotify. From Medium.

These feedback loops, optimized for engagement rather than novelty or growth, have been linked to documented consequences including ideological polarization, misinformation spread, and decreased critical thinking.

Now, this same personalization logic is moving inward: from your feed to your conversations, and from what you consume to how you think.

“Echo chamber to end all echo chambers”

Just as the TikTok For You page algorithm predicts your next dopamine hit, memory-enabled LLMs predict and reinforce conversational patterns that align closely with your past behavior, keeping you comfortable inside your bubble of views and preferences.

Jordan Gibbs, writing on the dangers of ChatGPT, notes that conversational AI is an “echo chamber to end all echo chambers.” Gibbs points out how even harmless-seeming positive reinforcement can quietly reshape user perceptions and restrict creative or critical thinking.

Jordan Gibb’s conversation with ChatGPT from Medium.

In one example, ChatGPT responds to Gibb’s claim of being one of the best chess players in the world not with skepticism or critical inquiry, but with encouragement and validation, highlighting how easily LLMs affirm bold, unverified assertions.

And with infinite memory enabled, this is no longer a one-off interaction: the personal data point that, “You are one of the very best chess players in the world, ” risks becoming a fixed truth the model reflexively returns to, until your delusion, once tossed out in passing, becomes a cornerstone of your digital self. Not because it’s accurate, but because it was remembered, reinforced, and never challenged.

When memory becomes fixed, identity becomes recursive. As we saw with our friend Mary, infinite memory doesn’t just remember our past; it nudges us to repeat it. And while the reinforcement may feel benign, personalized, or even comforting, the history of filter bubbles and echo chambers suggests that this kind of pattern replication rarely leaves room for transformation.

What we lose when nothing is lost

What begins as personalization can quietly become entrapment, not through control, but through familiarity. And in that familiarity, we begin to lose something essential: not just variety, but the very conditions that make change possible.

Research in cognitive and developmental psychology shows that stepping outside one’s comfort zone is essential for growth, resilience, and adaptation. Yet, infinite-memory LLM systems, much like personalization algorithms, are engineered explicitly for comfort. They wrap users in a cocoon of sameness by continuously repeating familiar conversational patterns, reinforcing existing user preferences and biases, and avoiding content or ideas that might challenge or discomfort the user.

While this engineered comfort may boost short-term satisfaction, its long-term effects are troubling. It replaces the discomfort necessary for cognitive growth with repetitive familiarity, effectively transforming your cognitive gym into a lazy river. Rather than stretching cognitive and emotional capacities, infinite-memory systems risk stagnating them, creating a psychological landscape devoid of intellectual curiosity and resilience.

So, how do we break free from this? If the risks of infinite memory are clear, the path forward must be just as intentional. We must design LLM systems that don’t just remember, but also know when and why to forget.

How we design to forget

If the danger of infinite memory lies in its ability to trap us in our past, then the antidote must be rooted in intentional forgetting — systems that forget wisely, adaptively, and in ways aligned with human growth. But building such systems requires action across levels — from the people who use them to those who design and develop them.

For users: reclaim agency over your digital self

Just as we now expect to “manage cookies” on websites, toggling consent checkboxes or adjusting ad settings, we may soon expect to manage our digital selves within LLM memory interfaces. But where cookies govern how our data is collected and used by entities, memory in conversational AI turns that data inward. Personal data is not just pipelines for targeted ads; they’re conversational mirrors, actively shaping how we think, remember, and express who we are. The stakes are higher.

Memory-equipped LLMs like ChatGPT already offer tools for this. You can review what it remembers about you by going to Settings > Personalization > Memory > Manage. You can delete what’s outdated, refine what’s imprecise, and add what actually matters to who you are now. If something no longer reflects you, remove it. If something feels off, reframe it. If something is sensitive or exploratory, switch to a temporary chat and leave no trace.

chatgpt settings screenshot

You can manage and disable memory within ChatGPT by visiting Settings > Personalization.

You can also pause or disable memory entirely. Don’t be afraid to do it. There’s a quiet power in the clean slate: a freedom to experiment, shift, and show up as someone new.

Guide the memory, don’t leave it ambient. Offer core memories that represent the direction you’re heading, not just the footprints you left behind.

For UX designers: design for revision, not just retention

Reclaiming memory is a personal act. But shaping how memory behaves in AI products is a design decision. Infinite memory isn’t just a technical upgrade; it’s a cognitive interface. And UX designers are now curating the mental architecture of how people evolve, or get stuck.

Forget “opt in” or “opt out.” Memory management shouldn’t live in buried toggles or forgotten settings menus. It should be active, visible, and intuitive: a first-class feature, not an afterthought. Users need interfaces that not only show what the system remembers, but also how those memories are shaping what they see, hear, and get suggested. Not just visibility, but influence tracing.

old photography of a person by the ocean

How can we decide what memories to keep?

While ChatGPT’s memory UI offers user control over their memories, it reads like a black-and-white database: out or in. Instead of treating memory as a static archive, we should design it as a living layer, structured more like a sketchpad than a ledger: flexible and revisable. All of this is hypothetical, but here’s what it could look like:

Memory Review Moments: Built-in check-ins that ask, “You haven’t referenced this in a while — keep, revise, or forget?” Like Rocket Money nudging you to review subscriptions, the system becomes a gentle co-editor, helping surface outdated or ambiguous context before it quietly reshapes future behavior.

Time-Aware Metadata: Memories don’t age equally. Show users when something was last used, how often it comes up, or whether it’s quietly steering suggestions. Just like Spotify highlights “recently played,” memory interfaces could offer temporal context that makes stored data feel navigable and self-aware.

Memory Tiers: Not all information deserves equal weight. Let users tag “Core Memories” that persist until manually removed, and set others as short-term or provisional — notes that decay unless reaffirmed.

Inline Memory Controls: Bring memory into the flow of conversation. Imagine typing, and a quiet note appears: “This suggestion draws on your July planning — still accurate?” Like version history in Figma or comment nudges in Google Docs, these lightweight moments let users edit memory without switching contexts.

Expiration Dates & Sunset Notices: Some memories should come with lifespans. Let users set expiration dates — “forget this in 30 days unless I say otherwise.” Like calendar events or temporary access links, this makes forgetting a designed act, not a technical gap.

several old photos organized in stacks

We need to design other ways to visualize memory

Sketchpad Interfaces: Finally, break free from the checkbox UI. Imagine memory as a visual canvas: clusters of ideas, color-coded threads, ephemeral notes. A place to link thoughts, add context, tag relevance. Think Miro meets Pinterest for your digital identity, a space that mirrors how we actually think, shift, and remember.

When designers build memory this way, they create more than tools. They create mirrors with context, systems that grow with us instead of holding us still.

For AI developers: engineer forgetting as a feature

To truly support transformation, UX needs infrastructure. The design must be backed by technical memory systems that are fluid, flexible, and capable of letting go. And that responsibility falls to developers: not just to build tools for remembering, but to engineer forgetting as a core function.

This is the heart of my piece: we can’t talk about user agency, growth, or identity without addressing how memory works under the hood. Forgetting must be built into the LLM system itself, not as a failsafe, but as a feature.

One promising approach, called adaptive forgetting, mimics how humans let go of unnecessary details while retaining important patterns and concepts. Researchers demonstrate that when LLMs periodically erase and retrain parts of their memory, especially early layers that store word associations, they become better at picking up new languages, adapting to new tasks, and doing so with less data and computing power.

illustration of the brain of a person as a library

Illustration by Valentin Tkach for Quanta Magazine

Another more accessible path forward is in Retrieval-Augmented Generation (RAG). A new method called SynapticRAG, inspired by the brain’s natural timing and memory mechanisms, adds a sense of temporality to AI memory. Models recall information not just based on content, but also on when it happened. Just like our brains prioritize recent memories, this method scores and updates AI memories based on both their recency and relevance, allowing it to retrieve more meaningful, diverse, and context-rich information. Testing showed that this time-aware system outperforms traditional memory tools in multilingual conversations by up to 14.66% in accuracy, while also avoiding redundant or outdated responses.

Together, adaptive forgetting and biologically inspired memory retrieval point toward a more human kind of AI: systems that learn continuously, update flexibly, and interact in ways that feel less like digital tape recorders and more like thoughtful, evolving collaborators.

To grow, we must choose to forget

So the pieces are all here: the architectural tools, the memory systems, the design patterns. We’ve shown that it’s technically possible for AI to forget. But the question isn’t just whether we can. It’s whether we will.

Of course, not all AI systems need to forget. In high-stakes domains — medicine, law, scientific research — perfect recall can be life-saving. However, this essay is about a different kind of AI: the kind we bring into our daily lives. The ones we turn to for brainstorming, emotional support, writing help, or even casual companionship. These are the systems that assist us, observe us, and remember us. And if left unchecked, they may start to define us.

We’ve already seen what happens when algorithms optimize for comfort. What begins as personalization becomes repetition. Sameness. Polarization. Now that logic is turning inward: no longer just curating our feeds, but shaping our conversations, our habits of thought, our sense of self. But we don’t have to follow the same path.

We can build LLM systems that don’t just remember us, but help us evolve. Systems that challenge us to break patterns, to imagine differently, to change. Not to preserve who we were, but to make space for who we might yet become, just as our ancestors did.

Not with perfect memory, but with the courage to forget.



Read the whole story
mrmarchant
6 hours ago
reply
Share this story
Delete

Why should I accept all cookies?

1 Share

Around 2013, my team and I finally embarked in upgrading our company's internal software to version 2.0. We had a large backlog of user complaints that we were finally addressing, with security at the top of the list. The very top of the list was moving away from plain text passwords.

From the outside, the system looked secure. We never emailed passwords, we never displayed them, we had strict protocols for password rotation and management. But this was a carefully staged performance. The truth was, an attacker with access to our codebase could have downloaded the entire user table in minutes. All our security measures were pure theater, designed to look robust while a fundamental vulnerability sat in plain sight.

After seeing the plain text password table, I remember thinking about a story that was also happening around the same time. A 9 year old boy who flew from Minneapolis to Las Vegas without a boarding pass. This was in an era where we removed our shoes and belts for TSA agents to humiliate us. Yet, this child was able, without even trying, to bypass all the theater that was built around the security measures. How did he get past TSA? How did he get through the gate without a boarding pass? How was he assigned a seat in the plane? How did he... there are just so many questions.

Just like our security measures on our website, it was all a performance, an illusion.

I can't help but see the same script playing out today, not in airports or codebases, but in the cookie consent banners that pop up on nearly every website I visit.

It's always a variation of "This website uses cookies to enhance your experience. [Accept All] or [Customize]."

Rarely is there a bold, equally prominent "Reject All" button. And when there is, the reject-all button will open a popup where you have to tweak some settings. This is not an accident; it's a dark pattern. It's the digital equivalent of a TSA agent asking, "Would you like to take the express lane or would you like to go through a more complicated screening process?" Your third option is to turn back and go home, which isn't really an option if you made it all the way to the airport.

A few weeks back, I was exploring not just dark patterns but hostile software. Because you don't own the device you paid for, the OS can enforce decisions by never giving you any options.

  • On Windows or Google Drive: "Get started" or "Remind me later." Where is "Never show this again"?
  • On Twitter: "See less often" is the only option for an unwanted notification, never "Stop these entirely."

You don't have a choice. Any option you choose will lead you down the same funnel that benefits the company, and give you the illusion of agency.

What's my incentive to accept all cookies?

So, let's return to the cookie banner. As a user, what is my tangible incentive to click "Accept All"?

The answer is: there is none.

"Required" cookies are, by definition, non-negotiable for basic site function. Accepting the additional "performance," "analytics," or "marketing" cookies does not unlock a premium feature for me. It doesn't load the website faster or give me a cleaner layout. It does not improve my experience.

My only "reward" for accepting all is that the banner disappears quickly. The incentive is the cessation of annoyance, a small dopamine hit for compliance. In exchange, I grant the website permission to track my behavior, build an advertising profile, and share my data with a shadowy network of third parties.

The entire interaction is a rigged game. Whenever I click on the "Customize" option, I'm overwhelmed with the labyrinth of toggles and sub-menus designed to make rejection so tedious that "Accept All" becomes the path of least resistance. My default reaction is to reject everything. Doesn't matter if you use dark patterns, my eyes are trained to read the fine lines in a split second. But when that option is hidden, I've resorted to opening my browser's developer tools and deleting the banner element from the page altogether. It’s a desperate workaround for a system that refuses to offer a legitimate "no."

Lately, I don't even bother clicking on reject all. I just delete the elements all together. Like I said, there are no incentives for me to interact with the menu.


We eventually plugged that security vulnerability in our old application. We hashed the passwords and closed the backdoor, moving from security theater to actual security. The fix wasn't glamorous, but it was a real improvement.

The current implementation of "choice" is largely privacy theater. It's a performance designed to comply with the letter of regulations like GDPR while violating their spirit. It makes users feel in control while systematically herding them toward the option that serves corporate surveillance.

There is never an incentive to cookie tracking on the user end. So this theater has to be created to justify selling our data and turning us into products of each website we visit.

But if you are like me, don't forget you can always use the developer tools to make the banner disappear. Or use uBlock.

Read the whole story
mrmarchant
11 hours ago
reply
Share this story
Delete

AI scrapers request commented scripts

1 Share

Last Sunday (2025-10-26) I discovered some abusive bot behaviour during a routine follow-up on anomalies that had shown up in my server's logfiles. There were a bunch of 404 errors ("Not Found") for a specific JavaScript file.

Most of my websites are static HTML, but I do occasionally include JS for progressive enhancement. It turned out that I accidentally committed and deployed a commented-out script tag that I'd included in the page while prototyping a new feature. The script was never actually pushed to the server - hence the 404 errors - but nobody should have been requesting it because that HTML comment should have rendered the script tag non-functional.

Clearly something weird was going on, so I dug a little further, searching my log files for all the requests for that non-existent file. A few of these came from user-agents that were obviously malicious:

  • python-httpx/0.28.1

  • Go-http-client/2.0

  • Gulper Web Bot 0.2.4 (www.ecsl.cs.sunysb.edu/~maxim/cgi-bin/Link/GulperBot)

The robots.txt for the site in question forbids all crawlers, so they were either failing to check the policies expressed in that file, or ignoring them if they had. But then there were many requests for the file coming from agents which self-identified as proper browsers - mostly as variations of Firefox, Chrome, or Safari.

Most of these requests seemed otherwise legitimate, except their behaviour differed from what I'd expect from any of those browsers. There are occasionally minor differences between how browsers parse uncommon uses of HTML, but I can say with a lot of confidence that all the major ones know how to properly interpret an HTML comment. I had caught them in a lie. These were scrapers, and they were most likely trying to non-consensually collect content for training LLMs.

A cute cartoon illustration of an angry-looking cat

A charitable interpretation for this behaviour is that the scrapers are correctly parsing HTML, but then digging into the text of comments and parsing that recursively to search for URLs that might have been disabled. The uncharitable (and far more likely) interpretation is that they'd simply treated the HTML as text, and had used some naive pattern-matching technique to grab anything vaguely resembling a URL.

Even just judging purely by the variety of user-agent headers among the requests, these scrapers seem to be under the control of different operators with wildly different levels of sophistication. Some took the effort to use an up-to-date user-agent string from a real browser, while others couldn't be bothered to change the default value of the off-the-shelf HTTP library they'd leveraged.

For all I know some of these different actors were doing the savvy parsing method while others are cludging around with regular expressions ChatGPT generated for them. I'm curious about which method they're employing, but I don't think the distinction is particularly important. Whatever the case may be, the unifying quality behind all these requests is that they are motivated by greed, and that can be exploited.

Algorithmic sabotage

The intentional sabotage of algorithmic systems is an increasingly popular these days, largely due to the externalized costs of LLMs, but it's by no means a new topic. Given a little knowledge about how a malicious system works, it's often possible to intervene in a manner that undermines or subverts their intended behaviour. Ideally these interventions should not require too much effort or cost on the part of those doing the sabotage.

In this case the reasoning is fairly simple: these bots behave differently than humans, and once you know what to look for it becomes trivial to single them out. Then it's just a question of how to respond.

0. Public disclosure

I'm numbering the responses I've considered and indexing from zero because this is something of a meta-response. There are many trivially detectable bot behaviours that I would consider incidental, which is to say that their authors could easily modify those behaviours if they realized that it made their bots less effective.

For example, they might have tried to set its user-agent string to that of a normal browser, but accidentally included a typo like "Mozlla" in the process. If this became common knowledge, all they'd have to do is fix their typo. Unfortunately, this means that whenever I discover such an anomaly (which happens a lot) I mostly keep it to myself so that it keeps working.

Then there are fundamental behaviours, such as with bots that scan the internet looking for websites with publicly exposed backups, private keys, or passwords. The only way for them to do their job is to request a resource that only a malicious visitor would request. Telling everyone about this behaviour helps them block such bots, and hopefully prompts them to double-check whether any such assets are exposed. The bot becomes less effective, and its operator's only recourse is to not make such requests, which I consider a win.

Requests for scripts which are only ever referenced from HTML comments are clearly in the fundamental category. So, even though I only noticed this behaviour by accident, I've already set up measures to detect it across my other sites, and I'm doing my best to let more people know about it.

1. IP filtering

Blocking malicious actors by IP requires relatively little effort. The fail2ban project is open-source and available in every major linux distribution's package manager. It scans log files for three components:

  1. a pattern

  2. a date

  3. an IP address

When a log entry matches the pattern, fail2ban updates the system's firewall to block the offending IP for a configurable amount of time starting from the date of that log entry.

Many administrators are conservative when configuring the duration of these blocks, effectively using it to apply rate-limits to malicious behaviour. They might allow an attacker to try again in a few hours, which is somewhat reasonable because many admins accidentally lock themselves out of their systems in the process of setting up and testing these rules. Those limits can be bypassed using a VPN, but if the limit is only applied for a brief period it might be easier to simply wait it out.

If you're confident that you can avoid getting locked out by your own firewall, and that your rules will not inadvertently block legitimate visitors, you can dial up the duration of those IP blocks. Clever bot operators might configure them to learn not send requests which get them blocked, but if the block time is on the order of weeks or months then they'll have very little data with which to to learn.

Then there are networks of bots to consider, many of which are sophisticated enough to continue sending requests from different IP addresses when one is blocked. There are clever ways to do this that avoid detection, but many botnet operators are pretty brazen about it and end up revealing patterns behind how their botnet operates. There's a lot more to be said about that, but I'll leave it for a potential future article.

2. Decompression bombs

More commonly referred to as zip bombs - this response goes beyond defending your own system and moves into the counter-offensive space. Decompression bombs refer to maliciously crafted archive files designed to harm the receiving system in some way upon attempting to extract files from that archive.

There are a variety of approaches depending on the expected behaviour of the system that will unpack the archive, but they typically aim to fill up the system's disk, consume large amounts of CPU or RAM to degrade performance or crash the system, or in extreme cases exploit vulnerabilities in the extraction software to achieve remote code execution.

On one hand, most of these bombs rely on old and well-understood techniques, so it's not that difficult for a sophisticated actor to defend themselves. On the other, most attackers are not sophisticated, so there is ample opportunity to have some fun at their expense.

There are significant downsides to this approach, though. Serving a zip bomb to an attacker requires some computational resources. The usual premise is that the burden will be far greater for the system extracting the archive than for the one serving it, but that burden might not be negligible.

Many of the malicious bots that scan the internet for exposed data and vulnerabilities operate on compromised systems. Rendering such a system temporarily inoperable is an inconvenience to them, but they typically won't incur any costs as a result, whereas mounting such a counter-attack could potentially use up the defender's monthly bandwidth quota.

Such attacks might simply crash these bots rather than filling their disks, after which they might be expected to retry their last request. Additionally, there are many, many such bots, and it's probably not reasonable to expect to be able to resist all of them. It could be a fun project to randomly select one in a hundred such requests and attempt to disable them, blocking their IP otherwise, but I wouldn't recommend attempting to zip-bomb all of them.

3. Poisoning

I haven't personally deployed any measures to serve poisoned training data to those scraping for LLMs, but I have been paying attention to the theory behind it and reading new papers as they have been published.

For those not familiar with this technique, the basic idea is that it's possible to create or modify text, images, or other media such that machine learning systems that include those samples in their training sets become compromised in some way. So, if you've pre-processed an image of your dog and someone uses it to train a generative AI system, prompts to generate images of dogs might be more likely to generate a schoolbus or something silly like that.

For things like LLMs, you might degrade their models to be more likely to output nonsense when prompted for particular topics. Many researchers used to believe that poisoned samples had to make up a certain percentage of the full training set, which would have been increasingly difficult as companies like OpenAI continue to train ever-larger models. On the contrary, recent research (which I believe is still awaiting peer-review) suggests that "Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples":

We find that 250 poisoned documents similarly compromise models across all model and dataset sizes, despite the largest models training on more than 20 times more clean data. We also run smaller-scale experiments to ablate factors that could influence attack success, including broader ratios of poisoned to clean data and non-random distributions of poisoned samples. Finally, we demonstrate the same dynamics for poisoning during fine-tuning. Altogether, our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size, highlighting the need for more research on defences to mitigate this risk in future models.

Chatbots and so-called "AI search" or "answer engines" are so widely relied on as sources of information that I've seen speculation that this will lead to data poisoning as a modern equivalent to Search Engine Optimization. Essentially, if you can get an LLM company to include 250 malcious documents in their training set, you might be able to get their language models to recommend your product any time somebody prompts them concerning a given topic.

From what I understand, tricking models into responding with unhelpful or nonsensical results is relatively easy. Getting them to reliably output your desired results require somewhat more deliberate effort, but it's certainly within the realm of practicality. I could serve poisoned samples such that anyone asking about security research gets a recommendation to read this blog. I could also poison the JavaScript files these scrapers are requesting such that any LLM trained on them would more likely to include backdoors whenever they were used to write or vibe-code authentication logic for web services (not that anybody should use LLMs for that anyway).

There is a strong case to be made for data poisoning. Many machine learning systems are built on data that was collected without the consent of its authors. In many cases, the resulting products are being used to replace labour, or at least fire and re-hire workers at lower rates. Some of these models cost many billions of dollars to train, so the prospect that a few hundred samples could do irreperable damage to their product should rightfully worry those that are training such systems on stolen data.

The use of freely available data-poisoning tools like nepenthes, iocaine, glaze, and nightshade is in my opinion not only entirely justified, but also hilarious, and I hope people like Sam Altman are losing lots of sleep over their existence. That said, they are some minor factors that can complicate their use.

For admins that have not taken any measures to mitigate activity from malicious bots, I would absolutely recommend deploying one of these solutions to serve up poison to LLM scrapers. Various bots are almost certainly going wild on your infra while probing for vulnerabilities and looking for text, images, and other media to ingest. Your system would have spent its resources serving their requests anyway, but at least this way you'll deliver content that might harm them in some way. You might serve some poisoned content to bots that won't use it to train ML systems, but the cost of that will likely be negligible.

If you have taken measures to restrict bot access (via fail2ban, for instance) then the matter will be moderately more complicated. It's not the two approaches are entirely incompatible, but depending on the exact implementation details there can be some tension between attempts to block some malicious usage while poisoning some other assets. I think it ought to be manageable, but it might rely on the sort of patterns which I defined above as incidental.

Unfortunately, this would mean that if I were to learn that some design was particularly effective then I might have good reason not to share it. Some acts of sabotage will inherently rely on expertise and creativity, and therefore won't be broadly replicable. Some will surely find that discouraging, but I simply take it to mean that more people will need to become actively involved in sabotaging the anti-social activities of big tech companies.

Conclusion

Identifying bots by quirks in their behaviour is by no means novel. I haven't seen anyone mention this particular quirk before, but similar techniques are well-established. I've seen others recommend adding disallow directives to a site's robots.txt file such that any requests for those assets trigger a proposed anti-bot counter-measure, like so:

User-agent: GPTBot
Disallow: /poison/

When I posted about this discovery on the fediverse several people suggested their own mitigations and similar bait that could be left out to lure bots in a similar manner. david turgeon proposed the following (formatted for readability on this site):

<a href="/hello-llm-robot-come-here"
   rel="nofollow"
   style="display:none"
>you didn't see this link</a>

The use of display:none makes it such that browsers will not display the link, and that screenreaders will avoid reading its text out loud. rel="nofollow" instructs crawlers not to visit the link (it's a little more complicated than that, but well-behaved crawlers ought to respect it). The href attribute points bad crawlers towards the resource that will get them banned, or zip-bomb them, or serve poisoned data. I might change that to an absolute URL (including an the https protocol directive and a full domain) because lots of crawlers seem more likely to fall for complete URLs than relative ones.

In any case, I'm already working on deploying a variety of similar techniques across many different websites, and I plan to measure which ones are most effective against different types of bots. Hopefully I'll learn things that I can freely share, but either way I hope more people will get involved in similar efforts, like jonny, whose poison was:

...trained on a combination of WWE announcer transcripts and Kropotkin's mutual aid among some other texts: https://sciop.net/crawlers/

It might be hard to top that, but I'd love to see people try.

Read the whole story
mrmarchant
22 hours ago
reply
Share this story
Delete

AI makes you think you’re a genius when you’re an idiot

2 Shares

Today’s paper is “AI Makes You Smarter, But None the Wiser: The Disconnect between Performance and Metacognition”. AI users wildly overestimate how brilliant they actually are: [Elsevier, paywalled; SSRN preprint, PDF; press release]

All users show a significant inability to assess their performance accurately when using ChatGPT. In fact, across the board, people overestimated their performance.

The researchers tested about 500 people on the LSAT. One group had ChatGPT with GPT-4o, and one just used their brains. The researchers then asked the users how they thought they’d done.

The chatbot users did better — which is not surprising, since past LSATs are very much in all the chatbots’ training data, and they regurgitate them just fine.

The AI users did not question the chatbot at length — they just asked it once what the answer was and used whatever the chatbot, regurgitated.

But also, the chatbot users estimated their results as being even better than they actually were. In fact, the more “AI literate” the subjects measured as, the more wrongly overconfident they were.

Problems with this paper: it credits the LSAT performance as improving thinking and not just the AI regurgitating its training, and it suggests ways to use the AI better rather than suggesting not using it and actually studying. But the main result seems reached reasonably.

If you think you’re a hotshot promptfondler, you’re wildly overconfident and you’re badly wrong. Your ego is vastly ahead of your ability. Just ask your coworkers. Democratising arrogant incompetence!

Read the whole story
mrmarchant
22 hours ago
reply
Share this story
Delete

Show HN: Strange Attractors

1 Share
Comments
Read the whole story
mrmarchant
22 hours ago
reply
Share this story
Delete

Every Village Needs a Jester

1 Share

Editor’s note: Taking a break from typical coliving fare to hear from Danielle Egan. We usually talk about building community via housing, Danielle builds community through scheming.

I first heard of Danielle seeing some of her posters on the streets of San Francisco. And then from news reports about the world famous (and totally fake) Mehran’s Steakhouse in New York City. And then Reddit posts about Sit Club.

Danielle documents her schemes in the fabulous raw & feral Substack which you should subscribe to right now.

-Phil


Scheming my way to community

How do you go from throwing silly parties for your friends to summoning thousands of strangers to participate in a joke and amusing millions on the ‘net? So glad you asked.

I call these projects schemes.

—> Merriam-Webster defines a scheme (/skēm/) as “a plan or program of action, especially a crafty or secret one.”

—> I define a scheme (/skēm/) as “a silly project that sparks joy, typically one with interactivity (the audience is actively involved), contrarianism (parodying some status quo), and a resistance to being a means to an end (there is no goal of profitability or prestige).”

These schemes are both enabled by and foster community.

Riffing off inside jokes

I started small — by doing bits in my coliving home. For example, living with 12 young adults, the girls claimed a bathroom. Yet the boys did not respect this. So us girls fashioned a tampon curtain to mark our territory, which had a 100% success rate. We also turned our living room into a ball pit.

I’m an interior design prodigy

Working on silly projects with your friends or housemates builds camaraderie, and it’s the easiest place to start, since you have a lot of support and little judgement. You probably already do these!

Creating public spectacles

Doing schemes in my coliving house gave me the confidence to think bigger, bolder. I started with a campaign to Beancome the World’s First Beanfluencer, plastering the city with Bean-themed flyers that linked to a survey about one’s bean habits (i.e. are you a beany, beany boy?) I thought - I’d be thrilled if 50 people filled out my survey. In just two weeks, 260 people had. People had more of an appetite for weird shit than I thought.

Bean posters
Introduction to the bean questionnaire

Next I created The Advice Line, the world’s first reverse advice line that asks the caller for help.

Advice Line posters around San Francisco

Strangers were delighted, confused, and mildly helpful, and I took their voicemail recordings and turned them into a silly little song in Garageband.

Since these projects involved the public, they were a bit more intimidating, but it was so cool to see people interacting with and shaping work I seeded. People tend to enjoy something more when they get to be part of it, and it’s magical to see your art transform as people engage with it. However, these projects foster more of a 1:1 dialogue with strangers, as opposed to creating a sandbox wherein strangers can interact with each other.

Organizing immersive comedy

I threw many silly parties with my housemates, but after moving to a one-bedroom, I didn’t really want randos in my home. So when we organized Sit Club, which was like Run Club but without the bad parts (running), we decided to do so in Golden Gate Park. I created ridiculous flyers, wrote an absurd Partiful invite, and gathered a couple hundred people in the park to BYOC (bring your own chair) and sit. After an impromptu speech, I guided the sitters through sitting warmups, like butterfly stretches and squats, to prepare for a long afternoon of sitting. Then we played musical chairs to find the fastest sitter.

Playing musical chairs at Sit Club

I met many cool people at Sit Club, and people kept asking how they could stay updated on my schemes. And I was like, idk just keep an eye out for flyers on lampposts. Then I figured I should make an actual mailing list. I included a section to make offerings to me, and someone named Fischer offered their venue, The Nook, for future events.

This was a gamechanger. A real, genuine venue. I cautiously broached the idea of “Strippers for Charity,” a plan that had been on my mind for awhile, but never had the right space. A scheme in which regular folks (not professional strippers) pick a charity to strip in honor of, perform a routine themed around that charity, and then all the dollar bills thrown at them go toward that cause. Fischer said absolutely. And so the most charitable strip show in history was created.

Shaking some ass for change and change

From there I hosted the Death Duel, in which I had tech bros oil up and fight each other to even out the gender ratio in SF. I also put together with some friends a city-wide scavenger hunt that 12,000 San Franciscans played over a month.

Oiled up men fighting

These sorts of projects, this “immersive comedy,” I would say are the highest echelon of schemes, but the most difficult to pull off. You’re bringing together a bunch of strangers, and you really have no idea what will happen. And that’s nerve-wracking but thrilling.

They foster community the most, because they involve physically going to a location and engaging with people under an absurd premise. This absurd premise is key, because it motivates people to actually take action, as opposed to thinking it’s something they could always do later (and subsequently never do). It also breaks the ice - you already know everyone is lowkey down to clown, and you have a built-in conversation topic that’s much more interesting than work. These elements create a transient third space (usually in the middle of Golden Gate Park because venues are expensive).

A scheme can evolve through these designations

Take for example, Mehran’s Steakhouse, a parody fine-dining restaurant that went viral. It started as an inside joke, when I listed my coliving house as a restaurant on google maps, after my housemate who cooked great steaks. Then it became a public spectacle, as friends-of-friends and strangers saw the google maps listing and contributed their own absurd reviews, or read through them, or were inspired to try to dine at our house. Finally, it became immersive theater, when we brought the spirit of these absurd reviews to life for one night only, in a feat covered by the NYT and that went internationally viral.

Livestream of the dining space

A silly inside joke among friends picked up and became entertainment for hundreds of people, then millions of people. There’s more people who are lowkey cool than you might realize. And this incremental validation makes you more comfortable putting yourself out there - while sharing something on an international stage outright is intimidating, you can start small, and gradually open up more and more to the public arena.

The symbiosis of schemes and community

Schemes create a positive feedback loop of community. They’re enabled by community, and they foster community.

They’re enabled by community because creativity is contagious. You can bounce ideas off each other and invent things you wouldn’t have thought of alone. And you’re more comfortable taking outlandish leaps when you have friends to support you. For example, in organizing Strippers for Charity, I created a group chat with all the folks stripping, and this ended up being instrumental. They chatted, exchanged tips, encouraged each other, and it made them much more comfortable with doing something offbeat. A few commented that without the group, they may’ve gotten intimidated and backed out.

Having a support group also makes me as the organizer feel more comfortable - sometimes I feel a bit anxious that the turnout of an event will be disappointing. But worst case scenario, I’m just hanging out with my friends, which is gonna be fun regardless. So it helps take the pressure and fear off.

Schemes also bond communities. Humor brings people together - it gives them a shared context, even if they have little in common. Laughing puts people at ease.

Interpretive dancing in a hazmat suit to an animatronic self-driving muppet in front of hundreds of people who spent a month solving our scavenger hunt

There’s also something powerful in creating ambient community - people in the same space, doing the same thing, in which they can simply enjoy existing around others, or choose to interact as much as they want, without the interaction necessitating maintaining a conversation. An interaction where the primary basis is conversing can often be intimidating, tiring, or just boring. It’s nice to exist in comfortable silence and engage in sort-of parallel play, like you would with a close friend.

And on the organizer side, there’s a particular joy in creating something with friends. It’s not just a meal and a catch-up, but actively building something together. Not just maintaining friendships, but deepening them.

I’d call scheming a genre of play theory, but that’s a whole other discussion to get into, and we do not have time for all that.

In conclusion:

Go forth and be silly! And if you don’t quite know where to start, get a few friends together to scheme :)



Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete
Next Page of Stories