622 stories
·
0 followers

Predicting Average IMDb Movie Ratings Using Text Embeddings of Movie Metadata

1 Share

Months ago, I saw a post titled “Rejected from DS Role with no feedback” on Reddit’s Data Science subreddit, in which a prospective job candidate for a data science position provided a Colab Notebook documenting their submission for a take-home assignment and asking for feedback as to why they were rejected. Per the Reddit user, the assignment was:

Use the publicly available IMDB Datasets to build a model that predicts a movie’s average rating. Please document your approach and present your results in the notebook. Make sure your code is well-organized so that we can follow your modeling process.

IMDb, the Internet Movie Database owned by Amazon, allows users to rate movies on a scale from 1 to 10, wherein the average rating is then displayed prominently on the movie’s page:

The Shawshank Redemption is currently the highest-rated movie on IMDb with an average rating of 9.3 derived from 3.1 million user votes.

The Shawshank Redemption is currently the highest-rated movie on IMDb with an average rating of 9.3 derived from 3.1 million user votes.

In their notebook, the Redditor identifies a few intuitive features for such a model, including the year in which the movie was released, the genre(s) of the movies, and the actors/directors of the movie. However, the model they built is a TensorFlow and Keras-based neural network, with all the bells-and-whistles such as batch normalization and dropout. The immediate response by other data scientists on /r/datascience was, at its most polite, “why did you use a neural network when it’s a black box that you can’t explain?”

Reading those replies made me nostalgic. Way back in 2017, before my first job as a data scientist, neural networks using frameworks such as TensorFlow and Keras were all the rage for their ability to “solve any problem” but were often seen as lazy and unskilled compared to traditional statistical modeling such as ordinary least squares linear regression or even gradient boosted trees. Although it’s funny to see that perception against neural networks in the data science community hasn’t changed since, nowadays the black box nature of neural networks can be an acceptable business tradeoff if the prediction results are higher quality and interpretability is not required.

Looking back at the assignment description, the objective is only “predict a movie’s average rating.” For data science interview take-homes, this is unusual: those assignments typically have an extra instruction along the lines of “explain your model and what decisions stakeholders should make as a result of it”, which is a strong hint that you need to use an explainable model like linear regression to obtain feature coefficients, or even a middle-ground like gradient boosted trees and its variable importance to quantify relative feature contribution to the model. 1 In absence of that particular constraint, it’s arguable that anything goes, including neural networks.

The quality of neural networks have improved significantly since 2017, even moreso due to the massive rise of LLMs. Why not try just feeding a LLM all raw metadata for a movie and encode it into a text embedding and build a statistical model based off of that? Would a neural network do better than a traditional statistical model in that instance? Let’s find out!

About IMDb Data

The IMDb Non-Commercial Datasets are famous sets of data that have been around for nearly a decade 2 but are still updated daily. Back in 2018 as a budding data scientist, I performed a fun exporatory data analysis using these datasets, although the results aren’t too surprising.

The average rating for a movie is around 6 and tends to skew higher: a common trend in internet rating systems.

The average rating for a movie is around 6 and tends to skew higher: a common trend in internet rating systems.

But in truth, these datasets are a terrible idea for companies to use for a take-home assignment. Although the datasets are released under a non-commercial license, IMDb doesn’t want to give too much information to their competitors, which results in a severely limited amount of features that could be used to build a good predictive model. Here are the common movie-performance-related features present in the title.basics.tsv.gz file:

  • tconst: unique identifier of the title
  • titleType: the type/format of the title (e.g. movie, tvmovie, short, tvseries, etc)
  • primaryTitle: the more popular title / the title used by the filmmakers on promotional materials at the point of release
  • isAdult: 0: non-adult title; 1: adult title
  • startYear: represents the release year of a title.
  • runtimeMinutes: primary runtime of the title, in minutes
  • genres: includes up to three genres associated with the title

This is a sensible schema for describing a movie, although it lacks some important information that would be very useful to determine movie quality such as production company, summary blurbs, granular genres/tags, and plot/setting — all of which are available on the IMDb movie page itself and presumably accessible through the paid API. Of note, since the assignment explicitly asks for a movie’s average rating, we need to filter the data to only movie and tvMovie entries, which the original assignment failed to do.

The ratings data in title.ratings.tsv.gz is what you’d expect:

  • tconst: unique identifier of the title (which can therefore be mapped to movie metadata using a JOIN)
  • averageRating: average of all the individual user ratings
  • numVotes: number of votes the title has received

In order to ensure that the average ratings for modeling are indeed stable and indicative of user sentiment, I will only analyze movies that have atleast 30 user votes: as of May 10th 2025, that’s about 242k movies total. Additionally, I will not use numVotes as a model feature, since that’s a metric based more on extrinsic movie popularity rather than the movie itself.

The last major dataset is title.principals.tsv.gz, which has very helpful information on metadata such as the roles people play in the production of a movie:

  • tconst: unique identifier of the title (which can be mapped to movie data using a JOIN)
  • nconst: unique identifier of the principal (this is mapped to name.basics.tsv.gz to get the principal’s primaryName, but nothing else useful)
  • category: the role the principal served in the title, such as actor, actress, writer, producer, etc.
  • ordering: the ordering of the principals within the title, which correlates to the order the principals appear on IMDb’s movie cast pages.

Additionally, because the datasets are so popular, it’s not the first time someone has built a IMDb ratings predictor and it’s easy to Google.

Instead of using the official IMDb datasets, these analyses are based on the smaller IMDB 5000 Movie Dataset hosted on Kaggle, which adds metadata such as movie rating, budget, and further actor metadata that make building a model much easier (albeit “number of likes on the lead actor’s Facebook page” is very extrinsic to movie quality). Using the official datasets with much less metadata is building the models on hard mode and will likely have lower predictive performance.

Although IMDb data is very popular and very well documented, that doesn’t mean it’s easy to work with.

The Initial Assignment and “Feature Engineering”

Data science take-home assignments are typically 1/2 exploratory data analysis for identifying impactful dataset features, and 1/2 building, iterating, and explaining the model. For real-world datasets, these are all very difficult problems with many difficult solutions, and the goal from the employer’s perspective is seeing more how these problems are solved rather than the actual quantitative results.

The initial Reddit post decided to engineer some expected features using pandas, such as is_sequel by checking whether a non-1 number is present at the end of a movie title and one-hot encoding each distinct genre of a movie. These are fine for an initial approach, albeit sequel titles can be idiosyncratic and it suggests that a more NLP approach to identifying sequels and other related media may be useful.

The main trick with this assignment is how to handle the principals. The common data science approach would be to use a sparse binary encoding of the actors/directors/writers, e.g. using a vector where actors present in the movie are 1 and every other actor is 0, which leads to a large number of potential approaches to encode this data performantly, such as scikit-learn’s MultiLabelBinarizer. The problem with this approach is that there are a very large number of unique actors / high cardinality — more unique actors than data points themselves — which leads to curse of dimensionality issues and workarounds such as encoding only the top N actors will lead to the feature being uninformative since even a generous N will fail to capture the majority of actors.

There are actually 624k unique actors in this dataset (Jupyter Notebook), the chart just becomes hard to read at that point.

There are actually 624k unique actors in this dataset (Jupyter Notebook), the chart just becomes hard to read at that point.

Additionally, most statistical modeling approaches cannot account for the ordering of actors as they treat each feature as independent, and since the billing order of actors is generally correlated to their importance in the movie, that’s an omission of relevant information to the problem.

These constraints gave me an idea: why not use an LLM to encode all movie data, and build a model using the downstream embedding representation? LLMs have attention mechanisms, which will not only respect the relative ordering of actors (to give higher predictive priority to higher-billed actors, along with actor cooccurrences), but also identify patterns within movie name texts (to identify sequels and related media semantically).

I started by aggregating and denormalizing all the data locally (Jupyter Notebook). Each of the IMDb datasets are hundreds of megabytes and hundreds of thousands of rows at minimum: not quite big data, but enough to be more cognizant of tooling especially since computationally-intensive JOINs are required. Therefore, I used the Polars library in Python, which not only loads data super fast, but is also one of the fastest libraries at performing JOINs and other aggregation tasks. Polars’s syntax also allows for some cool tricks: for example, I want to spread out and aggregate the principals (4.1 million rows after prefiltering) for each movie into directors, writers, producers, actors, and all other principals into nested lists while simultaneously having them sorted by ordering as noted above. This is much easier to do in Polars than any other data processing library I’ve used, and on millions of rows, this takes less than a second:

df_principals_agg = (
    df_principals.sort(["tconst", "ordering"])
    .group_by("tconst")
    .agg(
        director_names=pl.col("primaryName").filter(pl.col("category") == "director"),
        writer_names=pl.col("primaryName").filter(pl.col("category") == "writer"),
        producer_names=pl.col("primaryName").filter(pl.col("category") == "producer"),
        actor_names=pl.col("primaryName").filter(
            pl.col("category").is_in(["actor", "actress"])
        ),
        principal_names=pl.col("primaryName").filter(
            ~pl.col("category").is_in(
                ["director", "writer", "producer", "actor", "actress"]
            )
        ),
        principal_roles=pl.col("category").filter(
            ~pl.col("category").is_in(
                ["director", "writer", "producer", "actor", "actress"]
            )
        ),
    )
)

After some cleanup and field renaming, here’s an example JSON document for Star Wars: Episode IV - A New Hope:

{
  "title": "Star Wars: Episode IV - A New Hope",
  "genres": [
    "Action",
    "Adventure",
    "Fantasy"
  ],
  "is_adult": false,
  "release_year": 1977,
  "runtime_minutes": 121,
  "directors": [
    "George Lucas"
  ],
  "writers": [
    "George Lucas"
  ],
  "producers": [
    "Gary Kurtz",
    "Rick McCallum"
  ],
  "actors": [
    "Mark Hamill",
    "Harrison Ford",
    "Carrie Fisher",
    "Alec Guinness",
    "Peter Cushing",
    "Anthony Daniels",
    "Kenny Baker",
    "Peter Mayhew",
    "David Prowse",
    "Phil Brown"
  ],
  "principals": [
    {
      "John Williams": "composer"
    },
    {
      "Gilbert Taylor": "cinematographer"
    },
    {
      "Richard Chew": "editor"
    },
    {
      "T.M. Christopher": "editor"
    },
    {
      "Paul Hirsch": "editor"
    },
    {
      "Marcia Lucas": "editor"
    },
    {
      "Dianne Crittenden": "casting_director"
    },
    {
      "Irene Lamb": "casting_director"
    },
    {
      "Vic Ramos": "casting_director"
    },
    {
      "John Barry": "production_designer"
    }
  ]
}

I was tempted to claim that I used zero feature engineering, but that wouldn’t be accurate. The selection and ordering of the JSON fields here is itself feature engineering: for example, actors and principals are intentionally last in this JSON encoding because they can have wildly varying lengths while the prior fields are more consistent, which should make downstream encodings more comparable and consistent.

Now, let’s discuss how to convert these JSON representations of movies into embeddings.

Creating And Visualizing the Movie Embeddings

LLMs that are trained to output text embeddings are not much different from LLMs like ChatGPT that just predict the next token in a loop. Models such as BERT and GPT can generate “embeddings” out-of-the-box by skipping the prediction heads of the models and instead taking an encoded value from the last hidden state of the model (e.g. for BERT, the first positional vector of the hidden state representing the [CLS] token). However, text embedding models are more optimized for distinctiveness of a given input text document using contrastive learning. These embeddings can be used for many things, from finding similar encoded inputs by identifying the similarity between embeddings, and of course, by building a statistical model on top of them.

Text embeddings that leverage LLMs are typically generated using a GPU in batches due to the increased amount of computation needed. Python libraries such as Hugging Face transformers and sentence-transformers can load these embeddings models. For this experiment, I used the very new Alibaba-NLP/gte-modernbert-base text embedding model that is finetuned from the ModernBERT model specifically for the embedding use case for two reasons: it uses the ModernBERT architecture which is optimized for fast inference, and the base ModernBERT model is trained to be more code-aware and should be able understand JSON-nested input strings more robustly — that’s also why I intentionally left in the indentation for nested JSON arrays as it’s semantically meaningful and explicitly tokenized. 3

The code (Jupyter Notebook) — with extra considerations to avoid running out of memory on either the CPU or GPU 4 — looks something like this:

device = "cuda:0"
dataloader = torch.utils.data.DataLoader(docs, batch_size=32,
                                         shuffle=False,
                                         pin_memory=True,
                                         pin_memory_device=device)

dataset_embeddings = []
for batch in tqdm(dataloader, smoothing=0):
    tokenized_batch = tokenizer(
        batch, max_length=8192, padding=True, truncation=True, return_tensors="pt"
    ).to(device)

    with torch.no_grad():
        outputs = model(**tokenized_batch)
        embeddings = outputs.last_hidden_state[:, 0].detach().cpu()
    dataset_embeddings.append(embeddings)

dataset_embeddings = torch.cat(dataset_embeddings)
dataset_embeddings = F.normalize(dataset_embeddings, p=2, dim=1)

I used a Spot L4 GPU on Google Cloud Platform at a pricing of $0.28/hour, and it took 21 minutes to encode all 242k movie embeddings: about $0.10 total, which is surprisingly efficient.

Each of these embeddings is a set of 768 numbers (768D). If the embeddings are unit normalized (the F.normalize() step), then calculating the dot product between embeddings will return the cosine similarity of those movies, which can then be used to identify the most similar movies. But “similar” is open-ended, as there are many dimensions how a movie could be considered similar.

Let’s try a few movie similarity test cases where I calculate the cosine similarity between one query movie and all movies, then sort by cosine similarity to find the most similar (Jupyter Notebook). How about Peter Jackson’s Lord of the Rings: The Fellowship of the Ring? Ideally, not only would it surface the two other movies of the original trilogy, but also its prequel Hobbit trilogy.

title cossim
The Lord of the Rings: The Fellowship of the Ring (2001) 1.0
The Lord of the Rings: The Two Towers (2002) 0.922
The Lord of the Rings: The Return of the King (2003) 0.92
National Geographic: Beyond the Movie - The Lord of the Rings: The Fellowship of the Ring (2001) 0.915
A Passage to Middle-earth: The Making of ‘Lord of the Rings’ (2001) 0.915
Quest for the Ring (2001) 0.906
The Lord of the Rings (1978) 0.893
The Hobbit: The Battle of the Five Armies (2014) 0.891
The Hobbit: The Desolation of Smaug (2013) 0.883
The Hobbit: An Unexpected Journey (2012) 0.883

Indeed, it worked and surfaced both trilogies! The other movies listed are about the original work, so having high similarity would be fair.

Compare these results to the “More like this” section on the IMDb page for the movie itself, which has the two sequels to the original Lord of the Rings and two other suggestions that I am not entirely sure are actually related.

What about more elaborate franchises, such as the Marvel Cinematic Universe? If you asked for movies similar to Avengers: Endgame, would other MCU films be the most similar?

title cossim
Avengers: Endgame (2019) 1.0
Avengers: Infinity War (2018) 0.909
The Avengers (2012) 0.896
Endgame (2009) 0.894
Captain Marvel (2019) 0.89
Avengers: Age of Ultron (2015) 0.882
Captain America: Civil War (2016) 0.882
Endgame (2001) 0.881
The Avengers (1998) 0.877
Iron Man 2 (2010) 0.876

The answer is yes, which isn’t a surprise since those movies share many principals. Although, there are instances of other movies named “Endgame” and “The Avengers” which are completely unrelated to Marvel and therefore implies that the similarities may be fixated on the names.

What about movies of a smaller franchise but a specific domain, such as Disney’s Frozen that only has one sequel? Would it surface other 3D animated movies by Walt Disney Animation Studios, or something else?

title cossim
Frozen (2013) 1.0
Frozen II (2019) 0.93
Frozen (2010) 0.92
Frozen (2010) [a different one] 0.917
Frozen (1996) 0.909
Frozen (2005) 0.9
The Frozen (2012) 0.898
The Story of Frozen: Making a Disney Animated Classic (2014) 0.894
Frozen (2007) 0.889
Frozen in Time (2014) 0.888

…okay, it’s definitely fixating on the name. Let’s try a different approach to see if we can find more meaningful patterns in these embeddings.

In order to visualize the embeddings, we can project them to a lower dimensionality with a dimensionality reduction algorithm such as PCA or UMAP: UMAP is preferred as it can simultaneously reorganize the data into more meaningful clusters. UMAP’s construction of a neighborhood graph, in theory, can allow the reduction to refine the similarities by leveraging many possible connections and hopefully avoid fixating on the movie name. However, with this amount of input data and the relatively high initial 768D vector size, the computation cost of UMAP is a concern as both factors each cause the UMAP training time to scale exponentially. Fortunately, NVIDIA’s cuML library recently updated and now you can run UMAP with very high amounts of data on a GPU at a very high number of epochs to ensure the reduction fully converges, so I did just that (Jupyter Notebook). What patterns can we find? Let’s try plotting the reduced points, colored by their user rating.

So there’s a few things going on here. Indeed, most of the points are high-rating green as evident in the source data. But the points and ratings aren’t random and there are trends. In the center giga cluster, there are soft subclusters of movies at high ratings and low ratings. Smaller discrete clusters did indeed form, but what is the deal with that extremely isolated cluster at the top? After investigation, that cluster only has movies released in 2008, which is another feature I should have considered when defining movie similarity.

As a sanity check, I faceted out the points by movie release year to better visualize where these clusters are forming:

This shows that even the clusters movies have their values spread, but I unintentionally visualized how embedding drift changes over time. 2024 is also a bizarrely-clustered year: I have no idea why those two years specifically are weird in movies.

The UMAP approach is more for fun, since it’s better for the downstream model building to use the raw 768D vector and have it learn the features from that. At the least, there’s some semantic signal preserved in these embeddings, which makes me optimistic that these embeddings alone can be used to train a viable movie rating predictor.

Predicting Average IMDb Movie Scores

So, we now have hundreds of thousands of 768D embeddings. How do we get them to predict movie ratings? What many don’t know is that all methods of traditional statistical modeling also work with embeddings — assumptions such as feature independence are invalid so the results aren’t explainable, but you can still get a valid predictive model.

First, we will shuffle and split the data set into a training set and a test set: for the test set, I chose 20,000 movies (roughly 10% of the data) which is more than enough for stable results. To decide the best model, we will be using the model that minimizes the mean squared error (MSE) of the test set, which is a standard approach to solving regression problems that predict a single numeric value.

Here are three approaches for using LLMs for solving non-next-token-prediction tasks.

Method #1: Traditional Modeling (w/ GPU Acceleration!)

You can still fit a linear regression on top of the embeddings even if feature coefficients are completely useless and it serves as a decent baseline (Jupyter Notebook). The absolute laziest “model” where we just use the mean of the training set for every prediction results in a test MSE of 1.637, but performing a simple linear regression on top of the 768D instead results in a more reasonable test MSE of 1.187. We should be able to beat that handily with a more advanced model.

Data scientists familiar with scikit-learn know there’s a rabbit hole of model options, but most of them are CPU-bound and single-threaded and would take considerable amount of time on a dataset of this size. That’s where cuML—the same library I used to create the UMAP projection—comes in, as cuML has GPU-native implementations of most popular scikit-learn models with a similar API. This notably includes support vector machines, which play especially nice with embeddings. And because we have the extra compute, we can also perform a brute force hyperparameter grid search to find the best parameters for fitting each model.

Here’s the results of MSE on the test dataset for a few of these new model types, with the hyperparameter combination for each model type that best minimizes MSE:

The winner is the Support Vector Machine, with a test MSE of 1.087! This is a good start for a simple approach that handily beats the linear regression baseline, and it also beats the model training from the Redditor’s original notebook which had a test MSE of 1.096 5. In all cases, the train set MSE was close to the test set MSE, which means the models did not overfit either.

Method #2: Neural Network on top of Embeddings

Since we’re already dealing with AI models and already have PyTorch installed to generate the embeddings, we might as well try the traditional approach of training a multilayer perceptron (MLP) neural network on top of the embeddings (Jupyter Notebook). This workflow sounds much more complicated than just fitting a traditional model above, but PyTorch makes MLP construction straightforward, and Hugging Face’s Trainer class incorporates best model training practices by default, although its compute_loss function has to be tweaked to minimize MSE specifically.

The PyTorch model, using a loop to set up the MLP blocks, looks something like this:

class RatingsModel(nn.Module):
    def __init__(self, linear_dims=256, num_layers=6):
        super().__init__()

        dims = [768] + [linear_dims] * num_layers
        self.mlp = nn.ModuleList([
            nn.Sequential(
                nn.Linear(dims[i], dims[i+1]),
                nn.GELU(),
                nn.BatchNorm1d(dims[i+1]),
                nn.Dropout(0.6)
            ) for i in range(len(dims)-1)
        ])

        self.output = nn.Linear(dims[-1], 1)

    def forward(self, x, targets=None):
        for layer in self.mlp:
            x = layer(x)

        return self.output(x).squeeze()  # return 1D output if batched inputs

This MLP is 529k parameters total: large for a MLP, but given the 222k row input dataset, it’s not egregiously so.

The real difficulty with this MLP approach is that it’s too effective: even with less than 1 million parameters, the model will extremely overfit and converge to 0.00 train MSE quickly, while the test set MSE explodes. That’s why Dropout is set to the atypically high probability of 0.6.

Fortunately, MLPs are fast to train: training for 600 epochs (total passes through the full training dataset) took about 17 minutes on the GPU. Here’s the training results:

The lowest logged test MSE was 1.074: a slight improvement over the Support Vector Machine approach.

Method #3: Just Train a LLM From Scratch Dammit

There is a possibility that using a pretrained embedding model that was trained on the entire internet could intrinsically contain relevant signal about popular movies—such as movies winning awards which would imply a high IMDb rating—and that knowledge could leak into the test set and provide misleading results. This may not be a significant issue in practice since it’s such a small part of the gte-modernbert-base model which is too small to memorize exact information.

For the sake of comparison, let’s try training a LLM from scratch on top of the raw movie JSON representations to process this data to see if we can get better results without the possibility of leakage (Jupyter Notebook). I was specifically avoiding this approach because the compute required to train an LLM is much, much higher than a SVM or MLP model and generally leveraging a pretrained model gives better results. In this case, since we don’t need a LLM that has all the knowledge of human existence, we can train a much smaller model that only knows how to work with the movie JSON representations and can figure out relationships between actors and whether titles are sequels itself. Hugging Face transformers makes this workflow surprisingly straightforward by not only having functionality to train your own custom tokenizer (in this case, from 50k vocab to 5k vocab) that encodes the data more efficiently, but also allowing the construction a ModernBERT model with any number of layers and units. I opted for a 5M parameter LLM (SLM?), albeit with less dropout since high dropout causes learning issues for LLMs specifically.

The actual PyTorch model code is surprisingly more concise than the MLP approach:

class RatingsModel(nn.Module):
    def __init__(self, model):
        super().__init__()
        self.transformer_model = model
        self.output = nn.Linear(hidden_size, 1)

    def forward(self, input_ids, attention_mask, targets=None):
        x = self.transformer_model.forward(
            input_ids=input_ids,
            attention_mask=attention_mask,
            output_hidden_states=True,
        )
        x = x.last_hidden_state[:, 0]  # the "[CLS] vector"

        return self.output(x).squeeze()  # return 1D output if batched inputs

Essentially, the model trains its own “text embedding,” although in this case instead of an embedding optimized for textual similarity, the embedding is just a representation that can easily be translated into a numeric rating.

Because the computation needed for training a LLM from scratch is much higher, I only trained the model for 10 epochs, which was still twice as slow than the 600 epochs for the MLP approach. Given that, the results are surprising:

The LLM approach did much better than my previous attempts with a new lowest test MSE of 1.026, with only 4 passes through the data! And then it definitely overfit. I tried other smaller configurations for the LLM to avoid the overfitting, but none of them ever hit a test MSE that low.

Conclusion

Let’s look at the model comparison again, this time adding the results from training a MLP and training a LLM from scratch:

Coming into this post, I’m genuinely thought that training the MLP on top of embeddings would have been the winner given the base embedding model’s knowledge of everything, but maybe there’s something to just YOLOing and feeding raw JSON input data to a completely new LLM. More research and development is needed.

The differences in model performance from these varying approaches aren’t dramatic, but some iteration is indeed interesting and it was a long shot anyways given the scarce amount of metadata. The fact that building a model off of text embeddings only didn’t result in a perfect model doesn’t mean this approach was a waste of time. The embedding and modeling pipelines I have constructed in the process of trying to solve this problem have already provided significant dividends on easier problems, such as identifying the efficiency of storing embeddings in Parquet and manipulating them with Polars.

It’s impossible and pointless to pinpoint the exact reason the original Reddit poster got rejected: it could have been the neural network approach or even something out of their control such as the original company actually stopping hiring and being too disorganized to tell the candidate. To be clear, if I myself were to apply for a data science role, I wouldn’t use the techniques in this blog post (that UMAP data visualization would get me instantly rejected!) and do more traditional EDA and non-neural-network modeling to showcase my data science knowledge to the hiring manager. But for my professional work, I will definitely try starting any modeling exploration with an embeddings-based approach wherever possible: at the absolute worst, it’s a very strong baseline that will be hard to beat.

All of the Jupyter Notebooks and data visualization code for this blog post is available open-source in this GitHub repository.


  1. I am not a fan of using GBT variable importance as a decision-making metric: variable importance does not tell you magnitude or direction of the feature in the real world, but it does help identify which features can be pruned for model development iteration. ↩︎

  2. To get a sense on how old they are, they are only available as TSV files, which is a data format so old and prone to errors that many data libraries have dropped explicit support for it. Amazon, please release the datasets as CSV or Parquet files instead! ↩︎

  3. Two other useful features of gte-modernbert-base but not strictly relevant to these movie embeddings are a) its a cased model so it can identify meaning from upper-case text and b) it does not require a prefix such as search_query and search_document as nomic-embed-text-v1.5 does to guide its results, which is an annoying requirement for those models. ↩︎

  4. The trick here is the detach() function for the computed embeddings, otherwise the GPU doesn’t free up the memory once moved back to the CPU. I may or may not have discovered that the hard way. ↩︎

  5. As noted earlier, minimizing MSE isn’t a competition, but the comparison on roughly the same dataset is good for a sanity check. ↩︎

Read the whole story
mrmarchant
31 minutes ago
reply
Share this story
Delete

‘AI is no longer optional’ — Microsoft admits AI doesn’t help at work

2 Shares

An internal Microsoft memo has leaked. It was written by Julia Liuson, president of the Developer Division at Microsoft and GitHub. The memo tells managers to evaluate employees based on how much they use internal AI tools like the various Copilots: [Business Insider]

AI is now a fundamental part of how we work. Just like collaboration, data-driven thinking, and effective communication, using AI is no longer optional — it’s core to every role and every level.

Liuson told managers that AI “should be part of your holistic reflections on an individual’s performance and impact.”

Let’s be clear: this is a confession of abject failure.

Microsoft’s AI tools don’t work. Microsoft AI doesn’t make you more effective. Microsoft AI won’t do the job better.

If it did, Microsoft staff would be using it already. The competition inside Microsoft is vicious. If AI would get them ahead of the other guy, they’d use it.

We already know that when AI saves someone time at work, it’s because they can fob work off onto someone else. Total work doesn’t go down, and total productivity doesn’t go up.

But Microsoft is desperate to sell AI to anyone it can, because the CEO, Satya Nadella, has a bee in his bonnet. Nadella has decreed: everyone will use AI.

Even though it doesn’t work.

We should expect some enterprising Microsoft coder to come up with an automated AI agent system that racks up chatbot metrics for them — while they get on with their actual job.

Read the whole story
mrmarchant
4 hours ago
reply
Share this story
Delete

There Are No New Ideas in AI… Only New Datasets

1 Share

Most people know that AI has made unbelievable progress over the last fifteen years– especially in the last five. It might feel like that progress is *inevitable* – although large paradigm-shift-level breakthroughs are uncommon, we march on anyway through a stream of slow & steady progress. In fact, some researchers have recently declared a “Moore’s Law for AI” where the computer’s ability to do certain things (in this case, certain types of coding tasks) increases exponentially with time:

Length of asks AIs can do is doubling every 7 months
the proposed “Moore’s Law for AI”. (by the way, anyone who thinks they can run an autonomous agent for an hour with no intervention as of April 2025 is fooling themselves)

Although I don’t really agree with this specific framing for a number of reasons, I can’t deny the trend of progress. Every year, our AIs get a little bit smarter, a little bit faster, and a little bit cheaper, with no end in sight.

Thanks for reading! Subscribe for free to receive new posts and support my work.

Most people think that this continuous improvement comes from a steady supply of ideas from the research community across academia – mostly MIT, Stanford, CMU – and industry – mostly Meta, Google, and a handful of Chinese labs, with lots of research done at other places that we’ll never get to learn about.

And we certainly have made a lot of progress due to research, especially on the systems side of things. This is how we’ve made models cheaper in particular. Let me cherry-pick a few notable examples from the last couple years:

- in 2022 Stanford researchers gave us FlashAttention, a better way to utilize memory in language models that’s used literally everywhere;

- in 2023 Google researchers developed speculative decoding, which all model providers use to speed up inference (also developed at DeepMind, I believe concurrently?)

- in 2024 a ragtag group of internet fanatics developed Muon, which seems to be a better optimizer than SGD or Adam and may end up as the way we train language models in the future

- in 2025 DeepSeek released DeepSeek-R1, an open-source model that has equivalent reasoning power to similar closed-source models from AI labs (specifically Google and OpenAI)

So we’re definitely figuring stuff out. And the reality is actually cooler than that: we’re engaged in a decentralized globalized exercise of Science, where findings are shared openly on ArXiv and at conferences and on social media and every month we’re getting incrementally smarter.

If we’re doing so much important research, why do some argue that progress is slowing down? People are still complaining. The two most recent huge models, Grok 3 and GPT-4.5, only obtained a marginal improvement on capabilities of their predecessors. In one particularly salient example, when language models were evaluated on the latest math olympiad exam, they scored only 5%, indicating that recent announcements may have been overblown when reporting system ability.

And if we try to chronicle the *big* breakthroughs, the real paradigm shifts, they seem to be happening at a different rate. Let me go through a few that come to mind:

LLMs in four breakthroughs

1. Deep neural networks: Deep neural networks first took off after the AlexNet model won an image recognition competition in 2012

2. Transformers + LLMs: in 2017 Google proposed transformers in Attention Is All You Need, which led to BERT (Google, 2018) and the original GPT (OpenAI, 2018)

3. RLHF: first proposed (to my knowledge) in the InstructGPT paper from OpenAI in 2022

4. Reasoning: in 2024 OpenAI released O1, which led to DeepSeek R1

If you squint just a little, these four things (DNNs → Transformer LMs → RLHF → Reasoning) summarize everything that’s happened in AI. We had DNNs (mostly image recognition systems), then we had text classifiers, then we had chatbots, now we have reasoning models (whatever those are).

Say we want to make a fifth such breakthrough; it could help to study the four cases we have here. What new research ideas led to these groundbreaking events?

It’s not crazy to argue that all the underlying mechanisms of these breakthroughs existed in the 1990s, if not before. We’re applying relatively simple neural network architectures and doing either supervised learning (1 and 2) or reinforcement learning (3 and 4).

Supervised learning via cross-entropy, the main way we pre-train language models, emerged from Claude Shannon’s work in the 1940s.

Reinforcement learning, the main way we post-train language models via RLHF and reasoning training, is slightly newer. It can be traced to the introduction of policy-gradient methods in 1992 (and these ideas were certainly around for the first edition of the Sutton & Barto “Reinforcement Learning” textbook in 1998).

If our ideas aren’t new, then what is?

Ok, let’s agree for now that these “major breakthroughs” were arguably fresh applications of things that we’d known for a while. First of all – this tells us something about the *next* major breakthrough (that “secret fifth thing” I mentioned above). Our breakthrough is probably not going to come from a completely new idea, rather it’ll be the resurfacing of something we’ve known for a while.

But there’s a missing piece here: each of these four breakthroughs enabled us to learn from a new data source:

1. AlexNet and its follow-ups unlocked ImageNet, a large database of class-labeled images that drove fifteen years of progress in computer vision

2. Transformers unlocked training on “The Internet” and a race to download, categorize, and parse all the text on The Web (which it seems we’ve mostly done by now)

3. RLHF allowed us to learn from human labels indicating what “good text” is (mostly a vibes thing)

4. Reasoning seems to let us learn from “verifiers”, things like calculators and compilers that can evaluate the outputs of language models

Remind yourself that each of these milestones marks the first time the respective data source (ImageNet, The Web, Humans, Verifiers) was used at scale. Each milestone was followed by a frenzy of activity: researchers compete to (a) siphon up the remaining useful data from any and all available sources and (b) make better use of the data we have through new tricks to make our systems more efficient and less data-hungry. (I expect we’ll see this trend in reasoning models throughout 2025 and 2026 as researchers compete to find, categorize, and verify everything that might be verified.)

How to train and validate on Imagenet
Progress in AI may have been inevitable once we gathered ImageNet, at the time the largest public collection of images from the Web

How much do new ideas matter?

There’s something to be said for the fact that our actual technical innovations may not make a huge difference in these cases. Examine the counterfactual. If we hadn’t invented AlexNet, maybe another architecture would have come along that could handle ImageNet. If we never discovered Transformers, perhaps we would’ve settled with LSTMs or SSMs or found something else entirely to learn from the mass of useful training data we have available on the Web.

This jibes with the theory some people have that nothing matters but data. Some researchers have observed that for all the training techniques, modeling tricks, and hyperparameter tweaks we make, the thing that makes the biggest difference by-and-large is changing the data.

As one salient example, some researchers worked on developing a new BERT-like model using an architecture other than transformers. They spent a year or so tweaking the architecture in hundreds of different ways, and managed to produce a different type of model (this is a state-space model or “SSM”) that performed about equivalently to the original transformer when trained on the same data.

This discovered equivalence is really profound because it hints that *there is an upper bound to what we might learn from a given dataset*. All the training tricks and model upgrades in the world won’t get around the cold hard fact that there is only so much you can learn from a given dataset.

And maybe this apathy to new ideas is what we were supposed to take away from The Bitter Lesson. If data is the only thing that matters, why are 95% of people working on new methods?

Where will our next paradigm shift come from? *(YouTube…maybe?)

The obvious takeaway is that our next paradigm shift isn’t going to come from an improvement to RL or a fancy new type of neural net. It’s going to come when we unlock a source of data that we haven’t accessed before, or haven’t properly harnessed yet.

One obvious source of information that a lot of people are working towards harnessing is video. According to a random site on the Web, about 500 hours of video footage are uploaded to YouTube *per minute*. This is a ridiculous amount of data, much more than is available as text on the entire internet. It’s potentially a much richer source of information too as videos contain not just words but the inflection behind them as well as rich information about physics and culture that just can’t be gleaned from text.

It’s safe to say that as soon as our models get efficient enough, or our computers grow beefy enough, Google is going to start training models on YouTube. They own the thing, after all; it would be silly not to use the data to their advantage.

A final contender for the next “big paradigm” in AI is a data-gathering systems that some way embodied– or, in the words of a regular person, robots. We’re currently not able to gather and process information from cameras and sensors in a way that’s amenable to training large models on GPUs. If we could build smarter sensors or scale our computers up until they can handle the massive influx of data from a robot with ease, we might be able to use this data in a beneficial way.

It’s hard to say whether YouTube or robots or something else will be the Next Big Thing for AI. We seem pretty deeply entrenched in the camp of language models right now, but we also seem to be running out of language data pretty quickly. But if we want to make progress in AI, maybe we should stop looking for new ideas, and start looking for new data.

Read the whole story
mrmarchant
6 hours ago
reply
Share this story
Delete

Fashion tips for writing math.

1 Share

Many authors will tell you how to write mathematics clearly and correctly.

But few will tell you how to write it with style and panache, so as to attract the oohs, aahs, and swiveling heads of passers-by.

In that spirit, allow me to channel the men’s fashion guy on Twitter. (Note to the men’s fashion guy on Twitter: please never look at me or my clothes.) Here are a few side-by-side case studies in how to make your mathematics look good:

Sure, some polynomials look fabulous when factored. Also, some athletic 23-year-olds look good in midriff-baring tops. This doesn’t mean we should all try it.

Better to leave something to the imagination; it’s a sign of maturity.

They’re called radicals for a reason, folks. Don’t conform to algebraic conventions. Give the people something to talk about.

I’m not against f-1 notation in general. That would be like opposing casual wear at the office; no point shaking one’s fist at a ship that long ago sailed. (And anyway, why should x-1 be reserved for reciprocals? Isn’t that just a clever and illuminating convention? Any use of negative exponents is already a high-fashion abstraction.)

Anyway, in this particular case, it’s madness to use the dainty superscript when there’s a robust and appealing alternative.

Okay, yes, if you’re actually calculating anything from the limit definition of a derivative, you should go with the more familiar h going to 0 definition.

But be honest. Are you working with the definition of a derivative? Is this the 19th century? Are you a yeoman farmer and/or a Cauchy-era analyst?

No?

Well, then, you’re not bringing this definition to work. You’re using it to make a point: namely, that the derivative is what happens to a slope as the two points draw closer together. And that point is best made with this stylish latter version.

I hesitate to wade into the long-simmering /phi vs. /varphi debates.

But c’mon, folks.

If we can’t agree on such an obvious matter of aesthetics, then I fear we may be approaching the end of our existence as a coherent civilization. Perhaps, in a few decades, the /phi advocates can be resettled on the surface of the moon, where they can build their own sorry little society, beyond the intimidating shadow of our superior fashion sense.

Ah, variance, you minxy concept.

I almost went the other way on this one. After all, is this not the opposite of my advice on the definition of a derivative? Here, am I not promoting easy manipulation over conceptual illumination?

Indeed I am. And that’s because we’re perpetually manipulating variance. The only thing you want to do with that first definition is change speedily into the second one.

I know I’ll ruffle some feathers with this one. Good. Those feathers look silly. They need ruffling.

Now, have I disrupted the beauty of an equation that “unites the five fundamental constants of mathematics”?

Or, have I just revealed that “-1 + 1 = 0” is not as profound a sentiment as some folks think?

Read the whole story
mrmarchant
6 hours ago
reply
Share this story
Delete

What Does a Post-Google Internet Look Like

1 Share

With the rise of the internet came the need to find information more quickly. The concept of search engines came into this space to fill this need, with a relatively basic initial design.

alt

This is the basis of the giant megacorp Google, whose claim to fame was they made the best one of these. Into this stack they inject ads, both ads inside the sites themselves and then turning the search results themselves into ads.

alt

As time went on, what we understood to be "Google search" was actually a pretty sophisticated machine that effectively determined what websites lived or died. It was the only portal that niche websites had to get traffic. Google had the only userbase large enough for a website dedicated to retro gaming or VR headsets or whatever to get enough clicks to pay their bills.

alt

Despite the complexity, the basic premise remained. Google steers traffic towards your site, the user gets the answer from your site and then everyone is happy. Google showed some ads, you showed some ads, everyone showed everyone on Earth ads.

This incredibly lucrative setup was not enough, however, to drive endless continous growth, which is now the new expectation of all tech companies. It is not enough to be fabulously profitable, you must become Weyland-Yutani. So now Google is going to break this long-standing agreement with the internet and move everything we understand to be "internet search" inside their silo.

alt

Zero-Click Results

In March 2024 Google moved to embed LLM answers in their search results (source). The AI Overview takes the first 100 results from your search query, combines their answers and then returns what it thinks is the best answer. As expected, websites across the internet saw a drop in traffic from Google. You started to see a flood of smaller websites launch panic membership programs, sell off their sites, etc.

It became clear that Google has decided to abandon the previous concept of how internet search worked, likely in the face of what it considers to be an existential threat from OpenAI. Maybe the plan was always to bring the entire search process in-house, maybe not, but OpenAI and its rise to fame seems to have forced Google's hand in this space.

This is not a new thing, Google has been moving in this direction for years. It was a trend people noticed going back to 2019.

alt
Source

It appears the future of Google Search is going to be a closed loop that looks like the following:

  • Google LLM takes the information from the results it has already ingested to respond to most questions.
  • Companies will at some point pay for their product or service to be "the answer" in different categories. Maybe this gets disclosed, maybe not, maybe there's just a little i in the corner that says "these answers may be influenced by marketing partners" or something.
  • Google will attempt to reassure strategic partners that they aren't going to kill them, while at the same time turning to their relationship with Reddit to supply their "new data".

This is all backed up by data from outside the Google ecosystem confirming that the ratio of scrapes to click is going up. Basically it's costing more for these services to make their content available to LLMs and they're getting less traffic from them.

alt
Source

This new global strategy makes sense, especially in the context of the frequent Google layoffs. Previously it made strategic sense to hold onto all the talent they could, now it doesn't matter because the gates are closing. Even if you had all the ex-Google engineers money could buy, you can't make a better search engine because the concept is obsolete. Google has taken everything they need from the internet, it no longer requires the cooperation or goodwill of the people who produce that content.

What happens next?

So the source of traffic for the internet is going to go away. My guess is there will be some effort to prevent this, some sort of alternative Google search either embraced or pushed by people. This is going to fail, because Google is an unregulated monopoly. Effectively because the US government is so bad at regulating companies and so corrupt with legalized bribery in the form of lobbying, you couldn't stop Google at this point even if you wanted to.

  • Android is the dominant mobile platform on Earth
  • Chrome is the dominant web browser
  • Apple gets paid to make the other mobile platform default to Google
  • Firefox gets paid to make the other web browser default to Google

While the US Department of Justice has finally decided to doing something, it's almost too late to make a difference. https://www.justice.gov/opa/pr/department-justice-prevails-landmark-antitrust-case-against-google

Even if you wanted to and had a lot of money to throw at the problem, it's too late. If Apple made their own search engine and pointed iOS to it as the default and paid Firefox to make it the default, it still wouldn't matter. The AI Overview is a good enough answer for most questions and so convincing consumers to:

  1. switch platforms
  2. and go back to a two/three/four step process compared to a one step process is a waste of time.

I'm confident there will still be sites doing web searching, but I suspect given the explosion in AI generated slop it's going to be impossible to use them even if you wanted to. We're quickly reaching a point where it would be possible to generate a web page on demand, meaning the capacity of the slop-generation exceeds the capacity of humans to fight it.

Because we didn't regulate the internet, we're going to end up with an unbreakable monopoly on all human knowledge held by Microsoft and Google. Then because we didn't learn anything we're going to end up with a system that can produce false data on demand and make it impossible to fact check anything that the LLM companies return. Paid services like Kogi will be the only search engines worth trying.

Impact down the line

So I think you are going to see a rush of shutdowns and paywalls like you've never seen before. In some respects, it is going to be a return to the pre-Google internet, where it will once again be important that consumers know your domain name and go directly to your site. It's going to be a massive consolidation of the internet down and I think the ad-based economy of the modern web will collapse. Google was the ad broker, but now they're going to operate like Meta and keep the entire cycle inside their system.

My prediction is that this is going to basically destroy any small or medium sized business that attempts to survive with the model of "produce content, get paid per visitor through ads". Everything instead is going to get moved behind aggressive paywalls, blocking archive.org. You'll also see prices go way up for memberships. Access to raw, human produced information is going to be a premium product, not something for everyday people. Fake information will be free.

Anyone attempting to make an online store is gonna get mob-style shakedown. You can either pay Amazon to let consumers see your product or you can pay Google to have their LLM recommend your product or you can (eventually) pay OpenAI/Microsoft to do it. I also think these companies will use this opportunity to dramatically reprice their advertising offerings. I don't think it'll be cheap to get the AI Summary to recommend your frying pan.

I suspect there will be a brief spike in other forms of marketing spend, like podcasts, billboards, etc. When companies see the sticker shock from Google they're going to explore other avenues like social media spend, influencers, etc. But all those channels are going to be eaten by the LLM snake at the same time.

If consumers are willing to engage with an LLM-generated influencer, that'll be the direction companies go in because they'll be cheaper and more reliable. Podcast search results are gonna be flooded with LLM-generated shows and my guess is that they're going to take more of the market share than anyone wants to admit. Twitch streaming has already moved from seeing the person to seeing an anime-style virtual overlay where you don't see the persons face. There won't be a reason for an actual human to be involved in that process.

End Game

My prediction is that a lot of the places that employ technical people are going to disappear. FAANG isn't going to be hiring at anywhere near the same rate they were before, because they won't need to. I don't need 10,000 people maintaining relationships with ad sellers and ad buyers or any of the staff involved in the maintenance or improvement of those systems.

The internet is going to return to more of its original roots, which are niche fan websites you largely find through social media or word of mouth. These sites aren't going to be ad driven, they'll be membership driven. Very few of them are going to survive. Subscription fatigue is a real thing and the math of "it costs a lot of money to pay people to write high quality content" isn't going to go away.

In a relatively short period of time, it will go from "very difficult" to absolutely impossible to launch a new commercially viable website and have users organically discover that website. You'll have to block LLM scrapers and need a tremendous amount of money to get a new site bootstrapped. Welcome to the future, where asking a question costs $4.99 and you'll never be able to find out if the answer is right or not.

Read the whole story
mrmarchant
6 hours ago
reply
Share this story
Delete

Make Fun Of Them

1 Share

Have you ever heard Sam Altman speak?

I’m serious, have you ever heard this man say words from his mouth? 

Here is but one of the trenchant insights from Sam Altman in his agonizing 37-minute-long podcast conversation with his brother Jack Altman from last week:

I think there will be incredible other products. There will be crazy new social experiences. There will be, like, Google Docs style AI workflows that are just way more productive. You’ll start to see, you’ll have these virtual employees, but the thing that I think will be most impactful on that five to ten year timeframe is AI will actually discover new science.” 

When asked why he believes AI will “discover new science,” Altman says that “I think we’ve cracked reasoning in the models,” adding that “we’ve a long way to go,” and that he “think[s] we know what to do,” adding that OpenAI’s o3 model “is already pretty smart,” and that he’s heard people say “wow, this is like a good PHD.”

That’s the entire answer! It’s complete nonsense! Sam Altman, the CEO of OpenAI, a company allegedly worth $300 billion to venture capitalists and SoftBank, kind of sounds like a huge idiot!

“But Ed!” you cry. “You can’t just call Sam Altman an idiot! He isn’t stupid! He runs a big company, and he’s super successful!”

My counter to that is, first, yes I can, I’m doing it right now. Second, if Altman didn’t want to be called stupid, he wouldn’t say stupid shit with a straight face to a massive global audience.

My favourite part of the interview is near the beginning:

Jack Altman: So reasoning will lead to science going faster or just new stuff or both?

Sam Altman: I mean, you already hear scientists who say they’re faster with AI, like we don’t have AI maybe autonomously doing science, but if a human scientist is three times as productive using o3, that’s still a pretty big deal.

Jack Altman: Yeah

Sam Altman: And then as that keeps going and the AI can autonomously do some science, figure out novel physics-

Jack Altman: Is it all that happening as a copilot right now? [Editor’s note: this is exactly what Jack Altman says]

Sam Altman: Yeah there’s definitely not… You definitely can’t go say like, “Hey ChatGPT, figure out new physics” and expect that to work. So I think it is currently copilot-like, but I’ve heard like, anecdotal reports from biologists where it’s like, “wow, it really did figure out an idea. I had to develop it, but it made a fundamental leap.” 

This is a nonsensical conversation, and both of them sound very, very stupid. 

“So, is this going to make new science or make science faster?” “Yeah, I hear scientists are using AI to go faster [CITATION NEEDED], but if a human scientist goes three times faster [CITATION NEEDED] using my model that would be good. Also I heard from a guy that he heard a guy who did biology who said ‘this helped.’”

Phenomenal! Give this guy $40 billion or more dollars every year until he creates a superintelligence, that’ll fucking work.

Here are some other incredible quotes from the genius mind of Sam Altman:

  • “You hear these stories of people who use AI to do market research and figure out new products and then email some manufacturer and get some dumb thing made and sell it on Amazon and run ads…there are people that have actually figured out at small scale in the most boring ways possible how to put a dollar into AI and get the AI to run a toy business, but it’s actually working. So that’ll climb the gradient.” 
    • You may wonder if “the gradient” is mentioned at some point elsewhere. It is not.
  •  “So every year before the last maybe up until last year I would’ve said, ‘hey I think this is going to go really far,’ but it still seems like there’s a lot we’ve got to figure out.” 
  • “If something goes wrong, I would say somehow it’s that we build legitimate super intelligence and it doesn’t make the world much better, it doesn’t change things as much as it sounds like it should.”
  • “So yeah, I think the relativistic point is really important, but to us, our jobs feel incredibly important and stressful and satisfying. And if we're all just making better entertainment for each other in the future, maybe that's kind of what at least one of us is doing right now.” 

This is gobbledygook, nonsense, bullshit peddled by a guy who has only the most tangential understanding of the technology his company is building. 

Every single interview with Sam Altman is like this, every single one, ever since he became a prominent tech investor and founder. Without fail. And the sad part is that Altman isn’t alone in this.

Sundar Pichai, when asked one of Nilay Patel’s patented 100-word-plus-questions about Jony Ive and Sam Altman’s new (and likely heavily delayed) hardware startup:

I think AI is going to be bigger than the internet. There are going to be companies, products, and categories created that we aren’t aware of today. I think the future looks exciting. I think there’s a lot of opportunity to innovate around hardware form factors at this moment with this platform shift. I’m looking forward to seeing what they do. We are going to be doing a lot as well. I think it’s an exciting time to be a consumer, it’s an exciting time to be a developer. I’m looking forward to it.

The fuck are you on about, Sundar? Your answer to a question about whether you anticipate more competition is to say “yeah I think people are gonna make shit we haven’t come up with and uhh, hardware, can’t wait!”

While I think Pichai is likely a little smarter than Altman, in the same way that Satya Nadella is a little smarter than Pichai, and in the same way that a golden retriever is smarter than a chihuahua. That said, none of these men are superintelligences, nor, when pressed, do they ever seem to have any actual answers.

Let’s see what Satya Nadella of Microsoft answered when asked about how exactly it’s going to get to (and I paraphrase Dwarkesh Patel’s mealy-mouthed question) $130 billion in AI revenue “through AGI”:

The way I come at it, Dwarkesh, it's a great question because at some level, if you're going to have this explosion, abundance, whatever, commodity of intelligence available, the first thing we have to observe is GDP growth.

Before I get to what Microsoft's revenue will look like, there's only one governor in all of this. This is where we get a little bit ahead of ourselves with all this AGI hype. Remember the developed world, which is what? 2% growth and if you adjust for inflation it’s zero?

So in 2025, as we sit here, I'm not an economist, at least I look at it and say we have a real growth challenge. So, the first thing that we all have to do is, when we say this is like the Industrial Revolution, let's have that Industrial Revolution type of growth.

That means to me, 10%, 7%, developed world, inflation-adjusted, growing at 5%. That's the real marker. It can't just be supply-side.

In fact that’s the thing, a lot of people are writing about it, and I'm glad they are, which is the big winners here are not going to be tech companies. The winners are going to be the broader industry that uses this commodity that, by the way, is abundant. Suddenly productivity goes up and the economy is growing at a faster rate. When that happens, we'll be fine as an industry.

But that's to me the moment. Us self-claiming some AGI milestone, that's just nonsensical benchmark hacking to me. The real benchmark is: the world growing at 10%.

This quote has been used as a means of suggesting that Nadella is saying that “generative AI is generating basically no value,” which, while somewhat true, obfuscates its true meaning: Satya Nadella isn’t saying a fucking thing. 

The question was “how do you get Microsoft to $130 billion in revenue,” and Satya Nadella’s answer was to say “uhhh, abundance, uhhh, explosion, uhhhhh, GDP! Growth! Industrial revolution! Inflation-adjusted! Percentages! The winners will be the people who do stuff, and then productivity will go up!”

This is fucking nonsense, and it’s time to stop idolizing these speciously-informed goobers. While kinder souls or Zitron-haters may read this and say “ahh, actually, what Nadella was saying was…” stop. I want to stop you there and suggest that perhaps a smart person should be able to speak clearly enough that their intent is obvious. 

It’s tempting to believe that there is some sort of intellectual barrier between you and the powerful — that the confusing and obtuse way that they speak is the sound of genius, rather than somebody who has learned a lot of smart-sounding words without ever learning what they mean.

“But Ed, they’re trained to do this!”

As someone who has media trained hundreds of people, there is only so much you can do to steer someone’s language. You cannot say to Sundar Pichai “hey man, can you sound more confusing?” yYou can, however, tell them what not to talk about and hope for the best. Sure, you can make them practice, sure, you can give them feedback, but people past a certain stage of power or popularity are going to talk however they want, and if they’re a big stupid idiot pretending to be smart, they’re going to sound exactly like this.

Why? Because nobody in the media ever asks them to explain themselves. When you’ve spent your entire career being asked friendly-or-friendly-adjacent questions and never having someone say “wait, what does that mean?” you will continue to mutate in a pseudo-communicator that spits out information-adjacent bullshit. 

I am, to be clear, being very specific about that question. Powerful CEOs and founders never, ever get asked to explain what they’re saying, even when what they’re saying barely resembles an actual answer. 

Pichai, Altman and Nadella have always given this kind of empty-brained intellectual slop in response to questions because the media coddles them. These people are product managers and/or management consultants — and in Altman’s case, a savvy negotiator and manipulator known for “an absenteeism that rankled his peers and some of the startups he was supposed to nurture” as an investor at yCombinator, according to the Washington Post.

I’ll try and explain this with a little aside.

Let’s think about a hypothetical question about your friend whose dog died:

You: Oh no, what happened?

Them: Well, my dog had a tragic yet ultimately final distinction between their ideal and non-ideal state, due to the involvement of a kind of automatic mechanical device, and when that happened, we realized we’d have to move on from the current paradigm of dog ownership and into a new era, which we both feel a great deal of emotion about and see the opportunities within.

You would probably be a little confused and ask them to explain what they meant.

You: Wait, what do you mean automatic mechanical what? Huh?

Them: Yeah, exactly, and that was part of the challenge. You see, like, the various interactions we have in our day are challenging, and we see a lot of opportunities in assailing those challenges, but part of the road to getting around them is facing them head on, which is ultimately what happened there. And while we were involved, we didn’t want to be, and so we had to make some dramatic changes. 

You still, at this point, do not really know what happened. Did a car hit the dog? Did they run over their dog?

In this scenario, would you nod and say “wow man, that sucks, I’m sorry,” or would you ask them to explain what they’re saying? Would you, perhaps, ask what it is they mean?

By “coddle,” I mean these people are deliberately engaging in a combination of detective work and amnesia, where the reader or the listener is forced to simultaneously try and divine the meaning of their answer, while also not thinking too hard about the question the interviewer asked. 

Look at most modern business interviews. They involve a journalist asking a question, somebody giving an answer, and the journalist saying “okay!” and moving onto the next question, occasionally saying “but what about this?” when the appropriate response to many of the answers is to ask them to simplify them so that their meaning is clearer.

A common response to all of this is to say that “interviewers can’t be antagonistic,” and I don’t think a lot of people understand what that means. It isn’t “antagonistic” to ask somebody to clearly articulate what they’re saying, nor is it “antagonistic” to say that you don’t understand, or that they didn’t answer the question you asked. If this is “antagonistic” to you, you are, intellectually-speaking, a giant fucking coward, because what you’re suggesting is that somebody cannot ask somebody to explain themselves, which is what an interview is.

And I imagine nobody really wants to do this, because if you actually put these people on the spot, you’d realize the dark truth that I spoke of a few weeks ago: that the reason the powerful sound like idiots is because, well, they’re idiots. They sound like Business Idiots and create products to sell to Business Idiots, because Business Idiots run most companies and buy solutions based on what the last Business Idiot told them. 

To quote the excellent Nik Suresh:

While I like Snowflake as a piece of software, it is probably not a high priority to move to it at most large companies for various reasons I won't get into here. Fine, I'll get into one of them. It's just a really good data warehouse, you absolute maniacs, it isn't the cure for cancer, why the fuck is it valued at $53B?

Because everyone is buying it, and this has to be driven by non-technical leadership because there aren't enough technical leaders to drive that sort of valuation. Why would non-technicians be so focused on a database of all things, a concept so dull that it is Effective Communication 101 to try and avoid using the term in front of a lay audience? It's because if you buy Snowflake then you're allowed to get onto stages at large venues and talk about how revolutionary Snowflake was for your business, which on the surface looks like a brag about Snowflake, but is actually a brag about the great decisions you've been making and the wealth you can deploy if someone becomes your friend. And the audience is full of people that are now thinking "If I buy Snowflake, I can be on that stage, and everyone will finally recognize my brilliance".

I know some of you might read this and say “these people can’t be stupid! These people run companies! They make huge deals! They read all these books!” and my answer is that some of the stupidest people I’ve ever met have read more books than you or I will read in a lifetime. While they might be smart when it comes to corporate chess moves or saying “this product category should do this,” none of these men — not Altman, Pichai or Nadella — actually has a hand in the design or creation of any of the things their companies make, and they never, ever have. 

Regardless, I have a larger point: it’s time to start mocking these people and tearing down their legends as geniuses of industry. They are not better than us, nor are they responsible for anything that their companies build other than the share price (which is a meaningless figure) and the accumulation of power and resources. 

These men are neither smart nor intellectually superior, and it’s time to start treating them as such.


These people are powerful because they have names that are protected by the press. They are powerful because it is seen as unseemly to mock them because they are rich and “running a company,” a kind of corporate fealty that I find deeply unbecoming of an adult. 

We are, at most, customers. We do not “owe them” anything. We are long past the point when any of the people running these companies actually invented anything they sell. iIf anything, they owe us something, because they are selling us a product, even if said product is free and monetised by advertising. 

While reporters — as anyone — should have some degree of professionalism in interviews or covering subjects, there is no reason to treat these people as special, even if they have managed to raise a lot of money or their product is popular, because if that were the case we’d have far more coverage of defense contractor Lockheed Martin. It made $1.71 billion in profit last quarter, and hasn’t had a single quarter under a billion dollars in the last year. 

I’m being a little glib, but the logic behind covering OpenAI is, at this point, “it makes a lot of money and its product is popular,” which is also a fitting description of Lockheed Martin. The difference is that OpenAI has a consumer product that loses billions of dollars, and Lockheed Martin has products that makes billions of dollars by removing consumers from the Earth. Both of them are environmentally destructive. 

Covering OpenAI sure doesn’t seem to be about the tech, because if you looked at the tech you’d have to understand the tech, you’d see that the user numbers weren’t there outside of the 500 million people using ChatGPT, of which very few are actually paying for the product, and that the term “user” encompasses everything from the most occasional users who log in out of curiosity, to people who are actually using it as part of their daily lives. 

If covering OpenAI was about the tech, you’d read about how the tech itself doesn’t seem to have a ton of mass-market use cases, and those use cases aren’t really the kind of things that you’d pay for. If they did, there’d be articles that definitively discussed them versus articles in the New York Times about “everybody using AI” that boil down to “I use ChatGPT as search now” and “I heard a guy who asked it to teach him about modern art.”

Yet men like Dario Amodei and Sam Altman continue to be elevated because they are “building the future,” even if they don’t seem to have built it yet, or have the ability to clearly articulate what that future actually looks like. 

Anthropic has now put out multiple stories suggesting that its generative AI will “blackmail” people as a means of stopping a user from turning off the system, something which is so obviously the company prompting its models to do so. Every member of the media covering this uncritically should feel ashamed of themselves.

Sadly, this is all a result of the halo effect of being a Guy Who Raised Money or Guy Who Runs Big Company. We must, as human beings, assume that these people are smart, and that they’d never mislead us, because if we accept that they aren’t smart and that they willingly mislead us, we’d have to accept that the powerful are, well, bad and possibly unremarkable. 

And if they’re untrustworthy people that don’t seem that smart, we have to accept that the world is deeply unfair, and caters to people like them far more than it caters to people like us.

We do not owe Satya Nadella any respect because he’s the CEO of Microsoft.  If anything, we should show him outright scorn for the state of Microsoft products. Microsoft Teams is an insulting mess that only sometimes works, leaving workers spending 57% of their time either in Teams Chat, Teams Meetings or sending emails according to a Microsoft study.

MSN.com is an abomination read by hundreds of millions of people a month, bloated with intrusive advertisements, attempts to trick you into downloading an app, and quasi-content that may or may not be AI generated. There are few products on the modern internet that show more contempt for the user -- other than, of course, Skype, a product that Microsoft let languish for more than a decade, the product so thoroughly engorged with spam that leaving it unattended for more than a month left you with a hundred unread messages from Eastern European romance scammers. Microsoft finally killed it in May.

Products like Word and Excel don’t need improving, but that doesn’t stop Microsoft from trying, bloating them with odd user interface choices and forcing users to fight with popups to use an AI-powered Copilot that most of them hate.

Why, exactly, are we meant to show these people respect? Because they run a company that provides a continually-disintegrating service? Because that service has such a powerful monopoly that it’s difficult to leave it if you’re interacting with other people or businesses? 

I think it’s because we live in Hell. The modern tech ecosystem is so utterly vile. Every single day our tech breaks in new and inventive ways, our iPhones resetting at random, random apps not accepting button presses, our Bluetooth disconnecting, our word processors harassing us to “try and use AI” while no longer offering us suggestions for typos, and our useful products replaced with useless shit, like  how Google’s previously-functional assistants were replaced with generative AI that makes them tangibly worse so that Google can claim it has 350 million monthly active Gemini users

Yet the tech and business media acts as if everything is fine

It isn’t fine! It’s all really fucked! You can call me a cynic or a pessimist or every name under the sun, but the stakes have never been higher, and the damage never more wide-spread. Everything feels broken, and covering these companies as if it isn’t is insulting to your readers and your own intelligence.

Look at the state of your computer or phone and tell me anything feels congruent or intentional rather than an endless battle of incentives. Look at the notifications on your phone and count the number of them that have absolutely nothing to do with information you actively need. As we speak, I have a notification from Adobe Lightroom, an app I use occasionally to edit photos, that tells me “Elevate any scene - now enhance people, sky, water and more with Quick Actions.” Zerocam, an app that brands itself “the first anti-AI camera app” where you “capture moments, not megapixels,” gave me a notification asking if I took a photo today. Amazon notified me that there is a deal picked just for me — a battery pack that I bought several months ago.

Every single company that sends notifications like these should be mocked, but we have accepted such vile conditions as the norm. Apple should be tarred and feathered for allowing companies to send spam notifications, and yet it isn’t  because, by and large, Apple is less vile and less exploitative than Microsoft, Google or Amazon.

If you are reading this as a member of the tech press, seriously, please look at your daily experience with tech. Count the number of times that your day or a task is interrupted by poorly-designed software or hardware (such as the many, many times Zoom or Teams has a problem with Bluetooth, or a website just doesn’t load, or you type something into your browser and it just doesn’t do anything), or when the software you use either actively impedes you (hey, did you want to use AI? No? You sure?) or refuses to work in a logical way (see: Google Drive). There are tens of thousands of stories like this every day, and if you talked to people, you’d see how widespread it is…or maybe, I dunno, see that it’s happening to you too?

There are people responsible, and the tech media writes about them every day. I realize it seems weird to constantly write that a company is releasing broken, convoluted software, but hey, if we can write 300,000 stories about how crime-ridden New York City is, why can’t we write three of them about how fucked Microsoft Office or Google Search have become?

And why can’t we talk to the people in power about it? Is it because the questions are too hard to ask? Is it because it feels icky to interrupt Satya Nadella as he waffles on about using Copilot all the time by saying “hey man, Microsoft Teams is broken, tons of people feel this way, why?” or “why have you let MSN.com turn into a hub of AI slop and outright disinformation?”

Oh no! You won’t get your access! Wahh!

Who cares? Write a story about how Microsoft has become so unbelievably profitable as its products get worse, and talk about how weird and bad that is for the world! Ask Nadella those tough questions, or publish that Microsoft’s PR wouldn’t let you! 

These people are neither articulate nor wise, and whatever “intelligence” they may claim to have doesn’t seem to manifest in good products or intelligent statements. So why treat them like they’re smart? Why show them deference or pleasantries? These people have crapped up our digital lives at scale, and they deserve contempt, or at the very least a stern fucking reception.

I realize I’m repeating points I’ve made again and again, but why is there such a halo around these fucking bozos? I’m serious! Why are we so protective of these guys? We’re more than happy to criticise celebrities, musicians, professional sports players, and politicians (fucking barely), but the business class is somehow protected outside of the occasional willingness to say that Elon Musk might have sort have done something wrong.

I’m not denying there are critics. We have Molly White, Edward Ongweso Jr, Brian Merchant and — at a major outlet like CNN, no less! — one of the greatest living business writers in Allison Morrow. I believe that tech criticism is a barely-explored and hugely-profitable industry if we treated tech journalism less like the society pages and more like a force to hold the most powerful people in the world accountable as they continually harm billions of people in subtle ways. People are angry, and they aren’t stupid, and they want to see that anger reflected in the stories they read — and the meek deference we show to dumb fucking tech leaders is the opposite of that. 

As I’ve said before: we live in an era of digital tinnitus, nagged by notifications, warring with software ostensibly built for us that acts as if we’re the enemy. And if we’re the enemy, we should treat those building this software as the enemy in return. We are their customers, and they have failed us.

The entire approach to business owners, especially in tech, is ridiculous. These people are selling us a product and the product fucking stinks! Put aside however you feel about generative AI for a second and face one very simple point: it doesn’t do enough, it’s really not cool at all, and we’re being forced to use it. 

I realize that some of you may want them to succeed, or want to be the person who tells everybody that they did so. I get that there are rewards for you — promotions, new positions, TV appearances repeating exactly what the powerful did and why they did it, or a plush role as that company’s head of communications — but I am telling you, your readers and viewers are waking up to it, and they feel like you have contempt for them and contempt for the truth. 

It’s easy — and common! — to try and dismiss my work as some sort of hater’s screed, a “cynical” approach to a tech industry that’s trying “brave new things” or whatever. 

In my opinion, there’s nothing more cynical than watching billions of people get shipped increasingly-shitty and expensive solutions and then get defensive of the people shipping them, and hostile to the people who are complaining that the products they use suck. 

I am angry at these companies because they have, at scale, torn down a tech industry that allowed me to be who I am today, and their intentional and disgraceful moves fill me full of disgust. I have watched the tech media move away from covering “technology” and more toward covering the people behind it, to the point that the actual outputs — the software and hardware we use every day — have taken a backseat to stories about whether Elon Musk does or doesn’t use a computer, which is meaningless, empty gossip journalism built to be shared by peers and nothing else.

And please, please do not talk about optimism. If you are blindly saying that everything OpenAI does is cool and awesome and interesting, you aren’t being optimistic — you’re telling other people to be optimistic about a company’s success. It isn’t “optimistic” to believe that a company is going to build powerful AI despite it failing to do so. It’s propaganda, and yes, this is also the case if you simply don’t do the research to form a real opinion.

I am not a pessimist because I criticize these companies, and framing me as one is cowardly and ignorant. If you are so weak-willed and speciously-informed that you can’t see somebody criticise a company without outright dismissing them as “a hater” or “pessimist,” you are an insult to journalism or analysis, and you know it in your wretched little heart. My heart sings with a firm belief in the things I think, founded on rigorous structures of knowledge that I’ve gained from reading things and talking to people, because something in me is incapable of being swayed by something just because everybody else is. 

You are assuming people are right because it is inconvenient and uncomfortable to accept they may not be, because doing so requires you to reckon with a market-wide hysteria founded on desperation and a lack of hyper-growth markets left in the tech industry

Worse still, in engaging with faux-optimism, you are failing to protect your readers and the general public.  

And if that’s what you want to do, ask yourself why! Why do you want these companies to win? What is it you want them to win? Do you want them to be rich? Do you want to be the person that told people they would be first? What is the world you want, and what does it look like, and how does doing your job in this way work toward creating that world?

This isn’t optimism — it’s horse-trading, or strategic alignment behind powerful entities. It is choosing a side, because your side isn’t with the reader or the truth. If it was — even if you believed generative AI was powerful and that they simply didn’t understand — your duty would be to educate the reader in a clear-set and obvious way, and if you can’t find a way to do so, acknowledging that and explaining why.

True optimism requires you to have a deep, meaningful understanding of things so that you can engage in real hope — a magical feeling, one that can buoy you in the most challenging times.

What many claim is “optimism” is actually blind faith, the likes of which you’ll see at a roulette table. Or, of course, knowingly peddling propaganda.


Let’s even take a different tact: say you actually want these companies to “build powerful AI,” and believe they’re smart enough to do so. Say that, somehow, looking at their decaying finances, the lack of revenue, the lack of growth, and the remarkable lack of use cases, you still come out of it saying “sure, I think they’re going to do this!”

How? Why haven’t they done it yet? Why, three years in, are we still unable to describe what ChatGPT actually does, and why we need it? Take away how much money OpenAI makes for a second (and, indeed, how much it loses). Does this product actually really inspire anything in you? What is it that’s magical about this? 

And, on a business level, what is it I’m meant to be impressed by, exactly? OpenAI has — allegedly — hit “$10 billion in annualized revenue” (essentially the biggest month it can find, multiplied by 12), which is…not that much, really, considering it’s the most prominent company in the software world, with the biggest brand, and with the attention of the entirety of the world’s media. 

It has, allegedly, 500 million weekly active users — and, by the last count, only 15.5 million paying subscribers, an absolutely putrid conversion rate even before you realize that the actual conversion rate would be monthly active subscribers. That’s how any real software company actually defines its metrics, by the fucking way. 

Why is this impressive? Because it grew fast? It literally had more PR and more marketing and more attention and more opportunities to sell to more people than any company has ever had in the history of anything. Every single industry has been told to think about AI for three years, and they’ve been told to do so because of a company called OpenAI. There isn’t a single god damn product since Google or Facebook that has had this level of media pressure, and both of those companies launched without the massive amount of media (and social media) that we have today. 

Having literally everybody talking about your product all the time for years is pretty useful! Why isn’t it making more money? 

Why are we taking any of these people seriously? Mark Zuckerberg paid $14.3 billion for Scale AI, an AI data company, as a means of hiring its CEO Alexandr Wang to run his “superintelligence” team, has been offering random OpenAI employees $100 million to join Meta, thought about buying both AI search company Perplexity and generative video company Runway and even tried to buy OpenAI co-founder Ilya Sutskever’s pre-product “$32bn valuation” non-company Safe Superintelligence, settling instead on hiring its CEO Daniel Gross and buying his venture fund for some fucking reason.

When you put aside the big numbers, these are the actions of a desperate dimwit with a failing product trying to buy his way to making generative AI into a “superintelligence,” something that Meta’s own Chief AI scientist Yan LeCun says isn’t going to work.

By assuming that there is some sort of grand strategy behind these moves beyond “if we get enough smart people together something will happen,” you help boost the powerful’s messaging and buoy their stock valuations. You are not educating anybody by humouring these goofballs. In fact, the right way to approach this would be to ask why Meta, a multi-trillion dollar market cap company with a near-monopoly over all social media, is spending billions of dollars in what appears to be a totally irresponsible way. Instead, people are suggesting this is Mark Zuckerberg’s genius at work

Anyway, putting that aside, what exactly is the impressive part of generative AI again? The fucking code? Enough about the code, I’m tired of hearing about the code, I swear to god you people think that being a software engineer is only coding and that it’s fine if you ship “mediocre code,” as if bad code can’t bring down entire organizations. What do you think a software engineer does? Is all they do code? If you think the answer is yes, you are wrong!

Human beings may make mistakes in writing code, but they at least know what a mistake looks like, which a generative AI does not, because a generative AI doesn’t know what anything is, or anything at all, because it is a probabilistic model. 

Congratulations! You made another way in which software engineers can automate parts of their jobs — stop being so fucking excited about the idea that people are going to lose their livelihoods! It’s nasty, and founded on absolutely nothing other than your adulation for the powerful!

These models are dangerous and chaotic, built with little intention or regard for the future, just like the rest of big tech’s products. ChatGPT would’ve been a much smaller deal if Google had any interest in turning Google Search into a product that truly answered a query (as opposed to generating more of them to show more impressions to advertisers) — a nuanced search engine that took a user’s query and spat out a series of websites that might help answer said question rather than just summarising a few of them for an answer. 

And if you ever need proof that Google just doesn’t know how to fucking innovate anymore, look at AI Summaries, a product that both misunderstands search and why people use ChatGPT as a search replacement. While OpenAI may “summarise” stuff to give an answer, it at the very least gives something approximating a true answer, rather than a summary that feels like an absentee parent trying to get rid of you and then throwing you $20 in the hopes you’ll leave them alone. If Google Search truly evolved, ChatGPT wouldn’t really matter, because the idea of a machine that can theoretically answer a question is kind of why people used fucking Google in the fucking first place.

Again, why are we not describing this company as the business equivalent of a banana republic? It’s actively making its shit worse to juice growth, and it’s really obvious how badly it sucks. 

Why doesn’t the state of Google dominate tech news, just like how random ketamine-fuelled tweets from Elon Musk do? Why aren’t we, collectively, repulsed by Google as a company? Why aren’t we, collectively, repulsed by OpenAI? 

No matter how big ChatGPT is, the fact that there’s a product out there with hundreds of millions of users that constantly gets answers wrong is a genuinely worrying thing for society, and that’s before you get to the environmental damage, the fact it trained its models on millions of people’s art and writing, and oh, I dunno, the fact it plans to lose over a hundred billions of dollars before becoming profitable? 

Why are we not more horrified? Why are we not more forlorn that this is where hundreds of billions of dollars are being forced? The most prominent company in the tech industry is an unstable monolith with a vague product that can only make $10 billion a year (revenue, not profit) as the very fabric of its existence is shoved down the throat of every executive in the world at once. Also, if it’s not fed $20 billion to $40 billion a year, it will die. 

Give me a fucking break.

I don’t know, I sound pretty ornery, I get accused of being a hater or missing the grand mystery of this bullshit every few minutes by somebody with an AI avatar of a guy who looks like he’s banned from multiple branches of Best Buy, I understand there’s things that people do with Large Language Models, I am aware, but none of it matters because the way they’re being discussed is like we’re two steps from digitally replacing hundreds of millions of people.

The reality is far simpler: we have an industry that has spent nearly half a trillion dollars between its capital expenditures and venture capital funding to create another industry with the combined revenue of the fucking smartwatch industry. What I’m writing isn’t inflammatory — in fact, it’s far more deeply rooted in reality than those claiming that OpenAI is building the future.

Let’s do some fucking mathematics!

Projected Big Tech Capital Expenditures in 2025 and revenue from AI:

That’s $327 billion this year, with a total revenue of…what, $18 billion of revenue? And that’s not profit! And that’s if we include OpenAI’s spend on Azure. Even if every single one of these companies was making $18 billion in revenue a year from this it wouldn’t be great, but it’s more than likely that these chunderfucks can’t even pull together the projected revenue ($32 billion) of the global smartwatch industry! What a joke! 

“Wuhh, but what about OpenAI?” 

What about OpenAI? I’ve written about this so much. So what, OpenAI makes $12.7 billion this year, but loses $14 billion, what does that mean to you, exactly? What’re you going to say? The cost of inference is coming down? No, the cost that people are being charged is going down, we have no firm data on the actual costs because the companies don’t want to talk about it, and yes, it will absolutely lower prices to compete with other companies. The Information just reported that OpenAI was doing this to compete with Microsoft last week!

Hey, quick question — wasn’t SoftBank meant to spend $3 billion annually on OpenAI’s software? Did that happen?  

Anyway, even if we add OpenAI’s revenue to the pot, we are at $30.7 billion. If we add the supposed $1 billion in revenue from training data startup Surge, $300 million in “annualized revenue” from Turing, optimistically assume that Perplexity will have $100 million (up from $34 million in 2024, where it burned $65 million) in revenue in 2025, and assume that Anysphere’s (which makes Cursor) $200 million run rate stays consistent through 2025, we are at…$32.3 billion. 

But I'm not being fair, am I? I didn’t include many of the names from The Information’s generative AI database. Prepare yourself, this is gonna be annoying!

So let's add some more. We’ve got $3 billion from Anthropic, $870 million from Scale (now part of Meta), another alleged $300 million for Anysphere (The Information claims $500 million in ARR), we consider Neo4j’s “>$200 million ARR” to mean “$200 million,” Midjourney’s “>$200 million ARR” to mean $200m, Ironclad’s “>$150 million ARR” to mean $150 million ARR, Glean’s $103 million ARR, Together AI’s $100 million ARR, Moveworks’ $100 million ARR, Abridge’s $100 million ARR, Synthesia’s $100 million ARR, WEKA’s “>$100 million ARR” to mean $100m ARR, Windsurf’s $100m ARR, Runway’s $84 million ARR, Elevenlabs’ “>$100m ARR” to mean $100m ARR, Cohere’s $70m ARR, Jasper’s “>$60m ARR” to mean $60m, Harvey’s $50m ARR, Ada’s “>$50m ARR” to mean $50m, Photoroom’s $50m ARR…and then assumed the combined ARR of the remainders are somewhere in the region of a very generous $200m, we get…

Less than $39 billion dollars of total revenue in the entire generative AI industry. Jesus fucking christ! 

According to The Information, generative AI companies have raised more than $18.8 billion in the first quarter of 2025, after investing $21 billion in Q4 2024 and $4 billion in Q3 2024 for a grand total of $43.8 billion, or a total of $370.8 billion of investment and capital expenditures for an industry that, despite being the single-most talked about thing on the planet, cannot even create a tenth of the dollars it requires to make it work.

These companies are predominantly unprofitable, perpetually searching for product-market fit, and even when they find it, seem incapable of generating revenue numbers that remotely justify their valuations. 

If I’m honest, I think the truly radical position here is the one taken by most tech reporters that would rather take the lazy position of “well Uber lost a lot of money!” than think for two seconds about whether we’re all being sold a line of shit.

What we’re watching is a mountain of waste perpetuated by the least-charming failsons of our generation. Nobody should be giving Satya Nadella or Sam Altman a glossy profile — they should be asking direct, brutal questions, much like Joanna Stern just did of Apple’s Craig Federighi, who had absolutely fucking nothing to share because he has never been pushed like this. 

Put aside the money for a second and be honest: these men are pathetic, unimpressive, uninventive, and dreadfully, dreadfully boring. Anthropic’s Wario (Sorry, Dario) Amodei and OpenAI’s Sam Altman have far more in common with televangelist Joel Olstein than they’ll ever have with Steve Jobs or any number of people that have actually invented things, and they got that way because we took them seriously instead of saying “wait, what do you mean?” To a single one of their wrongheaded, oafish and dim-witted hype-burps. 

It’s boring! I’m terribly, horribly bored, and if you’re interested in this shit I am genuinely curious why, especially if you’re a reporter, because right now the “innovation” happening in AI is, at best, further mutations of the Software As A Service business model, providing far less value than previous innovations at a calamitous cost. 

Reasoning models don’t even reason, as proven by an Apple paper released a few weeks ago, and agents as a concept are fucked because large language models are inherently unreliable — and yes, a study out of fucking Salesforce found that agents began to break down when given multi-step tasks, such as “any task you’d want to have an agent automate.” 

So, here’s my radical suggestion: start making fun of these people.

They are not charming. They are not building anything. They have scooted along amassing billions of dollars promising the world and delivering you a hill of dirt. They deserve our derision — or, at the very least, our deep, unerring suspicion, if not for what they’ve done, but for what they’ve not done. Sam Altman is nowhere near delivering a functioning agent, let alone anything approaching intelligence, and really only has one skill: making other companies risk a bunch of money on his stupid ideas.

No, really! He convinced Oracle to buy $40 billion of NVIDIA chips to put in the Abilene Texas “Stargate” data center, despite the fact that the Stargate organization has yet to be formed (as reported by The Information). SoftBank and Microsoft pay all of OpenAI’s bills, and the media does his marketing for him. 

OpenAI is, as I said, quite literally a banana republic. It requires the media and the markets to make up why it has to exist, it requires other companies to pump it full of money and build its infrastructure, and it doesn’t even make products that matter, with Sam Altman constantly talking about all the exciting shit other people will build

You can keep honking about how “it built the API that will power the future,” but if that’s the case, where’s the fucking future, exactly? Where is it? What am I looking at here? Where’s the economic activity? Where’s the productivity? The returns suck! The costs are too high! 

Why am I the radical person for saying this? This entire situation is absolutely god damn ridiculous, an incomparable waste even if it somehow went in the green. For the horrendous amounts of capital invested in generative AI to make sense, the industry would have to have revenue that dwarfed the smartphone and enterprise SaaS market combined, rather than less than half of that of the mobile gaming industry.

Satya Nadella, Sam Altman, Wario Amodei, Tim Cook, Andy Jassy — they deserve to be laughed at, mocked, or at the very least interrogated vigorously, because their combined might has produced no exciting or interesting products outside of, at best, what will amount to a productivity upgrade for integrated development environments and faster ways to throw out code that may or may not be reliable. These things aren’t nothing, but they’re nowhere near the something that we’re being promised.

So I put it to you, dear reader: why are we taking them seriously? What is there to take seriously other than their ability to force stuff on people?

And I’ll leave you with a question: how do they manage to keep doing this, exactly? They always seem to find new growth, every single quarter, without fail? Is it because they keep coming up with new ideas? Or is it because they come up with new ideas to get more money, a vastly different choice that involves increasing the prices of products or making them worse so that they can show you more advertisements.

My positions are not radical, and if you believe they are, your deference to the powerful disgusts me.


In any case, I want to end this with something inspirational, because I believe that things change when regular people feel stronger and more capable.

I want you to know that you are fully capable of understanding all of this. I don’t care if you “aren’t a numbers person” or “don’t get business.’ I don’t have a single iota of economics training, and everything you’ve ever read me write has been something I’ve had to learn. I was a layperson right up until I learned the stuff, then I became a stuff-knower, just like you can be.

The tech industry, the finance industry, the entire mechanisms of capitalism want you to believe that everything they do is magical and complex, when it’s all far more obvious than you’d believe. You don’t have to understand the entire fundamentals of finance to know how venture capital works — they buy percentages of companies at a valuation that they hope is much lower than the company would be worth in the future. You don’t need to be technical to know that Large Language Models generate a response based on billions of pieces of training data, and by guessing at what the next bit of text in a line should be based on what it’s seen previously. 

These people love to say “ah, but didn’t you see-” and present an anecdote, when no anecdote will ever defeat the basics of “your business doesn’t make any money, the software doesn’t do the things you claim it’s meant to, and you have no path to profitability.” They can yammer at you all they want about “lots of people using ChatGPT,” but that doesn’t change the fact that ChatGPT just isn’t that revolutionary, and their only play here is to make you feel stupid rather than actually showing you why it’s so fucking revolutionary.

This is the argument of a manipulator and a coward, and you are above such things.

You don’t really have to be a specialist in anything to pry this shit apart, which is why so much of my work is either engaging to those who learn something from it or frustrating to those that intentionally deceive others through gobbledygook hype-schpiel. I will sit here and explain every fucking part of this horrid chain of freaks, and break it down into whatever pieces it takes to educate as many people as I have to to make things change.

I also must be clear that I am nobody. I started writing this newsletter with 300 subscribers and no reason other than the fact I wanted to, and four years later I have nearly 64,000 subscribers and an award-winning podcast.  I have no economics training, no special access, no deep sources, just the ability to look at things that are happening and say stuff. I taught myself everything I know about this industry, and there is nothing stopping you from doing the same.

I was convinced I was stupid until around two years ago, though if I’m honest it might have been last year. I have felt othered the majority of my life, convinced by people that I am incapable or unwelcome, and as I’ve become more articulate and confident in who I am and what I believe in, I have noticed that the only people that seek to degrade or suppress are those of weak minds and weaker wills — Business Idiots in different forms and flavors. I have learned to accept who I am — that I am not like most people — and people conflate my passion and vigor with anger or hate, when what they’re experiencing is somebody different who deeply resents what the powerful have done to the computer.  

And while I complain about the state of media, what I’ve seen in the last year is that there are many, many people like me — both readers and peers — that resent things in the same way. I conflated being different with being alone, and I couldn’t be more wrong. For those of you that don’t wish to lick the boots of the people fucking up every tech product, the tent is large, it’s a big club, and you’re absolutely in it.

A better tech industry is one where the people writing about it hold it accountable, pushing it toward creating the experiences and connectivity that truly change the world rather than repeating and reinforcing the status quo. 

Don’t watch the mouth, watch the hands. These companies will tell you that they’re amazing as many times as they want, but you don’t need to prove that — they do. I don’t care if you tell a single human soul about my work, but if it helps you understand these people better, use it to teach other people. 

These people may seem all-powerful, but they’ve built the Rot Economy on a combination of anonymity and a placant press, but pressure against them starts with you and those you know understanding how their businesses work, and trusting that you can understand because you absolutely can. Millions of people understanding how these people run their companies and how poorly they’ve built their software will stop people like Sundar Pichai from being able to quietly burn Google Search to the ground. 

People like Sam Altman are gambling that you are easily-confused, easily-defeated and incurious, when you could be writing thousands of words on a newsletter that you never, ever edit for brevity. You can understand every fucking part of their business — the economics of OpenAI, the flimsy promises of Salesforce, the destruction of Google Search — and you can tell everybody you know about it, and suddenly it won’t be so easy for these wretched creeps to continue thriving.

I know it sounds small, and like your role is even smaller, but the reason they’ve grown so rapaciously is driven by the sense that the work they do is some sort of black magic, when it’s really fucking stupid and boring finance stapled onto a tech industry that’s run out of ideas

You are more than capable of understanding this entire world — including the technology, along with the finances that ultimately decide what technology gets made next.

These people have got rich and famous and escaped all blame by casting themselves as somehow above us, when if I’m honest, I’ve never looked down on somebody quite as much as I do the current gaggle of management consultant fucks that have driven Silicon Valley into the ground.

Read the whole story
mrmarchant
6 hours ago
reply
Share this story
Delete
Next Page of Stories