1635 stories
·
2 followers

Your AWS Certificate Makes You an AWS Salesman

1 Share

I must have been the last developer still confused by the AWS interface. I knew how to access DynamoDB, that was the only tool I needed for my daily work. But everything else was a mystery. How do I access web hosting? If I needed a small server to host a static website, what service would I use? Searching for "web hosting" inside the AWS console yielded nothing.

After digging through the web, I found the answer: an Elastic Cloud Compute instance, better known as EC2. I learned that I could use it under the "Free Tier." Amazon offers free tiers for many services, but figuring out the actual cost beyond that introductory period requires elaborate calculation tools. In fact, I’ve often seen independent developers build tools specifically to help people decipher AWS pricing

If you want to use AWS effectively, it seems the only path is to get certified. Companies send employees to conferences and courses to learn the platform. I took some of those courses and they taught me how to navigate the interface and build very specific things. But that skill isn't transferrable. In the course, I wasn't exactly learning a new engineering skill. Instead, I was learning Amazon.

Amazon has created a complex suite of tools that has become the industry standard. Hidden within its moat of confusion, we are trained to believe it is the only option. Its complexity justifies the high cost, and the Free Tier lures in new users who settle into the idea that this is just "the way" to do web development.

When you are presented with a simple interface like DigitalOcean or Linode and a much cheaper price tag, you tend to think that something is missing. Surely, a cheaper, simpler service must lack half the features, right? The reality is, you don't need half the stuff AWS offers. Where other companies create tutorials to help you build, Amazon offers certificates. It is a powerful signal for enterprise legitimacy, but for most developers, it is overkill.

This isn't to say AWS is "bad," but it obscures the reality of running a web service. It is much easier than it seems. There are hundreds of alternatives for hosting. You can run your services reliably on a VPS without ever breaking the bank.

Most web programming is free, or at the very least, affordable.

Read the whole story
mrmarchant
1 hour ago
reply
Share this story
Delete

Feldspars

1 Share

 

Returning from a trip to New Mexico to explore some Puebloan ruins, I picked up this beautiful chunk of labradorite in the town of Quartzsite. This mineral creates an eerie blue shimmer in the sunlight: a phenomenon called ‘labradorescence’. Reading up on it, I discovered it’s a form of feldspar.

60% of the Earth’s crust is feldspar, and I know so little about this stuff! It turns out there are 3 fundamental kinds:

orthoclase is potassium aluminosilicate
albite is sodium aluminosilicate
anorthite is calcium aluminosilicate

Then there are lots of feldspars that contain different amounts of potassium, sodium and calcium. We get a triangle of feldspars with orthoclase, albite and anorthite at the corners. You can find labradorite on this triangle:

But not all points in this triangle are possible kinds of feldspar! There’s a big region called the ‘miscibility gap’, where as you cool the molten mix it separates out. Apparently this is because the radius of calcium is too much bigger than that of potassium for them to get along. Sodium has an intermediate radius so it gets along with either calcium or potassium.

And there are also subtler issues. When you cool down the feldspar called labradorite, it separates out a little, forming tiny layers of two different kinds of stuff. When the thickness of these layers is the wavelength of visible light, you get a weird optical effect: labradorescence! You really need a movie to see the strange shimmer as you turn a piece of labradorite in the sunlight.

In fact there are 3 kinds of feldspar that separate out slightly as they cool and harden, forming thin alternating layers of two substances:

• The ‘peristerite gap’ produces layers in feldspars with 2-16% anorthite and the rest albite: these layers create the beauty of moonstone!

• The ‘Bøggild gap’ produces layers in feldspars with 47-58% anorthite and the rest albite: these are labradorites!

• The ‘Huttenlocher gap’ produces layers in feldspars with 67-90% anorthite and the rest albite: these are called ‘bytownites’. For some reason these layers do not seem to produce an interesting visual effect. Maybe their thickness is too far from the wavelength of visible light.

All these gaps are ‘miscibility gaps’: that is, feldspars with these concentrations of anorthite and albite are unstable: they want to separate out. That’s why they form layers.

The physics and math of all this stuff is fascinating. Crystals try to do whatever it takes to minimize free energy, which is energy minus entropy times temperature. That’s why many feldspars have different high- and low-temperature forms. But sometimes when molten rock cools quickly, it doesn’t have time to reach its free energy minimizing state.

For feldspar all of these issues are complex, because feldspar crystals are complicated structures:

Aluminum and silicon have to be distributed among the corners of the tetrahedra here, and there are various ways to do this. The distribution is determined by the relative amounts of potassium, sodium and calcium, which are the white balls. The distribution of aluminum and silicon in turn controls the symmetry of the crystal, which can be either ‘monoclinic’ or the less symmetrical ‘triclinic’.

The picture here shows the difference between monoclinic and triclinic crystals:

But the picture doesn’t fully capture the symmetry group of an actual crystal—because there’s more to a crystal than just a shape of a parallelipiped! There may be the same atoms at all corners of the parallelipiped, or not, and there may also other atoms not on the corners.

Let’s get into a bit of the math.

The symmetry group G of a crystal, called its ‘space group’, fits into a short exact sequence:

0 → T → G → P → 1

where T ≅ ℤ³ is the group of translational symmetries and P is the group of symmetries that fix a point, called the ‘point group’. This sequence may or may not split! It splits iff G is a semidirect product of P and T.

For a triclinic crystal, there are only two possible space groups G, and both are semidirect products. P is either trivial or ℤ/2, acting by negation.

For a monoclinic crystal, there are 3 choices of the point group P as a subgroup of O(3):

• P = ℤ/2 (a single 2-fold rotation)
• P = ℤ/2 (a single reflection)
• P = ℤ/2 × ℤ/2 (generated by a 2-fold rotation and inversion
(𝑥,𝑦,𝑧) ↦ -(𝑥,𝑦,𝑧): their product is a reflection).

For each choice of P there are 2 fundamentally different choices of lattice T ≅ ℤ³ it can act on. One is made up of copies of the parallelipiped I showed you. The other is twice as dense; then we call the lattice ‘base-centered monoclinic’:

So, we get 3 × 2 = 6 space groups G that are semidirect products.

But there are 7 other non-split extensions! These other 7 give nontrivial elements of the cohomology group H²(P, T). It’s not obvious that there are just 7 options. Thus, the hardest part of the classification of all 13 monoclinic space groups is essentially the computation of H²(P, ℤ³) for all 6 choices of groups P and their actions on ℤ³.

I knew that cohomology rocks. But it turns out cohomology helps classify rocks!

Now, which of these various groups are symmetry groups of feldspars?

Apparently all the feldspars in the triangle have just two different symmetry groups:

• For the monoclinic feldspars (including sanidine, orthoclase, and high-temperature albite), the crystal has a 2-fold rotational symmetry, a mirror plane, and inversion symmetry

(𝑥,𝑦,𝑧) ↦ -(𝑥,𝑦,𝑧).

The point group is the Klein four-group ℤ/2 × ℤ/2. The lattice is base-centered monoclinic, so there’s an extra translational symmetry shifting by half a cell diagonally across one face of the parallelipiped.

• For the triclinic feldspars (including microcline, low-temperature albite, and anorthite), the only symmetry beyond translation is inversion. So the point group is just ℤ/2. And there are no extra generators of translation symmetry beyond the three edges of the parallelipiped.

Alas, each of these space groups G is the semidirect product of their point group P and their translation symmetry group T ≅ ℤ³. So, no interesting cohomology classes show up!

Nontrivial cohomology classes show up only in crystals where you can’t cleanly separate the translations from the symmetries that fix a point of the crystal. This happens when your crystal has ‘screw axes’ or ‘glide planes’. A screw axis is an axis where you’ve got a symmetry of translating along that axis, but only if you also rotate around it:

A glide plane is a plane where you’ve got a symmetry of translating along that plane, but only if you also reflect across it:

But wait! There’s a rarer kind of feldspar made with barium. It’s called celsian, after Anders Celsius, the guy who invented the temperature scale. Chemically it’s barium aluminosilicate. And its crystal structure has both screw axes and glide planes! So its space group G is not a semidirect product! It’s an extension of ℤ³ by the point group P = ℤ/2 × ℤ/2 that gives a nonzero element of H²(P, ℤ³). See the end of this post for some details.

All this is lots of fun to me: you start with a pretty rock, and before long you’re doing group cohomology. But the classification of symmetry groups is just the start. For mathematical physicists, one fun thing about feldspars is their phase transitions, especially the symmetry-breaking phase transition from the more symmetrical monoclinic feldspars to the less symmetrical triclinic ones! There’s a whole body of work—by Salje, Carpenter, and others—applying Landau’s theory of symmetry-breaking phase transitions to map out the space of different possible feldspar crystals! Here’s one way to get started:

• Ekhard Salje, Application of Landau theory for the analysis of phase transitions in minerals, Physics Reports 215 (1992), 49–99.

Even if you don’t particularly care about feldspars, there are a lot of good general principles of physics to learn here!

Details

Let me sketch out why barium aluminosilicate, or celsian, has a space group G that’s described by a non-split short exact sequence:

0 → T → G → P → 1

Its point group is P = {e, r, m, i} ≅ ℤ/2 × ℤ/2, where we can take r to be a 180° rotation about the y axis and m to be a reflection that negates the y coordinate, so that i = rm is inversion. In coordinates:

r acts as (x, y, z) ↦ (−x, y, −z)
m acts as (x, y, z) ↦ (x, −y, z)
i acts as (x, y, z) ↦ (−x, −y, −z)

We can take the translation lattice T ≅ ℤ³ to be the lattice generated by

f₁ = (1,0,0), f₂ = (0,1,0), f₃ = (½,½,½)

Note that (0,0,½) is not in T.

To compute the 2-cocycle we need a set-theoretic section s: P → G. We choose

s(e) = identity
s(m) = a glide reflection: (x, y, z) → (x, −y, z + ½)
s(i) = inversion: (x, y, z) → (−x, −y, −z)
s(r) = s(i)·s(m): (x, y, z) → (−x, y, −z + ½)

As usual, the 2-cocycle c: P2 → G is defined by

c(g,h) = s(g)·s(h)·s(gh)⁻¹

The interesting value is c(m, m): the glide composed with itself gives (x, y, z) → (x, −y, z+½) → (x, y, z+1), so s(m)² = translation by (0, 0, 1), while s(m²) = s(e) is the identity. Thus c(m, m) = (0, 0, 1). The other values are trivial: c(i, i) = 0, c(r, r) = 0.

Now, is this cocycle nontrivial in H²(P, T)? It would be trivial if we could find a different section that makes the cocycle zero—that is, find a function b: P → T such that replacing s(g) with s'(g) = s(g) + b(g) makes

c'(g,h) = s'(g)·s'(h)·s'(gh)⁻¹

be the identity for all g,h. I will spare you the calculation proving this is impossible. The idea is simply this: the reflection m squares to the identity in the point group, but no matter how we choose b, s'(m) is a glide reflection, so it squares to a nontrivial translation. On the other hand, s'(m2) is trivial since m2 is, so

c'(m,m) = s'(m)·s'(m)·s'(m2)⁻¹

is nontrivial.





Read the whole story
mrmarchant
14 hours ago
reply
Share this story
Delete

Lighter, not faster

1 Share

I keep hearing variants of this complaint lately:

“This (tool / workflow / service / slot machine) is slower than me doing it manually.

Therefore, it's not worth using.”

These people are missing the point. Speed is easy to measure - that’s great. But focusing on speed overlooks the importance of subjective effort and mental load.

Let's talk about grocery stores, naval signaling flags, and the value beyond time saved.

The grocery store self-checkout

In the grocery store, do you choose a human cashier or the self-checkout machine?

People who prefer self-checkout often believe that it's faster. But in my highly-scientific study (loitering in the soup aisle with a stopwatch, n=24), the fastest self-checkout user was only equal in speed to the average cashier in scanning items. Once you add in the time to bag items and pay (not to mention "unexpected item in bagging area"), most people have no chance to outpace a human cashier.

The true value of the self-checkout is to offload social effort, the weight of interaction. Arthur Schopenhauer lived too early to scan his own groceries, but he did write, "A man can be himself only so long as he is alone... for it is only when he is alone that he is really free."

Artist’s depiction

Decoding naval signaling flags (semaphore)

Quick, what is this sailor saying to you?

Image via Wikimedia Commons

It's flag semaphore for the letter "U", of course. One weekend, I was tired of staring at the lookup table for flag semaphore (a common cipher used in puzzle hunts), so I made an interactive graphical tool to help me decode it.

Try it out here: Semaphore Decoder

It worked great and I felt that it was way faster, just like our self-checkout users above. But after a highly-scientific evaluation (decoding four phrases from my friend Johnny), I was surprised to learn the decoder was only equal in speed to the lookup table.

The true value of the semaphore decoder is to offload cognitive effort. Instead of burning mental energy trying to match a shape in a lookup table, I can mechanically use my tool to grind through the rote decoding process - no faster, but still easier.

That means I'm free to focus my efforts on more interesting, fun, and challenging aspects of the puzzle. I keep more energy for the next puzzles in the hunt.

Lighter burdens make your journey feel faster

Via my third and final highly-scientific study (my personal vibes), I've come to believe we fixate on speed and time measures for two reasons:

  1. "Time is money" is a pervasive metaphor in our culture.

  2. It's easy to measure and compare.

But even if a new approach is equally slow or even slower, the value of reducing effort is real.

I've previously written a bit on the history of the command bar/palette seen in apps like VS Code, Notion, and Slack. That’s a UI pattern that is actually slower than dedicated keyboard shortcuts, but it provides a better user experience by providing better discoverability and reducing cognitive load.

The self-checkout is slower, but I can relax a little. The semaphore decoder isn't any faster, but I feel less mentally tired. And the command bar is slower, but I can always find what I was looking for.

Sure, if all things are equal, I prefer faster. Who wouldn’t? But if you only prioritize hard numbers and squeezing out every moment of savings, you are going to miss the opportunity to make your effort lighter - and relax for the same result.

I like the tools that make my work lighter, not just faster.

Read the whole story
mrmarchant
14 hours ago
reply
Share this story
Delete

“Approximately 21 times the estimated age of the universe”

1 Share

A few years ago, some sort of a bug at my work caused all of the timestamps appear as “54 years ago,” a seemingly arbitrary date. It took me a bit to realize: “Wait, you know what year was 54 years ago? 1970!” “Why is 1970 important?” asked another designer. I explained that by convention, Linux time counts up from Jan 1, 1970 – and so if the time “value” is zero or unavailable, as it was because of the bug, it would be rendered not as an error, but as that specific day long ago.

Computing is filled with all sorts of arbitrary numbers like these. The most famous one was Y2K (99 + 1 = 00 if you only allocate two digits), Pac-Man’s kill screen was number 256, people still bring up the infamous and likely non-existent “640 kilobytes should be enough for everybody” quote, and the Deep Impact space probe died a lonely and undignified death after its timers overflowed the two pairs of bytes given to them.

Here’s a new magic number to remember: macOS Tahoe has, for a while at least, a kill screen of its own – after 49 days, 17 hours, 2 minutes, and 47 seconds (or, 4,294,967,295 milliseconds), one of its time counters overflows and no new network connections can be made, rendering the machine rather useless. The only solution is a reboot. Talk about a deadline!

(Well, new-ish. In perhaps a bit of karmic payback, Windows 95 and 98 once had a similar problem with the exact same threshold of 49.7 days.)

Wikipedia has a nice list of other time storage bugs. The next big one? The problem of the year 2038. The technical fix, as always, is to give the numbers a bit more room to breathe. This is, in a way, kicking the can down the road, but that might be okay since the road is rather long:

Modern systems and software updates address this problem by using signed 64-bit integers, which will take 292 billion years to overflow—approximately 21 times the estimated age of the universe.

However, as always, the technical side won’t be the hard part.

#bugs

Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete

Reflections on Classroom Technology

1 Share

There are two articles out this week about my tech-free experiment, one in The Atlantic (gift link) and one in Chalkbeat. I thought the articles did a good job capturing my experiment so I won’t retread everything. Go read the articles, or if you want to see my writing on technology you can check it out here, and here, and here. I do have a few quick reflections on some responses I saw to the articles.

I’ve Moved On

It’s funny these articles came out now because I did the experiment in January. I feel like I’ve moved on. It’s not an experiment anymore. It’s just the way I teach. I radically reduced technology use in my class, I’m happy with that, and I’m thinking about a lot of other aspects of my teaching these days. I emphasize this because there’s a growing movement advocating for less technology in schools. I think that’s a good thing! But classroom technology is just one of many, many things that matter for student learning. Some of the big claims I’ve seen about technology use harming learning don’t seem in line with the evidence. Let’s reduce classroom technology use, sure, but let’s be realistic about the impact that will have on student achievement, and also work on the dozens of other things that can help students learn.

Just a Tool

A few responses said something along the lines of, “Sure, technology can be overused, but computers are a tool just like any other. Why is it such a big deal not to use student-facing technology at all? Why not only use it when it benefits learning?”

Before my experiment, I would have said that I’m someone who doesn’t use technology very much. I’d estimate we used Chromebooks about 20% of class time. Going cold turkey for a month was a helpful way to see that even that 20% was way too much. I’m also not completely zero-tech now. We’ve used Chromebooks twice in the last three months, both for narrow practical purposes.1 I’d recommend teachers give a tech-free experiment a try, just to see how they feel about it.

A related point here is that schools spend a ton of time, energy, and money on student technology. Something like 90% of US schools provide one-to-one devices to students. On Monday, when I asked students to pull out their Chromebooks, one looked like this:

This is the type of stuff teachers and other staff deal with when students carry Chromebooks around all day every day. The practical question here is whether schools should be providing one-to-one devices. Maybe schools should have a computer lab or shared Chromebook cart instead.

Assessment

A final response I saw a few times was surprise that I was giving quizzes and tests on Chromebooks before my tech-free experiment. I get it. That was dumb. In retrospect, it seems obvious students should be assessed on paper. I started giving assessments online during covid for practical reasons, and kindof just never thought twice about it again. They were easy to grade and there was no paper to keep track of for absent students. Students take most high-stakes assessments online now, so there’s a common logic in schools that students should practice taking other assessments online. There’s also a bit of cachet that comes with doing things with technology. I remember one time a few years ago the superintendent was doing a classroom walkthrough and I was giving a quiz. She complimented how nifty one of the interactive online questions was. They can be pretty nifty! That doesn’t mean it was good teaching, but there’s an assumption in education that using technology is innovative or whatever. This is a long-winded way of saying: trying a tech-free experiment can help teachers look at their practices from a new perspective, and maybe realize that they’re using technology for the wrong reasons.

There’s a lot of talk these days about how students are addicted to their phones. I think teachers can just as easily get addicted to classroom technology. The classroom tech I used before definitely made my life a bit easier. I work a bit harder now. That’s fine with me. I think it’s worth the trade. Students work a bit harder too! Those are exactly the types of conversations I hope schools start having about their technology use.

1

If you’re curious what I’ve used Chromebooks for: Once was about 15 minutes for this Desmos activity that I find helpful for teaching the triangle inequality theorem. Shoutout to Desmos, if you’re going to use technology in math class it’s the best tool out there. The second time happened to be on Monday — my students took the computer-based state test on Tuesday/Wednesday/Thursday this week, so we spent the last 10 minutes of class on Monday practicing typing math notation on computers.

Read the whole story
mrmarchant
2 days ago
reply
Share this story
Delete

Admissions Officers Beware: Some Advanced Placement Scores Are Inflated

1 Share

Illustration

While high school GPAs have been gliding upwards for years, college admissions officers have relied on Advanced Placement (AP) exams as a more stable, rigorous measure of college readiness. That confidence is now misplaced—at least for most of the exams that dominate the AP landscape.

The College Board has phased in a new scoring system that has inflated student results on nine of the most frequently taken AP exams. The share of students receiving the top score of 5 on these exams has jumped by an average of 61 percent in just four years. The share receiving a passing score (3 or higher) has risen by 37 percent.

Some less common AP exams still appear to function as reliable indicators of high academic achievement. But for the most popular exams, high school counselors and college admissions committees must go beyond a quick glance at the AP scores listed on an application. They now need to look closely at which AP exams a student took, and in which years.

Trevor Packer, the senior vice president in charge of AP programs, denies that any score inflation has occurred. He has described the claim that AP is being “dumbed down” as “entirely false.” This essay explains how the scoring system has changed, demonstrates that inflation has occurred, and shows why the official denials are misleading.

Why AP Matters

High performance on AP exams is an important way students signal readiness for rigorous college work:

  • Scores of 5 on multiple tests serve as a positive signal that a student is prepared for admission to Ivy League and other highly selective institutions.
  • Many selective but non-elite colleges award course credit or waive introductory course requirements for scores of 4 or 5.
  • Most non-selective colleges grant credit for a passing score of 3 or higher.

The financial stakes are high. By substituting a high school AP course for a college course, students can reduce college costs and shorten their time to earning a degree. Reflecting this, more than 1.3 million high school students in 2025 paid a $99 fee for each of over 4.8 million AP exams.

Given these stakes, the integrity of AP exams depends on a scoring system that is stable across subjects and from year to year.

How AP Scoring Used to Work

Until 2022, the College Board used a relatively consistent procedure to set score distributions:

  • Each AP exam was reviewed every 5–10 years by a panel of approximately 10 to 18 experienced college professors and high school teachers.
  • These experts had deep subject-matter knowledge and a clear sense of what level of performance justified advanced placement in college.
  • They determined what share of test takers should receive each of the five AP scores (1 through 5).

Under this system, the distribution of students awarded scores of 5, 4, 3, and so on was anchored to the standards of a carefully selected expert group and remained fairly stable over time.

The Shift to a New Scoring System

After 2021, the College Board began phasing in a different approach for nine of its most popular exams:

  • English Language and Composition
  • U.S. History
  • English Literature and Composition
  • World History
  • U.S. Government and Politics
  • Psychology
  • Biology
  • Human Geography
  • Chemistry

Less commonly taken exams—such as Music Theory, Art History, Japanese, Italian, and Physics (Electricity and Magnetism)— continue to be scored under the traditional expert-judgment system.

What Happened to the Scores?

Under the new system, performance on the nine popular exams suddenly “improved” in ways that are historically unprecedented:

  • Top score (5): The share of students earning a 5 increased from about 10 percent in 2021 to 17 percent in 2025, on average—a 61 percent increase. Under the old system, the share of 5s awarded in these subjects, on average, hardly changed over the previous six years.
  • Top two scores (4 or 5): In 2021, just 28 percent of test takers received a 4 or 5 on these nine exams. By 2025, that had jumped to 45 percent, a gain of 17 percentage points—or a roughly 63 percent increase.
  • Passing scores (3 or higher): The share of students receiving a 3 or better rose from roughly 52 percent to 71 percent over the same period, a 19 percentage-point increase—resulting in a 37 percent jump in passing rates.

Such large, rapid gains call for explanation.

Three Possible Explanations

Three broad explanations (or some combination of them) could account for this sudden surge in scores:

The test-taking pool became more selective. Perhaps weaker students stopped taking AP exams, leaving a stronger group of test takers.

This is easily rejected. Since 2021, the number of AP test takers has increased, not decreased. The pool has expanded rather than narrowed to a high-performing elite.

Teaching and learning improved dramatically. Perhaps teachers and students suddenly found far more effective ways to teach and learn AP material.

If students’ knowledge truly improved so dramatically, we would expect to see similar gains on other large-scale, independent tests. In fact, national and international data tell a different story:

  • NAEP (National Assessment of Educational Progress) scores in 8th-grade math, reading, and science were already slipping before Covid and have fallen sharply since. 12th-grade math and reading scores also declined between 2019 and 2024.
  • PISA (Programme for International Student Assessment) scores show stagnation in science and reading for U.S. 15-year-olds since 2015, and a decline in math.

These results provide no evidence of a sudden, broad-based leap in academic achievement. If anything, they point to stagnation and decline.

That leaves a third explanation.

The scoring system was relaxed. Perhaps a new evaluative approach altered the way tests were scored so that higher scores were given for the same level of performance. Let’s take a look.

Evidence-Based Standard Setting (EBSS): The New Method

After 2021, the College Board introduced what it calls “Evidence-Based Standard Setting” (EBSS) to determine score distributions on its most popular AP exams.

Under EBSS, the College Board consults hundreds of college instructors instead of relying on a small panel of carefully selected experts. These instructors are asked to recommend what proportion of students should receive each AP score.

In practice, the standards produced by this large, dispersed group are substantially lower than those set by the traditional expert panels.

The Impact of EBSS

With the implementation of EBSS, the share of passing scores rose sharply across the nine popular courses that that used it. The size of the increase varies by subject:

  • In English Literature, U.S. History, U.S. Government and Politics, and, the share of 4s and 5s rose by 24 percentage points or more (see Figure 1).
  • In Psychology and English Language and Composition, the increase was smaller but still substantial—about 9 or 10 percentage points.
  • In each of the nine subjects, EBSS is associated with higher scores and higher passing rates between 2021 and 2025.
  • The largest score increases on each exam within this period correspond to the specific year when EBSS was first applied.

These patterns are precisely what we would expect if the scoring standards had been relaxed.

Figure 1: Evidence of score inflation

Parsing the Official Denials

Trevor Packer insists there has been no “dumbing down” of AP exams, stating, “The exams themselves have not changed . . . Well-established equating processes ensure the difficulty of AP Exams remains consistent from year to year.”

This statement is technically correct but strategically framed. It emphasizes one piece of the puzzle (the difficulty of the test questions) while ignoring another (the conversion of raw test scores into AP scores).

Dumbing down does not require easier questions. It can be achieved just as effectively by changing how test scores are mapped onto the 1–5 scale—exactly what EBSS does.

According to one report of a public appearance, Packer acknowledged that the College Board aimed “to bring all exams to between a 60 and 80 percent success rate.” In 2025, the average passing rate on the nine EBSS exams was 71 percent, almost exactly the midpoint of that target range. EBSS appears to have been used to recruit scorers whose standards would produce the desired “success” rates.

Packer further claims that fluctuations in passing rates are driven by changes in student performance, pointing to recent declines in pass rates for AP Calculus BC, AP Statistics, AP Physics C: Mechanics, and AP Government and Politics courses. He neglects to highlight that:

  • All but one of these courses have not been subjected to EBSS.
  • For AP U.S. Government and Politics, the 2025 pass rate is only slightly below its 2024 level—after a 20 percentage-point increase following the adoption of EBSS.

The pattern is consistent: where EBSS is applied, scores rise substantially; where it is not, scores tend to reflect the stagnation or decline seen in broader national tests.

AP vs. IB and the Role of Marketing

To justify higher AP passing rates, Packer points to the International Baccalaureate (IB) program, where roughly 80 percent of candidates succeed. The comparison is misleading:

  • IB is an integrated two-year program, not a set of independent single-course exams.
  • Earning the IB diploma requires sustained performance across multiple subjects and assessments over time.

Nonetheless, the comparison reveals something important: the College Board is attentive to market positioning. If IB can boast an 80 percent “success” rate, AP’s passing rates must appear competitive to students, parents, schools, and policymakers.

Financial Incentives and Score Inflation

Market considerations are not incidental to the College Board. They are central to its operations:

  • In 2024, over 86 percent of College Board revenue came from fees and similar payments, including 48 percent from the basic AP exam fee.
  • In 2024, total revenues exceeded $1.17 billion, and the organization held reserves of over $2 billion.

Generous compensation at the top reinforces these incentives:

  • The CEO received $2.3 million in total compensation in 2024, comparable to the pay of the president of Stanford University, though Stanford’s operating budget is about ten times larger.
  • The second-in-command earned $1.5 million.

To sustain these revenues and salaries, the College Board must keep AP attractive to schools and students. Guaranteeing that more than two-thirds of test takers “succeed”—via relaxed scoring standards—serves that purpose well.

If this requires inflating AP scores, so be it. The more troubling question is why a senior vice president feels compelled to deny the inflation and to frame it instead as a story of scoring becoming “more precise.”


EdNext in your inbox

Sign up for the EdNext Weekly newsletter, and stay up to date with the Daily Digest, delivered straight to your inbox.


Implications for Admissions Officers and Counselors

For college admissions officers, high school counselors, and policymakers, several implications follow:

  • AP scores are no longer directly comparable across subjects and years. A score of 4 in AP U.S. History today (post-EBSS) does not mean the same thing as a 4 in U.S. History before 2021, nor as a 4 in AP Music Theory (still scored under the old system).
  • The most popular exams are the most inflated. The very tests taken by the largest number of students—those that dominate application profiles—are the ones whose standards have been relaxed.
  • Context now matters critically. Evaluators should:
    • Note which AP courses and exams a student took.
    • Check the year(s) in which those exams were taken.
    • Recognize that high scores on EBSS-affected exams are far less informative than scores on exams that retained traditional standard setting.

AP exams, once a gold-standard external check on grade inflation, now vary in reliability. Without close attention to subject and year, admissions decisions risk being distorted by hidden inflation.

Conclusion

The College Board’s shift to Evidence-Based Standard Setting for its most popular AP exams has produced an unmistakable pattern of score inflation, even as broader measures of student achievement show stagnation or decline. Official statements that “the exams themselves have not changed” hide the central fact: the scoring system has changed, in ways that dramatically raise reported performance.

AP remains influential in college admissions and credit decisions, but its signals are no longer uniform or stable. Those who rely on AP scores must recognize that some exam results reflect not a surge in student learning but a quiet lowering of the bar.

Paul E. Peterson is the Henry Lee Shattuck Professor of Government and Director of the Program on Education Policy and Governance at Harvard University. He is also a Senior Fellow at the Hoover Institution, Stanford University.

The post Admissions Officers Beware: Some Advanced Placement Scores Are Inflated appeared first on Education Next.

Read the whole story
mrmarchant
3 days ago
reply
Share this story
Delete
Next Page of Stories