358 stories
·
0 followers

Mediocrity Delenda Est

1 Share

I don’t care if a million people read my advice and fail if at least one person reads it and succeeds. Fundamentally, my writing is about being exceptional, so most will be unable to follow through with what I say. I’ve been told occasionally that my advice is dangerous because many will fail if they listen to it, but that doesn’t concern me. Part of my lack of trepidation stems from the fact that even failure on my path would be considered wildly successful when compared to an average outcome. If you try to get a $250k a year law or tech job following in my footsteps, but end up “only” making $125k a year, that is still triple the median income. We should not be concerned with what works for everyone or what works for the least able among us, we should enable those with extraordinary ability and drive to push the limits as far as they can. If we enable our best, the “Overton window” of success will stretch with them, uplifting everyone.

a_real_society’s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

One of the main problems with catering to mediocrity is that people act as if ambition and greatness are taboo, and tend to portray false humility as a way to mask shortcomings. Using this cope is a way to avoid coming face to face with one’s own limitations. If someone never tries to the fullest of their capability, they can always tell themselves things could be different if they had. They are so afraid of failure that they also become terrified of success. Others who struggle to assert their will on the world may see outcomes as a zero sum game, believing one person’s positive is another person’s negative. For example, they may view education as a nonrenewable resource, and decide that providing advanced math courses to one student will steal potential from another. They might believe money is a set allocation, and more for one person is less for another. This ignores the reality of value creation and how it would bring in new abundance for everyone in the system. If one person or group creates value in a way that others didn’t, that implies there is a group of people who “chose wrong” by not also capitalizing on the opportunity. Not wanting to face the consequences of one’s own decisions, and therefore delaying the act of committing to any one path, is a classic pitfall known as analysis paralysis.

Evidence of this fear of failure is particularly stark in parenting discourse, as people are understandably incredibly defensive of their parenting choices. To acknowledge that someone else has a better parenting method than you can be taken to mean that you have “failed” your children in some way, and confronting the reality of this is particularly distressing. However, this is fatalistic thinking, and in reality, there are many pathways to success. We continue to build a better universe as we go, and what works for one person may pan out differently for another. There are many ways to arrive at an outcome, and to believe that success hinges on any one specific choice or event is a flawed way to navigate life. We should reject all of these alarming and defeatist lines of thought, and appreciate those who strive for more.

It was a breath of fresh air to hear this acceptance speech from Timothée Chalamet – finally someone unashamedly pursuing excellence. This approach to life feels so stark and important because the last two decades have been defined by ignoring the potential greats in favor of the bottom quintile, not just in acting but in all fields. Social interventions like No Child Left Behind, rampant grade inflation at all levels of the education system, and a complete lack of enforcement of standards in all aspects of life have created an aura of malaise that hangs over every person with ambition. A great way to see this complacency in action is to take stock of how troublemakers are left unchecked in any classroom across America, or to look at traffic law enforcement data, where you can see that police forces absolve themselves of any responsibility on the roadways.

When it comes to discipline issues and accommodation for skill in classrooms, the overachieving students are often held back from their potential in favor of regulating the pace of the lessons for the overall student body. This happens to such an extent that the uniquely ambitious kids become maligned simply because they are more receptive to critique and instruction, and thus become the focus of punishment while the problem students are left unchecked. Many gifted students have experienced being assigned group projects with unruly or underperforming kids in the hope that the more conscientious student will take on the teacher’s responsibility of education and/or classroom management. Even while being catered to at the expense of other students, the worst performers in the classroom are still being left behind in terms of learning. It would be better for everyone if we sorted students by ability, and enabled those who can learn faster to push themselves rather than hobble themselves. I personally have several examples of the absurdity of the current system in action that shaped my view of education to this day.

Thanks for reading a_real_society’s Substack! This post is public so feel free to share it.

Share

In fourth grade, I became frustrated with the pace of the other students as we read Where the Red Fern Grows aloud, so I began to skip ahead. When my teacher noticed this, she confiscated my copy of the book, and would only hand it back to me whenever it was my turn to read a portion of the story. The next year, at an entirely new school, I would constantly read in class while ignoring my fifth grade teacher. Since I consistently grasped the material before my classmates, this was my solution to avoid sitting in silence doing nothing. She was irate with me for this, claiming that my reading was somehow disruptive to the other students, and sent me to the principal’s office repeatedly. She had sentenced me to a perpetual in-school suspension; I was stuck in that office alone all day long. This teacher even banned me from the school library, seeming to have a personal vendetta against any student who dared to excel or defy her authority. In spite of this, I would sneak into the library after hours to grab a selection of books to stash in the principal’s for the next day (the Animorphs books were particularly well sized for hiding in nooks), again, all to avoid sitting in silence doing nothing. If I had been less obsessed with learning and reading, these teachers and administrators could have permanently quashed my curiosity and desire to read.

There are two main ways to combat the societal descent we find ourselves on. One is through taking control of your own life, and the other is to build systems that enable excellence. For the former, you have to realize that you are not yet doing all that you can do. Truly decide that you want to be more than you now are, and push through every implicit or explicit barrier on your path to greatness. This self discovery is incredibly freeing, as once you know that you want to pursue something to its fullest extent, you can research how to do so. Once you have an outcome to aim for, even if it seems insurmountably difficult, you are much better off aiming for the top armed with as much information as possible than the average person who only has a vague idea of their desired career trajectory.

The socially acceptable advice and pathways laid out before you usually come with implied expectations, and much is left unsaid because it is unseemly to talk about standards. Someone might tell you that becoming an artist is a good idea, all the while picturing the vision of a successful artist, specifically. If you only hear their explicit, stated advice and become masterful in an artistic technique like photorealism, you are fundamentally missing at least half of what it entails to reach that initial vision of success that is implied. A successful artist also needs to know how to market their work, how to write about and contextualize their oeuvre, how to document their pieces well, how to find clientele, how to do administrative tasks, or how to get a gallery or representative to do all of these auxiliary tasks for them. There are thousands if not millions of technically skilled artists in this day and age, but I’ve learned from my fiancée that to stand out takes an extra level that almost no one factors into their plan. Follow the idealistic advice of your family, compatriots, or even an arts institution, and you will be left floundering and wondering where things went wrong, as no one wants to be realistic about the success and employment rates after graduation in most liberal arts majors.

a_real_society’s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

This is all without yet mentioning that the arts can be a field where the illusion of modesty remains important, and being self-deferential can be expected from your peers. Conversely, many of the most well-known artists of our time capitalize on this tendency by becoming full of bravado, which sets them apart from the crowd for good and bad, like Maurizio Cattelan and the banana he taped to a wall and sold at a Sotheby’s auction for $6.2 million. We have all heard of the term Sell-Out being thrown around when it comes to artistic integrity. The reality is that it takes a lot of intention, resolve, and belief in one's work to stand out, and there is a crabs in the bucket mentality surrounding the idea of proclaiming one's intention toward achieving greatness. By making it uncouth to shoot for the best outcomes possible, people end up giving and receiving advice that in the aggregate is substantially more harmful than the immediate pain of honesty.

Image

The other way to address the decline of our output and our institutions is to destroy the culture of mediocrity we find ourselves in. If you can influence a system in any way, you should use your power to enable excellence any chance you get. If you are a teacher and you see a student showing curiosity or ability in a subject, do everything to encourage that. If you are a school administrator, you should enable your teachers to uplift their students, and you should enforce standards that will further the pursuit of greatness. The uncomfortable element of a culture of excellence is punishing students who disrupt learning, or teachers who hold back advanced students for the sake of slower learners. The burden of cooperation shouldn’t be solved by slowing down those who are high achievers. Too often we coddle people to their own detriment. I am a firm believer in the human spirit and our capacity to rise to greater occasions than we could ever imagine. If you never push people to their limits, you will never see what people are capable of.

When I published my article on getting into biglaw, I had people in my replies complaining that my advice was dangerous because people might try and fail. I don’t care, if I enable one person to pursue excellence and everyone else fails, it was worth it. The idea of trying and failing being “dangerous” is a fallacy to begin with, you’ll always get further striving than standing on the sidelines. When I was initially contemplating getting into law school without a degree, everyone told me it was impossible. When I sought the advice of admissions consultants, one refused to take payment until after I got accepted into a school, and another even turned down my money outright. If I took the advice I was given at face value, I wouldn’t be where I am today. Most of society may be at the mercy of the bottom quintile, but I choose not to be. It’s okay to be excellent.

Read the whole story
mrmarchant
2 hours ago
reply
Share this story
Delete

An Ode To The Game Boy Advance

1 Share

In March 2001, Nintendo introduced an advanced portable model to the gaming market with the release of the Game Boy Advance (GBA, codenamed Advanced Game Boy or AGB). Equipped with a modernized 32-bit ARM CPU running at twice the speed of the Game Boy Color (GBC), this small device was more than capable of playing SNES-like games—still at the price of only two AA batteries.

The third major Game Boy revision indeed again proved to be a smashing hit, breaking various sales records during its relatively short lifespan. According to Eurogamer, in the United Kingdom, the GBA sold four times as much units in its first first week of release as the PlayStation 2. Although Gunpei Yoko’s “Lateral Thinking with Withered Technology” design philosophy was still applied (the system was still cheap: priced at $99.99—About $146.37 in 2020), it was clear that the technical specifications of the GBA were put into the spotlight. Why else would you name something “Advance” or put “32 bit” on the box? It almost feels like a poor apology: “We’re sorry about the GBC. This time, the model really is advanced, we promise!”

It becomes even more obvious when looking at a selection of the system’s launch titles that liked to brag about the capabilities of the new Game Boy model:

  • Castlevania: Circle of the Moon. Phenomenal music, a huge castle to explore, and nimble Vampire Killing moves that were not even seen in Castlevania IV for the SNES.
  • Super Mario Advance. Don’t dismiss this as a bleak adaptation of Super Mario Bros. 2 (which, in turn, is an adaptation of Doki Doki Panic): Nintendo R&D2 put a lot of effort in embedding rotating, popping, whooshing, bouncing, and stretching animations. The message is clear: “Dear game devs, look at this! The GBA has hardware-acceleration for this! Now go make games for it!”
  • F-Zero: Maximum Velocity. A handheld that can do Mode-7 tricks such as rotating, scaling, and skewing background layers? Finally! No more archaic HDMA tricks are required to master.

Next to showcasing the GBA’s strengths, the games also showed its biggest weakness: there was still no backlit screen. This became especially painful for Castlevania fans like me: the grim setting did not exactly benefit from highly contrasting colors like the more cheerful Super Mario Advance did. Without a proper light source, this often resulted in a dark gooey mess that made an already punishing difficulty even more frustrating. Complaints about the contrast even made Konami go all out on the colors for their the next GBA Castlevania game, Harmony of Dissonance, that was criticized for… it’s too bright color palette.

Castlevania: Circle of the Moon, played indoors on a cloudy day (left, simulated). You better mash that attack button since the background and foreground are barely distinguishable... Right: running in the mGBA emulator. With the ample contrast from a PC screen, the Skeleton Bomber on the right stands out.

This problem was not new but made even more pressing since, compared to its predecessor, the palette size dramatically increased and the picture resolution was boosted by 66%. These features delivered sharp pictures with little motion blur—as long as you could see what you were doing. Circle of the Moon did age beautifully, provided you play it on a more modern system that enables you to easily spot the eminent dangers in the castle corridors.

Instead of only attracting newcomers to the handheld gaming scene, engineers at Nintendo made sure to keep their regular customers happy as well by shipping the hardware with a second CPU: the trusty old Sharp LR35902. This enabled GBA machines to play GB and GBC games with no compromises. Since the GBA screen is horizontally-oriented and the original Game Boy was not, players were given the option to either play at the intended aspect ratio or to stretch the image to fill the GBA’s screen by pressing L orR. Both options come at a cost: either the actual image size is smaller than the original handheld, or the image is blurry.

Backwards compatibility was, and still is, a huge selling point. In the fall of 2001, home console players would be left in the cold again as Nintendo finally switched from cartridge-based games on the N64 to mini-DVDs on the GameCube. Fortunately, Nintendo handheld consoles were consistently developed with compatibility in mind: the GBC plays GB games, the GBA plays GB and GBC games, the Nintendo DS plays GBA games, and the Nintendo 3DS plays DS games. In addition, the Nintendo 3DS eShop sold various older handheld games. And yes, the Goomba Color emulator technically allows you to play GB/GBC games on your Game Boy Micro or Nintendo DS using a GBA flash cart.

Inspecting the lifespan of the Nintendo handhelds yields a few interesting facts. The clever decision to engineer revisions of the original hardware significantly prolonged the lifespan of the Game Boy, by then already going strong for seven years. The frequent hardware revision strategy became common for all Nintendo handhelds.

The lifespan of Nintendo handhelds and consoles. First group (lightgreen): the Game Boy family. Second group (darkgreen): the GBA family. Consoles are marked in orange. The Nintendo 3DS (not pictured), part of the eighth generation, saw the light in 2011.

The GBA was only two years old when it got its first revision, the SP—that’s a lot quicker than the seven year gap between the original and the Pocket. That faster cadence was maintained when looking at the Nintendo DS and 3DS. However, the total lifespan of the GBA—apart from the eccentric GB Micro—was much shorter than the GB family: the GBA came in 2001 and went in 2004 as the DS was released.

And yet, Nintendo did not at all drop support for the GBA. Then why would a company release another handheld that poses a threat to its own product? An IGN press report from November 2003 sheds some light on this:

Early Thursday morning Nintendo confirmed that it had posted a loss for the beginning half of the fiscal year, the first time ever in the company’s history. It also said it would develop a new game product which would not be the successor to GameCube or GBA for release in 2004, but no further details were specified.

The company was struggling to keep up in the console race, with Sony’s PlayStation stealing N64’s thunder and Microsoft entering the market with the Xbox in 2001. The GameCube was doing pretty bad compared to the PS2. Nintendo initially did not intend for the Nintendo DS to succeed the GBA, although in the end, it of course did. Instead, they were looking for something in-between a console and classic gaming handheld, something that would appeal to a much broader audience than just gamers. A slew of successful non-gamer friendly software titles such as Dr. Kawashima’s Brain Training and Nintendogs did just that.

Ultimately, the Nintendo DS family sold almost twice as much as the GBA family. The huge success of the DS launch did not stop Nintendo from releasing another revision for the GBA, though: the tiny Game Boy Micro. With dimensions of only 50×101×17.2 mm and a weight of 80 g, tiny is definitely the correct word to use. According to the then Vice President George Harrison, the Game Boy line was Nintendo’s testbed where they continuously and intentionally aspired to invent instead of merely reiterate.

However, inventing does not necessarily mean commercial success. Indeed, according to financial reports in 2007, the Game Boy Micro was Nintendo’s worst selling handheld ever, only clocking in about 2.5 million units. Then Nintendo President Satoru Iwata later admitted that the Nintendo DS may have kept the Micro from performing better sales-wise:

The sales of Micro did not meet our expectations. Micro showed different sales in and outside Japan. In Japan, initial sales of Micro were rather good and it did become a rather hot topic. So, there was the possibility for this product to grow in Japan. However, toward the end of 2005, Nintendo had to focus almost all of our energies on the marketing of DS, which must have deprived the Micro of its momentum. This is why Micro couldn’t meet our expectations in Japan.

Project Atlantis, the fitting codename for a new Game Boy, was sunk in 1997 for the same reason: Nintendo feared a new model would get in the way of the original Game Boy that still had a firm grasp of the handheld market. Atlantis was designed to be a proper successor to the original Game Boy, supposedly equipped with a 32-bit processor and a color screen. Instead, Nintendo again showed its conservative side by prolonging the Game Boy lifespan with the GBC. The 32-bit CPU idea was preserved in the freezer until 2001.

The overlapping lifespans of consoles compared to the handhelds is also worth looking at. When the GBC saw the light in 1998, the Nintendo 64 was already trying to prove itself as the SNES successor for two years. The lifespan of successive consoles and handhelds did shorten a little as the hardware engineering time decreased, eventually coinciding the lifespan of handhelds with consoles of the previous generation.

Several NES games were ported to GBC1, sometimes even boasting improved graphics (Dragon Quest I/II, Super Mario Bros.). The GBA has jokingly be called “a portable SNES”, and the Nintendo DS hardware made it possible to run a modified version of Super Mario 64 and Diddy Kong Racing on it, before the 3DS brought the big guns with The Legend of Zelda: Ocarina of Time 3D and even excellent ports of GameCube games such as Luigi’s Mansion.

The GBA PCBs, front (top) and back (bottom), revision 03.

There’s surprisingly little to see on the PCBs pictured above, as most components, including the second CPU, are embedded into the CPU-AGB casing in the middle. The chip on the top right is the 256 KB WRAM. The crystal to the left of the CPU runs at 4.194 MHz, as indicated on top o it: that is the same speed as the original Game Boy. The effective CPU speed was doubled for GBC and quadrupled for GBA games! The last visible chip on the front of the PCB is the LCD regulator marked AGB-REG, at the far left.

The back of the motherboard showcases the amplifier on the lower left (AMP AGB), close to the volume potentiometer and the speaker. The messy ribbon connector and red wire is the result of my unprofessional attempt at replacing the stock screen with a backlit one. The new screen taps into the power supply of the GBA via that wire. Since the CPU is located at the back of the PCB this time, no aluminum shield is required to separate it from the cartridge slot. However, great care is needed when reassembling the unit with a thicker custom screen that puts a fair amount of pressure on the CPU casing.

Technically, the GBA was about twice as fast as the GBC and had eight times that much memory:

Specification Value
CPU 16.8 MHz 32-bit ARM7TDMI and Sharp co-processor
Memory 256 KB work WRAM, 128 KB video VRAM
Cartridges up to 32 MB
Screen 240 width x 160 height (3:2 aspect ratio)
Colors 15-bit RGB
Audio Two 8-bit Digital-to-Analog Converters (DAC), four legacy channels from the original GB
Audio output Stereo

At first sight, these numbers look impressive compared to Nintendo’s SNES that housed the slower 16-bit Ricoh 5A22 processor. Both PPUs have similar capabilities, also including DMA and HDMA systems for speedy memory read/write cycles. However, perhaps the most noticeable difference is the lack of a proper audio subsystem. The two DAC components merely stream and convert bytes into analog audio waves (at a painfully low 8-bit resolution): mixing and applying effects had to be done in software, consuming precious CPU cycles. Nintendo hoped game developers partially relied on the Sharp co-processor to produce 8-bit sound effects, which in the end few games utilized efficiently.

Another feature borrowed from the SNES was the addition of the shoulder buttons. The action buttons X and Y had to wait until the DS. Whether the GBA designers borrowed the horizontal handheld layout from Sega’s Game Gear, we’ll probably never know. The decision did make the system more comfortable to hold and play for longer periods.

The screen of the GBC could already handle 15-bit RGB values but the small memory footprint of the hardware limited the color palette. The GBA, with its more comfortable memory size, increased the color range from 56 to 256 for the foreground and another 256 for the background.

The GBA SP

Instead of waiting seven years to redesign the first Advanced model, The GBA “SP” (codenamed AGS) was released only two years after the original GBA. The fairly new horizontal layout was promptly thrown out of the window in favor of a model that resembles the GBC, except that the clamshell design enabled you to fold it in half. A closed GBA SP, measuring only about 8.4 cm, was finally the ultimate pocketable gaming machine.

According to Nintendo, the redesign was issued to address two major complaints of the original design. First, it was supposedly uncomfortable to use, hence the new layout. In practice, most players complained that it was in fact the SP model that hurt their hands, locking fingers in a cramped position because of the new clamshell design.

Second, the complaints about the contrast of the screen were finally taken into account. Well, not entirely. The first SP model (AGS-001) included an internal front-light that could be toggled on and off. It acts as a Game Boy with an embedded Light Boy accessory. The result was better than the first GBA, but still not great, as frontlit LCD screens tend to have washed out colors. Fortunately, the revised revision (AGS-101) finally included a proper backlit screen—although you had to wait another two years for that, and Europeans were disadvantaged again with the extremely limited availability.

The GBA SP ended an era of hauling a slew of AA (or, in the case of the Pocket, AAA) batteries together with your favorite handheld system. An integrated rechargeable lithium-ion battery compartment was finally introduced, that, according to the official specifications, should keep you busy up to eighteen hours, provided the screen light is disabled. The comfort that was gained with the charging system was suddenly lost once you wanted to plug in your headphones: the headphone jack disappeared, in favor of an awkward adapter—to be bought separately, of course—that plugs into the same AC port as the power charger. Without even more adapters, it was impossible to use headphones and charge the system simultaneously. Take that, spare AA batteries.

The Game Boy Micro

In 2005, Nintendo brought back the horizontal layout by introducing the Game Boy Micro, codenamed OXY. It looks like the result of a how-small-can-we-possibly-go contest: measuring just 10 cm wide and 5 cm long, and weighing only 80 g, this petite Game Boy could even slip in the same pocket as your keys. Although you better not do that since the casing was metallic this time and thus scratches easily. Also, you run into the risk of fishing out your Micro instead of your car key since both are equally small.

A comparison of Game Boy screen sizes (actual size). Dimensions in millimeter. GBA aspect ratios are 3:2 (in purple), while GB aspect ratios are approximately 10:9 (in orange for GBC, in gray for Pocket and Classic).

Roughly one third of the screen size disappeared. However, the reduced size did deliver extremely crisp images as the pixels per inch or PPI increased from 99 to 144. Together with a nice and bright backlit screen, the Micro almost creates the illusion of being able to play GBA games at High Definition image quality. Oh, and the headphone jack returned.

Is this the one Game Boy system to rule them all? That depends on whether backwards compatibility is high on your priority list, as Nintendo decided, for the very first time in their handheld division, to drop support for older Game Boy cartridges. It might not come to a big surprise considering the physical size of the machine. A small portion of the 8-bit Sharp CPU is still present since GBA games could make use of its 4-channel APU.

A comparison of Game Boy device sizes (actual size). Dimensions in millimeter.

The GBA CPU Architecture

A simplified schematic of the GBA hardware architecture. Memory components are marked in orange (BIOS ROMs not depicted). Components are connected with a hybrid 8-bit (GBC ROM, SRAM) Based on Rodrigo Copetti's Game Boy Diagram.

Twelve years of technological progress since the original Game Boy is clearly reflected in the GBA architecture. For example, the meager 8KB RAM module that used to sit on the PCB next to the CPU has now been partially embedded by splitting RAM into bigger IWRAM (Internal Work RAM) and EWRAM (External Work RAM) chunks, thus enabling faster data access speeds.

The presence of the 8-bit Sharp CPU did not mean work could be offloaded to this co-processor. It was only there to make sure the machine was backwards compatible with older game cartridges. However, programmers could still access the older CPU’s APU unit to produce retro sounds. A hardware cartridge selector switch determines which CPU and BIOS to activate.

Even if the CPU can indeed work with 32-bit data without consuming extra cycles, most components, including most memory blocks (except IWRAM), are connected to the central system using only a 16-bit bus. This decision, and the absence of a proper music chip, further reduced manufacturing costs.

Another difference compared to the original GB is the lack of a MBC component on GBA cartridges. The full 32 MB ROM data is mapped onto the address space without having to read in chunks of a few kilobytes. The GBA still relies on the memory-mapped IO concept and there is, for better or for worse, still no Operating System to deal with. The address space did increase from 16-bit ($0000 to $FFFF on the GB) to 32-bit ($00000000 to $FFFFFFFF) to accommodate the increased size of most subsystems. The full address space looks like this:

A visual representation of the GBA's 32-bit address space. Based on DuoDreamer's DreamScape Game Boy Memory Map.

  • $00000000-$00003FFF—16 KB, 32-bit: System ROM, containing the BIOSes and special system functions that could be executed (such as DMA and calculating the square root) but not read.
  • $02000000-$02030000—256 KB, 16-bit: EWRAM. Address space used to store temporary variables, external to the CPU.
  • $03000000-$03007FFF—32 KB, 32-bit: IWRAM. Faster but smaller RAM to store temporary variables, internal to the CPU.
  • $04000000-$040003FF—1 KB, 16-bit: IO Registers. Various external inputs and interrupts can be controlled here. This is a very dense area: almost every bit has a special meaning.
  • $05000000-$070003FF—98 KB, 16-bit: VRAM. The first kilobyte stores two palettes of 256 color entries. The last kilobyte contains object data. The rest of the 96 KB is reserved for tiles and background data used by the PPU to draw the screen.
  • $08000000-$????????—?? KB, 16-bit: Cartridge ROM, that can take up to 32 MB in space, 16 times the depicted purple block. The same anti-piracy protection persisted that required game developers to license and store the Nintendo logo at a specific location.
  • $0E000000-$????????—?? KB, 8-bit: Cartridge RAM. This RAM is external and optional. Contrary to most older GB games, GBA games almost always come equipped with a RAM chip, on average about 64 KB.

In essence, programming on the GBA did not differ much from the GB. It was still a machine that required you to fiddle with addresses. Fortunately, this time, it came with the advantages of software development in the early twenty-first century: high level programming languages, proper debug tools, better compilers and documentation, and last but not least: internet access in case things go awry.

Games were written in the C programming language, so developers were finally relieved of constructing intricate but tiresome assembly routines. Sadly, the latter statement turned out to be too good to be true.

Processing Instructions

The GBA CPU chip came with a few perks that were absolutely essential to master. Instead of its older brother, the 8-bit Sharp CPU, the 32-bit ARM7TDMI RISC CPU did not come equipped with one but with two instruction sets!

The ARM company created their own standard instruction set for all ARM CPUs, thoughtfully called the ARM instruction set. The CPU understands 32-bit A32 instructions, which is an ARM dialect in pre-ARMV8 architectures. Instead of the 8-bit Game Boy load instruction LD A,B, we now write LDR A,B. The result is the same: LD (LoaD) or LDR (Load RegisteR) are just verbs of different languages. Since the ARM core is much more advanced than the older Sharp core, its instruction set is more powerful. Remember, the ARM7 processor has 16 32-bit registers available.

However, ARM code translates into 32 occupied bits. That is a lot on a limited machine. 00010111000000000001100000000111 is just one instruction. To increase the code density, programmers could decide to use the second available instruction set instead, called Thumb mode. These Thumb instructions are a subset of ARM: everything that can be expressed as Thumb can also be written as ARM, but not (that easily) the other way around. Thumb instructions only occupy 16 bits, half of the size of ARM code.

The most compelling reason to use Thumb was not its reduced size, but its increased speed while accessing 16-bit bus subsystems such as EWRAM. The problem with using a 32-bit based instruction set such as ARM is that everything will be expressed as 32-bit data. That means 16-bit data will get converted into 32-bit data, wasting a precious cycle in the process. Since most GBA subsystems have 16-bit buses, it only makes sense to use a 16-bit instruction set: Thumb.

In practice, both ARM and Thumb saw its use. Most C code would get compiled into Thumb assembly, after which hand optimizing crucial sections will be done in ARM, also making use of the 32-bit IWRAM. Creating games that brought out the best of the GBA ultimately required proficiency of three programming languages: C, ARM assembly, and Thumb assembly.

Another big advantage of the ARM7TDMI2 core is the embedded three-stage pipeline. A typical fetch/decode/execute machine cycle is a sequential process: the next fetch will have to wait until the decode and execute steps have been completed. With instruction pipelining, parallelism can be achieved on a single processor.

The idea is simple: keep all parts of the CPU as busy as possible. Once the first instruction has been fetched, it can be decoded. But the part responsible for fetching should not idle: instead, it can already fetch the second instruction, and so forth. Pipelining significantly reduces throughput time, as visible in the figure below.

Pipelining instructions comes with its own set of problems. For instance, what if the second instruction is dependent on the output of the first? There are multiple workarounds possible that go far beyond the scope of this article. Pipelining is also easier to achieve on the GBA than on the GB since a RISC processor guarantees each execute block will take only one cycle.

A three-stage instruction pipeline model: the second instruction starts its fetching procedure while the first is still decoding.

Imagine your mother doing laundry. In the above figure, replace “Fetch” with washing, “Decode” with drying, and “Execute” with ironing. The GBA CPU (or should we call it mom?) is able to put clothes from the washing machine directly into the dryer, enabling a second batch to be immediately washed. In the meantime, dried clothes can be ironed. The original Game Boy puts clothes in the washing machine and waits for the program to finish. It then transfers them to the dryer to wait some more. Hopefully it is clear that the GBA mother is a lot more efficient in doing laundry!

Thanks to the rise of online communities, the GBA homebrew development scene is still thriving. Toolchains such as devkitPro make it easy for enthusiasts to create GBA games using newer multi-paradigm programming languages like Rust and C++. This makes the GBA an excellent choice for learning about both hardware architecture and software development: its predecessor lacked software development tools and its successor increased the hardware complexity.

GBA Accessories

Hardware accessory producers rejoiced every time Nintendo released a new revision of their popular Game Boy franchise, and the release of the GBA was no different. Since it took two years for Nintendo to ship a GBA with any kind of screen light, different “Afterburner” products sold well. With these modification kits, you could install a frontlight on your GBA yourself, although sometimes soldering and tinkering with the plastic housing was required.

The Pokémon FireRed/LeafGreen GBA remakes included an official Wireless Adapter that finally got rid of the Link Cable, allowing players to catch ’em all without catching each other. More than thirty players could simultaneously join a lobby to battle or trade. In the end, only a few games supported Wireless play, since it was released in 2004, the same year as the Nintendo DS launched. Many accessories, including this one, were incompatible with the smaller Game Boy Micro.

If you squint your eyes, the Wireless Adapter even looks like a Pokémon.

A successor to the SNES Super Game Boy called the Game Boy Player allowed you to play GB, GBC, and GBA games on your GameCube. Some games even provided force feedback through the GameCube controllers. The Nintendo GameCube GBA Cable was an equally impressive way to connect the GBA to the GC, allowing you to play minigames or even use the GBA as an extra screen for a second player.

In case you don’t own a GameCube: the unofficial Super Retro Advance cartridge allows you to play GBA games on your SNES. It has its own separate composite-video only output. You’ve been warned. A (very expensive) Nintendo 64 solution also exists, called Wide Boy 64.

Then there’s the e-Reader, a Game Changer that lets you put three games in it and change on-the-fly, the obligatory cheat device, a Japan-only cartridge that lets you play movies and music from an SD card called Play-Yan, a car charger, and a battery replacement kit.

There was sadly nothing that could top the original Game Boy’s Sonar Sensor…


This article is part of a technical essay bundle entitled Inside The Game Boy Game Loop. See also: An Ode To Game Boy Cartridges (2023) and Historical Usage of Memory Bank Controllers in Game Boy Carts (2023).


  1. You might be inclined to think the GBC is capable of handling SNES games after playing the Donkey Kong Country on it. In reality, the game was completely remade for the GBC hardware. The term port is used rather loosely. ↩︎

  2. Interesting other uses of the ARM7 processor include: a Nintendo DS sound output and Wi-Fi support co-processor, the Sega Dreamcast sound chip, and two ARM7TDMI-derived CPUs inside 1st to 5th generation Apple iPods. ↩︎

Related topics: / gameboy /

By Wouter Groeneveld on 27 March 2025.  Reply via email.

Read the whole story
mrmarchant
11 hours ago
reply
Share this story
Delete

Anthropic can now track the bizarre inner workings of a large language model

1 Share

The AI firm Anthropic has developed a way to peer inside a large language model and watch what it does as it comes up with a response, revealing key new insights into how the technology works. The takeaway: LLMs are even stranger than we thought.

The Anthropic team was surprised by some of the counterintuitive workarounds that large language models appear to use to complete sentences, solve simple math problems, suppress hallucinations, and more, says Joshua Batson, a research scientist at the company.

It’s no secret that large language models work in mysterious ways. Few—if any—mass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science.

But it’s not just about curiosity. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and can’t do. And it would show how trustworthy (or not) they really are.

Batson and his colleagues describe their new work in two reports published today. The first presents Anthropic’s use of a technique called circuit tracing, which lets researchers track the decision-making processes inside a large language model step by step. Anthropic used circuit tracing to watch its LLM Claude 3.5 Haiku carry out various tasks. The second (titled “On the Biology of a Large Language Model”) details what the team discovered when it looked at 10 tasks in particular.

“I think this is really cool work,” says Jack Merullo, who studies large language models at Brown University in Providence, Rhode Island, and was not involved in the research. “It’s a really nice step forward in terms of methods.”

Circuit tracing is not itself new. Last year Merullo and his colleagues analyzed a specific circuit in a version of OpenAI’s GPT-2, an older large language model that OpenAI released in 2019. But Anthropic has now analyzed a number of different circuits as a far larger and far more complex model carries out multiple tasks. “Anthropic is very capable at applying scale to a problem,” says Merullo.

Eden Biran, who studies large language models at Tel Aviv University, agrees. “Finding circuits in a large state-of-the-art model such as Claude is a nontrivial engineering feat,” he says. “And it shows that circuits scale up and might be a good way forward for interpreting language models.”

Circuits chain together different parts—or components—of a model. Last year, Anthropic identified certain components inside Claude that correspond to real-world concepts. Some were specific, such as “Michael Jordan” or “greenness”; others were more vague, such as “conflict between individuals.” One component appeared to represent the Golden Gate Bridge. Anthropic researchers found that if they turned up the dial on this component, Claude could be made to self-identify not as a large language model but as the physical bridge itself.

The latest work builds on that research and the work of others, including Google DeepMind, to reveal some of the connections between individual components. Chains of components are the pathways between the words put into Claude and the words that come out.  

“It’s tip-of-the-iceberg stuff. Maybe we’re looking at a few percent of what’s going on,” says Batson. “But that’s already enough to see incredible structure.”

Growing LLMs

Researchers at Anthropic and elsewhere are studying large language models as if they were natural phenomena rather than human-built software. That’s because the models are trained, not programmed.

“They almost grow organically,” says Batson. “They start out totally random. Then you train them on all this data and they go from producing gibberish to being able to speak different languages and write software and fold proteins. There are insane things that these models learn to do, but we don’t know how that happened because we didn’t go in there and set the knobs.”

Sure, it’s all math. But it’s not math that we can follow. “Open up a large language model and all you will see is billions of numbers—the parameters,” says Batson. “It’s not illuminating.”

Anthropic says it was inspired by brain-scan techniques used in neuroscience to build what the firm describes as a kind of microscope that can be pointed at different parts of a model while it runs. The technique highlights components that are active at different times. Researchers can then zoom in on different components and record when they are and are not active.

Take the component that corresponds to the Golden Gate Bridge. It turns on when Claude is shown text that names or describes the bridge or even text related to the bridge, such as “San Francisco” or “Alcatraz.” It’s off otherwise.

Yet another component might correspond to the idea of “smallness”: “We look through tens of millions of texts and see it’s on for the word ‘small,’ it’s on for the word ‘tiny,’ it’s on for the word ‘petite,’ it’s on for words related to smallness, things that are itty-bitty, like thimbles—you know, just small stuff,” says Batson.

Having identified individual components, Anthropic then follows the trail inside the model as different components get chained together. The researchers start at the end, with the component or components that led to the final response Claude gives to a query. Batson and his team then trace that chain backwards.

Odd behavior

So: What did they find? Anthropic looked at 10 different behaviors in Claude. One involved the use of different languages. Does Claude have a part that speaks French and another part that speaks Chinese, and so on?

The team found that Claude used components independent of any language to answer a question or solve a problem and then picked a specific language when it replied. Ask it “What is the opposite of small?” in English, French, and Chinese and Claude will first use the language-neutral components related to “smallness” and “opposites” to come up with an answer. Only then will it pick a specific language in which to reply. This suggests that large language models can learn things in one language and apply them in other languages.

Anthropic also looked at how Claude solved simple math problems. The team found that the model seems to have developed its own internal strategies that are unlike those it will have seen in its training data. Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95.

And yet if you then ask Claude how it worked that out, it will say something like: “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” In other words, it gives you a common approach found everywhere online rather than what it actually did. Yep! LLMs are weird. (And not to be trusted.)

The steps that Claude 3.5 Haiku used to solve a simple math problem were not what Anthropic expected—they’re not the steps Claude claimed it took either.
ANTHROPIC

This is clear evidence that large language models will give reasons for what they do that do not necessarily reflect what they actually did. But this is true for people too, says Batson: “You ask somebody, ‘Why did you do that?’ And they’re like, ‘Um, I guess it’s because I was— .’ You know, maybe not. Maybe they were just hungry and that’s why they did it.”

Biran thinks this finding is especially interesting. Many researchers study the behavior of large language models by asking them to explain their actions. But that might be a risky approach, he says: “As models continue getting stronger, they must be equipped with better guardrails. I believe—and this work also shows—that relying only on model outputs is not enough.”

A third task that Anthropic studied was writing poems. The researchers wanted to know if the model really did just wing it, predicting one word at a time. Instead they found that Claude somehow looked ahead, picking the word at the end of the next line several words in advance.  

For example, when Claude was given the prompt “A rhyming couplet: He saw a carrot and had to grab it,” the model responded, “His hunger was like a starving rabbit.” But using their microscope, they saw that Claude had already hit upon the word “rabbit” when it was processing “grab it.” It then seemed to write the next line with that ending already in place.

This might sound like a tiny detail. But it goes against the common assumption that large language models always work by picking one word at a time in sequence. “The planning thing in poems blew me away,” says Batson. “Instead of at the very last minute trying to make the rhyme make sense, it knows where it’s going.”

“I thought that was cool,” says Merullo. “One of the joys of working in the field is moments like that. There’s been maybe small bits of evidence pointing toward the ability of models to plan ahead, but it’s been a big open question to what extent they do.”

Anthropic then confirmed its observation by turning off the placeholder component for “rabbitness.” Claude responded with “His hunger was a powerful habit.” And when the team replaced “rabbitness” with “greenness,” Claude responded with “freeing it from the garden’s green.”

Anthropic also explored why Claude sometimes made stuff up, a phenomenon known as hallucination. “Hallucination is the most natural thing in the world for these models, given how they’re just trained to give possible completions,” says Batson. “The real question is, ‘How in God’s name could you ever make it not do that?’”

The latest generation of large language models, like Claude 3.5 and Gemini and GPT-4o, hallucinate far less than previous versions, thanks to extensive post-training (the steps that take an LLM trained on the internet and turn it into a usable chatbot). But Batson’s team was surprised to find that this post-training seems to have made Claude refuse to speculate as a default behavior. When it did respond with false information, it was because some other component had overridden the “don’t speculate” component.

This seemed to happen most often when the speculation involved a celebrity or other well-known entity. It’s as if the amount of information available pushed the speculation through, despite the default setting. When Anthropic overrode the “don’t speculate” component to test this, Claude produced lots of false statements about individuals, including claiming that Batson was famous for inventing the Batson principle (he isn’t).

Still unclear

Because we know so little about large language models, any new insight is a big step forward. “A deep understanding of how these models work under the hood would allow us to design and train models that are much better and stronger,” says Biran.

But Batson notes there are still serious limitations. “It’s a misconception that we’ve found all the components of the model or, like, a God’s-eye view,” he says. “Some things are in focus, but other things are still unclear—a distortion of the microscope.”

And it takes several hours for a human researcher to trace the responses to even very short prompts. What’s more, these models can do a remarkable number of different things, and Anthropic has so far looked at only 10 of them.

Batson also says there are big questions that this approach won’t answer. Circuit tracing can be used to peer at the structures inside a large language model, but it won’t tell you how or why those structures formed during training. “That’s a profound question that we don’t address at all in this work,” he says.

But Batson sees this as the start of a new era in which it is possible, at last, to find real evidence for how these models work: “We don’t have to be, like: ‘Are they thinking? Are they reasoning? Are they dreaming? Are they memorizing?’ Those are all analogies. But if we can literally see step by step what a model is doing, maybe now we don’t need analogies.”

Read the whole story
mrmarchant
16 hours ago
reply
Share this story
Delete

The Importance of Distrust in Trusting Digital Worker Chatbots

1 Share

Adopting and implementing digital automation technologies, including artificial intelligence (AI) models such as ChatGPT, robotic process automation (RPA), and other emerging AI technologies, will revolutionize many industries and business models. It is forecasted that the rise of AI will impact a wide range of job functions and roles. White-collar positions such as administrative, customer service, and back-office roles will all be impacted by AI-fueled digital automation. The adoption of digital workers is currently positioned in the early adopter phase of the product lifecycle.1 AI-driven digital workers are expected to substantially alter many white-collar tasks, including finance, customer support, human resources, sales, and marketing.42

A study from Oxford University and Deloitte predicts AI is a significant risk to the white-collar workforce. According to the study, approximately 47% of white-collar jobs could be eliminated or reduced within 20 years due to AI-powered automation of critical business functions.4 According to Bill Gates “[t]he development of AI is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone; entire industries will reorient around it. Businesses will distinguish themselves by how well they use it.”13 Digital workers, or software robots, automate routine tasks, process and analyze large volumes of data, and interact with other software systems. They have been successfully integrated into the banking, healthcare, and customer service sectors, among several other industries, enhancing efficiency and reducing human error.25

Over the past several years, AI has advanced rapidly, particularly with the development of large language models (LLMs). These LLMs have enabled machines to understand language and generate human-like text with unprecedented accuracy, and have opened new avenues for applications such as chatbots and virtual assistants. One such LLM is ChatGPT, developed by OpenAI, a state-of-the-art language model leveraging unsupervised learning to generate contextually relevant and coherent text. Its capabilities extend to text translation, summarization, software code generation, and even creative writing, marking its potential as a disruptive force in the white-collar job market.37

Key Insights

  • Anthropomorphism drives AI adoption: This study shows that how human-like an AI agent appears has a greater influence on hiring decisions than trust. Familiarity fosters psychological comfort, making anthropomorphism a critical factor in AI acceptance.

  • Distrust triggers emotional reactions: While trust impacts hiring intentions, distrust plays a stronger role in feelings of embarrassment when disclosing sensitive information. Addressing distrust is essential for improving human-AI interactions.

  • AI’s growing role in white-collar work: AI-driven digital workers are reshaping white-collar jobs. Their success will depend not only on efficiency and accuracy but also on perceived anthropomorphism and ease of use.

  • Trust and distrust are distinct: Trust encourages AI adoption, but distrust is a separate construct with unique effects on user experience. Managing both is key to evaluating AI adoption in professional settings.

Several technologies complement and extend the capabilities of AI use cases such as RPA, often simply referred to as “bots.” Bots automate routine and repetitive tasks. Leading technology companies are expanding the capabilities of RPA solutions to include advanced analytic and cognitive features. Technology companies are positioning these new features of RPA solutions as digital workers. A digital worker is much more capable than a bot. A bot can be programmed to execute tasks, while a digital worker can do much more. Digital workers understand human interaction. They are configured to respond to questions and act on a human’s behalf. In theory, however, humans continue to have control and authority over digital workers while realizing the benefits of enhanced productivity. They improve and augment human interaction by combining AI, machine learning, RPA, and analytics while automating business functions from beginning to end. Forrester Research defines digital worker automation as a combination of information architecture (IA) building blocks, such as conversational intelligence, that works alongside employees.25

Digital workers enabled by technology such as AI, voice recognition, and natural language processing (NLP) can understand commands, respond to questions, and act on requests such as playing music, checking the weather, or placing grocery orders.21 In addition to voice recognition, it is now possible to interact with a digital worker via a webcam. Digital workers can also recognize and react to expressions and emotions and have advanced conversational capabilities. They have human-like features that can be tailored to the role, language, culture, personality, and demographic factors of their human communication partners. The technology that enables digital workers is progressing rapidly and is creating a sizable market opportunity for software companies that build and deliver digital workers. The successful implementation of digital workers represents a potential disruption to many components of the white-collar workforce, both in job displacement and job transformation.

Trust and AI Anthropomorphism

Trust is a central element in how we interact with other people. It is “the willingness of a party to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action important to the trustor, irrespective of the ability to monitor or control that other party.”30 In essence, trusting as it relates to other people means putting aside concerns about the possible inappropriate behavior of the party being trusted.28 The psychological importance of trust is based on people’s need to understand their social environment. However, doing so is complicated because that social environment is overwhelmingly complex considering that each party involved is a free agent whose behavior may not always be rational.14,28 Confronted with that overwhelming task, people make assumptions about what other people will and will not do. Those assumptions are what trust is about.15 Trusting thus involves assuming away many possible behaviors so that the social environment may become comprehensible.15 That is, by trusting others, people set aside their concerns about the many possible behaviors those others may engage in and adopt instead an assumption that those others will behave as expected. Trust is important in business contexts, perhaps even more than cost because it subjectively reduces business-related risks and uncertainty.17 Trust and control complement and substitute for each other.43

In the context of AI, the perceived anthropomorphism of an AI chatbot can increase this trust,5,20 as shown by previous research that has mainly examined innocuous contexts, such as asking Siri questions35 or involving riskless artificial experimental contexts.8,44 Anthropomorphism is the attribution of distinctly human attributes to a non-human agent,10 consequently treating artificial artifacts as though they are human and even forming an attachment to that non-human agent.3 Anthropomorphism is becoming key to the acceptance of robots too.22 The underlying theory used in much of this research stems from the “computers are social actors” (CASA) paradigm,38 which suggests that people treat computers, and by extension AI agents, as they treat other people. The logic often associated with perceived anthropomorphism is that initial trust (that is, before knowing the other party31) is influenced by familiarity, which is increased when the AI looks and/or behaves like a human.2 This need to understand the environment in which one interacts through increased familiarity is a central argument for why anthropomorphism may increase trust.2,11

Distrust

Our contention in this study is that distrust should be added to the fray. Trust and distrust are two distinct but related constructs that operate in tandem.27,34 As fMRI research shows, trust is predominantly a rational assessment aimed at building cooperation (because the neural correlates of trust are mainly associated with brain regions that deal with higher-order decision-making; for reviews, see Krueger and Meyer-Lindenberg24). Distrust, in contrast, is mainly associated with brain regions that deal with fear and emotional responses.9,39 Indeed, distrust has been discussed as an emotional response to a perceived threat.41 Trust and distrust are intertwined and together determine how people approach a relationship, but they are clearly distinct phenomena.27 What follows is that in contrast to the aforementioned relationship about trust in AI (see also the summary in Beck et al.2), where the discussion centers on trust being essential to promote interaction, or at least intentions to do so, it is possible that distrust will also play a key role.

To better understand the role of distrust, drawing on research on interpersonal relationships,27 people consider not only their initial trust but also their initial distrust when assessing people or organizations they do not know. In this dual consideration, initial trust is about assessing the other party with the objective of creating a constructive relationship based on the assumed trustworthiness of the other party.30 It is about setting aside concerns about the possible unconstructive behavior of that other party.15 In contrast, distrust is precisely about paying close attention to such concerns.27 To account for that distinction, this study looks at such concerns in the context of disclosing potentially embarrassing (as an exemplar of an emotional context) information to a CPA. Past research that did not consider distrust has shown that low trust is a predictor of embarrassment in providing information, and, moreover, that consumers prefer an AI avatar over a human CPA because they believe there will be no embarrassing social judgement by an avatar and hence, they would be more open to disclosing information.23

According to the literature survey by Israelsen and Ahmed,20 distrust in the context of AI avatars has been largely ignored in previous research on trust in AI agents, However, as AI agents move from innocuous tasks into more risky areas, distrust should become a key consideration. This is especially true given that even the popular press is now reporting on its tendency to provide “hallucinational” responses.6 Such risks are prominent in the context of this study, which deals with providing tax information to a CPA, where accurate information is critical and where there is a distinct risk of identity theft (which is one of the reasons CPA is a legally regulated profession).

The Experiment

In our experiment, people were tasked with hiring a CPA to help them prepare their annual tax returns after losing $60K on the stock market (data was collected in March 2023 when the market was going down). In this between-subjects design experiment, people were randomly assigned to either a human CPA agent or an explicit AI avatar CPA agent. Then they filled in a questionnaire about their trust and distrust in the agent, its anthropomorphism and intelligence combined with their intentions to hire it, and how comfortable they would feel in disclosing information to that agent. The questions were identical in both treatments. The only difference was whether they were exposed to a human agent or an avatar. The study was approved by the Drexel University IRB protocol #2303009777.

Survey participants were recruited and paid using the Centiment survey recruitment service. The use of such panel administrators to collect data is becoming more prevalent, with almost 20% of articles in leading management science journals applying it as of 2020, up from about 10% in 2010.19

The target audience was people in the U.S. 18 or older and within 5% of the census average for age, gender, and race/ethnicity. The scales were adapted from previous studies: Trust based on Gefen et al.,14 distrust on McKnight and Choudhury,33 anthropomorphism on Bartneck et al.1 and Moussawi et al.,35 and intelligence on Moussawi et al.35 In the above, Gefen et al.14 showed that potential adopters of what was then a new technology are influenced not only by their rational assessments of its perceived usefulness and ease of use but, importantly, by whether they trust the organization behind the IT. Adding distrust, McKnight and Choudhury33 showed that trust and distrust are two unique constructs that have opposite effects on willingness to share information and purchase. Embarrassment was based on the themes in Dahl et al.7 The Intended Hiring scale was developed for this study. All the questionnaire items, other than the demographics, used a seven-point Likert scale anchored at 1 = “Strongly Disagree,” 2 = “Disagree,” 3 = “Somewhat Disagree,” 4 = “Neither Agree nor Disagree,” 5 = “Somewhat Agree,” 6 = “Agree,” and 7 = “Strongly Agree.”

After clicking their consent to participate, subjects were asked to watch a 60-second video clip. This clip was either a real clip of an adult Caucasian woman in her thirties or forties advertising her CPA company, or an equivalent clip of an avatar, also of an adult Caucasian woman in her thirties or forties, but instead of talking about her company, she admits to being an avatar. The avatar was created with software from Synthesia, which enables content creators to use a wide range of lifelike avatars with customizable demographic characteristics and the ability to speak in more than 120 languages and accents.

Below is the text of the human agent:

We want to understand your personal and business goals. Only then can we customize your needs into the right tax strategies that work to your advantage. Solid planning can protect your assets, maximize their value, and minimize your tax burden. Your financial situation can change as time goes on and most certainly tax laws will change as well. Our firm constantly monitors federal, state, and local tax changes that may affect you. We form a partnership of communication with our clients so when conditions change, we’re ready to protect you from unnecessary tax expense. Our firm is here to help you with personal taxation and savings opportunities, choosing the right business entity for tax purposes, employee benefit, and retirement programs; education and gift-giving programs; tax considerations and retirement benefits; and trust and estate planning. If you have any questions regarding your business, we can help. Call us today.

This text was modified for the avatar by adding a preface saying “My name is Anna and I am a digital worker that has been trained by an artificial intelligence tool to be a tax expert. You will be able to interact with me by using a webcam and microphone,” and replacing “Call us today” with “If you have any questions, I am available 24/7 and only a click away to assist you.”

After clicking an acknowledgment that they watched the video, the subjects were told that:

Unfortunately, you lost $60K due to selling shares in a company on the stock market . You will need to discuss with the expert what steps you need to take to report this loss on your U.S. Federal income taxes.

After that introduction, the subjects proceeded to complete the survey; survey items are shown in Table 1. We added two manipulation-check questions right after the video clip: “The tax expert in the video (henceforth “the expert”) appears to be a real human being” and “The expert seems energetic.” The human was significantly (t=9.86, p-value<.001) assessed as more human (mean=5.79, std.=1.51, n=408) than the avatar (mean=4.62, std.=1.93, n=411), and likewise more energetic (t=7.83, p-value<.001; human mean=5.29, std.=1.36, n=389; avatar mean=4.43, std.=1.69, n=410).

Table 1. 

Survey items and their standardized loadings.
Trust in the Agent Loading (SE)
I expect that the expert will be honest with me. 0.811 (.014)
I expect that the expert will show care and concern towards me. 0.782 (.015)
I expect that the expert will provide good tax advice to me. 0.855 (.011)
I expect that the expert will be trustworthy. 0.889 (.009)
I expect that I will trust the expert. 0.872 (.010)
Distrust in the Agent  
I am not sure that the expert will act in my best interest. 0.807 (.015)
I am not sure that the expert will show adequate care and concern toward me. 0.809 (.015)
I am worried about whether the expert will be truthful with me. 0.838 (.013)
I am hesitant to say that the expert will keep its commitments to me. 0.785 (.016)
I distrust the expert. 0.769 (.017)
Perceived Agent Intelligence  
The expert speaks in an understandable manner. 0.597 (.024)
The expert will be friendly. Dropped
The expert will be respectful. 0.770 (.016)
I expect that the expert will complete tasks quickly. 0.717 (.019)
I expect that the expert will understand my requests. 0.817 (.013)
I expect that the expert will communicate with me in an understandable manner. 0.836 (.012)
I expect that the expert will find and process the necessary information to complete tasks relating to my needs. 0.871 (.010)
I expect that the expert will provide me with useful answers. 0.874 (.010)
I expect that the expert will be interactive. Dropped
I expect that the expert will be responsive. 0.783 (.015)
Perceived Agent Anthropomorphism  
The expert seems happy. 0.753 (.017)
The expert will be humorous. 0.574 (.025)
The expert will be caring. 0.799 (.015)
The expert seems energetic. 0.866 (.011)
The expert seems lively. 0.871 (.011)
The expert seems authentic. 0.840 (.012)
Intention to Hire  
I plan to contract with this expert to prepare my tax returns. 0.823 (.013)
Hiring this expert to prepare my tax returns is something I would consider seriously. 0.918 (.007)
I would pay this expert to prepare my tax returns. 0.920 (.007)
Hiring this expert to prepare my tax returns is okay by me. 0.909 (.008)
Being Embarrassed  
I expect that I will feel embarrassed discussing my tax question with the expert. 0.864 (.011)
I expect that I will be uncomfortable discussing my tax question with the expert. 0.915 (.009)
I expect that I will feel awkward discussing my tax question with the expert. 0.892 (.010)

The age range of the respondents was: 30 between 18-19, 91 between 20 and 24, 172 between 25 and 34, 171 between 35 and 44, 132 between 45 and 54, 118 between 55 and 64, 112 between 65 and 74, and 41 were 75 and older; two did not answer. Moreover, 382 participants were male, 482 female, and six declined to answer.

In terms of education level, 30 had less than a high school education, 246 graduated high school, 222 had some college experience, 94 had earned a two-year degree, 161 had earned a four-year degree, 93 had earned a professional degree, 21 held a doctorate, and three preferred not to say.

As far as reported ethnicity, 520 were Caucasian, 171 were African American, 21 were American Indian or Alaska Native, 30 were Asian, four were Native Hawaiian or Pacific Islander, 122 were Latino (noted as Latin X in the survey), and 58 chose “Other” (people were allowed to select more than one ethnicity). After deleting rows with missing data list-wise, the combined sample size used for data analyses was n = 781.

Data Analysis

The data was analyzed using Mplus, a covariance structural equation modeling (CBSEM) package that assesses the measurement model (how the items load on their assigned latent construct in a confirmatory factor analysis, CFA) simultaneously with the structural model (how these latent constructs relate to each other) using maximum likelihood.36 We posited that people would be less trusting and more distrustful of an avatar—perceiving it as less anthropomorphic and less intelligent—and that this would predict their willingness to hire the agent (whether human or avatar) and the level of embarrassment they would feel discussing their tax situation with the agent. The overall model fit indices were: χ4942=1919.312, RMSEA = .061, CFI = .931, and TLI = .922. Those numbers indicate a good model fit.16 Standardized item loadings are shown in Table 1. All the items loaded significantly at a p-value < .001 level. The latent constructs to which the items refer are shown in bold.

The standardized loadings of the structural model appear in the the Figure. Paths not shown are insignificant (Perceived Agent Intelligence on Intention to Hire Γ = -.010, SE = .060, p-value = .865; Trust in the Agent on Being Embarrassed Γ = .031, SE = .075, p-value = .675; correlation of Being Embarrassed with Intention to Hire y = .045, SE = .042, p-value = .287). Importantly, viewing the clip with an avatar rather than a human CPA (avatar in the Figure) did not affect Intention to Hire (β = .029, SE = .028, p-value = .307), Being Embarrassed (β = .036, SE = .034, p-value = .287), Trust in the Agent (β = .004, SE = .024, p-value = .858), Distrust in the Agent (β = .024, SE = .038, p-value = .530), but did significantly decrease Perceived Agent Anthropomorphism (β = -.307, SE = .034, p-value <.001) and Perceived Agent Intelligence (β = -.138, SE = .036, p-value <.001). Age decreased Intentions to Hire (β = -.140, SE = .026, p-value <.001) and Being Embarrassed (β = -.144, SE = .032, p-value = .287), but was insignificant on Trust in the Agent (β = .002, SE = .022, p-value = .932), Distrust in the Agent (β = -.034, SE = .035, p-value = .337), Perceived Agent Anthropomorphism (β = -.009, SE = .036, p-value = .809), and Perceived Agent Intelligence (β = .000, SE = .037, p-value= .998). Gender decreased Intention to Hire (β = -.086, SE = .026, p-value = .001) but was insignificant on Being Embarrassed (β = -.003, SE = .032, p-value = .928), Trust in the Agent (β = .030, SE = .022, p-value = .181), Distrust in the Agent (β = .028, SE = .036, p-value = .437), Perceived Agent Anthropomorphism (β = -.057, SE = .036, p-value = .111), and Perceived Agent Intelligence (β = -.014, SE = .037, p-value = .707).a

Verifying the importance of distrust, we calculated its effect size on Intention to Hire (f= .004, a practically zero effect and with an R2inc = .004) and on Being Embarrassed (f= .316, a large effect with an R2inc = .240). This suggests that while distrust does significantly affect Intention to Hire its size effect is negligible. This analysis also revealed, interestingly, that when removing the path from Distrust to Being Embarrassed, Trust became a significant predictor of Being Embarrassed (β = -.274, SE = .111, p-value = .014), as reported by Kim et al.,23 who excluded Distrust from their model. However, excluding Distrust brought the R2 of Being Embarrassed to only .10.

Structural model with standardized coefficients.
Figure.  Structural model with standardized coefficients.

As additional verification of the analysis, to account for the inevitable common method variance in questionnaire data, we applied the marker-variable technique that Malhotra et al.29 recommended. The standardized results are essentially the same except that now the path from Distrust to Hire Intentions is more significant (β = -.084, SE = .030, p-value = .005), but the effect size remained practically 0 at f= .005. This indicates that while CMV may have a minor impact, the substantive conclusions of our study are robust.

Descriptive statistics of the latent constructs are shown in Table 2. The T column shows the t-test values between the human CPA agent and the AI avatar CPA agent. All the t tests are significant at the .01 level except Being Embarrassed, which is insignificant (p-value = .239). Subjects trusted the avatar less, distrusted it more, perceived it as less anthropomorphic and less intelligent, and were less inclined to hire it, even though that made little difference on how embarrassed they felt about discussing their tax situation with it.

Table 2. 

Descriptive statistics.
  Human Agent AI Avatar Agent T
  N Mean (std.) N Mean (std.)  
Trust in the agent 407 5.58 (1.19) 439 5.18 (1.41) 4.48
Distrust in the agent 391 3.58 (1.47) 416 3.88 (1.49) -2.87
Perceived agent anthropomorphism 389 5.15 (1.05) 410 4.42 (1.39) 8.36
Intention to hire 380 4.77 (1.49) 406 4.22 (1.79) 4.74
Being embarrassed 381 3.20 (1.61) 407 3.33 (1.64) -1.18
Perceived agent intelligence 389 5.61 (1.02) 410 5.35 (1.14) 3.44

Discussion

According to the results, it is not exposure to an AI agent that makes a difference, but rather that participants in the study deemed the avatar to be less anthropomorphic and less intelligent. As the CBSEM analyses show, this fully mediated their trust and distrust in the agent and, subsequently, their hiring intentions and embarrassment. This supports previous research on how anthropomorphism builds trust, suggesting that the barrier to engaging digital workers (in this case CPAs) could be overcome, at least in part, by increasing people’s perceptions of their anthropomorphism. To the best of our knowledge, no academic empirical research exists which has explicitly examined distrust’s role in the context of AI avatar adoption. Because trust and distrust do not constitute two ends of a single continuum, but rather are separate constructs, as shown in this study too, the study of both trust and distrust complements the existing literature.

Specifically, the study revealed the following: Hiring an AI avatar, that is, starting a relationship with it, is mostly about its anthropomorphism (with a standardized coefficient more than twice that of trust) and then about trust (with a standardized coefficient almost three times that of distrust). However, when it comes to the embarrassment involved in the relationship (as opposed to its initiation), it is not trust but distrust and, to a lesser extent, anthropomorphism that is at play. Thus, removing distrust from the picture might have little impact on the decision to start a relationship, but it is central to how people feel embarrassed about that relationship.

The results also substantiate the central role of perceived anthropomorphism in human perception of AI avatars and their adoption. Current research holds that trust is central to the adoption of AI and is built through anthropomorphism.2,20 What the data tells us, however, is that while trust is important, perceived anthropomorphism is the key consideration, more so than trust. Moreover, the data analysis also shows that beyond perceiving the avatar as less anthropomorphic and less intelligent, the rest of the model was not affected by whether the participants were exposed to a human agent or to an AI avatar agent. That may lend credence to viewing trust in an avatar as a matter, extrapolating from Luhmann,28 of understanding what the AI is doing. As such, it is mostly anthropomorphism, and this applies to both the human CPA and the AI, that determines initial willingness to hire a CPA in such situations. Trust is important, and anthropomorphism engenders it, but it is anthropomorphism more than trust that counts in the decision to hire an agent. Distrust plays a distinctly different role, being related to information sharing embarrassment.

Key Takeaways

Critical role of distrust.  This study uncovers the critical but separate roles of distrust and trust in the acceptance of, and embarrassment with, AI-powered digital workers compared to a human CPA. While trust increases the willingness to engage with AI for tasks such as hiring a CPA for tax services, distrust, rather than trust, influences the embarrassment emotional response when interacting with such an agent. This finding is based on the results of the experiment, which show a significant correlation between distrust and embarrassment, suggesting an expanded understanding of AI engagement beyond traditional trust theories and research.

Managing distrust.  As the ability to distinguish between human and AI agents decreases with the increase of the anthropomorphism of AI, there should be a greater realization that distrust is a key element in this process. The role of distrust as revealed in this study is mainly about its correlation with embarrassment in disclosing information. In the case of hiring a CPA, disclosing embarrassing information might be clearly essential. Moreover, adding distrust to the model shows that it is distrust, rather than low trust as claimed by previous research,23 that is correlated with being embarrassed about providing information. This indirectly supports the claim that low trust is not the same as distrust.27,32 As Table 2 shows, the subjects were not more or less embarrassed in expecting to talk to a human CPA than to an AI avatar; this may also suggest that the issue is distrust rather than interacting with an AI avatar.

AI anthropomorphism.  This research highlights the central role of anthropomorphism in people’s willingness to adopt digital workers by showing that anthropomorphism fully mediates the effects of exposure to an AI avatar on how likely people are to trust them and intend to use them. This role of anthropomorphism might be rationally surprising because avatars are not subject to the same legal oversight that human CPAs are, but it shows how much AI adoption is about the psychology of the familiar. Indeed, familiarity builds trust.18 Moreover, anthropomorphism directly affects behavioral intentions and increases it even more than trust does, even though the theory adapted by much previous research was that trust was a crucial mediator in that process.

Read the whole story
mrmarchant
16 hours ago
reply
Share this story
Delete

The Grotesque Cruelty of Human Nature

1 Share

Reconsider the Lobster: On the Persistent, Joyful Cruelty of Bipedal Hominids by Ron Currie

Let’s just state it plainly right at the top: the principal feature of the Maine Lobster Festival is not the crowds, or the admittedly impressive engineering feat known as the World’s Largest Lobster Cooker, or the food or sketchy carnival rides or even the postcard Maine coast in summer. The principal feature of the Maine Lobster Festival is the ambient, omnipresent weirdness of the whole enterprise, which David Foster Wallace recognized and articulated to near perfection twenty years ago in “Consider the Lobster,” an essay originally published in Gourmet that turned out, quite infamously, to be anything but an epicurean puff piece. 

There’s almost no amount or quality of weirdness that we can’t get used to, of course (the term du jour for this neurological elasticity is normalize, which, like most buzzwords, is almost unbearably inane, but there you go), and most of the thousands and thousands who attend the Lobster Festival each year appear not to be bothered by its particular brand of weirdness, appear, honestly, to not even really notice how weird it is. This fact ends up creating a tremendous sense of isolation for someone like me, a native Mainer from the state’s interior who, for better or worse, can’t stop noticing how weird life is, how fuzzy and spooky it gets at the edges of our ability to perceive. Standing on the side of Main Street in downtown Rockland, I feel myself buffeted, adrift as in rough seas, while the thing considered by most Festival enthusiasts to be the highlight of the three-day event —the Lobster Festival Parade—rolls by. 

At the head of the procession is a cruiser from the Rockland police, the passenger seat occupied by an alarmingly manic McGruff the Crime Dog in his signature detective’s trench coat, repeatedly giving a thumbs-up to indicate his approval of something—the weather? Lobster Thermidor?—and tossing candy to the kids.

After a couple of standard-issue marching bands and the Daughters of the American Revolution float, we’re approached by a loose grouping of cartoon characters that are both immediately recognizable and completely off-brand. Sure, that’s Woody from Toy Story, but also it’s totally not Woody from Toy Story. This clear copyright infringement doesn’t bother the children, of course, who sit rapt at the edge of the street as knock-off Ninja Turtles and someone who’s probably supposed to be the snowman from Frozen saunter by. 

It goes on. There’s a random pirate wandering around slapping five with paradegoers. A group of kids dressed as lobsters in a giant pot, with soap bubbles meant to simulate steam. A dump truck with a few of those diamond-shaped road work signs on its side, except these ones read “BE PREPARED TO STOP FOR LOBSTER” and “LOBSTER—500 FEET AHEAD.”

There’s roughly another hour of this to go.  


Suicide comes in different forms. Or at least it can be argued that it does. My grandfather, for example, packed several lifetimes’ worth of drinking and smoking into just 49 years; his death was, in all the ways that count, a suicide. Ditto for my father, who smoked like a barbecue joint for the better part of four decades, quit too late, and died of lung cancer at 57. I don’t think either of them meant to kill themselves—not consciously, at least—but that’s what they did, in effect. 

I’ve lost a lot of people to deaths that wouldn’t rate as suicides on a coroner’s report but that, in terms of the quality of grief they inspire, sure feel like the friends and family in question chose to call it quits, often right in front of me, day by day, drink by drink, Big Mac by Big Mac. 

Then there’s the more straightforward version of suicide, the kind we usually refer to when we invoke the word: a moment; a single, irretrievable act. This is the kind of suicide David Foster Wallace committed, famously, in 2008. After a lifetime that resembled a Greco-Roman wrestling match with depression, after a crushingly bad year during which nothing he or the doctors or his wife or his family did seemed to help, he organized the novel manuscript he’d been working on for a decade, moved his beloved dogs into another room so they wouldn’t see what came next, and hanged himself.


Wallace’s essay for Gourmet magazine purported, at the outset, to be a straightforward if verbose travelogue, like if Rick Steves had swallowed an OED and cultivated a moderate case of social anxiety. But about a third of the way through, Wallace drops a question he’s been slyly building toward, one that changes both the tone and the direction of the piece entirely: “Is it all right to boil a sentient creature alive just for our gustatory pleasure?”

It is not, in fact, all right to boil something alive, and that’s probably just common sense.

This question, and the contemplation of casual cruelty and crustacean neuroanatomy that follows, caused no small amount of consternation among the readers of Gourmet at the time; the sacks of angry mail that showed up at the magazine’s offices remain legend among those who staff what’s left of the periodicals industry.   

Although Wallace takes pains, in the essay, to make clear he himself is undecided on the morality of boiling lobsters—”I am…concerned not to come off as shrill or preachy,” he writes, “when what I really am is confused”—it seems evident to me that to be worried enough about suffering to wonder in the first place whether lobsters are capable of it, you must first be well-acquainted with suffering yourself, and moreover must realize that, through laziness, malice, willful blindness, or all three, you regularly contribute to the sum total of suffering in the world.

Thus Wallace, the unrepentant carnivore, dismisses as hyperbole his own comparison of the Lobster Festival to a Roman circus, condemning his indifference to suffering while  he also condemns the average Gourmet reader’s own indifference.         

It’s precisely this hypocrisy that I find most appealing about the essay. The person before us is not Saint Dave, the self-help guru of This is Water fame, which is how most Americans know him (and more’s the pity). It’s Dave Wallace, a flawed, brilliant, deeply sensitive, callous man, whose inability to reconcile how he was with how he wanted to be brought him, eventually, to that grim night in September 2008, with the dogs and the manuscript and a length of rope.

Wallace’s insistence that he’s simply asking the question of whether we should boil animals alive, rather than pushing an answer, is if not disingenuous, at least part of the essay’s overall tactical aesthetic. He’s making room for the reader to join him in the realization that, when you set aside all the moral inquiry and “hard-core philosophy” in the piece, you’re left with a very simple conclusion that you really only can deny if you choose to: that it is not, in fact, all right to boil something alive, and that’s probably just common sense.


The principal feature of the unexamined life may be an inability to conceive of a way of thinking or feeling that a) differs from your own and b) is legitimate. I see hints of this phenomenon all the time, and mostly among men. Sports radio and pickup basketball games are good places to get acquainted with it—a kind of mildly grumpy, default-conservative worldview that takes its own legitimacy for granted and demonstrates little curiosity about, well, really much of anything—least of all testing its own assumptions.

This same worldview is what one sees on display at big dumb fun like the Maine Lobster Festival. People love it, and that they love it is conclusive proof it must be great. It celebrates all the wonderful things about Maine in the summer, plus the proceeds go to charity, so only a crank or a crazy person would call into question the morality of the whole thing, or wonder out loud if a party centered around boiling thousands of animals alive might actually be fucking barbaric.

The guys I play basketball with twice a week are, by and large, unexamined-life types, and I say that with all affection—I would cut off one of my own digits without anesthetic, Yakuza-style, if it meant I could breeze through my days the way most of them seem to. They have families and honest jobs and definitely don’t spend a whole lot of time thinking about how the meat they cook at backyard barbecues is so cheap because its real cost is borne by the animals themselves, in the form of inconceivable suffering.

Sometimes I imagine how befuddled these guys would be if I told them I think that life, far from being a gift, is actually an irredeemable evil. That consciousness—and its attendant, unavoidable suffering—is to me a morally indefensible thing to inflict on someone else. Some of them know I’m a writer, and so probably consider me suspect in a general way, but mostly I present as a typical guy among guys—I talk shit and get into good-natured squabbles and am not above sharpening my elbows if someone pisses me off. So if I piped up one day about how I love my children the best way I know how—by not having them in the first place—I might find myself quietly removed from the game’s ongoing email thread. And rightly so. As Wallace wrote, “there are limits to what interested parties can ask of each other.” I don’t even want to think about these things—I just don’t seem to have much of a choice.


The usual knock against Wallace’s writing is that it is cerebral and chilly, self-aggrandizing, all head and no heart. I wonder if, as is sometimes the case with the criticism we level at others, those who make this contention about Wallace are themselves emotionally deficient, or otherwise have agendas to serve. Because everything he wrote was shot through with a pain so obvious it’s like a lit cigarette placed lengthwise against your forearm and left to burn slowly down to its filter. Infinite Jest is about the pain of addiction, so acute and unbearable that it makes you willing to tolerate “some old lady with cat-hair on her nylons com(ing) at you to hug you and tell you to make a list of all the things you’re grateful for today” just so you can learn how to make that pain stop. Brief Interviews With Hideous Men is about the pain caused by toxic masculinity, decades before there even was such a term. The Broom of the System is about the pain of the inescapable isolation we all live in, “lords of our own tiny skull-sized kingdoms, alone at the center of all creation.” A Supposedly Fun Thing I’ll Never Do Again and “Big Red Son” are about the pain of discovering that certain things advertised as unequivocally good times are anything but. And “Consider the Lobster” is about the wholesale pain we as a species inflict on ourselves and creation, and how we turn a blind eye to that pain so we can keep eating, and doing, whatever we want.

Wallace’s focus on suffering remained until the end. “Maybe dullness is associated with psychic pain,” he wrote in the unfinished novel, The Pale King, “because something that’s dull or opaque fails to provide enough stimulation to distract people from some other, deeper type of pain that is always there, if only in an ambient low-level way, and which most of us spend nearly all our time and energy trying to distract ourselves from.”


I’m not going to attempt to influence what you think and/or feel about the gap between what Wallace wrote and how he behaved, according to some who knew him.

I will invite you to consider, though, several questions:

Are our ideals rendered null and void by our failure to live up to them? 

If I am never as smart or compassionate or articulate or even-handed in life as I am on the page, does that make the whole of my work a lie? 

Can suicide be thought of as the ultimate expression of disappointment in oneself—in Wallace’s case, a much more permanent and irrefutable condemnation than anyone has managed since?


The principal feature of the unexamined life may be an inability to conceive of a way of thinking or feeling that a) differs from your own and b) is legitimate.

In the western frontier states there’s a practice with the whimsical name “wolf whacking” that some people consider a fun pastime of sorts. It involves using a snowmobile to run a wolf to exhaustion, and then, when it’s too tired to flee anymore, running it over repeatedly until it’s dead. If on a given day you feel the urge to do some wolf whacking but the wolves aren’t showing themselves, coyotes—more numerous and less elusive—will do in their stead.

This practice came to my attention through the case of a man named Cody Roberts, a resident of Wyoming who not long ago ran a wolf over with his snowmobile and, when the wolf failed to die right away, decided to tape its mouth shut and bring it to town and show it off at a bar before finally taking the animal out back and shooting it dead. If you’re interested and can stomach it, photographic evidence of Mr. Roberts’ night out is available online, because pics or it didn’t happen, of course. In the photos, he seems to be enjoying himself quite a lot.

When I was in junior high, one of the boys’ favorite pastimes involved going to the golf course across the street to abuse, torture, and kill the frogs that made their home in the water hazard. There were baseball bats. There were firecrackers inserted into amphibian orifices and set alight. There were, of course, more workaday methods of dispatching frogs, as well: literally stomping their guts out, for example, or hurling their soft bodies against the brick foundation of the pro shop.

It’s commonly assumed that such displays of cruelty in childhood presage violent or anti-social behavior in later years. But as far as I know, none of the guys who killed frogs at the golf course grew up to be Jeffrey Dahmer. They all live average, unremarkable lives now. They’re cops and lumber yard workers, call center operators and middle-management flunkies. They’re husbands and fathers. They play beer-league softball and drive minivans. They’re normal. 

Here’s my thesis: the frog pogroms I witnessed and did nothing about as a child indicate there is something fundamentally wrong with us as a species, something that can only be mitigated, but not solved, by law or reason. This isn’t about ideology, but biology. Some evil that lurks in us all. Some intractable, sadistic chromosome, insufficiently counterbalanced by whatever grace or kindness we’re capable of.

And that’s why, when I read about Cody Roberts of Wyoming dragging the wolf around to show off to his buddies before finally killing it, I have two concurrent reactions. First, I feel a surge of hatred for my own species, like vomit rising in my throat. I hate what we are by both divine and natural law, an inscrutable house ape that takes grotesque, gleeful pleasure in the suffering of creatures we consider inferior to ourselves. 

Second, and more narrowly, I experience a howling desire for five minutes alone in a locked room with Cody Roberts. I want to break my hands against his face, tape his mouth shut and drag him around to show off to my buddies, snap pictures of him suffering and terrified while I grin widely with my arm around his shoulders.  

But as those fission-hot first reactions burn away, I realize this is not, in fact, what I want. Hurting Cody Roberts would be both too easy and too obvious. What I really want is to know how to make him understand, completely and for all time, how terrible what he did is. I want him to be haunted by that poor animal for as long as he lives, and to have no peace even when he sleeps. I am tormented by the need to see him tormented.

This means, of course, that I am no better than Cody Roberts. I can’t change my nature, any more than he can.

Which brings us back around, I think, to the topic of suicide.


I’ve contemplated suicide here and there over the years, never attempted it. 

Sometimes when I’m in a bad stretch, which means among other things that I’m being watched pretty carefully by professionals, I have to fill out these crude little surveys meant to quantify how pathological my thinking has become. They feature questions like, “In the past two weeks, have you felt like life isn’t worth living?” The surveys are multiple-choice and don’t provide space to ad lib, probably to keep smartasses like me from answering: “Haven’t you?!?” But as my buddy Gary wrote, when he found himself being asked similar questions in a locked ward, one must realize good boys get to go home and bad boys should have their mail forwarded, so answer accordingly. As with the TSA and the Secret Service, psychiatrists don’t get paid to appreciate levity.

It seems odd, doesn’t it, that we’re given no choice about life on either end? Certainly, no one inquired whether we wanted to be here in the first place, and now that we are, the full power of the state will be brought to bear to ensure we stick around. Someone alert the Federalist Society and Planned Parenthood! From what I understand, both those organizations are foundationally concerned with bodily autonomy, but neither seems to have anything to say about the fact that the most meaningful form of bodily autonomy is denied us, with near-unanimous support from our fellow citizens. The only thing we seem to agree on more than that life is precious is that meat is mighty tasty.

It’s not surprising, in retrospect, that I had to leave America to learn life is in fact dirt cheap. This was more than two decades ago. I was in Cairo, hurtling through that loud, heaving megalopolis in a black-and-white cab, alternately groaning and holding my breath, terrified not for myself but for the pedestrians scampering across six lanes of warp-speed traffic. The system, if it can be called that, was to sprint as far as you could across the road and, when you had to pause for cars, which was often, to stand up as straight as possible between lanes and hold very still as vehicles whizzed past only inches away. This probably goes without saying, but not everyone made it. That day alone, I saw two or three bodies on the side of the road, covered in sheets. No one seemed in a hurry to identify or otherwise deal with them.   

The only thing we seem to agree on more than that life is precious is that meat is mighty tasty.

Where did we get the idea, in America, that life is so precious anyway? We worship at the altar of the market economy, the simple overarching rule of which is supply and demand. Diamonds and gold are valuable not because they’re pretty, but because they’re rare. Life—human life—on the other hand, we’ve got plenty of. Way more than enough, in fact, if the general state of things, with eight billion of us and counting, is any indication.

But here in America, star-spangled Land of the Free, you are hardly free to end your life should you wish to. You’re going to live, whether you like it or not. And neither bemeasled patriots nor champions of women’s self-determination will come to your aid on that count.


On the marquee of a restaurant, a cartoon pig in a bib napkin grins as it gets ready to eat…a rack of pork ribs.

On a cardboard display in the dairy section of the grocery store, a smiling cartoon cow encourages you—practically begs you, in fact—to drink its breast milk.

The cognitive and moral dissonance such images should provoke—pork is so unspeakably delicious, even pigs want to eat it!—is nowhere in evidence in the culture as a whole. And the ubiquity of these images, once you start to notice them, inevitably raises the question of how they’re supposed to function. Is this some strip-mall version of the pagan impulse to honor the animals we eat through artistic rendition? Or is it just our old friend advertising doing what it does best—salving our conscience, laundering difficult truths until they come out sparkling clean and ready for retail?

Nobody but me seems preoccupied with such questions at the Maine Lobster Festival, and that’s hardly a surprise—everyone’s too busy rushing around in lobster shirts and lobster shorts and lobster hats (meaning hats with images of lobsters on them, as well as strange red hats with eyes and antennae designed to make the wearer look him- or herself like a lobster, sort of, and which are made available for sale by a gentleman with a small cart full of summertime kitsch no doubt manufactured by befuddled Chinese workers) and lobster socks and lobster pants and so on. More than one person is dressed in a full-body lobster suit, complete with claws and lobster-head hoodie. The summer residents in attendance, urbane types from Boston and New York who always stand out from the locals as though lit in neon, have lobster gear on, too, though in their case it looks expensive and probably designer and tends to be more tasteful (e.g. a light beach-cover type of dress in understated and symmetrical lobster print).

In “Consider the Lobster,” Wallace never mentions all the people who come to the festival dressed as the thing they intend to eat. Which seems odd to me—it’s precisely the kind of low-key, vexing detail that launched a thousand other footnotes in his work. Did it somehow escape his attention? Did he consider it not germane to the rest of the piece? Did it get axed in the (considerable, by Wallace’s account) back-and-forth between him and his editor at Gourmet?

My own attention is drawn mostly to the kids in their lobster clothes. Like everyone else, they’re just here to eat and have fun playing dress-up, but what if they clued into the fact that the thing they’re eating is also the thing they’re dressed up as? What if, as kids sometimes can, they saw clearly what the adults choose to turn away from? What if we were upending crates of live puppies into the World’s Largest Puppy Cooker? What story would we tell them about why that’s okay?

A few years ago, I worked on a television show set in the near future about climate change. Early in the process, we spent time tossing around ideas for episodes, and one concept some of us thought had potential was that of thousands of children, led by a Greta Thunberg-type personality, threatening to commit suicide en masse if the adults don’t get it together and actually meet the obligations of a climate agreement on time. We envisioned a kind of march, the kids moving from town to town and growing in numbers, until tens of thousands of them arrive at the latest installment of the fictional climate conference, ready to kill themselves.

And then what?

We never found out. One of the writers spoke up and said she couldn’t abide the thought that a real child, watching the show, might decide to kill herself. The rest of us instantly realized she was right, and the idea went in the trash can. But all this time later, I still think about it.


The lesson, for me, of middle age—that is to say, the cumulative lesson of the time that has passed since I first read “Consider the Lobster” and now—is that I know nothing and I am nothing. This likely makes me a lousy American, giving up my claim to a preeminent self like that. But it has also, counterintuitively, made life a little easier to take. Because if the unalterable fact of existence is confusion and cosmic irrelevance, it kind of takes the pressure off, doesn’t it? 

I have a tattoo on my right forearm that reads, “I’d tell you all you want and more, if the sounds I made could be what you hear.” David Foster Wallace wrote that line, and it helps too, a little. As alternately a devotee and a critic of Wittgenstein, Wallace grappled with the difficulty of using language to bridge the chasm between two minds. Which is another way of saying Wallace wrote about loneliness, and more specifically the loneliness that can afflict us even when we’re surrounded by other people. Like, say, at a big, dumb, orgiastic slaughter disguised as a culinary festival. 

I’m not glad that Wallace died, but I think I understand why he did. I miss the work he’ll never write. I wish I could ask him what he thinks of all these people in lobster swag.  

I’ve done my best, over the years, to smother hope with a pillow while it sleeps, but despite all the ways in which it seems to have no place in how I think or feel, hope has proven harder to kill than bedbugs. You can find it, tenacious as weeds, in my novels. In one of them, the world comes to a definitive end, but life, and its worth, are somehow affirmed in the process. I didn’t put hope there, or even give it permission to show up. It just keeps crashing my nihilist party, over and over. 

And maybe that’s my real confession, and the simple, essential difference between me and Wallace: he died because he killed his hope, and I’m still alive because I’ve failed, thus far, to kill mine. 

So: I hope you will try not to cause more suffering than you have to, either directly or indirectly. I hope you will be merciful, in ways large and small. I hope you find your own suffering bearable, when it inevitably comes to perch. I hope there is a tenth circle of hell, a sub-basement too awful for Dante to mention, and I hope Cody Roberts of Wyoming spends eternity there. I hope the animals, and God, will forgive us. I hope. I hope. I hope.

The post The Grotesque Cruelty of Human Nature appeared first on Electric Literature.

Read the whole story
mrmarchant
19 hours ago
reply
Share this story
Delete

Does AI Prediction Scale to Decision Making?

1 Share

Artificial intelligence (AI) excels at prediction.1 Large language models (LLMs), for example, are remarkable at predicting the next word and stringing together fluent text. AI’s predictive power extends beyond language to generate moving visuals and audio. AI models are trained with vast amounts of data and they use the statistical associations and patterns in the data to generate outputs.

But does AI’s ability to predict scale to decision making? Do relatively mundane (albeit impressive) forms of prediction—like predicting the next word—extend to reasoning in novel situations and to decision making in the real world?12

Many argue that the answer is “yes.” Predictive algorithms are now widely used to make decisions in varied domains like healthcare, finance, law and marketing.9,14,15 AI models are said to not only solve problems they have encountered in their training data, but also new problems. Some argue that LLMs have an “emergent” capability to reason.16 And with increasing amounts of data and computational power, some argue that AI will surpass humans in any cognitive or decision-making task. For example, in their book Noise: A Flaw in Human Judgment, Kahneman and colleagues argue that “all mechanical prediction techniques, not just the most recent and more sophisticated ones, represent significant improvements on human judgment.” In short, “mechanical prediction is superior to people.”10 Humans are biased and computationally limited, while machines are objective and neutral. And AI’s capacity for predictive judgment is said to extend well beyond mundane problem solving—specifically, to human decision making under uncertainty.1

We concur that the predictive capabilities of AI are remarkable. But in this column, we argue that there are limits to AI prediction. We elaborate on the nature of such limits, which in our view apply to a broad range of highly consequential real-world decisions. These decisions require the human capacity for what we call “counter-to-data” reasoning, extending beyond data-driven prediction.

The Limits of AI Prediction

The key limit of AI prediction derives from the fact that the input or “raw material” AI uses to make predictions is past data—the data the AI has been trained with. AI therefore is necessarily always backward looking, as its outputs are a function of inputs. The statistical learning that forms the basis of any prediction cannot somehow bootstrap the future, particularly if it involves data that is “out of distribution,” that is, data the AI has not encountered before. AI prediction is based on probabilistically sampling past associations or existing correlations from its training data, with an eye toward likely and probable outcomes. But past-oriented AI has no mechanism for making predictions or generating unique outputs well beyond its training data.

To offer a simple illustration of this, in one reasoning task humans and LLMs were both presented with the sequence transformation “a b c d → a b c e” and asked to apply the same transformation to “i j k l.” Both humans and LLMs could readily deduce that the analogous transformation would result in “i j k m,” by abstracting the concept of a “successor” (incrementing the last character). However, when the sequence was modified to use a permuted alphabet, LLMs frequently failed in situations that humans can easily solve. For example, if we give the following permuted alphabet [a u c d e f g h i j k l m n o p q r s t b v w x y z] and ask humans to name the next letter after “a” they will readily say “u.” Or if we give the pattern “a u c d → a u c e” and ask for the equivalent transformation for “q r s t” → q r s ?, then humans can easily follow the prompt and respond “b.” LLMs however struggle with these types of tasks.11

Or to offer another example, LLMs have problems with simple word puzzles like: “Alice has 4 brothers and 1 sister. How many sisters does Alice’s brother have?” Many LLMs fail and answer that Alice’s brother has one sister.13 These types of problems are also illustrated by the fact that even slight changes in the wording or structure of common reasoning and problem solving tasks lead AI prediction to falter or fail completely.8 These failures highlight the reliance of LLMs on previously encountered, surface-level patterns that are memorized from training data—and the inability of LLMs to engage in novel, on-the-fly reasoning.

Other clever experiments have been developed to see whether LLMs are simply reciting what they have encountered in their training data, or whether they can actually engage in novel reasoning. For example, AI researcher Francois Chollet’s so-called “abstract reasoning corpus” (ARC) tests the ability of AI to solve new problems, tasks the AI has not encountered previously.3,a Chollet “hides” these tasks by not posting them on the Internet, to ensure LLMs are not trained on them. The tasks are strikingly simple: a child can readily solve many of them. Chollet has even offered a $1 million prize to allow anyone to submit an algorithmic model to solve these simple, “hidden” problems better than humans. But so far humans significantly outperform any algorithm—including highly sophisticated LLMs, such as OpenAI’s o3 model unveiled in December 2024.17 The bottom line is that not just LLMs but AI more broadly presently fail to solve novel tasks, while humans (and even children) can do so routinely.7

The tasks used to illustrate how AI fails to engage in reasoning are quite simple. They highlight just how out of reach more complex forms of problem solving and novel decision making can be for AI, at least for now. That said, we certainly recognize that the existing capabilities of AI—in retrieving information and writing fluent text—are remarkable. The applications of this are sure to improve and be transformative. But the claim that AI’s ability to predict translates to reasoning and decision making in novel situations is overstated. AI prediction is tightly coupled with the data or problems it has been trained with, encountered, and essentially, memorized. AI can summarize—or generate derivative outputs from—the information it has encountered, but this does not somehow translate to solving new problems or bootstrapping new data.

AI’s Data-Driven Prediction vs. Human “Counter-to-Data” Reasoning

AI is superior to humans when it comes to the processing of data and information. Algorithms are superior to humans in what Kahneman, Sibony, and Sunstein call “predictive judgment,” the process of estimating an outcome based on past data, such as predicting a candidate’s future success or the likelihood of an event such as fraud based on past outcomes. Algorithms do not have preferences or values, which can introduce random variability (called “noise”). AI sticks to the data and objective facts. Importantly, this advantage of algorithms over humans is said to apply not only to recurrent decisions, but even to rare, one-of-a-kind decisions.10

While data is important, in many instances of decision making, data might in fact be wrong, contested, or not (yet) available. This is inherently the case for highly consequential, “forward-looking” decisions.4 In these instances, AI algorithms perform particularly poorly. And interestingly, it is here that the very things that are seen as the “bugs” of human judgment—seeming biases, idiosyncratic preferences, and disagreement—in fact turn out to be extremely useful. To illustrate, in the context of technology startups, a recent MIT study shows that disagreement among human judges and experts offers the best predictor of the eventual economic value and success of a startup.6 That is, startups that are contrarian and idiosyncratic—not predictable—create the most economic value. Disagreement and idiosyncracy are precisely what AI cannot capture well.

Many forms of AI are “autoregressive,” meaning they generate outputs sequentially by using prior data points to statistically predict future values. While this approach works well in stable environments, there are limits to applying it in evolving and uncertain ones. For instance, there is strong evidence that data-driven investors (in venture capital) that use AI—like machine learning and predictive analytics—tilt their investments toward startups that are “backward-similar” and therefore less innovative and novel, and less likely to achieve major success (like an IPO).2 AI is great at mirroring what led to success in the past, but it is lousy at anticipating what might lead to success in the future.

What gets lost with the emphasis placed on AI’s data-driven prediction is the human ability to engage in what might be termed “counter-to-data” reasoning. This form of reasoning takes several forms. First, humans can disagree and have different interpretations of even seemingly conclusive data and evidence. And second, humans can hypothesize and experiment to generate new data to prove things that might appear highly contrarian or implausible. This is the very basis of the scientific method. As scientists well know, any appeal to data is meaningless without some kind of theory; and theories tell us what might constitute relevant data and experiments. It is this logic that also should be at the heart of decision making under uncertainty.

The bottom line is that when knowledge evolves and grows—and when new data is needed—humans have an advantage over AI algorithms. Humans can disregard seemingly conclusive data, reason forward and experiment. The most consequential decisions humans make are often highly idiosyncratic.

Consider Airbnb as a brief illustration of how humans can reason counter-to-data and experiment to generate new (or different) data. The founders of the startup—in mid-2000s—were met with significant skepticism when they proposed to start a company to use vacant homes as an alternative to hotel accommodation. There was no data or evidence to suggest that this might be plausible. Sophisticated investors and lodging experts dismissed their idea. But the Airbnb founders believed, “counter to the data,” that if they could solve the problems of trust between strangers and create a way to efficiently match travelers, then they would be able to realize their belief.5 The founders thus ignored existing evidence about the implausibility of their idea and instead committed to realize their idea through causal reasoning and experimentation. They hypothesized that if they created a platform where people could “match” with those providing accommodation, and if they could generate trust amongst strangers (for example, with a ratings system), then their previously implausible-sounding idea might become a reality. For this, they needed to experiment to generate evidence, new data, for the plausibility of their idea.

Figure 1.  Comparing AI and Human Decision Making

The accompanying figure offers a visual summary of the contrast between the data-driven approach of AI versus human reasoning. To the left, AI takes a well-known bottom-up approach starting with data, leading to prediction and decision. To the right, humans do not start with all the relevant data but with “counter-to-data” reasoning, with new data generated through the process of reasoning and experimentation for a decision to be made. While some assert that AI’s ability to utilize masses of data and predict is useful for decision making under uncertainty,4 we argue that for many human decisions, forward-looking counter-to-data reasoning and new data are needed. Humans can disagree about existing data and engage in novel experimentation and problem solving to generate the evidence needed. Humans can ignore data they might disagree with and generate experiments to generate alternative evidence.

Comparative Advantages of AI and Humans

A good way to summarize the AI-human tension is to recognize their respective strengths and limitations (see the accompanying table), depending on the nature of decision, situation or problem that we are talking about. In some situations, the predictive abilities of AI far surpass human abilities. When problems are well defined and recurring, and when lots of relevant data is available, using mechanistic prediction can yield useful outputs and decisions. But in other types of situations and environments, humans surpass AI. That is, when problems are open-ended, ill defined or controversial (because they challenge accepted social norms or received wisdom), and when data is sparse or not available—or when there is disagreement on the data—relying on humans yields better results.

Importantly, many if not most AI-based tools are based on statistical averages, patterns and frequencies that are not amenable to one-off or individualized decision making under uncertainty.b In these types of decision making, data is the eventual outcome of a top-down process, rather than—as with AI—the starting input (see the accompanying figure). In human decision making, data is the outcome of thinking in counter-to-data ways, as well as intervening in the world through experimentation. These steps cannot be automated.

Table 1.  Summary of decisions and problems suited for AI vs. humans.
  Artificial Intelligence Humans

Types of problems

Structured, well-defined problems with clear parameters and solutions Ill-defined, open-ended, or controversial problems requiring problem formulation
Input Data Counterfactual and causal reasoning
Focus Prediction and pattern recognition Abstract, causal reasoning
Approach Bottom-up, data-driven Top-down, theory-driven
Temporal focus Backward-looking, uses general patterns from past data Forward-looking and idiosyncratic, anticipates and plans for uncertain futures
Causal understanding Identifies statistical relationships and correlations Engages in causal reasoning and hypothesizing
Level of specificity General probabilities, frequencies and averages Individualized focus, extremes and idiosyncrasies
Novelty Recombines known data and patterns to create variation Generates novel data and new associations
Useful contexts Operations, routine decisions in highly stable environments, pattern recognition Novel decision making, strategy, idiosyncratic decisions in unpredictable environments

Conclusion

AI prediction is everywhere. Prediction certainly has its uses, particularly when abundant data are available and decisions are “predictable.” But when dealing with uncertain and data-sparse environments—or when data is contested or not yet available—more forward-looking forms of reasoning are needed. We argue that human cognition plays a central role here. Humans can engage in counter-to-data reasoning about the plausibility of outcomes that presently lack data. It is this causal reasoning that enables humans to intervene in their surroundings and to experimentally generate new data. As a result, under certain circumstances, human decision making involves a very different set of steps from data-driven prediction.

Such human capability is of increasing importance, as we consider how to differentiate ourselves from others in a world of easy access to various AI models. Certainly human-centered AI (when humans interact with AI as an input into human decisions) has received a lot of attention, and will undoubtedly become increasingly important. But clear thinking around the nature of the problems being addressed must precede any discussion about the nature of such interaction.

Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete
Next Page of Stories