Let’s Think Step-by-Step

Last updated: 3/10/2025 | Originally published: 3/10/2025

Sorry, another LLM newsletter. I promise next week I have a different topic.

A whiteboard showing sticky notes with ideas for the topic "library"
(But first, some ideation for zine 4!)

Recently there’s been some Discourse™️ about the use of LLMs, with some noting a backlash against LLMs (primarily on Bluesky) and others noting a backlash against the backlash. There’s been plenty of newslettering, too, like:

Frankly, a lot of it is exhausting, and some of it seems like folks talking past each other — a great example of the farmer-forager dichotomy I defined a while back. (At one point Bjarnason accuses Sloan of being a bullshitter-a-la-Harry-Frankfurt, which is a very farmer-on-forager move!)

That said, I never want to be accused of having underthought things, and while I mostly stand by my last post, it does feel a little too… forager-y?

So here is:

Russell’s List of Every Objection to LLMs (That I Could Think Of)

(No, I didn’t ask Claude to generate a list of examples. I used my squishy wet human brain.)

A sign reading "no student access"
This was at a random stairwell at my workplace. 🤷‍♀️

So How Do I Feel?

Uneasy, definitely uneasy.

I don’t think LLMs are at all fake or useless — I listed genuine use cases in the last post!

But I am skeptical of the AI labs’ claims that we’re “just a few months” from a general intelligence, let alone a superintelligence (and that’s without getting into the murky waters of comparing intelligences…). Just today, I found myself nodding along to this bear case for AI progress on LessWrong — it argues we’re likely close to the ceiling on reasoning performance (c.f. the disappointing release of GPT 4.5 just last week) and most of the benchmarks are not really measuring genuine intelligence. As the article puts it: “It seems to me that “vibe checks” for how smart a model feels are easily gameable by making it have a better personality. […] Deep Research was this for me, at first. Some of its summaries were just pleasant to read, they felt so information-dense and intelligent! Not like typical AI slop at all! But then it turned out most of it was just AI slop underneath anyway, and now my slop-recognition function has adjusted and the effect is gone.” The fact that I could even use the term “mechanical sympathy” meaningfully in the last post is a clue that this really is “just a machine”.

And, really, most of the non-programming uses I listed in the last newsletter are useful, but not that useful — not multi-percentage-point-GDP-growth useful, certainly, but maybe as useful as a spreadsheet.3 One of the joys of having a newsletter is to see that I was already asking these questions almost a year ago. At the end of the day, I have a few niche tasks that I use LLMs for — converting equations into clean LaTeX is pretty darn useful when I need it — but I really don’t use LLMs that often.

So, where do I stand, given the critiques above?

Okay, whew. Hopefully you know that wasn’t written by an LLM because (it’s barely edited and) it sounds a lot like me. I am tired and need to go take a nap. I promise I have a non-LLM topic for next time.

Reply by email!