Let’s Think Step-by-Step

Last updated: Mon Mar 10 2025

Originally published: Mon Mar 10 2025

Sorry, another LLM newsletter. I promise next week I have a different topic.

A whiteboard showing sticky notes with ideas for the topic "library" (But first, some ideation for zine 4!)

Recently there’s been some Discourse™️ about the use of LLMs, with some noting a backlash against LLMs (primarily on Bluesky) and others noting a backlash against the backlash. There’s been plenty of newslettering, too, like:

Frankly, a lot of it is exhausting, and some of it seems like folks talking past each other — a great example of the farmer-forager dichotomy I defined a while back. (At one point Bjarnason accuses Sloan of being a bullshitter-a-la-Harry-Frankfurt, which is a very farmer-on-forager move!)

That said, I never want to be accused of having underthought things, and while I mostly stand by my last post, it does feel a little too… forager-y?

So here is:

Russell’s List of Every Objection to LLMs (That I Could Think Of)

(No, I didn’t ask Claude to generate a list of examples. I used my squishy wet human brain.)

A sign reading "no student access" This was at a random stairwell at my workplace. 🤷‍♀️

So How Do I Feel?

Uneasy, definitely uneasy.

I don’t think LLMs are at all fake or useless — I listed genuine use cases in the last post!

But I am skeptical of the AI labs’ claims that we’re “just a few months” from a general intelligence, let alone a superintelligence (and that’s without getting into the murky waters of comparing intelligences…). Just today, I found myself nodding along to this bear case for AI progress on LessWrong — it argues we’re likely close to the ceiling on reasoning performance (c.f. the disappointing release of GPT 4.5 just last week) and most of the benchmarks are not really measuring genuine intelligence. As the article puts it: “It seems to me that “vibe checks” for how smart a model feels are easily gameable by making it have a better personality. […] Deep Research was this for me, at first. Some of its summaries were just pleasant to read, they felt so information-dense and intelligent! Not like typical AI slop at all! But then it turned out most of it was just AI slop underneath anyway, and now my slop-recognition function has adjusted and the effect is gone.” The fact that I could even use the term “mechanical sympathy” meaningfully in the last post is a clue that this really is “just a machine”.

And, really, most of the non-programming uses I listed in the last newsletter are useful, but not that useful — not multi-percentage-point-GDP-growth useful, certainly, but maybe as useful as a spreadsheet.3 At the end of the day, I have a few niche tasks that I use LLMs for — converting equations into clean LaTeX is pretty darn useful when I need it — but I really don’t use LLMs that often.

So, where do I stand, given the critiques above?

Okay, whew. Hopefully you know that wasn’t written by an LLM because (it’s barely edited and) it sounds a lot like me. I am tired and need to go take a nap. I promise I have a non-LLM topic for next time.

Footnotes

  1. Before I’ve complained about folks who brush off the possibility of LLM consciousness using the term “stochastic parrots”. But the original Bender-and-Koller paper “Climbing Towards NLU” is great! Still, I stand by what I said — the “stochastic parrot” criticism works for grounded understanding, but not for phenomenal consciousness. LLMs may lack understanding of what they’re manipulating while still being phenomenally conscious. But as I always point out, I’m an illusionist about consciousness and believe the United States is meaningfully conscious — so LLMs being phenomenally conscious doesn’t say much.

  2. If Bender and Koller are right that genuine understanding is impossible to learn from syntax, then how come LLMs are useful at all? How do the new reasoning models reason? I think that means a lot of reasoning just is purely syntactic. If I ask a question like “how much sugar is in 2 tbsp of simple syrup,” you don’t need to know what sugar or simple syrup or tablespoons are “really”; you just need to know how they all relate to each other semantically and then do some probabilistic substitutions.

  3. One of the joys of having a newsletter is to see that I was already asking these questions almost a year ago.

Reply by email!