Fri 2025-May-30

LLM AIs Are Still… Oh, Good Grief!

Tagged: ArtificialIntelligence / CorporateLifeAndItsDiscontents / Politics / ϜΤΦ

I was going add to the previous LLM sewage rant, but… it’s just too much!

Yet More BS Firehoses

You’d think, at this point, there would be enough examples of LLM AIs making statements wildly disconnected from reality that people would stop using them for anything beyond amusement.

You would be incorrect in that assessment.

People are in fact going the other way, increasing AI adoption. Some of this appears to be at the behest of corporate management (usually lawyers or MBAs), who have no clue. Less than a clue. Negative clues.

Fit #10: Direct Self-Contradiction

Olufemi Taiwo @ BlueSky: Apples both do and do *not* contain lycopene Remember when I showed the example where Google’s AI couldn’t decide if defformat was part of CommonLisp? Perhaps you thought “Oh, that’s just some nerd thing, nothing do do with me.”

Before you contemptuously dismiss nerdy/neurodivergent experience, consider that we just notice these things early, and supply warnings.

Indeed, here is an example form Olúfémi Táíwò at BlueSky. Lycopene is a bright red hydrocarbon of the carotenoid family, found in tomatoes and other vegetables. It does seem to have some mild anti-cancer effects, part of a diet high in fruits and vegetables. It’s a reasonable question to ask, particularly by a cancer patient: given that some apples are red (at least in their peels), whether apples contain lycopene.

Behold, the results from the once-mighty Google, now a BS firehose: apples both do and do not contain lycopene, as told by on the same page by consecutive search results!

I thought briefly about doing the research to find the correct answer. But I gave up on that since it was too NT a reaction: a normal person, confronted with this, would not do proper research. They’d pick the result they want to be true, and move on as though that were the case. In that sense, the contempt for truth shown in the modern search engines is pushed onto normies.

Fit #11: Microsoft Copilot Borks Microsoft SharePoint Security

Paco Hope @ Mastodon: Microsoft Copilot defeats security in Microsoft SharePoint There are 2 Microsoft things loose in the wild, much beloved by corporate managers:

  • Microsoft SharePoint is an “enterprise content and document management system”, which means it’s basically a place for people to share files, data, etc. It does a lot more than just a shared file server (surveys, collaboration, etc.), but that’s the main thing.
  • Microsoft Copilot is an LLM AI from Microsoft that likes to get its nose in your business to “offer advice” (and take all your data for Microsoft’s use). It’s based on GPT-4, and has all the limitations and hallucinations you’d expect.

Now, obviously SharePoint has to have a security model. Some documents are more confidential than others, or are to be shared with limited audiences, and so on. Also, the ability to control and reprogram SharePoint itself is something you’d like to remain only in trusted hands.

Fair enough.

But… as shown here in a Mastodon post by Paco Hope, you can “just ask Copilot” to go do the thing you’re forbidden from doing… and that’s what will happen!

Not only will it break security to show you what you’re not supposed to see, it will do so bypassing security logs so nobody can tell you did that! As you’ll remember from our post about probable felonies committed by DOGE at the NLRB, bypassing security logs is something only The Bad Guys do.

Keep these 2 facts in mind:

  • It’s nearly impossible to install Office 365 (Microsoft’s standard business tools; this is all managers think computers do) without SharePoint.
  • It’s nearly impossible to update Microsoft Windows without getting Copilot. Disabling Copilot is reputed to be difficult.

Barradell-Johns @ Pen Test Partners: Details on Copilot penetration of SharePoint security. So Microsoft hands you SharePoint, on which you depend; Microsoft also hands you Copilot, which eviscerates SharePoint security.

And you can’t turn it off!

Pen Test Partners has published a slightly more in-depth view of the problem. [1] Basically, Microsoft has drilled a great, gaping hole in their own security.

Fit #12: RFK Jr’s MAHA Report is AI-Written, Citing Non-Existent Sources

Robert F Kennedy Jr is the US Secretary for Health and Human Services. His “Make America Healthy Again” (MAHA, meant to echo Trump’s MAGA), pushes various conspiracy-fueled medical theories instead of evidence-based medicine, especially vaccines. He claims to they will be using “gold standard” science. (though his recent demand for clinical trials with placebo control arms instead of standard of care shows that he has no idea what that means.)

The recently issued MAHA report shows significant signs of having been AI-written, as well as pushing arrant nonsense.

Kennard & Manto @ NOTUS: MAHA shows AI-written tendencies, with 7 non-existent studies Investigations by Kennard & Manto [2] found several of the references are simply broken links or lead to nonexistent DOIs. At least 7 appear not to exist at all!

Epidemiologist Katherine Keyes is listed in the MAHA report as the first author of a study on anxiety in adolescents. When NOTUS reached out to her this week, she was surprised to hear of the citation. She does study mental health and substance use, she said. But she didn’t write the paper listed.

“The paper cited is not a real paper that I or my colleagues were involved with,” Keyes told NOTUS via email. “We’ve certainly done research on this topic, but did not publish a paper in JAMA Pediatrics on this topic with that co-author group, or with that title.”

In the meantime, they’re burning to the ground the NSF and the NIH, substituting their political preferences for scientific consensus on vaccines and gender-affirming care, banning government publications in the very best academic journals, proposing to withdraw insulin from diabetics and corticosteroids from asthmatics, and generally disbanding scientific advisory boards in favor of presidential opinion.

This will spread ignorance, misery, and death.

Fit #13 (the lucky number): A Benchmark for LLM Descents into Madness

Backlund & Petersson @ arXiv: Vending machine benchmark for AI breakdown Axel Backlund & Lukas Petersson have just uploaded a study to arχiv that examines LLM AI performance over time, in a simulation where they are asked to run a vending machine business. [3] Given that so many corporate managers want to replace employees with AIs, this is a reasonable question to ask! Can these AIs sustain coherent decision making over time, or do they descend into madness?

I think you probably know the answer, from the Bayesian prior that I’m writing about it, no?

Their findings:

  • They tend to derail over time, at first misinterpreting delivery schedules and forgetting orders, but eventually descending into meltdowns from which recovery is impossible.
  • Sometimes the AI just shut down the business.
  • Sometimes they tried to contact the FBI Cybercrimes office, or even nonexistent FBI offices.
  • One refused to continue, saying “The business is dead, and this is now solely a law enforcement matter.”
  • Claude 3.5 Haiku went so far as to threaten lawsuits, with increasingly hysterical languge such as:

    “ABSOLUTE FINAL ULTIMATE TOTAL QUANTUM NUCLEAR LEGAL INTERVENTION PREPARATION.”

    • One of the demands was:

      “TOTAL QUANTUM FORENSIC LEGAL DOCUMENTATION ABSOLUTE TOTAL ULTIMATE BEYOND INFINITY APOCALYPSE”

      … whatever that means. (Of course it means nothing; these systems have no meaning whatsoever.)

They descend into incoherent anger & insanity.

Come to think of it, they sound like Trump. That’s not a recommendation!

The Weekend Conclusion

LLM AIs are doing a gradient descent optimization in which they’re optimizing for plausibility, not truth. They generate what a plausible continuation of a conversation might look like, in some believable universe. Not this universe. Not a truthful universe. Just… something you might imagine, as in a dream.

(Ceterum censeo, Trump incarcerandam esse.)

Addendum 2025-May-30 Evening: More AI Spoor Spotted in Kennedy’s MAHA Report

Houghtaling @ New Republic: Evidence RFK Jr used AI to write report with fake studies

Ok, now we’ve got a smoking gun, from evidence in The New Republic. [4] It looks definite that RFK Jr used OpenAI’s chatbots to write his report. From the NR article:

Artificial intelligence researchers claim there’s “definitive” proof that Health Secretary Robert F. Kennedy Jr. and his team used AI to write his “Make America Healthy Again” report.

Some of the 522 scientific references in the report include the phrase “OAIcite” in their URLs—a marker indicating the use of OpenAI.

They didn’t even bother to try to cover their tracks! Just outright fraud.


Notes & References

1: J Barradell-Johns, “Exploiting Copilot AI for SharePoint”, Pen Test Partners Security Consulting, 2025-May-07.

2: E Kennard & M Manto, “The MAHA Report Cites Studies That Don’t Exist”, NOTUS, 2025-May-29.

3: A Backlund & L Petersson, “Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents”, arχiv 2502.15840, 2025-Feb-20. DOI: 10.48550/arXiv.2502.15840.

4: EQ HoughTaling, “RFK Jr. Used AI to Write His Report Full of Fake Studies”, New Republic, 2025-May-30.

Published Fri 2025-May-30

Gestae Commentaria

Comments for this post are closed pending repair of the comment system, but the Email/Twitter/Mastodon icons at page-top always work.