Recent remarks by Apple commentator John Gruber labeled Siri’s current abilities as “an unfunny joke,” using its failure to identify the winner of Super Bowl 13 as an example, arguing that such basic information should be easily accessible to any US-based chatbot.
This particular example wasn’t arbitrary: it arose from a challenge posed by Kafasis, a friend of Gruber, who evaluated Siri’s responses regarding Super Bowls 1 through 60, and the results were far from impressive…
Kafasis documented the findings in a blog entry.
So, how did Siri perform? With the utmost leniency, Siri accurately identified only 20 winners out of the 58 Super Bowls played. This translates to a dismal 34% success rate. If Siri were a quarterback, it would be kicked out of the NFL.
It managed to get four consecutive winners correct (Super Bowls IX through XII), albeit under questionable circumstances where it answered correctly for the wrong reasons. More accurately, it succeeded in three consecutive instances (Super Bowls V through VII, XXXV through XXVII, and LVII through LIX). At its lowest point, it got 15 consecutive answers wrong (Super Bowls XVII through XXXII).
Siri appears to have a strong preference for the Eagles.
Interestingly, it attributed the Philadelphia Eagles with a remarkable 33 Super Bowl victories that they haven’t actually achieved, in addition to the one they do have.
The “correct answer for the wrong reason” scenario refers to Siri being asked about the winner of Super Bowl X. Strangely, Siri responded with an elaborate explanation about Super Bowl IX, which coincidentally had the same winner.
At times, Siri strayed completely off-topic, providing unrelated Wikipedia excerpts instead of answering the question directly.
“Who won Super Bowl 23?”
Bill Belichick holds the record for the most Super Bowl victories (eight) and appearances (twelve: nine times as head coach, once as assistant head coach, and twice as defensive coordinator) by any individual.
Perhaps the usage of Roman numerals is perplexing, leading other AI models to falter similarly? Gruber opted to conduct a few impromptu checks.
I haven’t executed a thorough test from Super Bowls 1 through 60 due to my laziness, but a few random checks indicate that every other question-answering agent I use accurately names the winners.
I tested ChatGPT, Kagi, DuckDuckGo, and Google. All four fared well even with trickier questions about Super Bowls 59 and 60, which haven’t taken place yet. For instance, when asked about the Super Bowl 59 winner, Kagi’s “Quick Answer” states: “Super Bowl 59 is scheduled for February 9, 2025. The game has not occurred yet, so there is no winner to report.”
Super Bowl champions aren’t an obscure topic; consider asking “Who won the 2004 North Dakota high school boys’ state basketball championship?” — a question I just fabricated, and surprisingly, Kagi answered that correctly for Class A, while ChatGPT provided accurate answers for both Class A and Class B, including a link to a video of the Class A championship game on YouTube.
That’s impressive! I chose an obscure state (no offense to North or South Dakotans), a relatively distant year, and the high school sport I was most adept at and fond of. Both Kagi and ChatGPT handled it well. (Kagi earns an A, and ChatGPT receives an A+ for recognizing champions in both classes, plus extra credit for the YouTube links.)
Gruber notes that the older version of Siri—specifically on macOS 15.1.1—performed better. Though it appears to be less adept, as it resorts to the classic “Here’s what I found on the web” response, at least it directs users to the correct answer. The new Siri does not offer this advantage.
The updated Siri — now powered by Apple Intelligence™ with ChatGPT integration—often gets the answer completely, yet misleadingly wrong, which is the worst way to answer incorrectly. It also has the tendency to be inconsistently wrong — when I asked the same question four separate times, I received different incorrect answers each time. It’s an utter failure.
Photo by Caleb Woods on Unsplash
FTC: We use income earning auto affiliate links. More.