The Turing test was created to distinguish intelligent machines from human thinkers, and the question it raises is nearly as old as computer science itself. Over time, countless variations of the test have focused on one familiar idea: spotting errors, awkwardness, or other signs of imperfection that might reveal a machine behind the answer. That approach is becoming less useful as modern systems grow better at sounding polished and convincing. In this blog, I explore a simple extension to the classic test: instead of asking whether an answer is flawless, I ask whether it is stereotypical. My argument is that this shift reveals a more subtle weakness—one that many of today’s leading AI systems still struggle to hide. The New Test Harness The core idea behind this new test harness is straightforward: instead of judging an answer in isolation, the evaluator would run it through multiple AI systems and measure how strongly it reflects the same familiar patterns; see the image below. The ...
In mathematics and computer science, asymptotic expansions provide a fundamental framework for managing complexity and reasoning about underlying processes at an appropriate level of granularity. For example, the discrete quantity represented by the n-th harmonic number , H n , can be approximated as ln n + O(1). This approximation offers the intuitive insight that harmonic numbers grow logarithmically, while also indicating that the omitted terms remain bounded by a constant. When greater precision is required, additional terms may be introduced—for instance, Euler’s constant —yielding a refined expression of the form H n = ln n + γ + O(1/n). A similar analogy applies to AI-friendly publications, such as my study helper for An Introduction to the Analysis of Algorithms, Second Edition , by RS and PF. This type of document is built around expandable AI prompts, which function in a manner analogous to asymptotic expansions for quantities of interest. The initial prompt (a solution of...