Methodology

What Makes a TV Show Intelligent?

TV Intelligentsia — March 2026 — 7 min read

Cosmos scores 200. Reality dating shows score 87. The Wire scores 185. What are we actually measuring, and why does it produce results that feel intuitively right even when they're surprising?

The most common question we get is some version of: who decides this? Fair question. When you see a number — Breaking Bad: 163 — you want to know what's behind it and whether you should trust it.

Here's the honest answer: the IQ Score is a structured framework, not a judgment call. It measures three specific, independently scored dimensions. The final number is a weighted combination of those three scores. The methodology is fully documented, the formula is public, and the scores are reproducible.

But the more interesting question is: why these three dimensions? Why not critical consensus? Why not cultural impact? Why not originality?

The three dimensions

Dimension 1 — 40% weight

Cognitive Stimulation

How much does the show ask of you mentally? This measures narrative complexity, the sophistication of ideas presented, whether the show rewards or penalises attention, and how much active engagement it demands versus passive reception. A show that can be watched while doing something else scores low here. A show that requires you to track multiple timelines, interpret ambiguous information, or engage with genuinely complex ideas scores high.

Dimension 2 — 35% weight

Educational Value

What does the show teach? This doesn't only apply to documentaries. A drama can have exceptional educational value if it requires you to build knowledge of history, science, psychology, or social systems to understand it. Shogun (190) requires you to understand feudal Japan. Breaking Bad (163) requires you to understand chemistry, institutional failure, and moral corruption in a way that's genuinely instructive. Educational value is about what you know after watching that you didn't know before.

Dimension 3 — 25% weight

Entertainment Quality

Is it any good? This is deliberately the lowest-weighted dimension — not because entertainment doesn't matter, but because it's the most subjective and the most easily faked. A show can be technically accomplished and emotionally flat. Entertainment quality here means something specific: the quality of the craft applied to delivering the experience — writing, pacing, performance, direction, structure. Not whether it's fun, but whether it's done well.

The formula

IQ = round((c × 0.4 + e × 0.35 + q × 0.25) × 4)

c = cognitive (0–50) | e = educational (0–50) | q = entertainment (0–50) | scale: 0–200

Each dimension is scored 0–50. The weighted combination is then scaled to a 0–200 range. The categories break down as: Masterclass (160+), Stimulating (130–159), Competent (100–129), Passive (70–99), Numbing (<70).

Why cognitive stimulation is weighted highest

This is the most debated design decision. Why not weight educational value highest, given that it seems most obviously "intellectual"?

Because education without engagement is inert. A show can contain objectively valuable information and still not produce learning if the viewer isn't cognitively engaged while watching. Cognitive stimulation is the mechanism by which a show's content gets processed and retained. It's the delivery system. Weight it highest, and the score correctly identifies shows that actively develop the viewer's mind rather than just presenting information.

Why entertainment quality is weighted lowest

Not because we don't care about it. We do. But entertainment quality is not a reliable proxy for intellectual value, and we're measuring intellectual value specifically.

Some of the most entertaining television ever made scores in the 110–130 range. It's brilliantly produced, enormously enjoyable, and asks relatively little of you. That's not a flaw — it's a design choice. But it belongs in the Competent category, not Masterclass. The score reflects that accurately.

What the score doesn't measure

Popularity. Cultural impact. Originality. Critical consensus. Personal taste. Whether it's appropriate for all audiences. Whether it's enjoyable in a specific mood. Whether it has re-watch value.

The score answers one question: what does this show do to your mind while you watch it? That's a narrower question than "is this good" — and it produces more useful answers specifically because it's narrower.

Some scores that illustrate the system

The Wire (185) and Succession (162) both score high because they demand active cognitive engagement across multiple characters, institutional systems, and moral frameworks simultaneously. The difference in their scores comes primarily from educational value — The Wire requires deeper knowledge of urban economics, policing, politics, and labour to fully appreciate.

A reality dating show in the 85–95 range typically scores well on entertainment quality (they're well-produced and watchable) but low on cognitive and educational dimensions. The score isn't saying the show is bad — it's saying it's not doing intellectual work.

The full methodology, including sub-metrics for each dimension, is documented at tvintelligentsia.com/methodology. If you think a score is wrong, you can request a re-score — we review every submission.