The title is a misdirection. The token counts may be higher, but the cost-per-ta...

alach11 · 2026-04-17T21:15:34 1776460534

I ran an internal (oil and gas focused) benchmark yesterday and found Opus 4.7 was 50% cheaper than Opus 4.6, driven by significantly fewer output tokens for reasoning. It also scored 80% (vs. 60%).

stingraycharles · 2026-04-18T00:23:02 1776471782

That’s just adaptive reasoning, not related to the increased tokenizer costs.

simianwords · 2026-04-18T07:56:13 1776498973

Why would I as a user be concerned about one over the other?

stingraycharles · 2026-04-18T08:28:59 1776500939

Because it teaches you cause and effect in terms of costs and quality.

Unless you want to keep complaining about the model being nerfed.

bisonbear · 2026-04-17T18:25:50 1776450350

yep, ran a controlled experiment on 28 tasks comparing old opus 4.6 vs new opus 4.6 vs 4.7, and found that 4.7 is comparable in cost to old 4.6, and ~20% more expensive then new 4.6 (because new 4.6 is thinking less)

https://www.stet.sh/blog/opus-4-7-zod

cced · 2026-04-17T18:47:09 1776451629

So they nerfed 4.6 to make way for 4.7?

Progress. /s

bisonbear · 2026-04-17T18:47:46 1776451666

> they nerfed 4.6 to make way for 4.7?

> Progress. /s

pretty much, lmao. my theory is 4.6 started thinking less to save compute for 4.7 release. but who knows what's going on at anthropic

GorbachevyChase · 2026-04-18T02:33:40 1776479620

A fun conspiracy theory I have is that Mythos isn’t actually dangerous in any serious sense. They just can’t reliably serve a 10T model. So they have to make up a reason to limit customers.

kirubakaran · 2026-04-17T19:33:11 1776454391

"but who knows what's going on at anthropic"

People at Anthropic, of course

dang · 2026-04-17T21:43:16 1776462196

(Submitted title was "Claude Opus 4.7 costs 20–30% more per session". We've since changed it to a (more neutral) version of what the article's title says.)

jofzar · 2026-04-18T00:47:15 1776473235

I think it's time to have previous titles show as a edit * icon that can show the previous title.

This is not the first time where the more neutral (which imo is better) has caused me to be confused why everyone is saying something different in the comments.

dang · 2026-04-18T04:27:07 1776486427

That's probably too much ceremony for HN but petercooper made a really nice HN title edit tracker which is probably still running. Let me see if I can dig it up for you...

Edit: hmm - maybe not: https://news.ycombinator.com/item?id=21617016.

aray07 · 2026-04-17T17:16:11 1776446171

im running some experiments on this but based on what i have seen on my own personal data - I dont think this is true

"given that Opus 4.7 on Low thinking is strictly better than Opus 4.6 on Medium, etc., etc.”

Opus 4.7 in general is more expensive for similar usage. Now we can argue that is provides better performance all else being equal but I haven’t been able to see that

namnnumbr · 2026-04-17T19:40:11 1776454811

Following up on "strictly better" via plot in release announcement:

https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-...

unpwn · 2026-04-17T17:17:32 1776446252

Very unlikely that the article is wrong. the 4.7 intelligence bump is not that big, plus most of the token spend is in inputs/tool calls etc, much of which won't change even with this bump.

namnnumbr · 2026-04-17T19:43:48 1776455028

IMO, you're incorrect:

1. In my own use, since 1 Apr this month, very heavy coding:

> 472.8K Input Tokens +299.3M cached > 2.2M Output Tokens

My workloads generate ~5x more output than input, and output tokens cost 5x more per token... output dominates my bill at roughly 25x the cost of input. (Even more so when you consider cache hits!) If Opus 4.7 was more efficient with reasoning (and thus output), I'd likely save considerable money (were I paying per-token).

2. Anthropic's benchmarks DO show strictly-better (granted they are Anthropic's benchmarks, so salt may be needed) https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-...