Hacker Newsnew | past | comments | ask | show | jobs | submit | simonw's commentslogin

Huh, I had not realized my comment was [dead] - it's here https://news.ycombinator.com/item?id=48492313 (I see it as not-dead, which I guess is how the dead system works) (UPDATE: it's no longer dead)

I was calling out the video for starting with:

> If you paid for that usage through a standard API, those 10 billion tokens would cost you around $15,000 a year. That is the real unsubsidized price. No discounts, no incentives, just the raw compute costs.

When "raw compute costs" is entirely misleading to describe API pricing.


If by "got caught" you mean "published it in their system card paper".

(Admittedly it was buried pretty deep in that 300+ page PDF, but they did at least disclose it. If they hadn't I imagine it would have taken quite some time for the research community to figure out what was going on.)


It was in the announcement, too. I’m 99% sure they edited it after they changed their mind, because I knew about it from reading that, and never opened the model card.

On the earliest web archive snapshot I can find [0], I do not see any mention of the safeguard/sabotage under discussion [1].

And to be clear, this isn't the safeguard where the model is explicitly downgraded to Opus, but rather where the Fable/Mythos model's "effectiveness" is transparently "limited" via "prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT)".

[0]: https://web.archive.org/web/20260609173222/https://www.anthr...

[1]: https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-...


I wasn't buried, it was on the third page after the ToC

Yes, I actually do mean that. I skimmed the system card. Them stating it openly, doing it, and being called out on it just doesn't have any meaningful difference.

They could have simply told people "we do not permit using Claude models to perform frontier AI research," which is defensible from a policy point of view. This particular usage of their products requires no deception, nor hiding information prevent abuse.

However, instead, they chose for some reason to publicly display a morally poor way to execute a reasonable business decision (preventing abuse, defending your business interests, etc.)


It doesn't make sense to include the capex cost to train a model in this kind of discussion, because that cost is fixed.

Consider a model that costs $100m to train.

If the vendor then prices it such that each inference token has a margin of 10% over the variable costs to serve (power + server costs), whether or not they cover their costs is based entirely on how many tokens they can sell.

If they sell less than $1bn of tokens, they lose money - the break even point is 10x100m = $1bn.

If they sell $10bn of tokens they make a ton of money.

This also means you can't credibly calculate how much of the fixed training expense is covered by your token spend, because until the model is retired and you can account for how much inference it ran you don't know what percentage of the training cost each sold token was responsible for.


Cost is fixed if you train a model once in several years, if you have to train 3/4 times per year to stay competitive training cost is a thing.

You have to include also failed training sessions and experiments in the math.

There are no official figures but given how fast new models are rolled out, I wouldn't be surprised if neither Anthropic nor OAI manage to cover the full models cost.


I think the capex being fixed assumes you can just stop training the next model. But its not clear that you can afford to do that and keep selling tokens.

And if capabilities plateau such that training the next one is useless, then the margins will drop fast due to competition.


Model inference:training compute for frontier models is estimated to be over 10:1 now.

Driven mostly by just how much inference they sell nowadays - but also by things like base model reuse.


This is not a credible presentation:

> If you paid for that usage through a standard API, those 10 billion tokens would cost you around $15,000 a year. That is the real unsubsidized price. No discounts, no incentives, just the raw compute costs.

The standard API pricing is not the raw compute cost. Making that claim in the first minute of the video discredits the entire thing.

(Here's the full MacWhisper transcript: https://gist.github.com/simonw/991dde81b95fa4436f46517c3c1a4... )

And yeah, if you work your way through the whole thing it's mostly breathless hype based around shaky premises.


There are a whole lot of foods that are incredibly wasteful if you truly care about water consumption.

Saying "if it's edible then it doesn't matter how much water it uses, it's justified" isn't a good position to take.


Hah, this is pretty funny.

The official AWS account posted a thread to promote their podcast interview with Charity Majors: https://art19.com/shows/aws-for-software-companies-podcast/e...

But they forgot that users who are NOT signed into Twitter don't see threads at all, so it looks to anyone who's follows the link that they just took an official AWS position at odds with the entire industry that they're trying to make billions of dollars from.

Here's the full text of that thread:

> More AI-generated code doesn't make your team faster. It might actually slow you down.

> The real bottleneck was never writing code. It's releasing it, debugging it, & keeping it running well. So when @Honeycombio CTO Charity Majors set a productivity target, she didn't chase 10x. She chose 2x, & built from there.

> Her team also skipped the mandates & built a set of AI values instead:

> "Every AI output has to have a human owner. If you don't want your name on it, it's probably not good work."

> Quality first, quantity second.

> Hear how @mipsytipsy built it on the AWS for Software Companies podcast.


According to https://chromestatus.com/metrics/feature/timeline/popularity... WebAssembly runs on about 6.11% of Chrome page loads, up from 3.37% in January 2024.

Probably all of that is Figma.

Sure, figma is 6% of the internet

There are many designers out there.

Suno also runs on WASM. Pretty good showcases, both, imho.

I'm unreasonably excited about WASI. WASI is the thing which takes WebAssembly from a tool for running stuff in a browser to a tool that can run entire portable sandboxed applications on a computer - with controlled filesystem and network access.

I don't ever want to run untrusted code from the internet outside of a sandbox ever again. If WASI lives up to its full potential I won't have to - we'll have a robust, cross-platform sandboxing solution for running real applications.


> I don't ever want to run untrusted code from the internet outside of a sandbox ever again

WASM is great, but I think it's a wrong approach for sandboxing problem. It's technically possible to sandbox native applications (compiled into target machine code) using OS-builtin mechanisms, but it's not done for compatibility reasons, because this is the way things were done last 50 years or so.


sandboxing native apps just gives you security. with wasm you also get a single portable binary that can run on x86 windows, arm64 linux and in your browser with zero modification. you dont need to write platform specific code or use third party frameworks.

> you dont need to write platform specific code

You don't need to write platform-specific code if you use some cross-platform framework. For simple programs it may be enough to use only the standard library of your language of choice.

> single portable binary that can run on x86 windows, arm64 linux and in your browser with zero modification

It has little value. Compiling a separate binary for each OS isn't that hard, since only a handful of architectures and operating systems are actually in use. Using an abstract cross-platform binary (like WASM) in the other hand adds extra performance costs and other user-side overhead, which isn't strictly necessary.


No you don't, because WASM is only compute, and you need exactly runtime specific code and third party frameworks for everything else as imported functions.

That's what WASI is for

And like CORBA, or POSIX, the portability does not work 100% as being sold.

Exactly. It is entirely a misconception to believe that WASM is this silver bullet on sandboxing and it is not that great security-wise I’m afraid.

It is only now being inspected by researchers and attackers who have found sandbox escapes [0] (chrome 0day), out-of-bounds [1] / use-after-free [2] and many other [3] flaws [4] in WebAssembly which I also agree that it is not enough for sandboxing at all.

[0] https://nvd.nist.gov/vuln/detail/CVE-2026-11645

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=2009901

[2] https://bugzilla.mozilla.org/show_bug.cgi?id=2013741

[3] https://www.miggo.io/vulnerability-database/cve/CVE-2026-269...

[4] https://github.com/bytecodealliance/wasm-micro-runtime/secur...


There's no reason to believe that [0] has anything to do with WASM, [1] and [2] are runtime implementation bugs, [3] is a vulnerability in a "weak" sandboxing library VM2 - it has nothing to do with WASM as such, and [4] is another implementation bug in an experimental WASI feature of that specific runtime which is gated behind a build flag.

------

[Re: 3] https://github.com/patriksimek/vm2

> vm2 attempts to sandbox untrusted JavaScript code within the same Node.js process as your application. It does this through a complex network of Proxies that intercept and mediate every interaction between the sandbox and the host environment.

> JavaScript is an extraordinarily dynamic language. Objects can be accessed through prototype chains, constructors can be reached via error objects, symbols provide protocol hooks, and async execution creates timing windows. The sheer number of ways to traverse from one object to another in JavaScript makes building an airtight in-process sandbox extremely difficult.

[Re: 4] https://github.com/search?q=repo%3Abytecodealliance%2Fwasm-m...


Those are not flaws in WASM itself, but in different WASM runtimes.

Might as well run java or flash.

But WASM is "lawnmower not included"!

Use cases I am more excited about:

1) Replace webhooks in web apps with wasm binaries provided by the customer, but that run in the web app servers.

2) Safer plugin system for professional software (plugins for photoshop, plugins for IDEs, etc)

3) Safer mod system for games and server-side mods that run on the game-maker server.


We had that in the 90's with Java. Why would this approach succeed today?

WASM sandbox is miles better than the JVM

WASI is a standard on where to poke holes on the sandbox for your specific use-case

WASM+WASI as a compilation target allows any program written for modern operating systems to work on any WASM runtime


Why wouldn't it? It's a different technology stack, developed with the benefit of decades of additional experience running hostile code on the web.

You mean like the CLR?

Kind of strange that such experience still allows for WASM to be the target of C and C++ compilers, and there is no bounds checking support inside linear memory regions.


Because Java is a language, not a compilation target?

From the introduction section of the Java specification [1]:

"The Java Virtual Machine is the cornerstone of the Java platform. It is the component of the technology responsible for its hardware- and operating system-independence, the small size of its compiled code, and its ability to protect users from malicious programs."

[1] https://docs.oracle.com/javase/specs/jvms/se26/html/jvms-1.h...


From the same link, opening sentence:

"The Java® programming language is a general-purpose, concurrent, object-oriented language."

Edit: Having thought a little, I appreciate that it's possible to compile for the JVM from source code which is not Java, which makes the JVM a compilation target. As far as I'm aware the JVM doesn't have first class support for this though, It's been tacked on as an afterthought. Compiling C to JVM bytecode for example doesn't appear to be an enjoyable process. WASM on the other hand was designed explicity to function as a compilation target for arbitrary languages.

Maybe I'm missing something, happy to be proven wrong.


There are (or have been) lots of languages using the JVM as a compilation target, whether it is well-suited for this or not. Wikipedia has a partial list: https://en.wikipedia.org/wiki/List_of_JVM_languages

My point is that it isn't well suited for it. Hence WASM.

Check out https://extism.org, it is built for those kinds of use cases. However I think WASI and components could enhance it.

sorry I meant "most excited about", WASI and components should be useful for the usecases I mentioned too.

For example a SaaS services that accepts WASM plugins could provide a WASI that lets the plugin write to a object-store filesystem (like AWS S3) provided by the SaaS owner.


i had this same vision when i created hyper-mcp with modular plugin system via WASM plugins. Too bad, the community moves on from MCP to CLI with coding agent

https://github.com/hyper-mcp-rs/hyper-mcp


Hey, this is also my interest. I was just looking into whether it was possible to e.g. build an archive extractor that runs like a normal program but does the actual extraction completely in wasm. Unfortunately, AFAICT it's possible but requires custom code; you can't (yet, I hope) just compile unzip/libarchive/whatever with CC=wasicompiler and get a sandboxed binary. But we're getting close.

What do you mean? You absolutely can run compression in WASM.

For example here is Gzip in WASM: https://github.com/ColinTimBarndt/wasm-gzip


You should be able to do exactly that though? Why do you think you can't?

You will of course need to include a lot of support code to provide the relevant syscalls and otherwise emulate the environment that the code expects. But there are plenty of examples of that at this point.


The thing that interests me the most is that execution is deterministic. If the inputs to a WASM module are logged you get durable execution and rr style reverse debugging as part of the package.

If you're interested in this, then you should check out https://github.com/golemcloud/golem

Golem is a durable workflow platform and can run any wasm.


Sorry but how exactly does the sandboxing help? You download and run an app that you expect to be useful and that you need. The app needs permission to access your data. If you want to use the application what choice do you have except to grant it access?

Point being you wouldn't run untrusted code in the first place and for "trusted code" you end up accepting it's access requirements anyway.

So logically I'd think that the malware would just get piggy bagged into actual non-obvious utility apps and nothing is gained.

Second problem is that the security model hoops make for terrible APIs and user experiences. Just look at the current filesystem browser APIs. It must be mentally challenging to design APIs to Be usable and the nerf them for security purposes to make them "not too usable".

Finally one must note that at least right now the webasm ecosystem is rather immature and the de-facto only tool (emscripten) is an amateur hour hobby project. So it's going to take some decades still before the tooling is really getting there.


> The app needs permission to access your data. If you want to use the application what choice do you have except to grant it access?

But it doesn't need network access to be useful, so it doesn't have that permission and can't exfiltrate your data?



In general, what's three point of a link to a sandbox in a conversation about the benefits of sanboxingm

But specifically, this sandbox also kills all interop with your system, other apps/utilities, so way too disruptive for the purpose of isolating just from the network.


Just like any WebAssembly runtime, without imports of external functions, the code can only warm CPUs.

So? Will they not have imports of external functions?

The point being that WASM doesn't improve anything over sandoxing native applications, on the OSes that actually are serious about it.

It should confine itself as the evolution from browser plugins.


So it does improve since the OSes are not actually serious about it otherwise they would've fixed these basic usability concerns years ago.

I haven't said such thing.

I'm curious if people have a good story for why WASI will succeed where Java failed

My main one is that WASI has benefitted from an additional 31 years of accumulated industry-wide experience compared to when Java was first released.

So where has the experience gone in the support of C and C++ for WASM?

Plus, Larry Ellison doesn't own WASM: "Lawnmower Not Included"!

Programs written in Java require installation of a middleware called Java runtime. It adds extra friction for end-users. And even if one has Java runtime installed, a newer version may be necessary for a recently-published application.

With WASM it may be the same, unless al major OS vendors integrate a WASM runtime so that it doesn't need to be installed separately.


>And even if one has Java runtime installed, a newer version may be necessary for a recently-published application.

WASM doesn't remove version churn, the linked article literally discusses a newer version. Oh and the wonderful web compat story.


It is exactly the same for WASM outside of the browser, and Java has Android as counter part to built in runtime.

Java's vm does not start in milliseconds nor has dozen independent implementations in every ecosystem

Folks tend to think only Oracle does Java, and the only place to get it is from https://www.java.com.

Maybe they should broaden their horizons.

https://en.wikipedia.org/wiki/List_of_Java_virtual_machines

And some like PTC or microEJ are missing from the list.


Yes, but inside the browser is a freaking big use case.

Not really, I don't need COM / CORBA on the browser.

> Programs written in Java require installation of a middleware called Java runtime. It's possible to link or embed a Java runtime in an existing application.

> It adds extra friction for end-users

It doesn't have to, the program can bundle its own jre as its often the case, and then you also don't have to worry about jre compatibility. Downside is then you have many jres installed and of course you can't trust their sandboxing.


Because Java was doing nothing similar, a better comparison would be .NET CLR that actually tried to be a decent compilation target.

Also security, Java has reflection so you cannot reliably sandbox java libraries


My main one is: distribution & access. If major browsers implement the WASI runtime then using and distributing a WASI app will be way simpler than the Java equivalent ever was.

I like the technical design of WASM, but I feel that better OS sandboxes for regular native code will be the common approach to running untrusted code.

As soon as you compile to WASM you no longer have the C FFI and the ability to call the OS systems interfaces for files, network and others.

It is extra work to move something to WASM vs just compiling it and running it in a sandbox.


Why not just run a vm?

Far more overhead.

In particular (as I just learned when looking it up), WASM can dynamically allocate host memory with the memory.grow instruction, so you don't have to waste a huge chunk of statically allocated memory per VM: https://developer.mozilla.org/en-US/docs/WebAssembly/Referen...

Although... it doesn't say anything about releasing memory back to the host (I don't see a memory.shrink instruction) so maybe it's not all that helpful? Will WASM applications continue hogging the maximum amount of memory they've ever used until they're restarted?

A VM could release memory back to the host using memory ballooning, but this has to be managed manually somehow, at least with QEMU.


It's only available on the $100-$200/month subscription plans until June 22nd, after which they're going to be charging everyone the full API token price. Then it's going to be really expensive.

(I think it may even be expensive enough for them to recoup their costs, unless OpenAI or Gemini put out a similar capability model before Anthropic have had a few months to make bank from it. There was that rumored customer who spent $500m in a single month, and that was just for Opus!)


Ah, well that makes sense. Will be interesting to see what adoption looks like for a model that’s billed at a more realistic price-point towards profitability.

Suggestion: run this command:

  uvx agentsview usage daily
That should tell me the token total and dollar cost across all of your local coding agents.

If it doesn't seem to know how much Fable costs you can follow these instructions to teach it the Fable pricing scheme, then run the command a second time to see the results: https://til.simonwillison.net/llms/agentsview-custom-model-p...

I've not run into the same Fable quota problems that others have reported myself yet. I wonder if people are running prompts that are somehow spinning up a ton of subagents?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: