Good. The true mark of AGI is when a company accepts liability and doesn’t bury “for entertainment purposes only” deep in their TOS. Same as it works with employees.
Same for self-driving. Your car is not self-driving until it accepts liability and you count as just a passenger.
But watch as Germany soon loses AI Google results.
To clarify a few comments here: this is not only OCI containers: container machines add support for persistence and filesystem mounting, making container machines a great lightweight Linux environment for developers using macOS. More details here: https://developer.apple.com/videos/play/wwdc2026/389
So Google has established a product called Search.
For that product rules have been established.
Google has monopolized that product.
Now Google is replacing that product with a new product.
But they keep calling it the same thing.
Because they want to keep their monopoly.
That is what has been deemed illegal.
Gemini is not illegal. Pretending the worst version of Gemini is Search is illegal, because it breaks the rules established for Search.
It's exhausting that the "solution" to problems like this is getting tens or hundreds of thousands of citizens stressed until enough public attention gives some small chance of redress. I'm not calling for violence, but if we can't get these things fixed in court there has to be a more effect and more forceful avenue for protest than venting on internet forums.
As a non-web dev, I have a question about this part:
> There was a sad coda; as is the way of contract work, I moved on. I explained what I had built to my replacement, that it always worked even without javascript. He was appalled and said, “but that’s a lot more work for us.”
Why is it more work? The approach described in the article seems honestly reasonably simple: just write the standard <input> components for the form, have a submit button at the bottom. When I was making my own websites many years ago now, that's how it worked, and it wasn't that hard. Maybe it's reflecting my ignorance in this field, but doing fancy front-ends seems much harder to me.
Just remember that Google is essentially an advertising company and that they were always going to squeeze this opening closed as soon as they could get away with it.
I do fear for a future were even Firefox ends up caving in. Ladybird browser might be our only hope until something legal comes along to block functionality.
Not missing the forest for the trees, this effectively means in 3-5 months China will drop open source models that are every bit as capable and dangerous as current day Mythos except with no safeguards.
And the only companies safe from this are the large corporations that shook hands with Anthropic? Because Fable doesn't seem to have actual safeguards, more like 'if you talk about this you will be talking to Opus.' It doesn't guard against offensive use, it prevents all use (offensive AND defensive).
Rationalists are inventing oligopolies from first principles, absolutely incredible things happening in SF
You cannot trust companies to communicate an unbiased vision of the future, because they will always build what they're capable of selling. Microsoft and Meta are incapable of selling phones and laptops; they're certainly capable of building them, but few people will buy them. So instead Meta builds smart-glasses and Microsoft presents this weird vision of "connected thin devices" by keeping the hardware itself very abstract and unknowable. The hardware doesn't matter to Microsoft, not because the hardware doesn't actually matter, but because it cannot matter, because Microsoft cannot win in hardware. Its not a vision of the future; its a vision of what Microsoft can meaningfully sell. Microsoft can meaningfully sell their weird constellation of 365 subscriptions that no one knows what they do or if they remotely do what you buy them for; thus their marketing now wears that idea of "unknowable capability" like a mask.
The thing I love about OpenCV is that it remains hands down the best library for simply loading images and video. I've never even used any of its fancy computer vision features, but if I need to load a video file and look at the pixels - which I did need to do recently for an art project - OpenCV does it in about four lines of code.
The pelican has looked very same-y across all frontier models, same color bike, same camera angle, etc. I suspect this challenge is already too embedded in the training data to be a good signal when it succeeds, and maybe even when it fails in pathological ways mirroring existing AI pelicans on the internet.
Still satisfied with my switch to codex/chatgpt. I couldn't imagine switching away from claude code when it first launch but with the drastically more generous usage on codex for the same subscription tier I just can't justify it.
I like fixing code made by AIs and others (outsourcing code is similar as someone else said already). Last week we found out some client tried to vibe some departmental tool; the result is some massive crap in nextjs that needs 10GB mem to compile, has 1000s of lint errors, dev logs in git (very noisy ones) and so on. Now we have to fix it: its basically free 10k-50k euros over and over again for this type of work. Very easy if you know what you are doing; impossible if you don't. Keep m coming.
Bad title. This isn't an agent "running amok", this is an early experiment in carrying out an Xz attack by using an agent to build trust (and hacking/impersonating a known-good contributor identity). The agent is obeying commands it was given, the exact opposite of running amok, and although the execution isn't particularly effective, it is having some success (patches have been accepted).
This is deeply scary, not because "agents are running amok" but because a huge amount of our infrastructure is vulnerable to this kind of attack, and if bad people are utilising LLM agents to carry them out, we're in for a wild ride over the next few years.
This all feels like a race where the model companies try to solve doing work locally in a way that doesn't suck, before the major operating systems companies figure out AI integration into their OS that doesn't suck. It also makes me wonder why Google which has both Gemini and Android can't figure this out, and if there are lessons to draw from that.
It seems like Fable will refuse to do any work when it comes to developing LLMs or even asking questions about topics related to LLM. Simple things like asking to explain a paper fails!
From the model card:
In light of the ability of recent models to accelerate their own development, we've implemented new interventions that limit Claude's effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design. Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user.
It is actually worse than that. It is at least 30 days. There is an "almost" that is doing a ton of heavy lifting here "deletion after 30 days in almost all cases". My read of that is they can hang onto data for as long as they want, even if they usually won't. And "all traffic" with an agentic harness is basically your entire codebase you work on.
> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We won’t use this data to train new Claude models, or for any non-safety-related purpose, and we’ve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.
It is as if Jetbrains told that "you can't use IntelliJ Idea to develop frontier IDE. We can introduce slight compilation errors if we detect you doing so".
I'm assuming this is vibe coded, because it's got a bunch of the usual tells, so to the people who do this: can you please stop making stupid scrolling presentations where I can see less than a slide of information at a time? Please tell your clanker to just write a blog post instead, or better yet, write it yourself.
The idea that 1:1s with devs adding very little value to the team is… pretty wild.
If you think 1:1s don’t add value, your slice of the reality of what even modestly sized teams need to operate smoothly is so far from my experience I don’t think we’re likely to bridge the divide.
But to make a good faith effort: what is the job you think line managers are supposed to be doing, if not listening to devs, going to meetings you would prefer not to sit through, and writing up carefully documented feedback for the under-performers you seem convinced surround you at every turn?
> It's possible Opus or GPT-5.5 could have done this too, I've not tried the exact same sequence. The Fable vibes are good here, though.
And that's the thing. These comparisons are all gut feelings. I'm missing objective unbiased measurements to actually have real comparisons between different models, their different generations, or even just the convention that everybody adds "you are an expert software engineer" and "don't make mistakes" to their prompts because they think it improves anything. Nobody knows if it actually does.
There are a lot of bad CEOs, though. It's a lot like a politician -- it's quite difficult to become a CEO, and the skills to make it to that position don't always intersect nicely with the skills necessary to actually do the job well.
I was curious how this thing works and asked Claude to visualize it -- mostly to see how good Fable is and I have to say, what it made was good enough for me to get a gist of it. Posted it here
Same for self-driving. Your car is not self-driving until it accepts liability and you count as just a passenger.
But watch as Germany soon loses AI Google results.