Hacker Newsnew | past | comments | ask | show | jobs | submit | alpineman's commentslogin

No thanks. Already cancelled my sub.

Would be karma for all the unnecessary flights we have taken as a species.

In particular anyone who does 'mileage runs' and emits huge amounts of CO2 just so they have the 'privilege' to sit in a slightly nicer chair in a dull airport lounge.


>In particular anyone who does 'mileage runs' and emits huge amounts of CO2 just so they have the 'privilege' to sit in a slightly nicer chair in a dull airport lounge.

I doubt anyone is doing this? At best they're grinding out flights so they can get free first/business class seats later.


People do this to meet minimum requirements for mileage tiers, e.g. I know someone who was close to Diamond status on Delta and went to Miami and back without leaving the airport area just for the miles.

Look on the Flyertalk BA forum 2005-2020. Was a huge thing and not always for upgrades, because BA have been stingy with upgrades for a long long time. Lounge access/baggage/priority boarding etc was a huge part of it

Popular mileage runs were London to Honolulu with lots of sectors on the way iirc !!


Yep, that's why I refuse to give Whatsapp (and when I still used them, Instagram and Messenger), full access to my Camera reel.

For some reason they keep asking aggressively for permission for the whole thing. I wonder why...


I never worked in a corporate that didn't use Lenovo

There are MacBook companies, but Lenovo is basically the only other alternative.

I've had Dell in the past, but haven't seen one in years.


You won’t see any Lenovo in the defense industry.

Defense industry is a small fraction of the notebook market, meaning Lenovo is still no. 1 globally even when missing the global defense market

You are aware of the existence of the Chinese defence industry?

This explains why China's defense capabilities are outpacing the west in 2026. The defense behemoth who castrates users by denying them the all-powerful TrackPoint will be doomed to irrelevance very soon.

Scaleway is so close to being a great product but they need to hire a really visionary Product leader


More like half of Google's AI team is hanging out on HN, and they can optimise for that outcome to get a good rep among the dev community.


Hello.

(I'm not aware of anyone doing this, but GDM is quite info-siloed these days, so my lack of knowledge is not evidence it's not happening)


Hello.

Please push internally for more reliable tool use across Gemini models. Intelligence is useless if it can't be applied :)


See: fish in bike front basket


100% agreed. I wish someone would make a test for how reliably the LLMs follow tool use instructions etc. The pelicans are nice but not useful for me to judge how well a model will slot into a production stack.


At first when I got started with using LLMs I read/analyzed benchmarks, looked at what example prompts people used and so on, but many times, a new model does best at the benchmark, and you think it'll be better, but then in real work, it completely drops the ball. Since then I've stopped even reading benchmarks, I don't care an iota about them, they always seem more misdirected than helpful.

Today I have my own private benchmarks, with tests I run myself, with private test cases I refuse to share publicly. These have been built up during the last 1/1.5 years, whenever I find something that my current model struggles with, then it becomes a new test case to include in the benchmark.

Nowadays it's as easy as `just bench $provider $model` and it runs my benchmarks against it, and I get a score that actually reflects what I use the models for, and it feels like it more or less matches with actually using the models. I recommend people who use LLMs for serious work to try the same approach, and stop relying on public benchmarks that (seemingly) are all gamed by now.


share


The harness? Trivial to build yourself, ask your LLM for help, it's ~1000 LOC you could hack together in 10-15 minutes.

As for the test cases themselves, that would obviously defeat the purpose, so no :)


Would you be willing to give a rough outline of one or a few test cases? I am having a bit of a hard time imagining what and how you are testing. Is it like "change the signature of function X in file @Y to take parameter Z" and then comparing the result with what you expect?


the purpose of what? i'm not an LLM

Then they can offer it cheaper as they don’t pay the ‘Apple tax’


So why is Claude not cheaper than ChatGPT? Why won't they let me remove my payment info afterwards? Most other platforms like Steam let you do that. I don't want my shit sitting there waiting for the inevitable breach.


Everything is perception though. You are looking at this with your own perception, biases, and heuristics just like everyone else. There is no 'right' way to hire.


You’re right, but on the other hand once you have a basic understanding security, architecture, etc you can prompt around these issues. You need a couple of years of experience but that’s far less then the 10-15 years of experience you needed in the past.

If you spend a couple of years with an LLM really watching and understanding what it’s doing and learning from mistakes, then you can get up the ladder very quickly.


I find that security, architecture, etc is exactly the kind of skill that takes 10-15 years to hone. Every boot camp, training provider, educational foundation, etc has an incentive to find a shortcut and we're yet to see one.

A "basic" understanding in critical domains is extremely dangerous and an LLM will often give you a false sense of security that things are going fine while overlooking potential massive security issues.


Somewhere on an HN thread I saw someone claiming that they "solved" security problems in their vibe-coded app by adding a "security expert" agent to their workflow.

All I could think was, "good luck" and I certainly hope their app never processes anything important...


Found a problem? Slap another agent on top to fix it. It’s hilarious to see how the pendulum’s swung away from “thinking from first principles as a buzzword”. Just engineer, dammit…


But if you are not saving "privileged" information who cares? I mean think of all the WordPress sites out there. Surely vibecoding is not SO much worse than some plugin monstrosity.... At the end of the day if you are not saving user info, or special sauce for your company, it's no issue. And I bet a huge portion of apps fall into this category...


> If you spend a couple of years with an LLM really watching and understanding what it’s doing and learning from mistakes, then you can get up the ladder very quickly.

I don't feel like most providers keep a model for more than 2 years. GPT-4o got deprecated in 1.5 years. Are we expecting coding models to stay stable for longer time horizons?


This is the funniest thing I've read all week.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: