More

alpineman · 2026-03-05T19:07:03 1772737623

No thanks. Already cancelled my sub.

alpineman · 2026-03-03T12:46:15 1772541975

Would be karma for all the unnecessary flights we have taken as a species.

In particular anyone who does 'mileage runs' and emits huge amounts of CO2 just so they have the 'privilege' to sit in a slightly nicer chair in a dull airport lounge.

gruez · 2026-03-03T13:21:24 1772544084

>In particular anyone who does 'mileage runs' and emits huge amounts of CO2 just so they have the 'privilege' to sit in a slightly nicer chair in a dull airport lounge.

I doubt anyone is doing this? At best they're grinding out flights so they can get free first/business class seats later.

Archonical · 2026-03-03T13:58:47 1772546327

People do this to meet minimum requirements for mileage tiers, e.g. I know someone who was close to Diamond status on Delta and went to Miami and back without leaving the airport area just for the miles.

gib444 · 2026-03-03T17:54:44 1772560484

Look on the Flyertalk BA forum 2005-2020. Was a huge thing and not always for upgrades, because BA have been stingy with upgrades for a long long time. Lounge access/baggage/priority boarding etc was a huge part of it

Popular mileage runs were London to Honolulu with lots of sectors on the way iirc !!

alpineman · 2026-03-03T10:13:17 1772532797

Yep, that's why I refuse to give Whatsapp (and when I still used them, Instagram and Messenger), full access to my Camera reel.

For some reason they keep asking aggressively for permission for the whole thing. I wonder why...

alpineman · 2026-03-02T15:29:02 1772465342

I never worked in a corporate that didn't use Lenovo

carlmr · 2026-03-03T07:52:32 1772524352

There are MacBook companies, but Lenovo is basically the only other alternative.

I've had Dell in the past, but haven't seen one in years.

bigfatkitten · 2026-03-03T08:28:18 1772526498

You won’t see any Lenovo in the defense industry.

joe_mamba · 2026-03-03T08:52:51 1772527971

Defense industry is a small fraction of the notebook market, meaning Lenovo is still no. 1 globally even when missing the global defense market

lava_pidgeon · 2026-03-03T15:27:14 1772551634

You are aware of the existence of the Chinese defence industry?

evanjrowley · 2026-03-04T01:17:15 1772587035

This explains why China's defense capabilities are outpacing the west in 2026. The defense behemoth who castrates users by denying them the all-powerful TrackPoint will be doomed to irrelevance very soon.

alpineman · 2026-02-20T09:45:22 1771580722

Scaleway is so close to being a great product but they need to hire a really visionary Product leader

alpineman · 2026-02-19T17:43:41 1771523021

More like half of Google's AI team is hanging out on HN, and they can optimise for that outcome to get a good rep among the dev community.

kridsdale3 · 2026-02-19T18:59:18 1771527558

Hello.

(I'm not aware of anyone doing this, but GDM is quite info-siloed these days, so my lack of knowledge is not evidence it's not happening)

alpineman · 2026-02-19T19:42:43 1771530163

Hello.

Please push internally for more reliable tool use across Gemini models. Intelligence is useless if it can't be applied :)

Barbing · 2026-02-19T17:58:44 1771523924

See: fish in bike front basket

alpineman · 2026-02-19T16:59:34 1771520374

100% agreed. I wish someone would make a test for how reliably the LLMs follow tool use instructions etc. The pelicans are nice but not useful for me to judge how well a model will slot into a production stack.

embedding-shape · 2026-02-19T17:22:43 1771521763

At first when I got started with using LLMs I read/analyzed benchmarks, looked at what example prompts people used and so on, but many times, a new model does best at the benchmark, and you think it'll be better, but then in real work, it completely drops the ball. Since then I've stopped even reading benchmarks, I don't care an iota about them, they always seem more misdirected than helpful.

Today I have my own private benchmarks, with tests I run myself, with private test cases I refuse to share publicly. These have been built up during the last 1/1.5 years, whenever I find something that my current model struggles with, then it becomes a new test case to include in the benchmark.

Nowadays it's as easy as `just bench $provider $model` and it runs my benchmarks against it, and I get a score that actually reflects what I use the models for, and it feels like it more or less matches with actually using the models. I recommend people who use LLMs for serious work to try the same approach, and stop relying on public benchmarks that (seemingly) are all gamed by now.

cdelsolar · 2026-02-19T17:26:43 1771522003

share

embedding-shape · 2026-02-19T17:41:42 1771522902

The harness? Trivial to build yourself, ask your LLM for help, it's ~1000 LOC you could hack together in 10-15 minutes.

As for the test cases themselves, that would obviously defeat the purpose, so no :)

MrGreenTea · 2026-02-20T07:58:24 1771574304

Would you be willing to give a rough outline of one or a few test cases? I am having a bit of a hard time imagining what and how you are testing. Is it like "change the signature of function X in file @Y to take parameter Z" and then comparing the result with what you expect?

cdelsolar · 2026-02-27T15:37:08 1772206628

the purpose of what? i'm not an LLM

alpineman · 2026-02-17T20:01:34 1771358494

Then they can offer it cheaper as they don’t pay the ‘Apple tax’

Razengan · 2026-02-18T02:58:54 1771383534

So why is Claude not cheaper than ChatGPT? Why won't they let me remove my payment info afterwards? Most other platforms like Steam let you do that. I don't want my shit sitting there waiting for the inevitable breach.

alpineman · 2026-02-16T09:31:03 1771234263

Everything is perception though. You are looking at this with your own perception, biases, and heuristics just like everyone else. There is no 'right' way to hire.

alpineman · 2026-02-08T07:44:41 1770536681

You’re right, but on the other hand once you have a basic understanding security, architecture, etc you can prompt around these issues. You need a couple of years of experience but that’s far less then the 10-15 years of experience you needed in the past.

If you spend a couple of years with an LLM really watching and understanding what it’s doing and learning from mistakes, then you can get up the ladder very quickly.

Nextgrid · 2026-02-08T07:59:19 1770537559

I find that security, architecture, etc is exactly the kind of skill that takes 10-15 years to hone. Every boot camp, training provider, educational foundation, etc has an incentive to find a shortcut and we're yet to see one.

A "basic" understanding in critical domains is extremely dangerous and an LLM will often give you a false sense of security that things are going fine while overlooking potential massive security issues.

nneonneo · 2026-02-08T14:09:45 1770559785

Somewhere on an HN thread I saw someone claiming that they "solved" security problems in their vibe-coded app by adding a "security expert" agent to their workflow.

All I could think was, "good luck" and I certainly hope their app never processes anything important...

nxobject · 2026-02-08T16:36:22 1770568582

Found a problem? Slap another agent on top to fix it. It’s hilarious to see how the pendulum’s swung away from “thinking from first principles as a buzzword”. Just engineer, dammit…

meetingthrower · 2026-02-08T20:53:00 1770583980

But if you are not saving "privileged" information who cares? I mean think of all the WordPress sites out there. Surely vibecoding is not SO much worse than some plugin monstrosity.... At the end of the day if you are not saving user info, or special sauce for your company, it's no issue. And I bet a huge portion of apps fall into this category...

spprashant · 2026-02-08T14:38:20 1770561500

> If you spend a couple of years with an LLM really watching and understanding what it’s doing and learning from mistakes, then you can get up the ladder very quickly.

I don't feel like most providers keep a model for more than 2 years. GPT-4o got deprecated in 1.5 years. Are we expecting coding models to stay stable for longer time horizons?

dickersnoodle · 2026-02-08T17:10:38 1770570638

This is the funniest thing I've read all week.