There is a reason why they won't do it. They are selling a narrative. There is a lot of money to be made here with this narrative and proving that artificial intelligence is NOT intelligent won't help sell that narrative.
The goal is to make it intelligent, by which OpenAI in particular explicitly mean "economically useful", not simply to be shiny.
Passing tests is well known to be much easier than having deep understanding, even in humans. They openly ask for tests like this, not that they could possibly prevent them if they wanted to.
There's scammers trying what you say of course, and I'm sure we've all seen some management initiatives or job advertisements for some like that, but I don't get that impression from OpenAI or Anthropic, definitely not from Apple or Facebook (LeCun in particular seems to deny models will ever do what they actually do a few months later). Overstated claims from Microsoft perhaps (I'm unimpressed with the Phi models I can run locally, GitHub's copilot has a reputation problem but I've not tried it myself), and Musk definitely (I have yet to see someone who takes Musk at face value about Optimus).
> The goal is to make it intelligent, by which OpenAI in particular explicitly mean "economically useful", not simply to be shiny
I never understood why this definition isn't a huge red flag for most people. The idea of boiling what intelligence is down to economic value is terrible, and inaccurate, in my opinion.
Everyone has a very different idea of what the word "intelligence" means; this definition has got the advantage that, unlike when various different AI became superhuman at arithmetic, symbolic logic, chess, jeopardy, go, poker, number of languages it could communicate in fluently, etc., it's tied to tasks people will continuously pay literally tens of trillions of dollars each year for because they want those tasks done.
This definition alone might be fine enough if the word "intelligence" wasn't already widely used outside of AI research. It is though, and the idea that intelligence is measured solely through economic value is a very, very strange approach.
Try applying that definition to humans and you pretty quickly run into issues, both moral and practical. It also invalidates basically anything we've done over centuries considering what intelligence is and how to measure it.
I don't see any problem at all using economic value as a metric for LLMs or possible AIs, it just needs a different term than intelligence. It pretty clearly feels like for-profit businesses shoehorning potentially valuable ML tools into science fiction AI.
> This definition alone might be fine enough if the word "intelligence" wasn't already widely used outside of AI research. It is though, and the idea that intelligence is measured solely through economic value is a very, very strange approach.
The response from @s1mplicissimus' on my previous comment is asking about "common usage" definitions of intelligence, and this is (IMO unfortunately) one of the many "common usage" definitions: smart people generally earn more.
I don't like "commmon sense" anything (or even similar phrases), because I keep seeing the phrase used as a thought-terminating cliché — but one thing it does do, is make it not "a very, very strange approach".
Wrong, that happens a lot for common language, but it can't really be strange.
> Try applying that definition to humans and you pretty quickly run into issues, both moral and practical.
Yes. But one also runs into issues with all definitions of it that I've encountered.
> It also invalidates basically anything we've done over centuries considering what intelligence is and how to measure it.
Sadly, not so. Even before we had IQ tests (for all their flaws), there's been a widespread belief that being wealthy is the proof of superiority. In theory, in a meritocracy, it might have been, but in practice not only to we not live in a meritocracy (to claim we do would deny both inheritance and luck), but also the measures of intelligence that society has are… well, I was thinking about Paul Merton and Boris Johnson the other day, so I'll link to the blog post: https://benwheatley.github.io/blog/2024/04/07-12.47.14.html
> there's been a widespread belief that being wealthy is the proof of superiority.
Both of these are assumptions though, and working in the reverse order. Its one thing to expect that intelligence will lead to higher value outcomes and entirely different to expect that higher value outcomes prove intelligence.
It seems reasonable that higher intelligence, combined with the incentives if a capitalist system, will lead to higher intelligence people getting more wealthy. They learn to play the game and find ways to "win."
It seems unreasonable to assume that anyone or anything that "wins" in that system much be more intelligent. Said differently, intelligence may lead to wealth but wealth doesn't imply intelligence.
I think we're in agreement? I'm saying their measure in this case is no worse than any other, but not that it's a fundamental truth.
All the other things — chess, Jeopardy, composing music, painting, maths, languages, passing medical or law degrees — they're also all things which were considered signs of intelligence until AI got good at them.
Goodhart's law keeps tripping us up on the concept of intelligence.
> I think we're in agreement? I'm saying their measure in this case is no worse than any other, but not that it's a fundamental truth.
Maybe we are? I think I lost the thread a bit here.
> chess, Jeopardy, composing music, painting, maths, languages, passing medical or law degrees
That's interesting, I would have still chalked skill in those areas as a sign of intelligence and didn't realize most people wouldn't once AI (or ML) could do it. To me an AI/LLM/ML being good at those is at least a sign that they have gotten good at mimicking intelligence if nothing else, and a sign that we really are getting out over our skis risking these tools without knowing how they really work.
Maybe by the time it’s doing a trillion dollars a year of useful work (less than 10 years out) people will call it intelligent… but still probably not.
I haven't seen "intelligent" used as "economically useful" anywhere outside the AI hype bubble. The most charitable interpretation I can think of is lack of understanding of the common usage of the word, the most realistic one is intentionally muddying terminology so one cannot be called a liar.
Are LLMs helpful tools for some tasks like rough translations, voice2text etc? Sure. Does it resemble what humans call intelligence? I'd yet have to see an example of that.
The suggested experiment is a great idea and would sway my opinion drastically (given all the training data, model config, prompts & answers are public and reproducible of course, we don't want any chance of marketing BS to taint the results, do we). I'll be honest though, I'm not going to hold my breath for that experiment to succeed with the LLM technology...
edit: lol downvoted for calling out shilling i guess
They don't have to do it themselves. The super-GPU cluster used to train GPT-6 will eventually shrink down to a garage size and eventually some YouTuber will.