Microsoft CTO defies critics: AI progress not slowing down, it’s just warming up

hillspuck · Jul 15, 2024

Microsoft CTO Kevin Scott doubled down on his belief that so-called large language model (LLM) "scaling laws" will continue to drive AI progress, despite some skepticism in the field that progress has leveled out. Scott played a key role in forging a $13 billion technology-sharing deal between Microsoft and OpenAI.

He has such a conflict-of-interest here that I find it hard to put any weight behind his opinion.

ResemblesPrison · Jul 15, 2024

Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.

ThatEffer · Jul 15, 2024

Guy who rents out x machines: The x industry is far from crashing, it is in fact getting ready to really take off.

Amon-Ra · Jul 15, 2024

I find Computerphile a bit more convincing

View: https://youtu.be/dDUC-LqVrPU?feature=shared

tsigos · Jul 15, 2024

When your entire job depends on selling a vision of the "AI Future," I need more than "trust me bro."

adrianovaroli · Jul 15, 2024

Serious analysis aside: a man with that disgrace of a beard should not be in a decision-making position. It's like he's trying to warn us at every level.

Joel622 · Jul 15, 2024

Always listen to the Chief Tulip Officer's advice about investing in tulips.

jezra · Jul 15, 2024

just warming up... like the climate... due to the absurd energy requirements of AI.

Lexus Lunar Lorry · Jul 15, 2024

This reminds me of real estate agents, for whom it is always a great time to buy a house regardless of actual market conditions.

Edit: Or cryptocurrency fanatics. If the market is hot, then it's a great time to go all-in. If the market is cold, then it's a great time to buy and hodl.

DoktorYes · Jul 15, 2024

"It is difficult to get a man to understand something, when his salary depends on his not understanding it."

- Upton Sinclair

ibad · Jul 15, 2024

AI will keep improving, but pure LLM based systems will always be prone to hallucinations and basic failures of reasoning and generalisation.

I think systems that are natively multimodal (like VLMs) may eventually do better, and systems that can engage in multi-step trains or trees of thought, or that try to capture an abstract world model (like Yann Lecun's IJEPA) will move the needle forward as well. Meta cognition and a system of motivations and objectives will help as well, among other things.

But expecting a feed-forward neural network trained on the text-autocomplete task (aka LLM) to become AGI by just feeding it more data and parameters/cores is folly. It will always be an unreliable hallucinating mess, and feeding it enough high quality data to overcome that will be infeasible, as will running a model large enough.

The MS CTO either doesn't know better, or he's trying to keep investors sweet.

peterford · Jul 15, 2024

LLMs may be highly useful in some meaningful areas, but I think they need an internal world model to take the next step - and I think you need multi modal as well.

And my pet theory:

Self awareness is bootstrapped when your world model contains representations of other agent's world models that contain you. "If I take this action, they will think this of me..."

myk.dinis · Jul 15, 2024

adrianovaroli said:
Serious analysis aside: a man with that disgrace of a beard should not be in a decision-making position. It's like he's trying to warn us at every level.

Kinda surprised he's not also wearing a tunic.

lewax00 · Jul 15, 2024

I've been playing around with running LLMs locally, and from my minor bit of experience, he's probably right that more compute power can get us better models...I mean, there's a stark difference between a 7 billion parameter model and an 80 billion parameter one.

But that order of magnitude increase in parameters comes with a similarly large increase in some of the hardware requirements, and the increase in hardware costs is even greater (e.g. getting 10x as much VRAM to load the larger model into memory would cost like 40x, and that's before considering other changes that might be needed, like power and cooling, and renting out high end compute instances is similarly expensive...alternatively, I can sit around and watch it spit out a word every couple seconds and take pay the cost in time). So I don't know that we can say there aren't diminishing returns already, once you take everything into account.

HeartbreakOne · Jul 15, 2024

Gun store owner says buy more guns, film at 11.

nzod · Jul 15, 2024

ThatEffer said:
Guy who rents out x machines: The x industry is far from crashing, it is in fact getting ready to really take off.

Trust me, you'll be able to take off with our arm-attachable wings, we just have to make them bigger!

kewippleNaja · Jul 15, 2024

Never trust a person with a Napoleon III beard.

star-strewn · Jul 15, 2024

We haven't seen the absolute end of LLM improvements, certainly. The technique hasn't been integrated yet into many tools.

But is this wave of shocking improvement over? It's at least starting to crest, I'd estimate. Every hype wave comes with a trailing crash, and amusingly the history of AI is one of the best examples of this.

star-strewn · Jul 15, 2024

I wonder how colossal of a bad investment Microsoft would need to make to sink the company's finances.

brandnewmath · Jul 15, 2024

Isn't a CTO's job description basically "technology hype man" anyway?

fredrum · Jul 15, 2024

When did the 'Tech Industry' become the tech industry? I mean we've had computers for more than 50 years, there's been a constant stream of new gadgets from color TVs to ~~flying cars~~ to ipods and video game consoles. All kinds of software good and bad have been created by programmers all over then world during the same time.

But its only quite recently we've had to suffer the endless over hyping of quite niche and very specific technologies. (no need to repeat them you know which I mean) Its like there's a separate business which is just this incessant overhyping of 'the new thing'.

My theory is that there's just too much money being collected by the top of the 'investor class'. Money is trickling upwards and not getting taxed enough and the result is a bunch of people with huge amounts of money burning holes in their pockets that then need to invest. And they have huge amounts of money to then pump/promote their own investments.

I guess 'tech' is the new 'financial products'?

gmerrick · Jul 15, 2024

ibad said:
AI will keep improving, but pure LLM based systems will always be prone to hallucinations and basic failures of reasoning and generalisation.

I think systems that are natively multimodal (like VLMs) may eventually do better, and systems that can engage in multi-step trains or trees of thought, or that try to capture an abstract world model (like Yann Lecun's IJEPA) will move the needle forward as well. Meta cognition and a system of motivations and objectives will help as well, among other things.

But expecting a feed-forward neural network trained on the text-autocomplete task (aka LLM) to become AGI by just feeding it more data and parameters/cores is folly. It will always be an unreliable hallucinating mess, and feeding it enough high quality data to overcome that will be in infeasible, as will running a model large enough.

The MS CTO either doesn't know better, or he's trying to keep investors sweet.

I'd be hallucinating too if I saw all that stuff they are feeding LLM's from the internet!

fellow human · Jul 15, 2024

Not being a ML expert I don't know how much further scaling will go but I'm pretty sure progress isn't going to stop when that limit is hit. In terms of model architecture, new modalities, interplay between models, connection with other systems, novel data filtration and labeling, optimization etc. I feel we're just getting started.

42Kodiak42 · Jul 15, 2024

LLM scaling laws refer to patterns explored by OpenAI researchers in 2020 showing that the performance of language models tends to improve predictably as the models get larger (more parameters), are trained on more data, and have access to more computational power (compute). The laws suggest that simply scaling up model size and training data can lead to significant improvements in AI capabilities without necessarily requiring fundamental algorithmic breakthroughs.

LLMs have "improved" in the last few years in ways that have given them more capabilities? I've seen the benchmarks about how every new AI trounces its competitors, only to go on to solve no new tasks that the old model was unsuitable for. I've seen articles about how you can coax ChatGTP to solve some math problems if you really put in the work. I've seen articles about how LLMs have achieved passing grades on the bar exam, only to get lawyers in serious trouble as ChatGTP cites entirely made up court cases. I've seen weather prediction AI achieve some higher fidelity and accuracy forecasts, with the caveat that "We wouldn't trust this thing to predict novel extremes and unusual weather patterns." I've seen news about professors failing classes after simply asking ChatGTP if the AI wrote the papers for his class. And I've seen Google's Bard tell people to put glue in their pizza cheese.

Has any of this changed? Am I missing some novel capability that AI has enabled beyond making propaganda spam easier?

Demosthenes642 · Jul 15, 2024

I'm not really sure how the idea that "if we keep shoveling data into the box it will magically morph into an AGI" keeps finding legs. Layers of meta-cognition don't just coalesce out of nowhere. At a certain level these folks are effectively arguing that taking the sum total of human text output and throwing all the compute at it will distill out the knowledge and behavior and produce something that can perform like a human. It's an odd bit of philosophical wrangling that I'm not convinced adds up.

gregorerlich · Jul 15, 2024

peterford said:
Self awareness is bootstrapped when your world model contains representations of other agent's world models that contain you. "If I take this action, they will think this of me..."

If you haven't, you should read I am a Strange Loop, Douglas Hofstadter's exposition of his theory of consciousness. It's quite similar, and based around the idea of recursive mental models.

On the topic of the article, the problem is the seemingly-exponential resources required for these models to achieve more-or-less linear improvements. Some back of the napkin math (consider it more of a Fermi estimate) was done here:

From https://www.astralcodexten.com/p/sam-altman-wants-7-trillion

GPT-5 might need about 1% the world’s computers, a small power plant’s worth of energy, and a lot of training data.

GPT-6 might need about 10% of the world’s computers, a large power plant’s worth of energy, and more training data than exists. Probably this looks like a town-sized data center attached to a lot of solar panels or a nuclear reactor.

GPT-7 might need all of the world’s computers, a gargantuan power plant beyond any that currently exist, and way more training data than exists. Probably this looks like a city-sized data center attached to a fusion plant.

Building GPT-8 is currently impossible. Even if you solve synthetic data and fusion power, and you take over the whole semiconductor industry, you wouldn’t come close. Your only hope is that GPT-7 is superintelligent and helps you with this, either by telling you how to build AIs for cheap, or by growing the global economy so much that it can fund currently-impossible things.

Which suggests that without a fundamental change in the algorithms/approach, this technique is approaching fairly hard boundaries in the next 10-20 years.

Joel622 · Jul 15, 2024

star-strewn said:
I wonder how colossal of a bad investment Microsoft would need to make to sink the company's finances.

Microsoft has its fingers in so many defense, federal government, healthcare, etc. pies that it will absolutely get the too-big-to-fail treatment if the coming AI crash wipes it out.

Northbynorth · Jul 15, 2024

The LLM may still have room to improve. But getting significantly more high quality training data may be a challenge. Many high quality sources seem to restrict access. Of course, you may more carefully "prune" current training data sets to increase the quality, but then you end up with smaller training sets.

bigchungus · Jul 15, 2024

Professional career conmen being paid millions to legally steal all your money. Abhorrent...

cyberfunk · Jul 15, 2024

From someone who knows from this inside, Kevin Scott is just a shill milking the cow and his time in the limelight until it peaks, he takes the biggest paycheck possible and GTFOs.. ignore his idiot blathering.

lewax00 · Jul 15, 2024

42Kodiak42 said:
LLMs have "improved" in the last few years in ways that have given them more capabilities? I've seen the benchmarks about how every new AI trounces its competitors, only to go on to solve no new tasks that the old model was unsuitable for. I've seen articles about how you can coax ChatGTP to solve some math problems if you really put in the work. I've seen articles about how LLMs have achieved passing grades on the bar exam, only to get lawyers in serious trouble as ChatGTP cites entirely made up court cases. I've seen weather prediction AI achieve some higher fidelity and accuracy forecasts, with the caveat that "We wouldn't trust this thing to predict novel extremes and unusual weather patterns." I've seen news about professors failing classes after simply asking ChatGTP if the AI wrote the papers for his class. And I've seen Google's Bard tell people to put glue in their pizza cheese.

Has any of this changed? Am I missing some novel capability that AI has enabled beyond making propaganda spam easier?

I don't think there's been anything paradigm-shifting, it's more of a difference in subjective quality. It's easier to see when you compare two models side-by-side with the same inputs.

I think it's so vague because it's really hard to quantify. Like, if you know two people, and one is significantly smarter than the other, it's probably apparent to you, but hard to explain the exact difference to someone who doesn't know them, besides maybe relying on some anecdotes or when either one did something particularly smart or dumb.

raxadian · Jul 15, 2024

Faster AI will just be dumber faster.

Fatesrider · Jul 15, 2024

ResemblesPrison said:
Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.

I wonder what he was saying or hearing in that picture, because those are not happy eyes he has.

That's one of those "His smile didn't extend to his eyes" photos.

Considering the sunk costs in the project, and the likelihood that it will not scale much more without some major technological breakthroughs in both hardware and coding is exceptionally high, considering that no amount of hardware or new kinds of coding (that's I'm aware of, anyhow) thrown at it thus far has improved reliability by any discernible amount.

It'll be interesting to see if this is just an episode of "Dot Com Bubble Bursting: The Sequel".

Legatum_of_Kain · Jul 15, 2024

I am staying away from all that shit forever, thanks.

hanharal · Jul 15, 2024

star-strewn said:
I wonder how colossal of a bad investment Microsoft would need to make to sink the company's finances.

You could ask ChatGPT:

"In its last accounting year, Microsoft reported a revenue of $211.9 billion for the fiscal year ending June 30, 2023."

It will be higher for 2024.

So to sink the company's finances I would say their losses would need to be on the scale of hundreds of billions of dollars.

Fred Duck · Jul 15, 2024

I am shocked, I tell you shocked at the number of beard-related insults in this comment section. I thought the community would be above that.

Those belong on ArsBeardica.com

Microsoft CTO defies critics: AI progress not slowing down, it’s just warming up

Ars Praetorian

Ars Centurion

Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.​

Ars Scholae Palatinae

Ars Praetorian

Smack-Fu Master, in training

Ars Scholae Palatinae

Ars Centurion

Ars Tribunus Militum

Ars Praetorian

Ars Centurion

Ars Praefectus

Ars Praefectus

Ars Centurion

Ars Legatus Legionis

Account Banned

Ars Praefectus

Ars Centurion

Ars Scholae Palatinae

Ars Scholae Palatinae

Wise, Aged Ars Veteran

Ars Scholae Palatinae

Ars Praefectus

Ars Praefectus

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Centurion

Ars Praetorian

Ars Centurion

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Praefectus

Ars Legatus Legionis

Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.​

Ars Praefectus

Ars Centurion

Ars Praefectus

nproxy.org

Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.

Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.