In this insightful discussion, experts explore the future of AI agents, the challenges of improving their reasoning capabilities, and the role of planning alongside large AI models. They delve into the progress made in reasoning through advancements in computing, algorithmic developments, and software engineering. The video highlights the role of next-generation hardware in enabling sophisticated AI systems, mentioning a new paper that showcases improvements in AI agents' reasoning through reinforcement learning and other techniques.

The video further examines the differing responses of creative and technical communities to the influx of generative ai tools. While some creators see AI as a threat to their craft, others in technical fields like software development view it as a productivity booster. The conversation also touches on the balance between creativity and automation, noting that AI can take on tedious tasks, allowing professionals to focus on more meaningful work and innovation.

Main takeaways from the video:

💡
AI advancements are being driven by algorithmic progress, hardware innovations, and enhanced software engineering.
💡
There is a significant disparity in how creative and technical communities perceive the impact of AI, which could influence the future direction of AI tools and systems.
💡
The evolving landscape of AI requires a balance between leveraging new technologies for productivity while preserving creativity and craft in various professional domains.
Please remember to turn on the CC button to view the subtitles.

Key Vocabularies and Common Phrases:

1. algorithmic [ˌælɡəˈrɪðmɪk] - (adjective) - Relating to a set of rules followed in problem-solving operations, especially by a computer. - Synonyms: (computational, procedural, systematic)

Very clear algorithmic progress.

2. heuristic [hjʊˈrɪstɪk] - (adjective) - Involving or serving as an aid to learning, discovery, or problem-solving by experimental and self-learning methods. - Synonyms: (exploratory, investigative, experimental)

We always came up with some heuristic huge data corpus, tried something out...

3. monotonically [ˌmɒnəˈtɒnɪkli] - (adverb) - In a manner that is constant or unchanging. - Synonyms: (consistently, uniformly, steadily)

I think that definitely is monotonically increasing.

4. esoteric [ˌɛsəˈtɛrɪk] - (adjective) - Intended for or likely to be understood by only a small number of people with a specialized knowledge or interest. - Synonyms: (obscure, abstruse, specialized)

The training system market is a traditionally very esoteric market...

5. reinforcement learning [ˌriːɪnˈfɔːrsmənt ˈlɜrnɪŋ] - (noun) - A machine learning training method based on rewarding desired behaviors and/or punishing undesired ones. - Synonyms: (reward-based learning, trial and error, AI learning)

That was actually a question I was thinking a lot about, because they talk about reinforcement learning as part of that

6. generative ai [ˈdʒɛnərəˌtɪv eɪˈaɪ] - (noun) - Artificial intelligence technology that can generate text, images, or other media in response to prompts. - Synonyms: (creative AI, production AI, synthesis AI)

What are we expecting next? How do we put planning and reasoning alongside this large representation of the worlds we have now? Are we going to have products that truly never incorporate generative ai? I think never is such a strong word

7. hyper-scalers [ˈhaɪpərˌskeɪlərz] - (noun) - Companies that offer platforms for handling large-scale computing tasks, especially those that provide scalable cloud computing environments. - Synonyms: (cloud providers, tech giants, large-scale data processors)

So for them, it's a way to get into the hyperscalers with a solution where they say...

8. prompt engineering [prɒmpt ˌɛndʒɪˈnɪərɪŋ] - (noun) - The process of designing prompts to efficiently communicate with AI models and achieve desired results. - Synonyms: (prompt design, AI query formulation, input structuring)

Even like the prompt engineering side.

9. universal basic compute [ˌjuːnɪˈvɜrsəl ˈbeɪsɪk kəmˈpjuːt] - (noun) - A concept suggesting that computing power should be freely available to all, akin to universal basic income. - Synonyms: (free computing resources, public computing access, open compute)

So we're moving to a world of universal basic compute, it sounds like.

10. silicon valley [ˈsɪlɪkən ˈvæli] - (noun) - A region in the southern part of the San Francisco Bay Area in Northern California that serves as a global center for high technology and innovation. - Synonyms: (tech hub, innovation center, tech region)

And there's a plethora of startups in silicon valley who are trying to make super low power, etcetera.

Agent Q, no AI in art, and AMD acquires ZT Systems

AI agents. What are we expecting next? How do we put planning and reasoning alongside this large representation of the worlds we have now? Are we going to have products that truly never incorporate generative ai? I think never is such a strong word.

What's the most exciting thing happening in hardware today? It's nice to see that finally we built big computers again. I'm Brian Casey, and welcome to this week's episode, a mixture of experts. We let Tim go on vacation this week, so you're stuck with me. And I'm joined by a distinguished panel of experts across product and research and engineering. Volkmar Ulick, who is the VP of AI infrastructure, Chris Hay, who is the CTO of customer transformation, and Skylar Speakman, senior research scientist.

There's been a lot of discussion in the market around reasoning and agents over the last six months or so. And so the question to the panel is, do we think we're going to get more progress in building reasoning capabilities through scaling compute? And this is just over the next year or so, scaling, compute, algorithmic progress, or from good old fashioned software engineering? So, Volkmar, over to you. Very clear algorithmic progress. Chris, software engineering. All right, Skylar, algorithmic, that's the next step.

All right, I like it. We got some different opinions on this, and this actually leads us into our first segment that we're going to be covering today, which is a company called multion released a new paper around agent Q. And this paper is demonstrating improvements in reasoning and planning. And the scenario they defined in the paper, which was using an agent to actually book restaurant reservations, was using loms, combined with other techniques like search, self critique, reinforcement, learning. And they demonstrated some, like, order of magnitude improvements in just the success rates of LLMs.

And so maybe, Skylar, as a way of just kicking us off, I'd love to hear a little bit about, just, like, why do LLMs struggle so much today with, with reasoning? And, like, why is some of the work going on in this space, exploring other ways, like, so important to making progress?

So LLMs have this amazing ability to build a world model. I think I've seen that phrase popping up more and more. Sometimes it will get criticized and say, oh, all they're doing is predicting the next word. But in order to predict the next word as well as they do, they actually do have this. I'm not going to say understanding might be too long of a stretch, but they have this model of the world.

Up until these new, recent advancements, they had no real reason. Motivation, agency, whatever you want to call it, to really go out and explore that world. But they had created that model of the world and they could answer questions about it. So I think this idea of LLMs being limited to creating the model of the world, they did a very good job of that.

I think some of these next steps now are all right, now that we've got a representation of the world, which is pretty good at the next token prediction problem, how do we actually execute actions or make decisions based on that representation? And so I think that's kind of this next step we're seeing, not just from Agent Q, but lots of research labs here are really trying to figure out how do we put planning and reasoning alongside this large representation of the worlds we have now. So I think these guys are off to a good start. One of the first ones to kind of put something out there, the paper. Paper down, you know, available for people to read. Lots of other companies are working on it as well.

So I wouldn't necessarily. These guys. I wouldn't necessarily. They're ahead of the pack. Yeah, maybe, Chris. I know we were talking a little bit about this, which is, like, how indicative do you think some of the work that the team did here is of just, like, where everybody's going in this space? Like, is this, is this paper just like another piece of data in, like, what is a continuation of everybody sort of exploring the same sort of problems? And do we think this is pretty dialed in on where the problem space is going to be around agents over the next year or so?

I think it is actually pretty dialed in when I read the paper, it's similar to some of the stuff that we're doing with agents ourselves. So that's always goodness there. But if you really look at what's going on there is they're not really using the LLM for the hard bits. They're using a Monte Carlo tree search to actually work out. So one of the major things that they're doing is they're using a web browser as a tool. So if they're trying to book a restaurant, for example, then what they're actually doing is doing a Monte Carlo tree search, and they're navigating using that tool to different spaces. They're using the LLM to self reflect. They're using the LLM to create a plan in the first place of how they're going to book that restaurant. But they are relying on outside tools or relying on outside pieces like the tree search, to be able to work out where they're going. And the fact is that is because LLMs are not great at that.

Right. So it's like it's more of a kind of hybrid architecture in that sense. And everybody's doing the same thing with agents as well. Right. You're bringing in tools, you're bringing in outside memory, you're bringing in things like graph searches, for example. So graph rack is becoming really popular in these spaces. Everybody's sort of bringing in planning and reasoning as well. I think they're doing some really interesting stuff there with the self reflection and the fine tuning, so that it's more of a kind of virtuous circle in there within the paper.

So I think they're probably further ahead than a lot of people in those spaces. But even if you look at the open source tools, the open source agent frameworks, we started with things like Langchaine, but now you will see things like land graph is becoming really popular, and then you're moving into other multi agent collaborations such as crew AI. So everybody's on a different, slightly different slant on where they are in this journey, but they're definitely on the right track, I would say, at this point in time. And by the way, back to my earlier argument, that is software engineering, my friend, that is not doing anything different with the LLM, it is engineering and putting stacks and frameworks around your toolset.

To that point, Brian, I do want to hear Volkmar's take on why algorithmic was his pick. So you have to hold us to our answers and who's going to go next. So my background is I built self driving cars for seven years, and this was always this decision between how much software engineering can we do and how much can we train into a model. And then in many cases, what Chris just said is it's oftentimes the packaging of different technologies together. And I think where we are right now is we have, as you mentioned, this really powerful tool, which is LLMs, that we have some basic form of understanding and we have the role model, and now we are trying to make something do stuff which we haven't seen.

It's not, oh, just predict the next thing you do on opentable, right? And so now you're in an unknown open world where you need to explore different, different choices. And then I think what the next step will be. You run this brute force and then once you have those choices, you actually will train a model. That's my expectation, because that's the path I've been on with driving. So we always came up with some heuristic huge data corpus, tried something out, and then in the end, it was always like, oh, yeah. Now that we figured out what the underlying problem is, let's train a model to make this more efficient in execution.

In the end, the model is just an approximation of an extensive search. And so I think that's why, algorithmically, I believe that the algorithms we will build are effectively those graph searches, tree searches, et cetera, which ultimately then will feed into a simpler representation, which is easier and in real time to compute. I was kind of disappointed by the paper, if I'm honest, and I'll tell you why, and Brian's dreading what I'm about to say now, but I'll tell you why. I was disappointed because the whole example was the open table example.

Now, unless I am wrong, and I don't think I am, isn't multion the company that claimed that they were the agents behind the strawberry man, the I rule the world mo Twitter account? So, you know, that would have been the agent example I would have wanted to see in the paper. It is. That was actually a question I was thinking a lot about, because they talk about reinforcement learning as part of that. And one of the interesting things that I've just seen in the market the last, I don't know, a few months or so, is there's this light backlash happening to LLMs within the ML community, even a little bit.

Particularly, I think, the people who have worked a lot in reinforcement learning, and you even heard folks, like, people talking about LLMs being a detour on the path to AGI. And I'm seeing, like, as we've slowed down a little bit in terms of progress, I've seen, like, the folks who love, who operate in those kind of reinforcement learning spaces, like, starting to pop their heads up more and being like, hey, it's back. The only way we're going to make progress around here is some of these other techniques. And I'm curious, like, maybe two questions is maybe I'll start with this one is like, do you all think if we fast forward to a world where agents are a much more significant part of just the software that we're all using every day, do we think LLMs are, like, the most important part of that? Or, Chris, to your point, around this paper, that the extensive use of lots of other techniques, do we think, like, a bunch of other techniques are going to come and rise back to prominence as we actually try to make these things do stuff? Maybe I'll stop there and just see if anybody has a take on that.

Yeah, I definitely think RL is going to come back into this. I know they were using RL in that paper and they were also using things like DPO and stuff. But I think it's going to come back into this. So I keep thinking back to Alphago and the DeepMind team, you know, winning go there. And, and again, they were using similar techniques, as you could see in that paper there. But if you take a deep learning algorithm today on your machine and you get it to play the simple game of Snake, or play the Atari games like deepminded, very, very simple architectures like CNN, DNN type things, absolutely rock that game.

If you get an LLM to play, and it doesn't matter whether it's an agent or not, that is the worst playing of Snake I've ever seen from frontier models. Right? And GPT 40 is terrible at it. You know, Claude is terrible at it. They're all terrible playing at these games. But really simple RL, deep learning, you know, CNN style architectures actually rocket those games. And therefore, I think that as we try and solve and try and generalize, I think some of those techniques that were really successful in the path in the past to come back into the future, and I'm pretty sure that's where a lot of people are going at the moment.

So we're going to see software engineering, we're going to see improvements in architecture, we're going to see improvements in algorithms, it's going to stack, stack, stack. And hopefully all of these techniques will come together into hybrid architecture. But when you take LLMs and put them into an old sort of gaming style environment, they absolutely fail. Today. Do we think there will be general purpose agentic systems over the next short term, let's say like next couple of years, or is everything going to be task specific? Because one of the nice things, Chris, to the point about this thing being an open table, like go book a reservation, it's a very easily definable objective. And that means that you can pull in a bunch of these other techniques in ways that are harder to make kind of fully generalizable.

And so it's like, when we look at agents, do we think we're going to make a lot of progress on kind of generalizable agents over the next year or two, or is everything going to be just in this task specific land? Skylar, maybe it looks like you got some thoughts on that. No, I don't think we'll have general within two years. I think there will be some areas, and this might even lead to our next topic areas around language creativity. I think that will surpass some humans abilities, but the world works on much more boring, mundane business processes. And I think there's still a lot more ground to make on that to get those systems to a level of trust that people will use.

It's one thing to have these methods create a funny picture, write a funny story, but to have LLMs execute financial transactions on your behalf, different, different bowl game, and we're not going to be there within two years. I'll be proven wrong. You can timestamp this, that's okay. But yeah, no, we're always accountable for our predictions on this show. So Brian, I think where we may go is we will probably get. Now we are going through examples opentable and we try another 20. I think we will get into a tooling phase where you can actually explore a domain with some human intervention and some human guidance. You will have tools which can explore, let's say a webpage, how to interact with it. You may go through some pruning process which may be manual, but I think we will get to more automation that it will be ten times or 100 times faster to build this.

But I think as Chris said, there will be a software engineering component to it which until we are fully autonomous, you just point at something and say learn. That will take a while. And then the question is, where does the information come from? Is it through trial and error? Or we could even just read the source code of the webpage. We have source code and protocol on business processes. I can just give you. Here's my billion lines of code of SAP adoption.

For the second story, there was the CEO of this company, procreate. They are a company that builds and designs illustration tools. And I think it was on Sunday night their CEO came out and released a video in which he said that they are never that one. He actually said he hates Genai. I think he actually used the word hates to describe it. And he said that they were never going to include genai capabilities inside of their product. And the reaction from their community and the design community broadly was super excited and supportive of this statement.

I think as timer recording, that video has got almost 10 million views on Twitter. I have a bunch of different reactions to that that hopefully we can pick apart here a little bit. But one of the things that was like most striking to me is that the way two different sets of creator communities have reacted to the arrival of LLMs. I have friends and colleagues who are software engineers and LLMs for code. People are generally pretty enthusiastic about that. Look at it as a great productivity tool. They can get more work done than they were ever able to do before.

I also have friends and colleagues who are writers who work in Hollywood, who are creatives and who, like, look at the arrival of some of this technology, like the grim Reaper, basically. And so it's just like wildly different responses from, from these two communities. And I'm just curious, like, maybe, Chris, throw it over to you to, you know, maybe get some initial thoughts and reactions to it is like, you made sense of, like, why these communities are responding so differently. Differently to this technology. I think never is such a strong, that was gonna be one of my other reactions to it.

Never so far, really. No feature at all. Yeah, I am never, ever gonna stream video content because I believe physical is more important. Well, you know what, you're out of business blockbusters, so I don't know. I think there is a general wave. I applaud them. Right. I think they make tools for their particular audience, and their audience doesn't want that. And I think that's going to be a unique differentiator. I'm not sure how that stands the test of time. I think never is such a strong word there. The industry is moving fast and different audiences have different needs.

Right. I mean, I'm pretty sure that if I use procreate, there's no chance ever I'm going to produce anything that is of any artistic quality. And that is because I have no artistic talent. But you know what? For that, I am not the target audience. But I am grateful for AI generated art because it allows me to produce something that I would never be able to produce otherwise. So things like PowerPoint slides, etcetera. So if they are, they are focused on the creative professionals, and creative professionals don't always want to have AI gen AI within that. And I understand that. That's great. You've got your audience, you've got your target, and that's fine. But I think, and I think there will always be an audience for that.

But I think that tide of time will push against them there. And I think that's, that's really going to be a very strong artisan statement to me before we move on. Chris, what, what sort of PowerPoint art are you doing? Like, that was my, I mean, generally, if I'm honest, it's almost always of unicorns with rainbow colored hair. That is, that is my pretty CEO presentations. Every CEO loves a picture of a rainbow, rainbow colored unicorn. All the other ones do, you know, that resonates with me. But Skylar Volkmar, I'm curious if either of you have takes on just the community's really reaction to these two different sets of tools.

So I think we are in a world where we have artists and craftsmanship, and we are going through a phase of automation of this artistry and craftsmanship. And so the bar will be really, really high and there will be always unique art. We still today, I can buy a photography, I can buy a copy of a Monet, some of the greatest artists in the world and can hang it on my wall. But there is still a need and a demand by people to have unique art, which is theirs. And I think that will stay. And we've seen this across the progression of time.

Horses used to be forms of transportation, and now they are a hobby. And old cars is going the same way. Hopefully at some point, that's with airplanes. And I think these unique pieces of art, if I can automate the creation and I can industrialize it, the industrialization wins. It always wins. But it doesn't mean that those tools and those artists and that craftsmanship shouldn't be supported. It will just shrink dramatically because the capabilities become more accessible to everybody. If you used to have typists, now everybody can type, all the typists are gone, and it will be the same thing.

One of the things I thought was interesting is that you made this point about craft. I think a lot of people choose their life's work because they like the craft of that. They chose to be an artist or a developer because they like doing that work. And so having a tool come in and do all of it for it is robbing some degree of value from the things that they do day in and day out. And one of the things that I was also thinking about, and I'm just curious if within your teams, within your own set body of work you're doing with clients that y'all are working at, do you also see, like, one of the other places where I was thinking about tension around this sort of dynamic is in the relationship between management and practitioners, where, like, one of my observations is that, like, management is oftentimes particularly enthusiastic about adopting these tools because of the productivity benefits.

Like, I can get more things done, I can reduce my costs, I can drive more revenue, whatever it might be, because those are the things that they're running their entire organization to deliver those results. And in some cases, they've become, as they've gotten more senior, maybe one step removed from actually doing the craft. So the loss of the craft maybe feels like less of a consequence to management sometimes, but to practitioners, it's like, this is my thing and this tool is coming around and just like doing it for me in some cases. So I'm curious if you all have also observed any sort of like when it comes to adoption of some of this stuff, any tension between management and practitioners in terms of their level of enthusiasm for this technology.

I'm not sure about tension of management and practitioners. There might be some I've witnessed of which flavor or which version. So they're going to say, no, we're going to use this one back. Actually, behind the scenes, somebody's using a different tool and some tension back and forth on that one. So it's not necessarily the adoption, but maybe the channel or the tool has had a bit of that one or this one. So yeah, that would be what I've observed. I think it's also the question when you look at craftsman, there's 20% of work you love and 80% of work you hate. Oftentimes it's like the majority. I mean, actually data scientists like 80% data cleaning. Do you think they like data cleaning? No.

Right. So if you, I think the tools, like, if they support the toiling, the useless work and make people more productive, then, you know, you're shift more into the work which you actually like and appreciate. So I think there is from the, from the engineering, I mean, mostly talking software engineers here. From the engineering perspective, I think it's actually an improvement. Nobody likes Jira ticket reviews and writing comments and all that stuff. If that can be automated away, then that's an improvement in the life of people. Or I don't need to go to stack overflow and try to find that algorithm. I can just ask the model to write it and I'm done. And so I'm more at the architectural level and I think from a management perspective, they want to get productivity out of, but they also productivity in an engineering process in many cases, is that you need to convince all the people to do these pieces of work because they're necessary for the product, but everybody hates them. And I think to a certain extent it's an improvement on both sides.

That's a great point. I always, well, it's probably not safer podcast description of it, but I always like to tell we share those things amongst the team, so everyone should mentally come to terms with some percentage of your job is the work that none of us want to do in this team, but we're at least going to spread it around the group a little bit. But that description, like actually. So a lot of the teams that I work with operate a lot on just like IBM.com, do a lot of things around content, and the.com property has tens, hundreds of thousands, millions of pages as part of it. And we're trying to do way more with automation and how we connect content together and stuff like that.

It turns out in order to do that, all your tagging has to be really good across the entire property, across tens of thousands of pages. And it's like, oh my God, the amount of time that we are going to spend cleaning up the metadata on this chunk of the website, it's like just kill your calendar for three days for the whole chunk of the organization to go through this stuff. And if we can instead build just a really good classifier and ways of doing that, that type of stuff actually lands like a huge relief and like, lets us focus on doing the work that we actually signed up to do.

So, like, at least within my team, like, that's a lot of what we're doing is we're looking at this type of tedious work that is really, it's important and it has to get done, to your point. But like, nobody really wants to spend their day doing that. Can we do as much of that so we can actually like focus on doing the work we want to do? But like, when it comes to using LLMs for like the core, core thing that we're doing, everybody's still a little skittish, honestly, at least in some of these now it's not on the software engineering side of our teams, but on like some of like the, you know, more creator side of it.

So it's like some of this, some of these announcements, like, kind of resonating with me because I see it with some of the folks that I work with a lot. I think one of the other things is I don't think it's just tedious stuff. I think for kind of prototyping type stuff, you know, and ideating, it's really good, like, so. And I don't think it matters whether you're producing content or you're producing code or you're producing images. Sometimes you're like, I have an idea. Is this going to work? It's going to take me quite a lot of time to sort of build that out. Let's just get the LLM to do something or the image generator to go through this a little bit.

I get an idea what it looks like and then I'm going to start pruning it and then I'm going to start building the idea a little bit more. And I personally, again, more from a software development side of things. That's kind of how I work. So at the moment, I'm sort of trying to create a distributed, a distributed parameter service for training LLMs. There is no chance that I would be able to just sit and code that straight up myself, right. I need an LLM to help me out, figure this out a little bit, and then I will engineer through where I need to be with that, right? And I think that is true.

And it's the same with image generation, right? It's like, you know, if you're doing a concept and you need that unicorn with rainbow colored hair, get the board presentation image model. Exactly. Get it. Get it out there. And then you go, okay, you know, that, that doesn't quite work in this context. You know, I need this. And then you can go and draw your pretty unicorn at that point, right. But I think prototyping is a really important use case. And I think, Chris, like, when you're doing that prototyping, right, it's like you can have a dialogue, you know, with a machine, and you get major refactorings done in seconds, right?

So you can just like this other thing, let me split this into four classes or let me collapse them. The amount of work you would have to do. And that's all the tedious stuff, you know, refactoring of code, and we have ides to do that, but they kind of suck. So if you can actually get an LLM to do that, it's just amazing. And like, the time saves, you can do it in an hour and, you know, somewhere on a plane, and you can actually write massive amounts of code and experiment with it.

Brian, before we leave this topic, I think we just need to remind ourselves that you asked kind of an art question to three nerds. I'm safe in saying that, right? I mean, just put a disclaimer here. I think it would be a fascinating conversation to have artists representation on this question. So all of this just taken, you know, we're talking about inevitability and tools and all of that. And I think that's, that's where our brains go. But really fascinating to have this conversation with, with the artists, with the craftsman.

It's a very, one of the reasons why is because I do have, like I said, I do have, like, friends who do both of these things. And I have just like, observed how different the reaction is from them and from the communities that they operate. There's a bunch of interesting economic factors here that play into this. I think there's less concern in some cases about more real industry disruption happening with the software engineering community than there is on the creative side.

So I think there is a little bit of that core underlying economic anxiety that is not quite the same in those two places, even though you're really just dealing with just different types of models that are helping improve productivity in different types of domains. But it'll end up landing, I think, pretty differently, potentially. So I think it's a great point. We did not totally represent that other side of this, but it is just a super interesting topic, I think.

And I think one of the things will be interesting is just to the point about never. I feel like there's so many tools that you use them as part of a workflow and you don't even know what the underlying technology is. It's like, if you want to take a background out of an image, do I know that's Jenna or something else or what? Like, do I even care in some cases? So, you know, it's some of those places. I'm like, man, never really. But I think it will be. It'll be interesting to see, like, how this space evolves over the next couple years.

Earlier this week, AMD announced the acquisition of ZT Systems. And so I think, as everybody knows, like, the hardware space has been, like, one of the biggest winners, if not the biggest winner so far in terms of the early days at least, of the Genai and LLM sort of cycle. AMD is a company, obviously, we've talked and everybody's talked a ton about Nvidia, but AMD is obviously making big play in this space as well.

Their CEO, Lisa sue, was on CNBC earlier this week, and she was talking about the acquisition. And one of the things is that AMD historically has invested a lot in silicon. They've invested a lot and even doing more on the software side of it. And that the way that they talked about this acquisition is that they were starting to bring together a stronger set of capability from like, a systems perspective. And so maybe Volkmar as just like a way of kicking things off, like, why is it so important? Like, why is this market moving from just like, silicon to systems? And, like, why are systems and like, these almost like, vertically integrated systems within this space, like, almost like, so uniquely important. Important.

So if you look at AMD and their AMD offerings, AMD acquired ATI a decade or two decades back, and that's the heritage of their AI accelerators. And they are kind of head to head with Nvidia over the years, and they own some spaces and Nvidia in some spaces. I think what Nvidia did very well over the last couple of years is to look not only at the GPU itself, but looking at many GPU's in a box. And then when you go into training, you go multi box. So you need many, many machines.

And the integration, if you look at the acquisitions Nvidia did, is they acquired a company which is providing the software stack to run very large scale clusters, which is their base command product. And then they also acquired Mellanox, which is the leader in like, reliable network communication. And so AMD is sitting there and it's like, okay, so what do we do? And they don't have a consolidated story how they can put a 10,000 GPU training system on the floor. So they are kind of locked in the box and they are not yet at the scale where they could actually compete on the training side. And I think also the reason why Nvidia owns like 96% of the market.

A when you're trying to train, you can pretty much only use Nvidia. And then you already did all the coding on Nvidia systems, and all the operators are implemented for CuDA and performance optimized because otherwise you didn't train the model, then running. It's kind of trivial, right? And so switching an ecosystem is really hard. Nvidia went down this route of having the DGX system, so they built full servers with all the network communication, et cetera. And AMD, I think, is just now catching up.

So they're catching up on the network Mellanox, they announced Ultra Ethernet, and now they are catching up. How do you get these big systems at scale into the industry and they need to get into cloud providers? I think CT systems being a boutique shop, which makes very large scale infrastructure deployments happen, is a logical conclusion. That makes one of the. I think you mentioned training a lot. One, maybe as a follow up question to that, one of the observations I have just about the GPU market in particular, is that it feels more vertically integrated than the world of CPU's does, at least somewhat.

One, I guess, would you agree with that characterization? And two, if you do, is building out the unique set of requirements, maybe around the training stack, like the underlying core force around why this market is, like, behaving the way it is and why it's behaving differently, or do you kind of see that story, like, differently than the way I just kind of laid it out? I think the training system market is a traditionally very esoteric market, which is the high performance computing market.

And at IBM, we build like, top 500, like number one and number two top 500 supercomputers with blue gene, you know, LPQ and the follow on systems. And suddenly we are in a world where that is not anymore a domain of the labs which drop $100 million and get a computer. Suddenly every company which wants to train a network at scale needs similar technology. What we are seeing is after 20 years or 40 years almost, HPC being a very esoteric field of, let's say, 50 supercomputers in the world, suddenly it's a, you know, it's a commodity, and you'll start up with two.

We should all have a supercomputer. Exactly. You know, say, oh, yeah, I need a supercomputer. You don't have one. So. And you know, the, I got an unfinished basement, like, you know, the joke on, but I was like, I'm GPU poor, right? So I only have like 100. So the, and if you want to play in that market, you need to actually offer a solution. And I think AMD has been traditionally in the desktop market with a GPU, or like enterprise market with the GPU. And they sell servers, but they never build these systems.

Nvidia, being an actual GPU vendor, amazingly, has captured like 85% of the dollars spent in the data center. So it's like, yeah, your intel chip, good luck. And a little bit of memory and everything else. We take. We take the switches and we take the Ethernet cards and we take the GPU, and that's the other 85%. And so for AMD to get something deployed at scale, I think they need to have an offering which is on par. I think intel with Gowdy is in a little bit better shape because they have partnerships over 50 years with Dell and Lenovo, etcetera. For them, it will be easier to get into that market because they already have an ecosystem, and that's not the case for AMD.

This is why I don't get a Volkmar. I actually don't get the acquisition, because let's say I was an Apple company, not Apple, but an Apple company. And my market, everybody bought red Delicious apples because they're great apples. But my company sold Granny Smiths. Nobody ate Granny Smith apples. Why would I buy a company that makes better packing boxes for my apples? That's my problem with it. I'm kind of like, if I'm spending $5 billion, spend $5 billion on getting better gpu's and go compete with Nvidia. That's where I don't quite understand it in my mind.

I think Nvidia figured out a way of actually delivering it, deploying it to partners, and to a certain extent, AMD got locked out in that space. So they need to find a way to market and protocol that way to market. If you look in the training space, a huge percentage of the training is actually happening with the hyperscalers companies, they want to put Nvidia cards on their premises, but in many cases, in early beginnings, they go into the cloud and CT is delivering to the hyperscaler.

So for them, it's a way to get into the hyperscalers with a solution where they say, okay, we give you the whole thing, so you take down the risk on the hyperscalers. I'm not sure people do want to use Nvidia. I think that Nvidia has got this market low, and Nvidia is awesome. They make great gpu's, but at the same time, Apple seems to be doing well on the desktop market or the laptop market with their chips. And with MLX as a framework, so custom Apple silicon seems to be working out well.

You're seeing companies like Google invested in their own ASIC based chips. With tpus, you see other people move into asics as well. I think there is a space for a low cost alternative to Nvidia chips, and I think there is a market for that, because otherwise other companies, hyperscalers, etcetera, wouldn't be investing in that. And that's why I'm saying I don't get it. Nvidia by far makes the best gpu's across the board. They're an incredible company. I just think if I was a competitor, I would try and find a niche in that space which isn't the packing boxes.

Yeah, I think really, in the training market right now, Nvidia is just the only choice you have. And I think this is primarily where Indy is trying to break in. I think in the inference market, there will be, like you said, apple, and there's Qualcomm, there's a tiny of chip vendors, and there's a plethora of startups in silicon valley who are trying to make super low power, etcetera. But in the training market, if you look where AMD is going and the wattages they are putting down, where it even goes above 1000 watts on a GPU in the next generations, Nvidia is effectively the only game in town.

And I think they want to put something up against and you only have for pre train maybe, but not necessarily fine tune fine tuning. I think you can, in many cases you can do in a box. Like, you do not need a huge system. Yes, but in the pre training market you do. And this is where right now you buy Nvidia or you buy Nvidia and gowdy isn't there yet. AMD isn't there yet. And so I think this is effectively an attempt. And who knows? Let's see how it plays out. Thank God I didn't have to make the decision.

But I think this is an attempt of breaking into that large scale training market and delivering very, very large HPC systems. The company has run 100,000 GPU training clusters. Building that takes a year. It's massive investment. It's in billions of dollars. And so if you want to capture some of those revenues, then you need to have. It's nothing. We collect three engineers and they put up a supercomputer. It's like, no, this is a construction process.

And this is where AMD, with this acquisition, finally has a chance of bringing the guys with the hardhats in as well because you need to put power in and cooling and all this stuff. I think they don't right now because that's all vendored out. They do not have the experience. I think they are buying the competence. But that point about competence was actually something I saw come out a lot in the discussion post the acquisition, where this is a company that does have a lot of capability around doing exactly that, building out large scale clusters, some of the biggest in the world, essentially.

It's an interesting theme that I've heard at every level of the whole genai stack at different points over the last year or so, you hear it in the hardware side, you hear it. And it's really like, to the point about being like, oh, you're almost rate limited by the amount of expertise that's in the market right now. It's like in the hardware side, I heard it on like the training side, you heard it for a while. Even like the prompt engineering side.

Like, you know, people refer to them as like, you know, magic encampments for a little while, and there was like this, like only a certain, like, group of people even really knew how to prompt a model correctly. And a little bit of what I've observed over the course of like the last, I guess, like almost two years at this point is that like, as this thing has blown up, like, it feels like some of those skill shortages are like, getting less acute, like, more people know how to train models. More people are getting competent working with models. More people obviously are, like, attracted to the hardware side of the equation because of some of what's happened over the last couple of years.

I'm curious, like, across the board, like, how much do you feel like our progress in AI is still rate limited by just, like, raw expertise across the world in this space? And, like, how much has that improved or not over the course of the last year or two? And so maybe, Skylar, just kick it over to you. Sartre. I have this conversation pretty regularly with our director, and I would say it's not necessarily the overall amount of skills. I think that definitely is monotonically increasing, but how it's distributed across the globe, that's becoming more extreme.

And so I think that's something that we are experiencing. We're IBM research Africa. We represent a billion people. But the talent that here is probably going to emigrate. And what does it look like to have that talent here and bring that culture here? So, yes, it is increasing, but I think at very different rates across the globe. That probably be my short summary of that. And it is something that we do talk about on a regular basis, is what does capacity in generative ai look like on a really global scale? So that's probably another session entirely in itself.

I was not expecting that. That was a fascinating perspective on that. So, yeah. Chris Volkmar, thoughts? Yeah, I think there is such a gold rush, and it's a new technology, and so it's a lot about even trying it out. And every day there's something new. So you need people who are really passionate about it and that they spend their living and have sleeping hours on it. And so the skillset, I think, will develop over time. I feel like we are repeating the gold rush of the web era where I was like, oh, my God, you can write a web service. Isn't that amazing? And now it's like, yeah, everybody can do it.

And so I think we are just in this uptick with a very extreme supply shortage. And because it's so deep, like, when you just plug the computer into a network, it was relatively easy. I mean, it's like, okay, here's a computer on a network. Go. Now. It's like, the training is different. Do you need to even understand what math is? And most engineers hate math. That's why they like computers. And so there's this set of skills which need to be built up and, you know, until it actually rolls to the universities and we get people who are truly practitioners.

So you first need to get the education, and then you need to become a practitioner and you need to toy around with it for five years. So I think for the next ten years, we will probably be in this plus the speed of change, we will be in this world of, of this supply shortage everywhere. I think on the flip side, coming from the systems corner, it's nice to see that finally we built big computers again. So I really like this and that we are actually going away from the cloud. Providers do everything for us and we need to actually look at system design with a fresh angle.

I think that's a goodness for the industry. It was kind of locked in, and there are five companies in the world who still know how to, to plug a computer into a network and a power socket. And I think it's good that we are actually going through more of a renaissance of computer architecture and design, at least. Yeah, I'm the total opposite. I think that people are, I think skills, people are learning the skills and they're doing a great job of that across the globe. But at the end of the day, if you want to train a large language model, you need an awful lot of GPU and you need access to an awful lot of data, and that is outside of the access to the average human being.

So there is a lot of really great skill, talent, and they are not going to be able to practice their craft because access to the GPU's to be able to learn what is the effect of this data just isn't there. Now you can learn from doing things like fine tuning and training very, very, very small models, et cetera. But at the end of the day, we know that for the larger models, it emerges on the higher scale. And therefore, and at the scale now is it's tens of thousands of GPU's to be able to do that. And I think that is what's looking at the average practitioner.

So me personally, I want to see more distributed compute, I want to see more access to GPU's and skills. And therefore, I think, to kind of Skylar's point, I think that will open up a really talented set of people that are distributed across the globe to be able to make great contributions in that area. But at the moment, it's going to be concentrated in the big tech companies because they're the ones with the GPU's. Chris, I want to fight back on your fighting back. That's why we do this, right? If I have a researcher that comes to me and says the only way they can make their case is that they need 10,000 GPU's. That's. That's not a good argument.

That researcher needs to be able to make their case off of two gpu's. So. I agree. Where, you know, where's that conversation start about making the case off of this, this two GPU example? Show that then we can talk about the hundred, the 2000, the 100,001. I don't think it's fair to say I can't make progress unless I have 10,000. I agree, Skyler. But again, we're sitting in a company who has ten thousands of GPU's, right?

So they can go to, you make the argument with two gpu's, and then you can give them access to scale, right? But the average person, they might get so far with two GPU's, and then they're like, huh, I don't have the money now. Well, I'm gonna go and do something else. So we're moving to a world of universal basic compute, it sounds like. I feel like that's been a little bit meaning recently, so we will call it a day there. Volkmar, Chris, Skylar, thank you all for joining. Great discussion today day.

And for those of you out who are listening to the show, you can grab mixture of experts on Apple Podcasts, Spotify and podcast platforms everywhere. So until next week, thank you all for joining. We'll see you next time.

Artificial Intelligence, Technology, Innovation, Ai Agents, Reasoning Capabilities, Generative Ai, Ibm Technology