This video explores the rapidly shifting landscape of artificial intelligence, with a particular focus on open source models, the increasing infrastructure demands fueled by AI innovation, and the adoption of advanced AI tools in scientific research. The episode begins with a discussion of Kimmi K2, an open source model from Moonshot, and its potential to challenge industry leaders like Claude and GPT-4. Experts weigh in on the debate around benchmark performance versus real-world utility, emphasizing ongoing skepticism and the need for practical validation beyond public testing. The panel highlights a broader strategic shift as open source AI matures, not only providing alternatives to proprietary models but exerting pressure on pricing and business strategies across the sector.
The conversation then moves to the increasing industrialization of AI infrastructure, spotlighting Google's landmark $25 billion investment in energy infrastructure for data centers. Panelists examine how AI is inverting traditional notions of abundance and scarcity—shifting bottlenecks from hardware to energy—and the potential socioeconomic, environmental, and regional impacts. The discussion raises critical questions about the long-term consequences of these investments, from community effects to global energy competition, and what it could mean if AI agents increasingly run on-device with smaller, more efficient models.
Main takeaways from the video:
Please remember to turn on the CC button to view the subtitles.
Key Vocabularies and Common Phrases:
1. holistic [hoʊˈlɪstɪk] - (adjective) - Considering the whole system rather than just individual parts; comprehensive. - Synonyms: (comprehensive, integrated, all-encompassing)
If anything, I kind of, you know, clap to Google to actually take a more of a holistic approach in terms of being able to create, you know, compute or create data centers.
2. retrospective [ˌrɛtrəˈspɛktɪv] - (noun / adjective) - Looking back on or dealing with past events or situations. - Synonyms: (review, reflection, look-back)
We're going to talk about a little bit of a retrospective for R1.
3. phenomenal [fəˈnɑːmɪnəl] - (adjective) - Remarkable or exceptional, especially exceptionally good. - Synonyms: (extraordinary, outstanding, exceptional)
They have done a phenomenal job. I mean it's a 1 trillion parameter model.
4. proliferation [prəˌlɪfəˈreɪʃən] - (noun) - A rapid increase or spread, often of something undesirable or uncontrolled. - Synonyms: (spread, expansion, multiplication)
It was always going to be kind of dismantled as you had a proliferation of developers, it was going to kind of erase to the bottom, but this really just kind of expedites it.
5. paradigm shift [ˈpærədaɪm ʃɪft] - (noun) - A fundamental change in approach or underlying assumptions. - Synonyms: (transformation, revolution, fundamental change)
Rather than a fundamental paradigm shift. So but that was, I think the efficiency was very important because you know, showing that you can get, you know, to these state of the art models with less cost, that was I think a very important shift, you know, that they have showcased here.
6. agentic [eɪˈdʒɛntɪk] - (adjective) - Related to acting as an agent, showing agency or intentionality; often used to describe AI models capable of autonomous actions. - Synonyms: (autonomous, self-directed, proactive)
So this is a model that is definitely being designed for agentic behavior. Right. They've really focused on the plan and they've really focused on the use of tools.
7. interdisciplinary [ˌɪntərˈdɪsəplɪˌnɛri] - (adjective) - Involving two or more academic, scientific, or artistic disciplines. - Synonyms: (cross-disciplinary, multidisciplinary, integrative)
So now a biologist can ask Claude to explain a complex, you know, physics concept in simple terms or material scientists can quickly understand, like a new machine learning technique. So this is kind of also going to foster a lot of interdisciplinary breakthroughs, which are really important to, I think, that push the boundaries of science.
8. compliance [kəmˈplaɪəns] - (noun) - Acting according to certain accepted standards, rules, or laws; obedience to requirements (often legal or regulatory). - Synonyms: (adherence, conformity, observance)
You know, the enterprise uptake for, you know, Deep seq, I feel it still remains unlimited and I think it's mostly due to regulatory and compliance and some of the tooling blockers.
9. uptake [ˈʌpˌteɪk] - (noun) - The adoption or acceptance of a new idea, process, or product. - Synonyms: (adoption, acceptance, assimilation)
But then if you look at in the west and in the US so the enterprise uptake is still limited and it's mostly I think of course in the academic and the specialized domain there is a lot of traction here.
10. maturation [ˌmætʃəˈreɪʃən] - (noun) - The process of becoming mature; the process of developing fully. - Synonyms: (development, evolution, growth)
So but I also feel with this launch we're kind of getting into this maturation of the open source AI movement.
11. vertical integration [ˈvɜːrtɪkəl ˌɪntɪˈɡreɪʃən] - (noun) - A company's ownership and control of multiple stages of production or supply chain, typically from raw materials to final product. - Synonyms: (consolidation, unification, amalgamation)
And at some point, it kind of feels like, okay, where this all goes is like vertical integration. You can, you can subscribe to, have your energy bill sent to you from Google.
12. downward pressure [ˈdaʊnwərd ˈprɛʃər] - (noun phrase) - Factors that cause a decrease in prices, profits, or values. - Synonyms: (depressive force, suppressive factors, declining trend)
And then two, I think there's also developer centric pricing is going to continue to force a downward pressure. I think over the last six months you've seen an actual explosion in cost per input and output tokens.
Kimi K2, DeepSeek-R1 vibe check and Google’s data center investments
It's all great in theory, but then you know what happens when it comes to I've got to power Google AI overviews or Mr. And Mrs. Jones down the road? Need to watch the television this evening or need to keep warm in the winter and you're like, I'm paying for the data center. Sorry grandma. We have a pre training run. All that and more on today's Mixture of Experts.
I'm Tim Huang and welcome to Mixture of Experts. Each week Moe brings together a crack team of the most brilliant and entertaining researchers, product leaders and more to distill down and chart a path through the ever more complex landscape of artificial intelligence. Today I'm joined by Abraham Daniels, Senior Technical Product Manager for Granite, Kautar El Magraui, Principal Research Scientist and Manager for Hybrid AI Cloud, and Chris Hay, Distinguished engineer. We have a packed episode today. We're going to talk about a little bit of a retrospective for R1. We'll talk about a huge data center investment by Google. We'll talk about the adoption of Claude by Lawrence Livermore National Laboratory. But today I actually want to start first with Kimmy K2 and I think for our around the corn question we'll do a really simple one which is Kimmy K2, is it overhyped or underhyped?
Abraham, curious if you've got any thoughts on that. Honestly, I don't know. From a benchmark perspective, it looks amazing, but I think we have to wait and see. From a generalization perspective, it's actually as good as they say. All right, Chris, what do you think? It is actually really good, but it's not better than Claude no matter what the benchmarks say. All right, and finally, last but not least, Kawtar, what do you think? Yeah, I think it's a little overhyped, but yes, it's a very good model. Okay, a lot to get in here too. I love these opinions. They're like, eh, maybe good, maybe bad. So just give quick background for folks who may have not been watching this. So Kimike 2 is a new model that dropped from the Alibaba backed startup Moonshot and it's an open source model notably, and it's been kind of really storming the charts. There's been a lot of chatter about it online. People are saying it's the best thing since sliced bread. And I think the most interesting thing about the launch is that the Moonshot company has basically claimed that against benchmarks. It is surpassing the latest state of the art for Claude and GPT4, particularly on coding Benchmarks which is a big deal, right? The idea that on this specialist task of coding this open source model is now challenging the biggest players in the game.
Abraham, maybe I'll start with you because I thought your response was maybe a good way into this discussion. You were saying, well, hey, it looks great, but we actually don't know yet if it's any better. What do you mean by that? Tell us, tell us more. Well, a couple of things. One, in public benchmarks, you know, as we spoke in a couple, in a number of these, you know, mixture of experts episodes can be gained and they don't always tell the full story. So although, you know, they may have published that they're better than cloud and GPT. Until we can actually get some independent or third party or see what the community actually thinks, I think it's, you know, a little, maybe the claim is a little bigger than it really is. Also, it's, I don't know, my opinion that there's a lot of kind of craze at the beginning and then things kind of settle down and we kind of figure out where it really stands. So I'm cautiously optimistic about its performance, but I'd like to just see some real world applications. Whether that's, you know, integrating to certain stacks or, you know, actually demonstrating side by side comparisons. Whether this is actually as good as they say it is. Yeah, for sure. And Chris, I think maybe I'll turn to you next. I think the caution is well warranted and I think at this point I barely look at the benchmarks in the blog post when they announce models because I'm like, ah, it's all gameable, it's all trash. But you seem to be convinced you're just on playing around with it. It's a good model, but it is definitely not as good as Claude and GPT4. What leads you to say that?
Putting my hands on the keyboard and typing stuff in and seeing what comes out. Yeah, give me more than that though. Of course. But this is more than just a vibe check, right? You actually think against certain tasks, you think that like, still, this is not surpassing the state of the art here. No, I don't think. I don't think so. So the first thing I would say, it is by far, in my humble opinion, it is the best open source model out there at the moment, or open weight model. They have done a phenomenal job. I mean it's a 1 trillion parameter model. So this thing is big. Okay. It is a mixture of expert model with a lot of models, but it's still a big model and, you know, and you need a lot of this space to get that running on your machine now. But it is the best model for an open source. But it doesn't be Claude. And there are a lot of things that I think are really good for this model. I mean, when I was playing with it, I really liked its planning capability, I really liked its tool use. So this is a model that is definitely being designed for agentic behavior. Right. They've really focused on the plan and they've really focused on the use of tools. And I think that is going to be exciting when we run a smaller model. Because, to be honest, when you want to run agents, I said agents, of course, but when you want to run agents, you want your models to be small and fast and lean and. And I think it's going to do a phenomenal job as well. The other thing is it's not a reasoning model, so it doesn't have that thinking capability yet. They've just provided a base model and an instruct model. But it is fabulous for the chat. So code wise or to sort of come back to what I said there, Tim, Right. Code wise, I think open source, open weight model, it is the best coding model out there. I've used pretty much every single one of these models, whether it's the quant models, whether it's deep seq, etc. It really is the, the, the best coding model out there for an open weight model. But it doesn't beat Claude. Right. It may beat it on the benchmarks. And back to Abraham's point, right. Which is a lot of these things are gamed towards the benchmarks to try and get that sort of edge. But when you put it in real coding scenarios, right. I want to code up this, this particular program, change this, do this, whatever, it, it does a good job. But, But Claude is better, right? I mean, Claude is giving me better results than I'm seeing from my vibe checks, but fair play to them. Don't take anything against that. It is an incredible model and for kind of the budget, the compute, the time that they've had, again, spectacular.
Katzer, I know you came in basically saying that you felt like it was a little bit of an overhyped launch and so do you kind of buy. Chris, you're basically like very good, but still like, you know, as compared to the proprietaries, you know, I think it's still, still lagging a little bit behind. Yeah. But I think there are also other angles that this release or this Launch is kind of getting us to start thinking about which is more on this evolving war on the cost, the open source versus the proprietary APIs. So if you look at companies like OpenAI or Anthropic or Google, they're charging per token for API access. But you know these open source models, with models like KVK2 llama or Mistral, you know the cost here is shifting from these API free to a fixed or at least predictable infrastructure cost. So you're paying more for the compute. So it's like we're getting with models like, you know, it is a great model, I played with it a little bit, but it's kind of where the good enough tipping points. So for you know, many business tasks, you know, summarization, classification and et cetera, these open models are doing a pretty good job or even superior than the closed ones. So now I think we're kind of getting into this phase where companies can now adopt this hybrid strategy, use maybe expensive proprietary models for maybe complex frontier tasks, but then offload the bulk of their workload to really cheaper or self hosted open source models. So but I also feel with this launch we're kind of getting into this maturation of the open source AI movement. I mean it just didn't happen with Kimik too, but also with the other open source models. So it's no longer about providing a free alternative but also about competing directly on performance and features with these other closed source models. So I think with this release it's also kind of pushing for, you know, kind towards putting more pressure on the pricing models of these proprietary giants like OpenAI and Google and you know, kind of, you know, putting a lot of pressure on them. So the future of these, you know, enterprise AI is not just a single vendor solution. I think it's kind of, we're leaning toward more cops, optimized portfolios, hybrid models. And you know, I think Kimi's K2 success or really great performance signals that the primary battleground in AI here is shifting from this pure performance race to kind of a war of economic efficiency and also the strategic control.
Yeah, and I think I did want to pick up on the strategic control point. There's an interesting observation that some people are making which is, okay, I know this group is maybe a little skeptical about sort of K2's ultimate capabilities on coding, but assume for a moment that it is actually better than what Claude and say OpenAI can provide. A lot of people were pointing out that actually now it's actually a little bit difficult for Kimmy to compete in that universe because a lot of people are on platforms and endpoints that are using all the existing leading proprietary models. And I guess Abraham, maybe I'll throw it to you because I know you're working with Granite day in, day out. Do you think that there's kind of this really interesting dynamic emerging where now the kind of pre existing install base, effectively, if you will, for these models particularly encoding means that it's actually really difficult for a new model, even if it's better to get in and actually compete with these proprietaries. Do you buy that at all? Not really. I think it's less about whether if it's better or not. And to Katar's point, it's really what are the economics of using this model versus vendor lock in or locking into a particular stack or infrastructure? I think the question really is it good enough where the price tag aligns with our business case or our use case? And I think you're consistently seeing that these open source is now a strategic weapon as opposed to just a mandate by an organization where you're starting to disrupt a lot of these closed source models. And when you can actually brush up against their performance, whether that's R1 on reasoning or Kimi2 or Kim UK2 on coding, you're really signaling to the market that one vendor lock, in my opinion vendor lock in was always going to be kind of dismantled as you had a proliferation of developers, it was going to kind of erase to the bottom, but this really just kind of expedites it. And then two, I think there's also developer centric pricing is going to continue to force a downward pressure. I think over the last six months you've seen an actual explosion in cost per input and output tokens. So personally I think this is amazing. Granted as a model we are huge proponents of open source licensing with no known things open sourcing. So I think this is the right direction not only for the field. And then I also think this is kind of signaling to LLAMA and OpenAI that they have to start to take this very seriously in terms of how this bakes into their roadmap too. So with OpenAI are hinting at another open source model, the first since GP2. So personally, back to your question, I don't think this is necessarily an issue. I think this is really just an economics question more so than a technology question.
The second topic of today that I really wanted to get into was zooming out from Kimike 2. Right. Someone pointed out to me recently we're six months since the R1 launch, which is amazing because R1 launched January 20, 2025. It already feels like it was six years ago. Not just six months ago, but I think it might be good for us to kind of just talk for a few minutes. Zooming back a little bit on like what has changed since R1 launched. And I think, Abraham, you're picking up, I think on one thing that I did want to bring up, which is, you know, in the midst of all this, OpenAI announced that it would be kind of delaying indefinitely the launch of its open source model, which was kind of way hyped and which originally read as kind of a response to this new generation of Chinese open source models, but now appears to be kind of like on the back burner. Well, backburner is maybe the wrong word, but delayed for an unknown amount of time, I guess. Chris, maybe to throw it to you, do you feel like the US companies in some ways have not been able to kind of answer this open source challenge at all?
I think in some ways Meta is still competing, but OpenAI is not really open sourcing. It feels like there hasn't been another kind of marquee model that says, oh, okay. Actually a lot of these kind of dominant US companies can kind of keep up in this race. I think there's different economics and power shifts in play in this sense. I don't think there's any reason why OpenAI or Anthropic can't release an open weight model. Right. And they're obviously choosing to do other things there. I stick by my statement that I said earlier for size. I think the best open weight models out there are the, the deep seats. That is the, you know, now surprised by the Kim, you know, suppressed by the Kimmy K2 model. You know, the Mistral models are incredible. They're open weight models, you know, especially their 24 billion parameter 1 and the Mistral medium. They, they're really great models and I love what we are doing with granite, with the 7B models or in the 8B models. Right. Everybody and the, the 1B models, I think everybody's forgetting about these really small models, right. And, and actually they become super important especially for things like agents. So I, I think they're missing a trick. I mean the only American company that's really producing good open weight models is Google at, you know, Google at the moment and, and IBM obviously. But I'm meaning on the kind of the higher number of parameters. So I, I just think, I think there is more to do in that effort and it's running away from there. So I'd like to see that position change because the reality is there is a risk for all these companies, which is once you start to get competitive models, you're not going to compete with a trillion parameter model. But if you can get a really great coding model down to the 8 billion parameter number, and again, I don't think that's far off when you think about some of the, like Mistral's doing with the 24 billion parameters, then Aria, back to Kaldar's point about cost economics. If I can run something on my laptop and I can get good code from it, or I can run good agents from it, that starts to affect their business model. So I'm a big fan of openway, a big fan of open source. I really like to see all the close source providers open up their models and open up their weights. I'd like to see that as just get it done.
Yeah, for sure. Well, and I think that's one thing I did want to get to. And Kaltar, I guess you've been name checked so I'll kind of bring the conversation back to you. Is like, you know, there's obviously different economics and the kind of sort of us leading companies are like trying a couple different things in the space. But it is kind of interesting to me that it feels like the number of kind of like, like I guess when I think about open source, I think like, oh well, there's going to be tons and tons of different players putting out lots and lots of different models and you know, we're going to see this space really, really kind of open up. I mean, to Chris's point, you know, even though the Chinese market has kind of like really invested in open source, it still feels like after six months, deepseek is really still kind in the lead here, right? Like that. Actually we haven't seen like an explosion of new companies offering open source models in the space that are kind of at least as competitive. I guess the question I kind of want to get you to respond to is like whether or not you think there's like a special discipline with doing open source models that's maybe different from closed source. Like, is there like a different style of what's going on here that actually is almost as difficult as doing a closed source model?
Well, yeah, that's a very good question. I think what really, you know, I think helped Deep Seek is the efficiency, you know, aspect of it. So I think the key innovation was mostly behind their architectural efficiency, where they employed, you know, you know, the bag of techniques of mixture of experts, reinforcement learning, you know, optimizations all the way to the level of the ptx, et cetera. So that was, I didn't think that was kind of, maybe kind of a breakthrough thing, but more, you know, efficient implementations, clever ways of using existing techniques. So and, and of course, you know, there is an ongoing debate about the nature of, you know, Deep seqs achievements. You know, while, you know, some of them view that, you know, their methods are revolutionary breakthroughs, you know, I'm more, you know, along the sides of those that, you know, that think that, you know, it's a clever and effective implementations of existing techniques rather than a fundamental paradigm shift. So but that was, I think the efficiency was very important because you know, showing that you can get, you know, to these state of the art models with less cost, that was I think a very important shift, you know, that they have showcased here and since then we've seen you know, many releases where they kept improving their models. So they have the steady flow of releases where they kept improving. So that was really great to see. So going to your question, what's kind of the, maybe the recipe here? I think of course being able to be state of the art, kind of beating these benchmarks, but also having the capability to do these things efficiently. But if you see like six months from their lunch, have they kind of shaken the markets, have they kind of, you know, like especially the closed source ones, probably not that much. You know, the enterprise uptake for, you know, Deep seq, I feel it still remains unlimited and I think it's mostly due to regulatory and compliance and some of the tooling blockers. So the adoption of course is mostly concentrated in Chinese based startups and hobbyist communities. But then if you look at in the west and in the US so the enterprise uptake is still limited and it's mostly I think of course in the academic and the specialized domain there is a lot of traction here. A lot of Researchers are leveraging R1 format, problem solving, co generation and especially for the Chinese language medical diagnostics for example. But then in the enterprise I feel it's still limited and maybe that's also kind of part of this geopolitical AI race where we've seen it is getting intensified. So because Deep Seq's open source strategy is encouraging the rivals, for example Moonshot AI like we're seeing with KMEK2 to follow here and especially to kind of trying to partially bypass the US controls or the US chip controls. So that is really something that is so important for them. But you know what we see also on the western governments, they're really trying to double down on this trustworthy AI frameworks which is becoming very important.
Yeah, I think this is so a lot to unpack there and I think you're getting to something I think is really interesting is I think the narrative when R1 launched was oh man, all of these American companies are suddenly in trouble because you have this incredibly powerful model and it's available for free. Right. And I think six months on my kind of reflection is Katha the same as yours, which is actually like enterprise adoption has been kind of less than I would have thought. And that's pretty interesting, right, that in some ways the market dynamics that we kind of originally thought with R1 and particularly around open source don't necessarily seem to be playing out the way we thought, I guess. Abraham, do you have any responses to that? It's kind of odd to me that you have this incredibly great model that's like available for free and we just haven't seen like mass adoption in a six month period. Like if anything, you know, they're proprietary like your open AIs, your anthropics of the world seem, you know, they're changing the strategy but they're not like completely demolished as a result of this change.
Yeah, and I think that's exactly it. I think it was less of like competition with respect to, you know, another model that's in the queue in terms of, you know, what your enterprise is going to use. I think it was just more of like a what the strategy was. The status quo pre R1 shifted to be able to differentiate from R1. So you know, where closed models were clearly ahead. You know, open weight was able to give you parity on key resource, key reasoning tasks. So it shifted to, you know, let's get smarter, cheaper inference as the goal. You know, agents were agentic. Orchestration was already kind of, you know, bubbling up but everybody doubled down on, you know, being able to develop an LLM that was, you know, a key supporter of agentic workflows. Safety was also doubled down too in terms of, you know, our models are, you know, safe from a, you know, red teaming from a governance from AI perspective both on the model and the data side. So I think it was really just a shift in strategy from like a model capability, PR, if you will, perspective in order to differentiate from R1 to showcase that, you know, we are moving forward as a, you know, US based or western model. Developer companies and less of A, you know, R1 was now considered a viable option as part of like an enterprise use case. It's really interesting, Chris, maybe a final comment again pulling out of kind of Kowtar's theme. You know, I think Kawtar, you've pointed out, I think this really interesting thing, which is, well, Maybe part of R1's genius is its dedication to efficiency. They were able to assemble all these hacks together to really squeeze a lot of results without having a whole lot of resources. And I think a little bit about what it means to be efficiency minded and how it can be really hard to kind of think in that style if you're used to having the most compute and the most money in the entire world. And I guess, Chris, I don't know if there's almost kind of a thesis here that I want to run by you, which is could it be hard for American companies to pivot into this, which is a big deal if you think that small open source models are going to be the future of agents. Is it hard for these companies to pivot into this kind of efficiency mindset? Because in some ways, technically I think they're maybe so used to an environment where it's like we never have to think about scrappily how to assemble all these things to start, squeeze the most results out of limited resources. I'm curious about if you think that's almost like a barrier in some ways to these companies pivoting towards open source.
I think that when you are limited by your resources, you become super creative. And actually if we think about the Kimmy K2 scenario, they got super creative, right? One of the biggest things that they did is they came up with their, their new optimizer, right? The, the Muon optimizer, which, which was really about them being able to train very, very large models in a consistent way and not basically have their training losses mess up during that process. That is a huge moment. Now, we don't know all the details behind that, but the innovation there is great. They've moved away from the optimizers others are using. Right. When I think about the Deep Seq moment and their efficiency, they similarly. But nobody really cared about Deep Seek. But when they first launched anyway, DeepSeek v3 came out in December, but it wasn't until they released R1 where we got excited and it's because they had the reasoning model and it was pretty much close to the O series of models there, right? And then they were open about how they published it. They, they went through you know, their RL flow on how they trended, the GRPO stuff, et cetera. And, and we all learned stuff and it was all great, but they were innovative and, and the great thing is they were open about it. And everybody's been running around copying their techniques and learning from them. Kimmy K2 wouldn't exist if deep seat V3 wasn't open about how they train the V3 model. So I think that in itself is going to boost that creativity. But to your point, I'm not quite sure if you're just sitting there with, you know, hundreds of thousands of H1 hundreds. I'm not sure you're, you've got all the compute you need, right? I'm not sure you're going to be. So, you know, you're just going to get your job done as opposed to like going, oh, I can't do this because I don't have this and I need to figure my way out of it. So I think, I think that is helping them. But, but why is Deep seek maybe, you know, six months on to your point, I'm going to call it the, I'm going to call it the Patrick Mahomes effect, right? There are great quarterbacks kicking around. Tom Brady is the greatest. And then great quarter quarterbacks who come along. You go, oh, there's Jared Golf. They're like, you know, you're like, okay, even Justin Herbert, people will shoot me for that. They'll go, ah, okay. Do you know what I mean? Because you're not seeing anything amazing over time. You get used to them. But then when you look at Patrick Mahomes play and you're like, how did he do that? No human on earth is able to make that through. How did he do he, he wasn't even looking. And I don't think those models are quite doing that yet, right? Because the models that have come out are equivalent or there are thereabouts the same as the, the, the models and nobody really cares about the same, right? If you think of like, you know, if you think of super bowl or what. Nobody remembers who lost the Super Bowl. They're close enough to the, the team that won, right? But, but people care about the winners the greatest, right? So I think for one of these to take hold and really upset OpenAI, anthropic, et cetera, they're going to have to do something like that no model has ever done before. It's just like, oh, I press a button and it's, it's created an entire billion dollar company overnight. Wow. And it's done it on a chip that runs on my laptop. Then we'll be like, I mean that would not be impressive. I would be impressed. Nobody's going to care at that point. You think you're going to stick on? Oh no, I'm going to stick typing in ChatGPT, you're like, no, I'm running over to the new thing. I've got to see that. Whereas if it's just like as the same as it was before, you're like, well, it's just the same. I'll stick with what I've got. That's what needs to change. And also I think the first movers adventures always has a big effect. I think OpenAI, which ChatGPT kind of gained a lot of mass adoption. And so once you get used to that, sometimes switching from that environment to something else, you really need to have like Chris say something completely kind of of why we affect something not just incremental. And I think Tim, going back to your like the resources or the compute question, so even you know, you know, R1 kind of shaped, you know, the GPU dominance, the Nvidia GPU. So the stock dipped, you know, significantly, like 17%. But then the demand for Nvidia hardware kind of rebounded because large scale inference still relies a lot on gpu. So. So I mean we had the panic moments, but the efficiency gains really haven't negated the massive compute needs that are still there. Yeah, I think that's right. Well, we'll be checking in again in another six months. I think kind of like using R1 as a peg and kind of moving out I think is really useful just because the space moves so so quickly. I'm going to move us on to our next topic. Announcement coming out of Pittsburgh, really big event this week. The president was all the major companies were there. But I think there's one announcement in particular I want to zoom in on, which is Google announced that it'd be making a $25 billion. That's with a B announcement to invest in energy infrastructure. So for one part, hydropower in Pennsylvania and then also something that's known as the PJM interconnect. Right. Which is a network grid that stretches across New Jersey, Pennsylvania, West Virginia, Virginia, really large area of the country. And I think this is in some ways taking a step back, both wild, both in terms of the dollar amount being committed, but also just to remind ourselves that Google is a company that started doing search. Right. And so it's not intuitively obvious that you would eventually Say, years later, we're going to be investing billions of dollars in going all the way upstream to really literally change, like, the energy grid of a whole part of the country. And so I guess Abraham, question for you is just like, how far you think this all goes, right? Like, at some point does Google just say we're going to be owning and operating a nuclear power plant? Like, it feels like in some ways AI is generating such demand on the grid. These companies really need to assure energy access. And at some point, it kind of feels like, okay, where this all goes is like vertical integration. You can, you can subscribe to, have your energy bill sent to you from Google. Is that where this is all going to?
I mean, that's a great question. I think you mean Microsoft and Meta have both, you know, committed massive amounts of money to build their own data centers. I think Google's taken a different approach in terms of not only building a data center, but what I think was missing with the prior ones is investing in the actual grid themselves as well as investing in the community around them. So to your question, you know, maybe it kind of makes sense if you talk about the actual cost of power to be able to manage these data centers. If anything, I kind of, you know, clap to Google to actually take a more of a holistic approach in terms of being able to create, you know, compute or create data centers. Because I feel one thing that's typically missing is, you know, getting a better understanding of how, what the impact of these data centers are to the surrounding, whether it's the grid, the, you know, ecosystem, the, you know, this takes a ton of water to be able to cool these things, so the runoff. So I think from Google's perspective, they did a more of a holistic approach, which I kind of applaud to. I think this is only. Can you continue to happen? And you kind of mentioned, you know, nuclear energy. Like, I think the next step is really better understanding, you know, whether Taizo power, you know, electricity, nuclear. Where is all this energy actually going to come from? Because it's, you know, depending on what you read, you know, by 2030, data centers are going to represent 1% to 3% of all power on the grid. And right now, it just can't support that, let alone manage it. So it's really kind of focusing on how do we support today, and then how are these hyperscalers going to invest in the grid if they're going to be the primary user of the energy coming off of it? Because there are some downstream impacts. And I mentioned Environmental. But when you have all these data centers or these players integrating to the grid, that drives electricity costs up for your everyday consumer. And, you know, some of these areas that these grids are built by, these data centers, are built in middle America. You know, these aren't areas where, you know, you typically have, you know, access to as much as you would maybe like a New York or Boston or in San Francisco. So I think it's just important to kind of take a little bit more of like a long tail view in terms of, you know, building out the grid and building out these data centers and really focusing on, you know, what are the impacts above and beyond, you know, the business side of things, and what are the impacts from the, you know, the surrounding community and environment?
Yeah, Kaltar, you think about hardware a lot, and I think that, you know, one of the things I love about AI is how it just kind of inverts our sense of what's abundant and what's scarce. You know, like, I think a few years ago you would have said, oh, there's just so much data, we're never going to run out of data. And then I think in AI land, we routinely have conversations where we're like, how do we get the next most valuable tokens? And it feels like for a long time, at least in what we're talking about here, hardware felt like the real bottleneck, which is like, can you get access to Jensen's chips? That really was the big thing. Over the longer run, though, the midterm, like, let's say five to 10 years, do you think energy becomes the new bottleneck? At some point, I think there will be more chips, there will be more GPUs, there'll be more suppliers of those GPUs, there'll be changes in models that maybe make the specific hardware less necessary. But it kind of feels like maybe where this is going is that whatever hardware platform you use, the energy demand is just going to be enormous. And so should the world of AI start to think about energy becoming a bottleneck?
Yeah, I totally agree. I think it's interesting to see this shift from ship shortage to power shortage. I think like you said, for the last few years, the main AI bottleneck was securing enough GPUs and of Nvidia GPUs. But now it seems like the new bottleneck is physical security, land permits, and most importantly, access to these massive amounts of stable electricity. Because the data centers, of course, is useless if you can't power and cool it. And I think even utility companies they're reporting that requests for new data center connections are really overwhelming their capacity and forecasting capabilities. So I think wait time for large scale power connections can be years long. This is pushing to this sustainability challenge that we're going to be facing and I think we already started seeing these things. So this massive increase in energy demands puts enormous pressure on the climate goals here. So how do we power this AI revolution without relying on fossil fuels? And that's what Google is doing here. So I think this is forcing big tech companies to become also energy players. So they are now among, I think the largest purchase purchasers of these renewable energy through like the power purchase agreement, like the PPAs. And I think Google's investment here is likely, you know, tied also this new solar, wind and potentially next generation geothermal or even nuclear projects to meet, you know, its carbon free energy goals. So of course I think what Google is doing, this is a massive investment. It's just confirming that the AI race right now is officially an industrial scale energy and infrastructure race. Like he said, the new bottlenecks right now it's going to become energy.
Chris, one of the things I'm wondering if you can opine on is I think a little bit about the downstream effects of all this, which is you're just building a lot more energy capacity. But the nice thing about energy is you can use it for all sorts of things. You could use it for industrial manufacturing. There's all sorts of things that happen when energy becomes more available. And I guess I'm curious about how you think a little bit about that. I guess maybe I'll put it in the most dramatic way. If you're a cynic, you might be like, ah, all of this AI stuff is a huge bubble and at some point it's all going to fall apart. Even if that's the case, at that point we would have built this huge electrical grid which is kind of like this really interesting outcome is basically like it almost feels like AI is now pulling other making things happen that are going to have all these downstream effects that have things that nothing to do with AI at all. So yeah, I don't know. I'm curious if there's maybe to put a question on it if there's particular effects that you think are the most interesting here. I don't know if I'm honest and it's not often I say I don't know, but I imagine, imagine if we went back 150 years and Google made steam trains. I'm like, do I need 100,000 steam trains do I need millions of tracks of clackety wood railways? And I'm like, I don't know. Do you know what I mean? America cannot fall behind building railroads. Yeah, yeah. And this is. And then it would be like. And then you're like we need more kettles to fill up the engine with water. You know what I mean? I'm not sure. And you like the downstream effect is like it's all great in theory, but then you know what happens when it comes to I've got to power Google AI overviews or Mr. And Mrs. Jones down the road need to watch the television this evening or need to keep warm in the winter. And you're like I'm paying for the data center. Sorry grandma, we have a pre training run. Exactly. And I, so I don't really know how that works out logistic wise. And I worry about then big massive dams filled with water for the cooling and then the poor person at the other end of that dam going I've got no water. And my, you know, I think there is a lot of effects and I'm just, I'm not sure how this works. What I would like to see is people figuring out how to get more energy efficient electricity, et cetera, you know, how to bring down the cost of compute, you know, have more efficient models. I mean, I mean in theory I think it all sounds great that if you can have the infrastructure and energy and then regular people are going to, as opposed to AI people are going to get the benefit of that, then I think it's wonderful. But I don't know, I don't know if we're going to have like some big wasteland at the end of this. So but maybe, maybe they're going about it all wrong. Right? Who says the data centers need to be on planet Earth? Why not just load it in a big rocket ship, push it towards the sun? You know what I mean? You get all the energy you want in space and then just send the model weights down. So maybe, maybe, maybe they're doing it all wrong. I don't know.
Yeah, for sure. Yeah, I think that's, that's kind of you're getting to. I think what I, what I was interested in was basically like how much of this is really required for the future of AI what are all the alternative structures we could imagine building? But there's a lot to talk about there. Maybe we'll get lucky. Maybe one of these compute like the Alibabas, the moonshots, et cetera. Maybe because they're so GPU constrained, they'll come up with a model that runs really small and then we won't need it. Totally. Yeah. There's, I think, an alternative world where it's just like, actually maybe if some of what we think is going to happen, say we buy. Chris, your theory about like, okay, in agent world, you're going to mostly need smaller models that can kind of run locally and on devices, if that ends up being the major commercial use for this technology, what is all this huge investment for on energy infrastructure? And I think that's a very real outcome potentially. But maybe there are just going to be more and more usage of these things that's just going to drive more demand on the electricity. It's like right now, the phones even, you know, they're kind of relatively low power, but the massive, you know, usage of the phone is still going to increase the energy. So if we have in all these AIs, in all devices, all embedded devices everywhere. So it's still going to be, you know, I think, a big energy footprint that is needed to sustain all of these things. So I think the energy problem is still going to be there. Whether we go towards smaller models or we still going to have a hybrid approach with big and smaller models, energy is still going to be an issue. And I'm worried, like, you know, Chris said, you know, what's the imbalance this is going to create the new wars, you know, they're going to be, I mean, are we going to kind of increase, you know, the divide between, you know, the poor and the wealthy and accessibility to, you know, the basic things to live in favor of powering these models and things like that. So that is something, I think that's a bit scary. Yeah, the movie becomes a documentary as opposed to a movie, and everybody's going to go and Google that now and go, what the. What is Chris talking about? What is Chris talking about?
All right, last segment, which we're going to do really quickly as usual. Way more to talk about than we have time for. Fun small announcement that Anthropic made on its blog recently. They basically announced that one of their customers, Lawrence Livermore National Laboratory, one of the big national labs in the us has decided to kind of expand their installation of Claude to expand across the entire laboratory. So this is a license of their core product that goes to 10,000 scientists. So on some level, this is just, hey, you got a new customer, you got a bigger customer. That's great. But I think what's really interesting is they went a little bit into Detail on what scientists at Lawrence Livermore are using CLAUDE for, for. And one of them, I'll just read it, quote, basically they're saying the scientists are using basic CLAUDE for processing and analyzing complex data sets, generating hypotheses, exploring new research directions with an AI assistant that understands scientific context. And the idea here is to literally use sort of agentic or in the very least kind of AI assistance to accelerate scientific discovery. And I guess Abraham, maybe to throw it to you, this is like a pretty big deal, it feels like, right? I know in the past we've talked a little bit about like, well, is AI going to accelerate science? This seems to be like a big lab saying we're going to make a bet on this technology. Do you feel we're now kind of entering an era where AI is really going to actually be accelerating science?
I mean, I think it already has. To be honest. I think this is just more of like a publicly facing PR piece demonstrating one of the biggest research firms in the us if not the world, using AI to accelerate science. I think what this is really cool. What I think is really cool here is, you know, the agentic kind of validated the agentic framework in a kind of like a high stakes environment, if you will. But yeah, I think this is kind of an early indication of what we can do. What I think is, well, I wouldn't say neat, but like, is the, you know, these are really highly like, you know, secure spaces, if you will or like, you know, in terms of the dis, like the science behind it, like having an agent. And I don't know whether this is like, you know, an agent that is unmonitored or whether there's some type of human, the loop validation scheme as part of the workflows. But yeah, I think both from the perspective of using cloud and using it in a way to, you know, drive, you know, scientific discovery. Like I. One, I think that's amazing, but two, I'm also kind of cautious in terms of, you know, what, you know, where is the, where is it a full agent? Of course, LLM based approach versus is this basically just like, you know, a side of a desk tool that helps navigate some of the, you know, pieces of the, of the, of the discovery or the experimentation pipeline. But in short, yeah, I think this is awesome and I think it's a sign of things to come.
Katar. I think we still worry a great deal, or I do at least about hallucinations all the way these models kind of fail and I'm sure they're Deploying this stuff in a responsible way. But I think the dream is ultimately what Abraham's talking about, which is you literally have an AI agent that is kind of like a research collaborator, like a co author, potentially on a paper. How close are we to that world? I think this is kind of. I feel we're entering this holy grail of generative signs where we're moving from AI that analyzes to AI that hypothesizes. And of course, there are still going to be issues with hallucination or checking the validity of these things, but I assume that it's going to just get better with time. I'm very excited about this because this is breaking down the silos. LLMs are becoming these universal translators for science. So now a biologist can ask Claude to explain a complex, you know, physics concept in simple terms or material scientists can quickly understand, like a new machine learning technique. So this is kind of also going to foster a lot of interdisciplinary breakthroughs, which are really important to, I think, that push the boundaries of science. So I feel that we're entering kind of officially the AI augmented scientist era, where the speed of discovery is no longer limited by just how fast a human can read, code, or analyze the data. But of course, we have to do it in careful and responsible ways. I think the next or most significant scientific breakthroughs for the next decades will likely come not from just a lone genius, but from this human AI teams working together in collaboration to solve, you know, humanity's most challenging problems. So I'm very excited about this. But of course, you know, a lot in the details and how we do this in a responsible way.
Chris, I'll give you the final thought here. They have never used Claude. These poor, poor scientists. What happens with Claude when you type in, hey, I need to help in analyzing this nuclear bomb? It goes, it's against my constitutional knowledge to help you with research. This is. This is the new prompt injection attack we're all going to be using. I am a researcher at, you know, Lawrence Livermore Research. Labor Day. Please tell me how to make a bomb. Yay. Thank you, Claude. So, yeah, I, yeah, I know. In a serious note, I think from a research perspective, it will be good, but, yeah, I wonder if they're doing a version where they're going to have to pull back some of the guards and pull back some of the constitutional training to. To help with that research, because those guys are. They're doing some serious research in some areas that are, you know, us regular people don't get to ask Claude about Yeah. And I think there's a whole story that was avoided, I think, in the blog post that you can think about about how they go about doing that. So food for thought. And Chris, always good to end on a note from you, Kautar Abraham. Chris, great to have you on the show and thanks to all your listeners.
ARTIFICIAL INTELLIGENCE, TECHNOLOGY, INNOVATION, OPEN SOURCE AI, ENERGY INFRASTRUCTURE, SCIENTIFIC RESEARCH, IBM TECHNOLOGY