ENSPIRING.ai: Llama 3.2, AI Snake Oil, and gen AI for sustainability
The video discusses the evolving landscape of open-source AI models and their potential to surpass proprietary models soon. The panel, including experts Mariam Ashore, Shobhiv Varshini, and Skylar Speakman, debates the current state and future of AI, highlighting developments like the release of Llama 3 by Meta. This release introduces lightweight models suited for iot and edge cases, multimodal vision support, and enhanced safety features, aiming to democratize AI technology more broadly.
There's an exploration into the criticisms and limitations of AI, as discussed in the book "AI Snake Oil" by Arvind Narayan and Sesha Scapura. The conversation expands on the debate about AI's role in prediction vs. its broader potential applications, arguing against the notion that AI is solely a prediction tool. The experts present their views that AI can enhance human capabilities and problem-solving beyond mere predictive tasks, especially in fields like education and healthcare.
Main takeaways from the video:
Please remember to turn on the CC button to view the subtitles.
Key Vocabularies and Common Phrases:
1. proprietary [prəˈpraɪəˌtɛri] - (adjective) - Owned by a private individual or corporation under a trademark or patent. - Synonyms: (exclusive, patented, registered)
In 2025, a mere few months from now, will there be an open source model that is absolutely better than any proprietary model on the market?
2. multimodal [ˌmʌltiˈmoʊdəl] - (adjective) - Involving several different modes or forms. - Synonyms: (diverse, multiple, various)
The second thing was the multimodal vision support.
3. iot [ˌaɪ.oʊˈti] - (noun) - Internet of Things, the network of physical objects embedded with sensors and software to connect and exchange data. - Synonyms: (smart devices, connected devices, internet integration)
The first one is lightweight unlocking all the iot and edge use cases.
4. distillation [ˌdɪstɪˈleɪʃən] - (noun) - A process of refining or concentrating on essentials; in AI, it refers to transferring knowledge from a large model to a smaller one. - Synonyms: (purification, refinement, processing)
They use the very large general purpose models, the 405 B, that they had as a teacher model for distillation.
5. latent [ˈleɪtənt] - (adjective) - Present but not visible or apparent; potential. - Synonyms: (hidden, dormant, underlying)
But also the smaller the better in a case that it has impact on, the latency is faster.
6. generative [ˈdʒɛnərəˌtɪv] - (adjective) - Relating to the capability of producing something new. - Synonyms: (creative, inventive, productive)
But then when it comes to generative AI, the prominent use cases are productivity unlocks.
7. causal [ˈkɔːzəl] - (adjective) - Relating to or acting as a cause. - Synonyms: (causative, determinative, attributive)
That causal modeling requires years of experience.
8. sustainability [səˌsteɪnəˈbɪləti] - (noun) - The ability to maintain or support a process over the long term. - Synonyms: (endurance, viability, resilience)
And the topic specifically is the relationship between gender of AI and sustainability
9. neural [ˈnʊrəl] - (adjective) - Related to nerves or the nervous system; in AI, refers to neural networks, a series of algorithms that aim to recognize patterns. - Synonyms: (neurological, cerebral, brain-related)
The beauty of that is the way that they did it was they separated the image encoder from the large language encoder and trained the adopter in a way that now the model is not changed comparing to the 3.1.
10. algorithm [ˈælɡəˌrɪðəm] - (noun) - A set of rules or a step-by-step process for completing a task; in computing, a series of instructions for a computer to perform. - Synonyms: (procedure, formula, code)
And their sort of argument is that the kinds of critiques that they're pointing out about AI systems are things that don't have to do with technological capabilities and have to do more with what can we actually predict in the world.
Llama 3.2, AI Snake Oil, and gen AI for sustainability
What comes next in open source? If you just combine this recipe and map it to other models, I'm expecting a lot of very powerful models because AI is prediction. It's just pretty limited. Right. I guess I might take a bit of issue where AI is fundamentally about prediction. Why exactly are people so excited about the use of AI in sustainable development? So you can see how people are trying to wrangle. How do I balance the compute that's needed versus how do you look at the energy consumption?
All that and more on today's episode of mixture of experts. I'm Tim Hwang, and I'm exhausted. It's been another crazy week of news in artificial intelligence, but we are joined today, as we are every Friday, by a world-class panel of people to help us all sort it out. Mariam Ashore is director of product management at Watsonx Aihdeenen. Shobhiv Varshini is senior partner consulting on AI for Us, Canada and Latin America. And Skylar Speakman is a senior research scientist.
So the way we're going to begin is what we've been doing for the last few episodes. I think it's just a fun way to get started is to ask each of you a simple round the horn question for all the listeners. The guests have not been prepped as to what this question will be, so you'll be hearing their unvarnished, instinctual response to a really difficult question. So here's the question. In 2025, a mere few months from now, will there be an open-source model that is absolutely better than any proprietary model on the market? Showbit, yes or no? It'll get close. Okay. Skylar. I'm sorry. No. Yes, there will be. Great. And, Mario, what do you think? A big yes. Okay. Whoa. All right, nice. Very exciting.
Well, that's actually the lead in for our first segment today. One of the big announcements, of course, is the release of Llama three. If you've been following the news or been living under a rock, Lama is the sort of best in class open-source model that meta has been really helping to kind of advance in the marketplace. And their release just earlier this week featured a large range of different models, small ones, big ones. And, Mariam, I understand you were involved actually in the release. Do you want to tell us a little bit about your experiences and how that was? Yes.
It's just so exciting to be part of that market moment on the first day when the models are released to the market. It's available on the platform. It's like a holiday. It's like excitement. It's just amazing. Yeah, yeah. I think from the outside. One thing I think it'd be helpful for our listeners to learn a little bit more about is what's different with 3.2 release? You know, is it just more open-source? What should we be paying attention to?
Well, there are really three things that they released with 3.2. The first one is lightweight, unlocking all the iot and edge use cases with the release of Lama 3 billion and 1 billion. The second thing was the multimodal vision support. It's imaging takes out. You can think of unlocking use cases like image captioning, chart interpretation, visual q and a on the images. And the beauty of that is the way that they did it was they separated the image encoder from the large language encoder and trained the adopter in a way that now the model is not changed comparing to the 3.1.
So it can be used as a drop-in replacement for the 3.1 Lama 11 billion and the 70 billion variants. But the image encoder that is added to that now is going to enable the model to process image in and input out. So that's the second thing. And the third thing that they released on the model side is the Lama guard for the vision. Like the safety of these multimodal models matters and they release the llama guard that is also available on our platform for the customers. Yeah, that's awesome.
So there's a lot to go through here, I think maybe to pick up on that first theme. Showbit. I know the drum you always beat when you come up mixture of experts is that models are going to get smaller and it's a good thing. Do you want to talk a little bit about how this matters for people who are implementing this kind of stuff in the enterprise? Yes.
So a lot of my clients, we are deploying these small language models on-device quite a few times. It's just because they don't have good Internet access on the factory floor or people who are running around in the field, things of that nature. Right. So we have to do a lot of that computation on-device, especially if you're looking at our federal clients or manufacturing and so on and so forth.
In those cases, for the last few months, I've been super impressed by the momentum we have had in this AI space going towards much smaller, more efficient models. So in the 1 billion to 2.53 billion parameter space, we've seen an influx of a lot of models. So I have been running Google's Gemma, Apple's open elm. We've had Microsoft's fi 3.5. There have been some amazing models that deliver quite a bit of value.
We have from meta now, the 1 billion parameter model, I was able to download that just before I took a flight. So I was able to experiment for the next 3 hours with these small models. And by the way, I was looking at the meta connect using the Oculus glasses. It was a completely different experience being there live. So I got a chance to go experiment with these models.
There are certain things that we do for our clients where we add another layer of some fine-tuning to these models and the fact that they are small and I can fine-tune them because they're open. I'm able to deliver much higher accuracy with a much, much smaller footprint. I think that's where you get gold and the return on investment you get from these small models that you can then fine-tune and then run on-device. That opens up a whole lot of use cases for our clients. If you've not been able to do, if you're going and calling an API call back and forth. Yeah, definitely.
And Skylar, I guess this kind of response puts maybe your response to the round-the-horn question into context. I think I was like, are you going to have an open-source model that's better than the best model in the world? I guess, kind of. That's not what you think is exciting about this release. Right? I feel like you're like chomping at the bit to talk about how great if they had come out with a 500 billion parameter model, that would have been. Yeah, for me.
But if they're emphasizing the 3 billion and 1 billion parameter space, that gets me so excited because it's away from the bigger is better idea, and that bigger is better idea has crowded out other really cool research problems that probably should have been worked on while people were scaling larger and larger and larger. So to see a major player like meta come out and make some noise about a 3 billion, 1 billion parameter model, I think that's just some really outstanding work. And in the larger context, it also really shifts decision makers to not be gated behind the ones that have access to running a 400 billion parameter model.
So I think that type of, that kind of power dynamic, if open source is continually getting these smaller scales, I think that's just a really good direction. So, yeah, kudos to that about Lama coming out and saying 1 billion in three parameter space is showing skills, and again, being able to download right before you said you hopped on a plane. I mean, that type of thing, that's a really great direction to see these types of foundation models going.
So there are a couple of other things in the space as well. The 128K window, the context window. That is pretty surprising to me for such a small size model. Why is it surprising? Yeah. I think some folks might not actually have a familiarity there. It's worth, I think, for them to hear that subtlety. Yeah, yeah.
So the fact is you can put more context into that prompt that you're asking. It's 128,000 tokens I can pass in as context. So if I'm looking at a whole email thread chain on device, I can pass that in. So that kind of a response. Or eventually, we'll start to see more models that can handle images and stuff too, that are this small size.
Currently, the pixtrial model, 12 billion parameters, or meta, is 11 billion. Those are the ones that are doing images. But I'm very hopeful that soon we'll see more image capabilities come down to this two or 3 billion parameter models as well. So doing that on-device, when you're walking around taking a picture of equipment and saying what's wrong with this? Or what's the meter reading, things of that nature, I'm super excited.
As the capabilities increase, there are a few things that I lack that I would like to see come out in the future. Things like function calling, being able to do, like being able to create a plan and have more agentic flows between these smaller models. I'm very excited about the future iterations of these models as well.
Mariam, when you compare, we have been working on granite models for a while, and we've always been focused on small models. Can you give your perspective on the small model size? What are you seeing as the good size? Like 7 billion to 2 billion. Where do you see the right threshold of performance and size?
Well, it depends on the use case, right? If you have an iot or edge use case, the smaller the better, but also the smaller the better in a case that it has impact on, the latency is faster, it has impact on the energy consumption and carbon footprint generation, and it has impact on cost. So if we can get the performance that we need from a smaller model that's well suited for that use case.
But Skylar, to your point, what excites me about this release and the lightweight is the way that they achieved that. Lightweight models, like if you look into the paper of how they did that, they grab the Lama eight B and they structurally pruned it. So it's like cutting, cutting the network, making it smaller but then they use the very large general purpose models, the 405 B, that they had as a teacher model for distillation, to bridge that gap.
If you just combine this recipe and map it to other models, I'm expecting a lot of very powerful models coming to the market moving forward, just with a combination of distillation and pruning. Yeah, for sure. And I think one of the most interesting things is as it gets sort of cheaper and cheaper and more available, I think we'll also see lots of use cases. Right?
Like, so far we've been gated by how much investment you need to put into these models and how expensive they are to run. But I think it's almost like as it becomes more accessible, we'll also just see, like, well, why not just plug a model in, right? Like, it'll end up being something that you can apply for all sorts of different applications that we would have thought it would have been ridiculous to do a few years ago because it would have been too expensive to even think of doing.
Hey, Mariam, just on the latency part, I was stunned. I'm in the flight. I have a one parameter model running. It's giving me 2000 tokens. A second response. That's like 1500 words is generating per second. That's the experience I want when I'm looking at a model on my phone responding.
I became a believer when I saw that speed of the response. The latency. Yeah, the vision of you on the plane with the goggles using a model, your seat neighbor being like, who's this easy playing with LLM? Exactly. I'm waiting for the new airline documentation that come out that says, please do not run LLMs on devices while the plane is in flight, you know?
So, Mariam, I guess before we move on to our next topic, what comes next, do you think, like, are we going to see more releases of this kind? Is this going to be the big release for a while? Like, what should we expect? I'm expecting to see a lot of movement in open source and open community.
Listen, the future of AI is open. It gives really this openness drives innovation, and it gives you three things. One, making the technology accessible to a wider audience. And when you open it up to a wider audience, it gives you a chance to stress test your technology so we can advance the safety of these models.
Together with the power of community, it gives you an acceleration on innovation and contribution back to building better models for different use cases. So a combination of accessibility, safety, enhancement and acceleration innovation is what I'm expecting to see in the open community. And because of that, we are going to see a lot more powerful, smaller models emerging in the next six months.
Two researchers, Arvind Narayan and his collaborator Sesha Scapura, came out with a book which was called AI Snake Oil. And it's basically the book adaptation of a wildly successful substack they've been running for a while, where they essentially point out all the places where AI is being oversold, overhyped, or being deployed in ways that are not necessarily the best use of the technology.
What's so fun is Arvind took to the Internet to basically say, we're so confident of our arguments here that we want to put a bounty out of. If you think we're wrong on anything that we're arguing in this book, tell us, and we can put a bet on it in two to five years. And their sort of argument is that the kinds of critiques that they're pointing out about AI systems are things that don't have to do with technological capabilities and have to do more with what can we actually predict in the world.
So one of the things they say is AI really can't predict individual life outcomes or the success of cultural products like books and movies or things like pandemics. They're kind of arguing, arguing that prediction can only go so far, and AI is ultimately a prediction machine. And so there's actually like, kind of just so far this technology can go.
I think I just wanted to kind of first start there is like, I'm curious if that group sort of buys that argument. Like, you know, do we think that this prediction thing is just limited in a certain way and that actually caps kind of what AI can be used for or should be used for? I guess. Skylar, maybe I'll throw it to you if you've got any responses there.
I guess I might take a bit of issue where AI is fundamentally about prediction. I think the gains that we have seen recently on this idea of the transformer being used to do the next token prediction in that sense, yes, but because it's able to that next token prediction, there are so many other use cases that are not prediction focused.
So it's this idea about. Yes, we have to understand what this length of what this context of data is and underlying it. That transformer model does rely on that prediction, but it is so much bigger than just prediction. So I would really probably take that issue. That prediction is very difficult, but the other downstream tasks that you can do after that prediction task is really what has probably moved this space forward.
So don't get too hung up on the prediction capabilities of a model. Yeah, I'm with the scholar on that. If you look into traditional ML, prediction was key. And all the use cases, the majority of the use cases, enterprise use cases that we were using traditional ML for, was a reflection of, really prediction.
But then when it comes to generative AI, the prominent use cases, productivity unlocks that. It does, which is a function of content generation, code generation. It can be prediction in a sense, as Skylar said, the next token. But I don't think that's the prediction in the use case as a use case. For that reason, I don't 100% agree that the prediction use case is the primary use case that AI is designed to deliver.
Yeah, that's actually very interesting. I hadn't really thought about it like that. This has come up in some of the episodes we've done before, but one of the debates I find most interesting is, oh, well, at some point, machine learning kind of diverged from computer science, because the way you program a computer is quite different from the way that you test, evaluate, and fine-tune a model.
You're almost saying that actually there's even another distinction that could be made, which is basically this traditional machine learning, if you will. We'll almost diverge a little bit from the kinds of concerns that we have in generative AI, or whatever you want to call it. But this current generation is almost so different in kind that there's almost like a different set of problems.
I don't know if that's kind of what you both are chasing after. I do think there is a divergence away from classical machine learning. You know, take all of your decision trees, your aggressions, those phases, and then generative AI, those have diverged. And I'm trying to. Trying to keep up with it.
You know, that's my previous background was in the classical machine learning space. And then, man, we're in for a wild ride on generative AI. So, Tim, being a podcast, let me just quickly recap the book. I had. I had the pleasure of listening to the audiobook on the flight while I was hacking.
Oh, you did? Okay. You did the homework. I was in a very meta phase because I'm trying to hack something while I'm listening to this book on AI. The two authors are brilliant. There are two of the hundred top influential people in AI, according to Time magazine. There are five points they make in the book.
The first one is around making. They're saying that AI predicts, but doesn't truly understand the context. There is a second point is around AI will reinforce our biases in areas like policy, hiring, and things of that nature. The third one is around you got to be skeptical about anything. That's black box AI solutions.
The point that Mariam had just made about openness, and that's the future direction then you had, there should be stricter regulations and accountability, especially when an AI is making an outcome that could have an adverse impact elsewhere. And ethics. And ethics in AI has to be focused on beyond just the technical capabilities that we are making.
So none of these are groundbreaking statements that we have not heard before. But the very first one, I think that's where Skylar started, was AI is making predictions. And in a lot of cases, we expect an intern or a junior person to make a prediction, look at a pattern, and raise their hand once they see something that's not working.
My wife is a physician. She spent 14 years in medicine becoming a doctor. Right? She does critical care, lungs and sleep medicine. She has a set of medical assistants, mas, or nurse practitioners who are helping patients as well. She expects them to raise their hand when they see a pattern break.
Here's the stats that they've had from all their tests. A patient comes to them and say, hey, something looks different here. So all she's asking is to recognize a pattern and call me as an expert. I think that's where we should be with AI. AI is augmenting us.
We should be very precise in saying pattern recognition is a good thing. I want AI to do patterns, and I think that's too much of a gap between pattern recognition and getting to the root cause analysis of being what caused this. That causal modeling requires years of experience. And I think that's the relationship I would like to have with our aih.
Be able to find patterns and raise your hand, come to me for expert advice. So I think we're heading in a good direction. The name of the book is very catchy, but I think the points that they're making are pretty grounded in what we see in reality today. Yeah, for sure.
And I think to pick up on that point, I agree. Showbit. I mean, I think that's kind of the dream of how this technology should be deployed. I think part of their worry is that they feel like the market's not going to provide that, right? That there will be a tendency to be like, yeah, let's just implement the AI, and it will do everything for us.
And I guess maybe a question that posed back to the group is like, how do we do a good job fighting that, right? Because I think I want to live in the world that you're describing. But I think a lot of people who are particularly getting used to the technology or new to the technology almost have a tendency to kind of apply it for that causal stuff, which is actually where we kind of want to preserve the human role.
And so I'm curious, in people's conversations with friends and family and others, are there things that they've done to kind of like, you know, help to set level, set with the technology properly? I think an example that has come up with this in our conversation recently, my parents were both teachers, public school teachers, and we were talking about whether AI is going to replace teaching.
And similar to the healthcare ideas, I would really like to see AI be very measured in education because there's got to be a human connection there that comes through. And so to back off a little bit into that space, similar to showbiz analogy with the, the medical situation, about where we really see these specific roles. And I think an AI instructor would actually be terrible.
I don't want that. I wouldn't want that world. But having AI, being able to assist students and assist that interaction between a human teacher and the students, I think that would be a really cool example of this where we'd want to pull back a little bit and not go full automation in education, probably in healthcare as well.
I will push back a bit. Skydar, on the whole education piece. I think if you follow someone Khan doing Khan Academy, Khanmigo, I think the impact he is having surgically with AI, he has figured out a good blend between teachers, students and where AI becomes a copilot for them.
Right. So I think to your point of creating the human connection, 100%. My mom was a teacher as well, growing up. Unfortunately, she was also the principal of my school, so that did not go well with me. But being. Wait, while you were at the school, while I was at the school too.
Oh my God. But the fact that you can understand the nuances. Today a teacher is addressing 60 kids in a room and she has to go talk at the same level for each one of them. So you can't adapt the training to people who come from different language backgrounds as an example, right.
Or there's certain sections in the book that some people will take longer to understand, some will take a shorter time to understand. Right. So I think adapting the teaching curriculum to that student AI can do a great job. You can take people from MIT, great PhDs, professors, and you can take that coursework and translate that in Canada for some person in a village in India.
I think AI can play a very positive role. And back to what Tim was saying. We need your parents, Skylar, to tell us where AI should be augmenting like taking the same lesson and creating multiple flashcards and different, adapting that lesson and things of that nature. And there are lots of things that you can do with AI in that space of teaching too.
So next week my parents will be on the podcast and we should definitely do a parents' episode where it's just everybody's parents but none of the usual guests. That would be so much fun. From this I've learned. I need to joke, I need to check back in with Khan Academy. I think the last time I was there they were YouTube videos.
So I think maybe that space is really expanded. I need to go check back into that. Yeah, for sure. It's cool. Yeah, they're doing a lot of interesting experiments.
I want to make sure we get time for the last topic, which is a really broad one, but I think it connects a bunch of stories that have kind of played out over the last few weeks and isn't really anything that we've covered in too much detail on mixture of experts in the past.
And the topic specifically is the relationship between gender of AI and sustainability. This week was the UN General assembly and it was very interesting to me that the US State Department said, we're going to bring a bunch of people together, all the CEOs of all these companies, to talk about how AI is going to be used for the sustainable development goals.
And then similarly, IBM just released a paper fairly recently talking about some collaborations they've been doing with NASA, specifically around predicting climate and building climate models that are available. I want to turn to you because my understanding is actually you gave a talk or were on a panel recently specifically on this topic.
I'm wondering if you can give our listeners a sense of how this connection is evolving using this technology for these types of really, really big problems where I think as someone who hasn't really been as deep in the space, I'm like, how does chat GPT help save the world? And I know that's not the case, but if you can give us a little bit more color on like, how are people using this tech in this space?
Absolutely. And Tim. So IBM does a lot of work in the space. We have our own commitment to being carbon neutral by 2030, and we're doing a great job against that. Already this week I spent a lot of time in New York with a lot of global leaders and like celebrities in the space and got very humbled by the kind of problems that everybody's dealing with.
So the entire conversation is focused around AI can help solve some sustainability goals for us, and we need that compute power to be able to solve these gnarly problems, right? So making predictions on what happens to climate all over the world at a very granular level, how do you forecast what events may happen and things of that nature? There's a lot that happens in that space.
How do you optimize the cost envelope of running businesses, things of that nature? On the flip side, you have a cost, a climate, and environmental cost that comes with running these models, to just give you a few data points. If you ask chat GPD, or a massive model like that, a question to go create something, it consumes a 500 ML bottle of water to answer that question, that's just the water consumption that goes behind. These things just cool down centers and whatnot.
The data centers. Bloomberg came out with a study. All the data centers together would be the 17th largest country in energy consumption. Countries like Italy or use less energy than the data centers do today. In countries like Ireland, where they've become a center, where all these international tech firms have all their data centers as well.
The data centers in Ireland use 12% of the national energy consumption. It's more than all the households combined. So you're starting to get to these numbers where if you look at any of these graphs of the energy consumption, and then you see where we are today, you get to a stage where companies like Microsoft are now partnering with nuclear reactors that things that had melted down, we're now trying to resurrect them so that they can power them.
It was a three-mile island, right, which famously had some trouble a little while back. So you can see how people are trying to wrangle. How do I balance the compute that's needed versus how do you look at the energy consumption? So my talk was about we have to be computationally responsible. That was the title of the talk.
And we were talking about how do you figure out the right balance from the chip level all the way up to how do you end up using the models? And I was suggesting that, like, how you have cars that come with an MPG miles per gallon sticker, that one number somebody can look at and say, yes, this is what I'm doing.
When you're booking a flight, I know the carbon emissions. So I think as part of that, we need to be very conscious about, if I'm using chat GPT as a calculator, to add two numbers versus using the actual calculator, there's a huge delta between what happens to the number and we'll get the answer wrong. Exactly right.
Yeah, I think there are some really good use cases of where AI has been helping augment. We do a lot of work with forestation. We look at how lands use has increased. We are predicting catastrophic events with governments all across the world. We're trying to help them with wildfires and stuff like that.
So I'm overall very impressed with how IBM has taken a position on sustainability using AI for good. And we are super focused on smaller models, energy efficient, all the way down to how do we optimize our compute. And this is also part of our whole AI alliance with meta and all the other companies where we are collectively trying to reduce the threshold required to go implement AI across the world, especially in Africa, in the parts of Europe and Asia and things of that nature as well.
I like that bottle of water analogy. There was a paper came out from signal and hugging face just this last week, and it was on sustainability and the energy that's being used here. And one of the units of analysis they used is how many cell phone charges, this thing that a query would use, and the highest was image generation.
And we're approaching a query to an image-generating model is getting close to a cell phone's overnight charge. And I just, I just really liked that kind of unit of analysis because it brings it home so much more about, okay, I put in that query for an image generation and now I have to think about, that's the power of a cell phone for a day or two.
So I think it's really cool to try to maybe think about more creative metrics that we can present this to the world about just how power-hungry or water-thirsty these models are. Otherwise, I see Milo Bamdeh Milliwatt hours. I'm not an electrical engineer and I don't really appreciate it, but you tell me how many bottles of water it is or how many cell phone charges and it clicks.
Yeah, that's interesting. Would you want it to be metered? So as you're using Claude or something and it's like, here's how much power, you know? Yeah, that would be really useful. Maria?
We've done a lot of work with granite models with Prithvi, and we open-sourced them. Do you want to share with the audience what we're doing with our granite models? With granite, we are focusing on the smaller model for the exact same reason that you mentioned. Let me just share some data points.
If you look into hosting a 500 billion large language model on a 100s, roughly you need 16 a 100s for that hosting. If you look into a 20 billion models parameter model, just one single a 100. So the API call that you send to a 20 billion model versus a 500 billion model is 16 x more energy efficient just because it's 16 times less gpu, just ignoring all the cost and latency and all the other concerns just for sustainability.
Because of this, what we see in the market emerging is looking into the smallest model that makes sense and customize that on their proprietary data. That's the data about their users, that's the domain-specific data to create something differentiated that delivers the performance that they need on a target use case for a fraction of the cost.
And by cost, I mean cost in terms of energy, carbon footprint, and everything together. That's the guiding principles for granite. We've been focusing on smaller, enterprise-ready models that are rooted in value and trust, and allow our company, the companies, to use their own data on granite to make the custom model.
If you look into our granite custom, the open-source models, they are released under Apache 2.0 license. What it gives enterprises is the freedom and flexibility to customize those models for their own commercial purposes with no restriction, which is really the power of granite. I love that.
And Mariam, this week we also released our Prithvi next-generation models for granite. And just to share with the audience, we as IBM, have been partnering with NASA. And the problem we're trying to solve. Generally, we have these machine learning models that make predictions on forecasting weather patterns and things of that nature.
This is the first time it has ever been done where we have created a foundation model, where a pixel, where it's a square inch of the earth, we're using those as tokens. We're trying to predict what will happen next instead of using text. So we have built this foundation model that combines weather data and climate data together in one model.
So in that model can then be adapted for various use cases. In the current state, we have things like if you want to do forecasting in Florida, for rainfall, there'll be a completely different model. If you're trying to do deforestation somewhere else, it'll be completely different model.
So this is the first time we've combined a model that can be easily adapted, just like the foundation models that we've built. And as a mic drop, open source is completely to the community. So now you can go and take these Prithvi models from hugging face, deploy them for the same model, multiple things.
The next iteration, where I think we will hopefully go, is starting to do what multimodal models did. You used to have one model that did text, one model that did image, and then just like Meta is 3.2 billion, 3.2. Now we're combining the two together so the same model can do both of them.
I'm hoping that we'll come to that point with foundation models for weather and climate, we can then start to connect what's happening in two different places. The climate patterns are changing. The forestation is changing. It will be able to think through and combine those two.
So we've made the first step towards a new future where foundation models will be able to combine all of this data together. And the same model can answer all of these questions. Exactly. I was super excited about these models. And also think about it.
40 years of NASA satellite images are at our fingerprint now with these models to use it for weather forecasts, climate prediction, seasonal prediction, and use that to inform decisions for planning mitigations for climate. And modest. That's exciting. That's super exciting.
It's a great note to end on just because I think both, it's a model that's open-source listeners. You can go and download and play with it if you want it. And I think it's a great application. I think showbit to what I was talking about earlier.
I think it's so useful to get beyond simply how does a chatbot gain sustainability? There's all these other aspects and applications that I think people don't think about when this topic tends to come up. Well, great, everybody. So that's all the time we have for today.
Thanks for joining us. If you enjoyed what you heard, you can get us on Apple podcasts, Spotify, and podcast platforms everywhere. Showbit. Skylar, Mariam, thanks for joining us and we hope to have you on sometime in the future.
Artificial Intelligence, Sustainability, Innovation, Open Source, Technology, Environmental Impact, Ibm Technology
Comments ()