ENSPIRING.ai: The Power of Granite in Business

ENSPIRING.ai: The Power of Granite in Business

The podcast episode features a conversation between Jacob Goldstein and Maryam Ashuri regarding the use of AI technology in enterprises, specifically focusing on IBM's Watsonx AI and the concept of openness. Maryam, with her extensive experience, discusses the potential of AI to increase efficiency and transparency across various domains, especially in customer care, by utilizing models like Granite which allow for quick and accurate responses using internal company data.

Maryam addresses how businesses have transitioned from experimenting with generative ai to implementing it into production, facing challenges related to latency, costs, and energy consumption. The episode highlights a trend towards using smaller, customized AI models instead of larger, generalized ones. This approach is seen as cost-effective and efficient, offering excellent performance on specific use cases.

Main takeaways from the episode:

💡
Enterprises are shifting towards smaller, specialized AI models for improved efficiency.
💡
Openness in AI models is becoming increasingly important for safe and effective implementation.
💡
New AI developments focus on reasoning and acting with enhanced transparency and trust, while addressing industry-specific challenges.
Please remember to turn on the CC button to view the subtitles.

Key Vocabularies and Common Phrases:

1. speakshead [spiːksˈhɛd] - (verb) - To lead a project or initiative, especially with authority or enthusiasm. - Synonyms: (lead, head, direct)

...where she spearheads the product strategy and delivery...

2. latency [ˈleɪtnsi] - (noun) - The delay before a transfer of data begins following an instruction for its transfer. - Synonyms: (delay, lag, holdup)

...navigating challenges such as increased latency, cost, and energy consumption.

3. paradigm shift [ˈpærəˌdaɪm ˈʃɪft] - (noun) - A fundamental change in approach or underlying assumptions. - Synonyms: (transformational change, radical change, fundamental shift)

...has been perhaps one of the largest paradigm shifts when we think about productivity...

4. indemnification [ɪnˌdɛmnəfɪˈkeɪʃən] - (noun) - Compensation for harm or loss. - Synonyms: (compensation, reimbursement, repayment)

...we offer indemnification, and we stand behind them.

5. proprietary [prəˈpraɪəˌtɛri] - (adjective) - Relating to an owner or ownership, often suggestive of something privately owned and subject to restrictions on use. - Synonyms: (exclusive, private, patented)

...to very small, trustworthy models that they can customize on their own proprietary data...

6. indispensable [ˌɪndɪˈspɛnsəbəl] - (adjective) - Absolutely necessary or essential. - Synonyms: (essential, crucial, vital)

...AI is inevitable, but should not be feared.

7. transparency [trænˈspɛrənsi] - (noun) - Openness, communication, and accountability; the essential quality that allows people to trust information and processes. - Synonyms: (clarity, openness, candor)

The conversation focused on how enterprises can use technology to build and deliver greater transparency in AI

8. function calling [ˈfʌŋkʃən ˈkɔːlɪŋ] - (noun) - The computational process where a mobile application or program requests for a task to be executed by a function or subroutine. - Synonyms: (method invocation, function execution, procedure call)

...we expanded the granite capabilities to be able to do function calling.

9. governance [ˈɡʌvərnəns] - (noun) - The action or manner of governing, directing processes and accountability in organizations and systems. - Synonyms: (control, administration, management)

But also, one thing that I like to highlight in particular is the AI governance

10. generative ai [ˈdʒɛnəˌreɪtɪv] AI - (noun) - A type of artificial intelligence capable of generating new content, such as text, images, or music, often based on existing data. - Synonyms: (creative AI, AI art, AI generation)

...have moved from mere experimentation with generative ai to actual production...

The Power of Granite in Business

Pushkin hello, hello. Welcome to Smart Talks with IBM, a podcast from Pushkin Industries, iHeartRadio and IBM. I'm Malcolm Glapwell. This season, we're diving back into the world of artificial intelligence, but with a focus on the powerful concept of open its possibilities, implications and misconceptions. Well look at openness from a variety of angles and explore how the concept is already reshaping industries ways of doing business and our very notion of whats possible.

On todays episode, Jacob Goldstein sat down with Maryam Ashuri, the director of product management and head of product for IBMS Watsonx AI, where she spearheads the product strategy and delivery of IBMS Watsonx foundation models. She is a technologist with more than 15 years of experience developing data driven technologies. The conversation focused on how enterprises can use technology to build and deliver greater transparency in AI. With granite, Maryam explained how granite can be utilized to improve efficiency across various domains.

She discussed how these models are being used in real world business applications, particularly in areas like customer care, where AI can help enable quick, accurate responses based on internal company data. Merriam provided a fascinating look into how enterprises have moved from mere experimentation with generative ai to actual production, navigating challenges such as increased latency, cost, and energy consumption. She highlighted how the emerging trend of smaller models, customized with proprietary data, can potentially deliver high performance at a fraction of the cost, marking a significant shift in how enterprises leverage AI.

Whether you're an AI enthusiast or a business leader looking to harness the power of artificial intelligence, this episode is packed with valuable insights and forward thinking strategies. Let's just start with your background. How did you come to work at IBM? I joined IBM right after I graduated. I have an AI background, and throughout the years I've held many roles in design, engineering, development, research, mostly focused on AI, application development and design. In my current job, I'm the product owner for what's the next AI, which is the IBM platform for enterprise AI.

What excites me about this job, I would say, is the technology advancements. Over the last 18 months in the market, we've been witnessing how generative ai has been changing the market. But the way that I see that is GenAi has been perhaps one of the largest paradigm shifts when we think about productivity the same way that Internet and personal computers impacted the productivity of workforce. Now we are witnessing another wave of all those opportunities that it can unlock for, especially enterprise AI, when it comes to enhancing the productivity of the workforce and releasing some time that can potentially be put into creating more value work for enterprise.

So that's the major part that I picked this team to have an impact on the market and the community, but also, of course, using the skills that I gained through all these years through IBM, to help to establish IBM as the market leader for enterprise AI. So you talked about Genai as this sort of generational, transformational technological force. And I'm curious, just in terms of how it's going to come into the world. Like, how do you see market adoption of Genai sort of evolving from here?

Well, last year was the year of excitement about generative ai. Most of the companies were experimenting and exploring. With Gen AI, we see that energy shifted towards how to best monetize that technology. Almost half of the market has moved from investigation to pilots. 10% has moved to production. When you are exploring with this technology, you're looking for a wow factor. You're looking for an AHA moment.

That's why very large general purpose models shine. But as companies move toward production and scale, they soon realize the path to success is not that straightforward. For example, the larger the model, the larger compute resources it requires. That translates to increased latency. That's your response time. That translates to increased cost, that translates to increase carbon footprint and energy consumption. So think about that at the scale of enterprise in production.

Some of them can be a showstopper. Because of this reason, what actually C is emerging in the market is instead of focusing on very large, general purpose models, coming back to very small, trustworthy models that they can customize on their own proprietary data, that's the data about their customers, that the data about their specific domains to create something differentiated that is much smaller and delivers the performance that they want on a target use case for a fraction of the cost. So let's talk a little bit more specifically about what you're working on. Let's talk about granite. First of all, tell me, what is granite?

Granite is our industrial leading family of models, flagship IBM models. These are the models that we train from scratch. When offered through our platform, we offer indemnification, and we stand behind them. Today, it comes in four language code, time series, and geospatial models. Granite language series is covering English, Spanish, German, Portuguese, and Japanese. We have a combination of commercial and open source language models on granite, for example, we recently released the granite seven b language model, small, powerful english model.

On the code front, our models are state of the art models, ranging from 3 billion to 34 billion parameters. These are very powerful models that performs or outperforms in some cases. The popular open source models in their weight class, so very powerful models. So I get the idea of big picture about these models, but it would be helpful to just get a sense specifically of what they're doing. Can you give me any specific examples of how these models are being used in businesses in the real world right now? Well, the top use cases for generative ai are really content generation summarization, information extraction.

Perhaps the most popular use case that we are seeing in enterprise is content grounded question and answering. So using these models as a base to connect them to a body of information, let's say their policies, their documents, that is internal to the enterprise, and get the model to provide answers based on that questions. One example of that is for customer agents, customer care. When a customer is asking a question previously, the agent that responds to the customer had to answer the question, and if they don't know the answer, escalate it to the product specialist, keeping people on hold on the line to go figure out the answer for that, and then come back. You can think of the time it takes to resolve an issue.

But now with LLMs, we have an opportunity to automatically retrieve the information based on the internal documents of the company, formulate an answer, show it to the human agent, and then if they verify with the sources of where he's coming from, they can just translate it directly to the customer. This is a very simple example of how it's impacting the customer care. So one big theme of this season is this idea of open. And one of the things that's interesting to me about the work you're doing is you are using not only granite, this model IBM developed, but you're also using third party models right from other places. So tell me about that work and how that is sort of fitting into your kind of real world, typically enterprise gen AI work. When it comes to model strategy, our strategy is really focused on two pillars, multi model and multideployment.

It means that we don't believe one single model rules all the use cases. And I think at this point, the market has also realized. The enterprise markets, in average today are using five to ten different models for different use cases. Oh, interesting. So, in our portfolio, if you look into Bosonx AI today, we are offering a large sets of high performing, state of the art models coming from open source commercial models that we are bringing through our partners, and also IBM developed models.

In addition to all of these, we also have an option for bring your own model from outside the platform. Let's say you have a custom model that you made it yourself, you can bring it to the platform and really helping the customers to navigate through a wide range of models and pick the right model for their target use case. Throughout that, we've been heavily working with our partners and you know, this is the market that is evolving rapidly. We've been at the forefront of speed to delivery. One example that I like to highlight is recently meta released Lama four or 5 billion such a powerful model on the same day that it was released to the market.

We made it available in our platform to our customers the same day, and not only we delivered it on the same day. We are offering competitive pricing, but also flexibility in where to deploy. So we are giving an option to enterprise to deploy these models on the platform of stage choice, either multi cloud, it can be GCP, AWS, Azure, IBM cloud, or on premises. The same for Misreal AI. Misral AI recently released the model misrel two on the same day. We delivered that through the platform. That's an example of a commercial model.

Lama was open source, but misrole large two is a commercial model that we made available through the platform. Great. So I want to talk about enterprise grade foundation models, just to get into it briefly, what's a foundation model? People associate foundation models with a large language model, but large language models are really a subset of foundation models. Large language models are focused on language, but foundation models can be co generators, can be focused on time series model. We talked about.

They can be images, it can be geospatial models. So foundation model, as the term suggests, they are foundations to create a series of subsequent models that can be customized for a downstream use case. And that's why they are calling them foundation models. LLM is a good example of that as a subset for language that you can further customize on your specific data to get the model to do other work. So the core of these foundation models, they are basically trained on an absurd amount of data, large data sets that most of the institutions today are sourcing them from the Internet.

So you can imagine what can potentially go to those models, and then it comes to the enterprise and they start using it. So for us also, when we started looking into, in particular, it was triggered by customers asking us to provide client protections on these models. And we started thinking about, let's look into how the models are trained and if we are comfortable offering client protections on the models that are available in the market. And guess what? For a majority of these models, there is absolutely no visibility into what data went into those models, not much transparency into how the model trained and the responsibility lies on you as a customer to start using those models.

So just to be clear, that is presenting like, potential risk, real potential risk, to a company that is using these models. It is, it is a potential risk in particular for the customers in highly regulated industries. So what we did for granite was when we started training these models from scratch, basically, we went to the corpus of data that was available to us. So, for example, the very first version of granite was exposed to 20% of its data from finance and legal. Because we have a lot of financial institutions as our clients, we worked directly with our IBM research to identify detectors for harmful information like hate abuse and profanity detectors.

Okay, so we're talking about granite. We're talking about this set of models IBM has developed. Let's talk about using Granite on Watson X compared to downloading open source models. Like, how do those differ? By using Granite and Watson X, you get two things. The first one is the client protection and that we talked about. You get that if the model is consumed through our platform.

And the second one is really the ecosystem of platform capabilities that we are offering to help you create value on top of those data. So, for example, bringing your data to customize granite for your own specific use case. But also, one thing that I like to highlight in particular is the AI governance. So when you get one of these pre trained models, you put it in front of your own users. Through the input and instructions that the user provides for that model, they can nudge the model to potentially create undesired behavior and change the behavior of the model.

And because of this, it's extremely important to automatically document the lineage of who touched the model, at what point. So if something happens, you can trace it back and see where it's coming from. And that's what Watson X dot governance is offering, automatically documenting the lineage. When you use the granite within the platform, you get all of those. You can have the end to end governance and have access to all these scalable deployment opportunities that is available for you, like to allow you deploy them on the platform of your choice that we talked about, either multi cloud or on Prem. And it also helps you to have access to a wide range of model customizations, approaches, prompt tuning, fine tuning, retrieval, augmented generations agents.

There is a series of them available to use and apply to your model. This distinction between large language models and foundation models is eye opening. Merriam emphasized that foundation models can be tailored to specific tasks, but with that versatility comes a significant challenge, the lack of transparency in how these models are trained. This can pose a real risk, especially in highly regulated industries like finance, essentially by using granite and Watson X together, enterprises get powerful and customizable tools.

So let's talk about the future a little bit. What do you think are some of the big developments we're likely to see in the realm of AI models? Very good question. I feel like the generative ai of the past was powered by large language models. The generative ai of the future is going to reason, plan, act and reflect. And so, I mean, in the context of granite in particular, like, what are we likely to see both in the near term and in the sort of medium to long term? There are multiple elements to implement an agentic workflow that I just mentioned.

One element of that is the LLM itself, to be able to do the planning and reasoning and acting and doing something that we call tool calling. So basically a series of tools are available to the model. You ask the model to call those and make a call. For example, we can say, hey, granite, what is the weather like where Jacob lives? It's going to connect to web search API, look up your location, then it's going to connect to weather API, calculate the weather and come back and formulate an answer and respond to that.

So during this process, it first has to plan the task of how to answer that question, look into what are the tools that are available to it and call them. And that's the ability of the model to do that. What we did with Granite was we expanded the granite capabilities to be able to do function calling. So for example, today we have an open source granite 20 b function calling that is available on hugging face to try on, and you can grab the model and the model has capability to do the tool calling. I'm anticipating that in the near future, the planning and reasoning and acting and reflecting capabilities of the large language models are going to continue to evolve. So thinking now from the point of view of buyers and users of AI's, really people who are listening from that perspective, as people are evaluating AI tools and solutions, what is the most important thing they should be thinking about?

How do you think about that process? I think they should always start with the area at which they think it would benefit from AI, and then within that area, look into what data they have available to potentially fit into those AI service architects. Do they have access to quality data? The second question that they have to ask themselves is, do I have a trusted partner that can supply what I need to be able to implement AI? That can be a collection of the foundation models that you're going to need that can be a collection of the platform capabilities that the trusted partner can offer you to implement such a thing.

The third thing is go and evaluate the regulations. Does regulation allow you to apply AI to that specific area that you are investigating and you're targeting for AI? And the last part, but not least, is back to the principles of design thinking. What is the problem in that area I'm solving with AI, and if AI is even appropriate, because we want to make sure that you use AI, not just because it's a cool, hot toy in the market, but you are convinced that it can significantly enhance the user experience of your customers in that area. And once you have an answer to those all these four questions, then maybe you have a good candidates to start applying AI to.

And what about from the side of project managers who are trying to just keep up with how fast things are changing, how fast innovation is happening? Like, what advice would you give those people? My advice would be focus on agility. This is a market that is evolving rapidly, and the winners of the market would be those that are able to take advantage of the best the market can offer at any point of time. So in order to do that, they need to be open to experimentation, continuous learning, and open to rapidly adopting the new ideas.

And when you think about the future and genai, is there a particular, say, problem that you are most excited to solve? I think that would be productivity. If you look into the stats that are out there, there are surveys that confirm that 60% to 70% of the time of our employees can be potentially enhanced through the productivity gains of generative ai. For example, I personally myself use my product for content generation a lot. So the time that it frees up can be potentially put into generating a higher value work. And because of that, I'm super excited with all the opportunities that it represents for enterprises to go and dedicate the time of the employees to higher value items.

Great. Okay, a couple granite specific questions. So what are like the key things you want the world to know about granite? Granite is open, trusted and targeted. Two ways to think about openness. One, open as openvaits. It's available for public to download, and the second one is open as in there is less restrictions on how the customers can legally use these models.

For a range of use cases, we have released granite open source models under Apache license that is enabling a large range of use cases. The second one was trusted. We talked about that like it's rooted in the trustworthy governance process that we established around how we are training these models and the responsibility that we take for these models, and the third one is targeted. Targeted for enterprise. We talked about like exposing granite to enterprise data, or the domain specific granite, some of them, like cobalt to Java translation, that is targeting to solve the specific enterprise needs.

And that's granitehe. So open, trusted and targeted. So there are a lot of models out in the world all of a sudden, right? It's a crowded market. Where does Granite fit in that universe? What is the market for granite? We talked about the enterprise market shifting away from very large general purpose models to targeted, smaller models. And Granite is a small model that enterprise can pick up and customize on their proprietary data to create something that is differentiated for a target use case.

So granite is well suited as a small, domain specific business, ready, tailored for business, and trained on enterprise data to solve enterprise questions. You mentioned small as one of the things that granite is. Why is that useful in some contexts for enterprise, for businesses, the larger the model, the larger compute resources it requires. It translates to increased latency. That's your response time. It translates to increased cost. Adding translates to increased carbon footprint and energy consumption.

So at the scale of enterprise transactions, when you move to production and you want to scale, some of these challenges can be multiple times stronger. Like costs can add up, the energy consumption can be a serious thing, and the latency is, depending on the application, can be a showstopper and blocker, because for longer, larger models, more powerful models, it just takes way longer time to process and calculate the output for you. We are gonna finish up with a speed round, and I want you to just answer with the first thing that comes to mind. Don't overthink these. Okay, complete this sentence. In five years, AI will be invisible.

Ah, I like that. What do you mean by that? Today, AI is everywhere. But if you ask my kids at home, they know AI. But if you say, where is AI? Like, how do you use aihdem? They don't know the answer because it's so blended in their life that they don't feel like it's something that they are using. They are getting used to that. So when I think of next generation and the years to come, that generation is so used to AI being part of their life that they feel like it's just there.

That's one. And the second one is the simplicity of interaction with AI, that you don't feel like you're interacting with the system. It's just there. Like you talk to AI, everything is automated. So I would say the simplicity and being blended to solve the right problems is the part that I'm referring to as invisible.

Like, Internet is everywhere and it's invisible. But we used to dial in, like, you remember the dialing sound to connect to Internet? It's gone. Internet is completely invisible today. Right. Like we used to talk about logging on.

Right. And you don't log on anymore because you're always logged on. Yeah. You're always connected. Yeah.

What's the number one thing that people misunderstand about AI? AI is inevitable, but should not be feared. What advice would you give yourself ten years ago to better prepare you for today? I would say develop a broad range of skills. Even if you think they will not help you today, they may be valuable in the future. So on the consumer side, right now we hear a lot about chatbots and image generators.

But on the business side, what do you think is the next big business application? AI influencers generating content. How do you use AI in your day to day life today? One simple example is LinkedIn posts. I love it. To just go to my product. I'll give you an example, which is my favorite one.

Lama 3.1405 bilhenore. The post that I announced on LinkedIn on, hey, IBM is releasing the model on the same day it was generated by Lama 3.145 billion. So using the same model to post generate the announcement notes. Very elegant. Is there anything else I should ask you? Oh, we didn't talk about instruct lab.

So when you grab a model, you start from the model, but you need to then customize it on your proprietary data to create value on top of that. So instruct lab is giving you a method based on open source contributions to collectively contribute to improve the base model. So if you're an enterprise, you can leverage your internal employees to collectively all contribute to improve the model. And I'll give you an example of why it matters. Like if you go to hugging face today and look for Lama, there are about 50,000 different lamas coming up.

And the reason is because there is no way to contribute to the base model. If you're a developer, you have to make a copy of the model and fine tune it for your own purpose. We figure the method, that is we call instruct lab to be able to collectively collect all that information and contribute to the base model and enhance that. That's in struct lab. I just wanted to highlight the value of being open because that's another topic that has been emerging in the market over the past 18 months.

In particular, I believe the future of AI is open, and we've been seeing how the open source markets has been changing, how the models are accessible to a wider audience. And good things typically happen when you make technology pieces accessible to a broader range of community to stress test them. And that's the direction that we've been adopting with granite. And I felt like that's really the adoption that the market is going to emerge to moving forward.

Yeah, there's this interesting, I think maybe naively unintuitive, but it makes sense once you think about it thing that open source things are safer. You might naively think, oh no, put it in a box so nobody can see it and that'll be safer. But like it turns out in the world, if you let everybody poke at it, the world will find the vulnerabilities for you and you can fix them. Right? That's exactly what's going to happen. Yeah. Great. It was lovely to talk with you. Thank you so much for your time. The same here. Thanks, Jacob. And that wraps up this episode. A huge thanks to Mariam and Jacob.

Today's conversation opened my eyes as to how open technology and AI are intersecting to create more transparent and efficient systems for enterprises. From the power of smaller, more targeted models like granite, to the importance of trust and governance in AI, these developments are reshaping how businesses operate at their core. As we continue to unpack the complexities of artificial intelligence, it's clear that openness, whether in data, technology or collaboration, is not just a concept, but a driving force that can unlock new possibilities.

Artificial Intelligence, Technology, Innovation, Ibm Watsonx, Granite Models, Model Strategy, Ibm Technology