The video explores the future of prompt engineering in the context of large language models and the potential impact of automation in the field. It delves into how advancements in AI such as Anthropic's metaprompt, Cohere's prompt tuning, and Google's acquisition of Prompt Poet are leading towards reducing human involvement in crafting prompts. Panelists discuss whether prompt engineering as a profession will continue or evolve into something new, ultimately suggesting a shift towards a more supervisory role for engineers overseeing automated systems.

Please remember to turn on the CC button to view the subtitles.

Key Vocabularies and Common Phrases:

1. dexterity [dɛkˈstɛrɪti] - (noun) - Skill and grace in physical movement, especially in the use of hands. - Synonyms: (agility, skillfulness, proficiency)

Building a robot with these human level dexterity or mobility has proven very difficult.

2. metaprompt [ˈmɛtəprɒmpt] - (noun) - A higher-order prompt system that helps in generating or refining prompts automatically for language models. - Synonyms: (prompt system, automated prompts, prompt generator)

Anthropic announced a metaprompt system that helps generate prompts for you.

3. specialization [spɛʃəlaɪˈzeɪʃən] - (noun) - The process of focusing one's occupational concentration on a specific area of expertise. - Synonyms: (focus, concentration, expertise)

For the prompt engineers, they will need a broader set of skills, including model training, data set curation, the integration of the LLMs into broader AI pipelines, and also some niche specializations

4. ethical [ˈɛθɪkəl] - (adjective) - Relating to moral principles or the branch of knowledge dealing with these. - Synonyms: (moral, principled, virtuous)

And I think another thing is also the broader ethical and social implications for automating scientific research

5. democratize [dɪˈmɒkrətaɪz] - (verb) - To make something accessible to everyone; to make something universal or equal. - Synonyms: (universalize, equalize, liberalize)

So this kind of democratizes the prompt creation, and this could reduce some of these technical barriers to entry, pushing prompt engineers to focus more on more complex or high impact tasks where deep expertise is still required, such as designing industry specific models or optimizations at scales

6. ubiquitous [juˈbɪkwɪtəs] - (adjective) - Present, appearing, or found everywhere. - Synonyms: (omnipresent, pervasive, widespread)

Existing prototypes, they are far from ubiquitous, but they are really nice demos and it shows a lot of promise.

7. bottleneck [ˈbɒtlˌnɛk] - (noun) - A point of congestion or blockage, often in a process or system. - Synonyms: (obstacle, impediment, constraint)

One way of thinking about it is that we've got this kind of bottleneck for the researchers, the brilliant minds that we have.

8. augment [ɔːɡˈmɛnt] - (verb) - To make something greater by adding to it; to increase or enhance. - Synonyms: (enhance, enlarge, amplify)

More specialized tools helping us augment what humans aren't good at

9. interdisciplinary [ˌɪntədɪˈsɪplɪnəri] - (adjective) - Relating to more than one branch of knowledge. - Synonyms: (multidisciplinary, cross-functional, integrative)

Skeptical is whether the AI could really fully replace human intuition in scientific discovery, especially when you're dealing with more abstract or interdisciplinary fields.

10. scarcity [ˈskɛrsəti] - (noun) - The state of being in short supply; shortage. - Synonyms: (insufficiency, paucity, deficiency)

And he publicly even said he complained actually, about the scarcity of these AI chips chips.

NEO 1X Robot, OpenAI chips, The AI Scientist, and the future of prompt engineering

My opinion is that prompt engineering is never gonna die. It's a forever thing. Anyone who's worked with large language models has experienced some of the pain, dark art, black magic of. If I shout loudly enough at my model, maybe, like, literally, if I type in all caps, maybe this time it will do what I'm asking it to do. The creepy factor is big, but these robots are also pretty cool. If you can get them to work, I would love to have one actually in my home, cleaning dishes and cooking it. How many scientists are going to be out of a job in the next ten to 15 years? I'm just looking forward to a world where we started using the word we when AI is actually starting to do something meaningful for us.

All that and more on today's episode of Mixer of Experts. I'm Tim Hwang, and I'm joined today, as I am every Friday, by a world class panel of engineers, researchers, product leaders, and more, to hash out the week's news in Aihe. On the panel today, Kate Sol is a program director of Generative AI research. Shobhit Varshini, a senior partner consulting on AI for us Canada and Latin America. And Kautar L. Megrowi, principal research scientist, AI engineering, and AI hardware center.

So, as always, on mixture of experts, we're gonna start with a round the horn question. And that question is, will prompt engineers even exist in five years? Kate, yes or no? No. Showbit, yes or no? Not at all, man. Okay. All right. And how about you, Katar? I think it's gonna evolve to a different role. Okay. All right, well, let's get right into it. The prompt for this first story that we wanna cover today is that we've just had kind of a slew of sort of subplot, you know, sub b kind of announcements coming out from all the companies. They haven't been the most kind of prominent things they've been announcing, but it has really kind of created a little bit of a pattern, I think.

Kate, you flagged this for us, which is that a lot of the companies have all been working on prompt automation, right? So Anthropic announced a metaprompt system that helps generate prompts for you. Cohere is launching a prompt tuning feature, which takes a prompt that you have and improves it automatically. And then Google recently acquired a company called Prompt Poet, which is very much in the same functionality. And so this is a big deal, right? If you're familiar with LLMs in the past, right. A lot of the work has gone into making a good prompt, and I think the big thing about this is the future of JC, taking the human out of the loop, the idea that you won't need prompting anymore.

And I guess, Kay, as someone who kind of threw this topic to us, do you want to just explain for our listeners why is that important? What changes when that happens? Yeah, and I like what you did there, Tim, the prompt for today. So, look, I think anyone who's worked with large language models has experienced some of the pain, dark art, black magic of if I shout loudly enough at my model, maybe like, literally, if I type in all caps, maybe this time it will do what I'm asking it to do, which can be a really frustrating process and doesn't make logical sense.

I think we're all rational beings, and ideally, there would be a really rational and structured way to try and prompt these models. So I'm really excited to see a lot of work come out, which is trying to not take a human entirely out of the loop, but take a human out of the loop of finding these phrases and tokens and words and patterns that seem to be more effective for one given model to perform a task that's in question. So being able to, for example, search a broader space of natural language and try and identify, okay, if I frame my question this way now, I can get an improved level of accuracy. I think that is going to be really powerful overall just to improve productivity and reduce some of the stress when working with models.

Yeah, for sure. And now, Katar, you said actually in your response is you agreed with everybody that kind of, well, maybe prompt engineering is kind of not long for this world, but you did say that you feel like the role will shift. Do you want to tell us a little bit more about what you're thinking there? Yeah, sure. So there has been a lot of recent developments in prompt engineering that is leading to significant changes, particularly in how prompt engineers interact with large language models.

Like Kate mentioned things, for example, the prompting from anthropic metapromptin and the development here, it shifts the focus of the prompt engineers from crafting these individual prompts to designing systems that guide the AI to adjust its own behavior. So prompt engineers may increasingly here focus on creating frameworks for meta prompting or refining the logic that underpins it. And this creates a more robust role where engineers manage how prompts evolve in real time. And if you look, for example, at what prompt, tune in from cohere, for example, the prompt tuner.

So here, the prompt tuner from cohere enables user to fine tune and optimize prompts specifically for different applications. And here the implications prompt engineers may transition from manually crafting prompts to overseeing or curating automated tuning systems. So this kind of democratizes the prompt creation, and this could reduce some of these technical barriers to entry, pushing prompt engineers to focus more on more complex or high impact tasks where deep expertise is still required, such as designing industry specific models or optimizations at scales.

And there is also other also, if you look at the prompt poet acquisition by Google. So here, this acquisition acquisition emphasizes automation here in the generation and the optimization of prompts. And the implication here, this kind of further blurs the line between AI systems and prompt engineers. So AI systems here, like prompt point, evolve. As they evolve, the role of the engineer here may shift from towards more a supervising role. So where you're supervising these AI systems that continuously optimize themselves.

So human prompt engineers might focus more on edge cases or creative tasks or model specific customizations. So I think the implications overall here is kind of shifting from manual to kind of a supervisory role. I don't like to say that, you know, we're going to completely remove human out of the loop here, but it more increased focus on optimizations, expansions of the skill sets here. For the prompt engineers, they will need a broader set of skills, including model training, data set curation, the integration of the LLMs into broader AI pipelines, and also some niche specializations.

I think, to sum up, is kind of the prompt engineering is likely evolving from hands on manual role into a more supervisory role where engineers focus on higher level design, optimization and supervision of these automated systems. Yeah, that makes a lot of sense, and it's sort of interesting that kind of like the process that's happening in the movement to AI agents will also sort of happen in the prompt space, which is rather than kind of like doing everything, you're just sort of like monitoring the system as it goes and keeping it together. Yes, I think the prompts will get more and more personalized to that particular person, and over time, there'll be a lot more context that will automatically pull in.

So the center of gravity is going to keep moving towards more hyper personalization to showcase as an individual. So the way the prompt, when I say something to a model, the way it expands it out and makes a metaprompt out of it, that'll be super hyper personalized to the context. The memory of everything that I've done in the past. Right. Like, I, I feel like being a good prompter to these LLMs at work has made me a much better parent. Talking to my eight year old daughter, she just explained it clearly. Think through it step by step, you know? Yes, I have to talk to my daughter saying that, anya, you just turned nine, you are a big girl now.

And then I walk into a chain of thought reasoning and I get the answer. I'm expecting her to say that, no, I should not have ice cream before I sleep. Got it. Right. Exactly. That's the desired outcome. Absolutely. There's a lot, and that's a revert to its feedback training. Right. And now we're at a point where, say, it's 08:00 p.m. at night, and if I say, anya, her response is going to be, papa, I'm almost done eating. Cause she understands that there's a pattern that when she's eating and she's taking more time, I'm gonna probably be checking in and seeing, are you eating properly or not? Right. So she has a lot more context on how to respond to showbiz itself.

Right. Versus if my wife is us calling her her name, her response is gonna be slightly different. So I think the hyper personalization of these metapromps, that's the direction that we will be looking going forward. Yeah, for sure. And I guess, Kay, maybe to turn it to you before we move to the next topic. I think this exact point was one thing that I did want to bring up is when we think about prompting with humans, we encode in language. Right. What's sort of interesting is that the prompting that we've done is both to kind of help us understand how we're interfacing with the system and then also direct the system. I think. I don't know if you buy this, which is like many of the optimizations, may use tokens that don't even look like normal grammar.

It could just be a random string of numbers and letters that actually get the best results out of the system. And so I got some kind of curious. Do you feel like prompts, over time will become more and more obscure to us? Because it turns out the optimal encoding for the language model may actually not be something that's particularly human readable or easily understandable at all. And so there's almost this very interesting trade off of optimization and readability, and. Just want to get your thoughts on that?

Yeah, I think to answer that question, it's important to recognize that there's really kind of two different sides of innovation that are happening around this area. So one is improving our ability to prompt the models, but the other is improving the model's ability to take structured and more reasonable prompts. So instead of talking to Chauvin's eight year old daughter, can I talk to a software developer that understands structured inputs and can provide very structured responses? So if we only innovated on the prompt optimization side, where we're trying to create new tokens and keep the model frozen, then yes, I think we could get to a point where we're starting to see non human readable prompts.

But I think we're also seeing with OpenAI structured outputs, more and more structure being baked into these models to make it more standardized and systematic and how we work with these models. And ultimately, I think that's where the real value would get unlocked and where a lot of really exciting workflows could develop, especially in agentic patterns. If we can really start to focus more on having very structured, formulaic, maybe not perfectly machine read, human readable, and that it's not like storytelling when I read what the model is happening. But a very formulaic way to work with these models, I think, is ultimately where we're going to end up.

Yeah, it'll be so funny, because what you're describing is we're reconverging towards code, right? Like structured language as a way of getting systems to do what they want them to do. Yeah, we started structured, created a bunch of unstructured, and now we're like, wait, that was actually, there was some good things there that we should maybe bring back.

So I'm going to move us on to our next topic. We spend a lot of time on mixture of experts talking about software, we talk a lot about enterprise, but I think one of the most kind of viral, if you will, AI moments of the last few weeks was the launch of a humanoid robot called Neo from a company called one X Technologies. And specifically they're working on. The idea is to work on humanoid robots that are designed to be at home assistants.

So this demo, basically, if you've seen it, and if you haven't, it's worth looking up on YouTube or whatever, is a humanoid robot helping out around the home, cleaning dishes, helping to clean up, and otherwise kind of assist on tasks. And again, I kind of wanted to ask the question, and I think it's always an important question to ask in the world of AI, which is how much of this is going to be a reality? How much of this is a really cool demo? Maybe most importantly, would you buy one for your own home? But we can address that at a certain point. Kautzer, I'm kind of curious about your thoughts.

If you saw the demo, what you thought about it, and if you think something like this is really going to be a reality. And I think in part, I think the question is whether or not this is a real affordable thing from a hardware standpoint. There's a bunch of really practical bits and atoms questions here that I would love to get your take on. Sure. I would love to have one actually in my home, cleaning dishes and cooking. Yeah. Someone who spends like an hour after folding clothes. Task I hate the most.

Of course, the demo was very impressive from one x, and I think one x is among the one of the most prominent companies in the emerging field of human eyed robots. But will human eyed robots become a reality or still a pipe dream? So I think human eyed robots having the focus of science fiction for a long time and transitioning from dream to reality comes with significant challenges. So the argument for human eyed robots is that they can fit into environments designed for humans, use existing tools, and interactive more naturally with people.

However, I think there are still several challenges that need to be fixed. You know, first, I think there is the mobility aspect. Building a robot with these human level dexterity or mobility has proven very difficult. While there are some progress, I think there is still a lot, you know, that needs to be done. Technologies like soft robotics and advanced actuators are making strides here, but are far from a robot that can perform all human tasks autonomously.

The other, you know, challenge is the energy efficiency. I think these robots require significant power to function and which limits, you know, their practical use. Nio, for example, and other similar projects are working to make these robots more energy efficient. But the issues around battery life, energy consumption, there are still bottlenecks. The other, the other thing is the cognitive and social interactions here, beyond just the physical tasks.

You know, these robots must navigate the complexities of all the human life interactions, perceptions, and developing an AI capable robot that is capable of interpreting these social cues, responding appropriately, making decisions in real time, is still an ongoing research area, and there is still a lot of work around AI and reasoning. So I think it's going to take time for us to get there. And another challenge, I think, is the economics of this building something that is affordable, versatile, reliable, it's still a major hurdle.

And for many industrial and service applications, simpler robots or specialized machines are more efficient and cost effective than having this general purpose humanoid robot. So the complexity and the cost of these humanoid robots, I think, especially in their design, still limits the adoption to especially niche markets. There are challenges. What's the reality versus the long term vision at present, it is a transitional phase.

Existing prototypes, they are far from ubiquitous, but they are really nice demos and it shows a lot of promise. But I think we are still not there in terms of the mass market tools and adoption. But it's not just a technological pipe dream. So it's gonna happen. That's my thinking. But it's, you know, for the full realization, it's gonna take years, if not maybe decades away before they really become a reality.

Yeah. That functionality gap is very interesting to think about. Like, I love the idea that for a period of time people are purchasing these, but it turns out there's, like, not a whole lot you can do around the home with them so that they end up just like, being, like all the lonely pelotons you see in people's houses, where it's like this really expensive piece of hardware that just kind of sits around. But it's. It's just funny because it's like a humanoid guy, basically, I guess, I don't know, Kate show, but if you got a kind of view on this, if you're a little bit more skeptical, or if you kind of agree that, yeah, maybe, I don't know, Kathryn, you didn't put a date on it, but in our lifetime, we'll see this become a practical reality.

Yeah. So I'm a big geek, and I will go and buy stuff that I think is awesome. Right? You're gonna have the peloton robot in your house. So I feel that the same argument about one massive model that's just absolutely stunning can do everything like GPT four model or cloud models. Right. Versus the argument that smaller set of models have a niche for specific use cases a lot more efficient and targeted for a particular use case. Right. I'm on the camp of I would rather have a device that is helping me for a particular task, and it's incredibly doing a good job at that task.

As an example, I use the Roborock s eight Max v Ultra, whatever, the highest end of their robot that does vacuum vacuuming and mopping and goes back and cleans itself up. It dries itself out, comes back again, and finishes off that last little bit of scrubbing that it missed somewhere. More specialized tools helping us augment what humans aren't good at. I think that's the future direction. In the short run, it'll take a while for us to get to something that solves for all the constraints that we just discussed before you get to a point where a humanoid replica of you can actually start doing things.

So I think in the short next five years, specialized tools that do a particular task incredibly well are cost optimized. It's repetitive. They nail that particular use case. I'm more in that camp. Kate, do you think the same? I completely agree. If you think about how model specialization has progressed, we see the same exact trends as you pulled out. So I'm 100% in the same camp.

It also reminds me of the common story that you hear where if you asked someone back in the horse and buggy days what they wanted, and they always said they wanted a faster horse, and then Ford came along and released the first cars. And I think we're in a bit of that scenario right now where it's like, I just want more human time to do the things that I don't want to do as a human. So create some humanoid robot. But really, can we rethink of what the right way this is to make humans more superpowered, not just create more humans that we don't have to worry about feeding them or other potential labor issues?

Okay, that sounds more like, say, how we solve the dishwasher paradigm, right? Yeah. It figured out that there's an optimal way of washing dishes, and it does an incredibly good job at a very low price point, and it nails it. Right. So we have changed the way human workflow used to work. Right. Earlier, as a human, I would take a dish, rinse it, keep it somewhere else. We did not try to optimize that particular workflow. We said, there's a better way of solving this particular niche use case. It's very custom optimized, and we'll nail it.

So I'm on that camp with you that I think we'll get to a point where smaller machines that do a particular task really well, I don't want, like, for example, in our pool, we have. We have a skimmer that just skims and removes all the dirt from the top. Now, a human will take a net and try to clean up each one of them one by one. That's not the optimal way of solving for that problem.

So I'm with you. That the workflow, the human workflow has got to change, and then we optimize by the time we get to a point where you get a humanoid that can then solve for all the problems that we discussed around cost and flexibility, dexterity, and things of that nature. Yeah. And I think, for what it's worth, I think also, just, like, you can't discount, like, the creep factor, right? Like, I do feel like it's, like, a little bit it's a little bit spooky to have, like a, you know, a large human in my house. And I do think that will be part of the adoption, almost like, leans in favor of these more specialized applications because they kind of don't raise that fear. I don't know.

We'll have to see in practice whether or not x, one is able to pull us off or one x is able to pull this off. Yeah, I think it's interesting development, and it all comes to what people are also able to consume and the capabilities. Of course, specialization versus generalization is always going to be a concern, but of course, if we can combine both, that would be great. So it's like what these LLMs are doing, but we still need special models. But, you know, the evolution of LLMs is still important. Having these large models that can do a variety of things, but then specialize in them for certain tasks.

Can we have this, the same argument for these humanized robots that, you know, can do a variety of tasks, but maybe you can press a button and tell it. Now I want you just to be focused on cleaning the dishwasher or the pool or so something that's maybe take a subset of that model that is specialize within that humanoid. I think that would be cool to have. Yeah, I mean, ultimately, you're gonna have, like, you know, the humanoid robot's gonna be the one that does the maintenance for all the other smaller robots. This is gonna be robots all the way down. It's like a hierarchy over here.

Yeah. I think what cart this cartoon is the way you framed. I think you're looking at a transformer robot that's gonna do that one job really, really well. That would be the job to live in. That would be cool. Yeah. So I'm going to move us on to our next topic.

So there's a fascinating paper that was shared by a friend of the pod, Kush Varshney, who, if you're a listener, has been a recurring guest on this show. And what I love about some of these papers in machine learning is that they, like, pick the most dramatic name for their paper. And so the name of the paper is the AI scientist and has a long title about kind of towards, you know, effectively, like, using AI to automate end to end and science. And it's a proposed system that tries to really see and kind of push the limits of whether or not large language models can really help out with scientific discovery in a fully kind of automated way.

And this is a big deal. I mean, you think about how societal progress happens, right? Like these technological breakthroughs are really critical. And so one way of thinking about it is that we've got this kind of bottleneck for the researchers, the brilliant minds that we have. And so the hope is basically, can we augment that process? Can we accelerate that process? With Aihdem has been kind of a real focus. You know, what I always worry about these papers is that the results look almost too good and the ambition is too great.

But I mean, Koutou, I know you looked at this paper in some detail. I'm curious if you're coming away with this feeling like, yeah, they've really kind of hit upon something here that really could be the kernel of something new. Or if you feel like ultimately the way AI fits in, science is going to look a little bit different from the way they're proposing here. Yeah, I enjoyed reading the paper. I think it really put forward a very nice way of, kind of thinking of this automated AI scientist, which made me also worry, you know, what's going to happen to the scientists in the future.

So it presents, you know, this very nice framework where large language models generate research ideas, write code, run experiments, visualize results, and even write papers. So, and they also showed some very interesting papers that were, you know, generated by this AI scientists. One thing. Yeah, you just needed to do the paper session at the conference, the poster session at the conference. It makes you even worry, you know, what's going to happen to the conferences in the future. And some of the papers, are they really generated by real scientists or this is all, you know, LLM generated.

So these advancements could significantly impact scientific discovery, reducing the cost and also increasing the speed of research. So there could be some benefits to this, especially if you look at it as an augmentation for human research. The thing is, the controversy surrounding this paper is largely coming from the methodological concerns that they are views in, and especially when you look at the reliance on automated review systems to evaluate the scientific quality. And that kind of raised some concerns to me, the questions here whether such reviews can truly assess novelty, creativity and rigorous of the work.

And also I think one thing that's skeptical is whether the AI could really fully replace human intuition in scientific discovery, especially when you're dealing with more abstract or interdisciplinary fields. So this, I think AI is still not there yet when you're really looking across multiple fields and kind of mimicking that human intuition. And I think another thing is also the broader ethical and social implications for automating scientific research. So there are a lot of concerns here.

But I think from a scientific perspective, it's a very nice piece of work, but has a lot of implications, of course, ethical, and also the automated review, the process that they have. That's right. Yeah. I'm curious, Kate, as a researcher yourself, how do you feel about all this? I feel like it's very interesting, for example, seeing engineers be like, well, they're never going to learn to code as good as I am, so I know there's a tendency to push back on it. But curious about how you think about these types of experiments. Are they like fun toys? Like, would you use these? Would you read the papers produced by these AI's? Yeah. Well, I'm honored you call me a researcher, but I certainly work with a lot of amazing researchers here at IBM research, even if I'm not one directly.

But I actually question whether, as a non researcher, this might be a naive opinion, whether there isn't something that LLMs can do well in terms of understanding what's been done in the past with related literature on a much broader scale than what's humanly possible to go through and analyze and read and try and find similar methods or approaches to apply to a new problem that's related. I don't know if Kautar, you have any thoughts on that, if that's maybe a jump too far? No, I think I agree. You have a point there.

So there might be stuff that they're discovering that scientists are not able to discover because they're pulling from a wide variety of sources. But I think we still need human in the loop here to validate verify these experiments and then take them to the real world and try them and see the results. So we cannot just take the results out from these LLMs and then just apply them directly. So I think there still needs to be some verification that probably these systems will get better and better as we use them more for scientific discovery.

Yeah, I think one of the interesting things here is that some of the people I know who research this space think a little bit about the burden of knowledge, which is like, there's just like more and more knowledge and more and more papers. And part of the hope with some of these systems is simply that there's a lot of findings that could exist purely in finding connections between papers that just people are not making the connection between. And so that ends up kind of reducing it more to a search problem. Right. I think what's kind of interesting here is the idea that then you want them to run the experiment, then you want the AI to do the empirical stuff.

I think there's a question about how far beyond just the question of search you need to go. Yes, I think just like any workflow from an enterprise perspective, we help a lot of clients with their R and D research and things of that nature, right? Coming up with a new formulation for a new food item or perfume or like, product research for the next car, things on, so forth, battery research, whatnot. So across all of them, just like any other workflow in an organization, you figure out that here are all the steps that are needed.

When you are hiring somebody brilliant from MIT to come join your team as an intern, you're giving them a specific task to augment what a senior researcher in the field for a decade has been doing. You will plan out saying that, hey, here's a task, and I'm going to go give you go research this particular topic. I think we'll start to incrementally see more and more AI helping out on specific tasks in the research spectrum, end to end. I don't think, just like any other workflow, I don't think it will completely be taken over by AI.

It's augmenting intelligence rather than being replacing. So I think that the good tandem between humans and AI, and we'll also start getting better at what to request health. So, for example, you just mentioned a knowledge graph across a whole bunch of different research papers to figure out if somebody overseas in a different country had some novel idea that you just didn't think about, right? So I think we'll get to a point with this research.

What I'm really interested in is a conference that we get to where each one of us would have our representatives as AI going to each other, right? Just imagine if you have a collaboration between a team of researchers with their AI counterparts in Israel talking to the same their counterparts in the US, and they're exchanging ideas, and you come up with a new theorem and say, hey, I think we came up with this new idea that we should do x. I'm just looking forward to a world where we started using the word we when AI is actually starting to do something for us.

And like, one of the big dramas in academia, of course, is like, who's the first author? Like, I wonder if in the future it'll be like, you'll get into this big struggle with some LLM collaborator that you have, is trying to take all the credit from you. Now, you know, we'll have that drama play out, but it'll just be funny because it'll be humans and AI's so I think it'll be competition between models, who's writing the best paper and who's an AI conference completely generated by AI and reviewed by AI. That's right. Yeah, exactly. Angry that you're unjustly turned down for your paper. Reviewer number two.

You know, I would say that there are certain things that we don't think about quite yet in the whole research spectrum when we are so focused on doing our actual novel research, when it comes to, say, peer reviewing. I'll give you an example of what we're doing with some of our utility companies. Utilities, when they have to go file for increasing the price of the electricity in a particular state, they have to go file for a case and they have to make a case and say, here's why I think I should increase it by x cents. Right. $0.05. We are helping these utilities create that whole submission package. So we're looking at everything that they've submitted, all competition, it's all openly available online. So you research and help create the first package itself.

Then once you know who's going to be on the panel, who's going to be assessing it, we can then go look at every question that they've ever asked. So in this case, in a peer review, we know when showpit gets to be the reviewer, I typically ask more about ethical concerns about a particular paper and so on, so forth. Right. Each one of us has a pattern on how we ask questions, right. So now we reverse engineer what the judges would ask on the panel, and then we change the documentation so that the submission itself is going to address those proactively.

Then when you actually go and have to present your case in person, that's an interview that's happening. So then we are preparing the witness based on the kind of questions that the person has asked everywhere else and what's the right chain of thought to go onto that? So I think there are aspects of research that researchers don't want to do that. I think AI will be really helpful in augmenting. Do you think that'll be helpful, Carter? I think so, definitely.

Yeah. Of course, as humans, we're limited, and if we're augmented by AI, we're going to be superhumans. And hopefully in the right direction. So. Well, and I think it gets back to what we were just talking about. Right? Like, are we going to have AI, like, literally try and become its own researcher and just replicate what a human can do? Or are we going to have AI specialize in parts of the process and run that process faster and better and support humans in new, more efficient workflows. It's just, you know, now without the robots focused on scientific methods. The news story of the week was that it was finally kind of rumored.

A new story kind of came out that OpenAI is going to be investing in trying to produce its own in house chips to support its work. And part of this is its integration and collaboration with Apple. But more generally, this has been something that's been rumored about for some time, that now looks like it's now more in the realm of certainty that they really are kind of investing this in a really, really big way. Kautz, you're the most natural person to ask about this, but why would OpenAI want to do this?

Like, semiconductors are like wildly expensive, very hard to pull off. My understanding is basically like China. The whole country has been trying to reproduce the taiwanese semiconductor industry and is only moderately successful at it. Why is OpenAI kind of making such a big bet on hardware? I think the CEO of OpenAI, Sam Altman, has made the acquisition of more AI chips a top priority of his company. And he publicly even said he complained actually, about the scarcity of these AI chips chips.

So given, I think all the rising costs, chip costs, the supply chain challenges, and the need for specialized hardware, especially specialized hardware that's optimized for OpenAI models, it seems to me that this is a strategic move. So designing their own chips could enable OpenAI to tailor hardware for their specific workloads, improving performance, efficiency and scaling potential. However, of course, there are challenges here and financial challenges given the complexity, especially of the semiconductor design and manufacturing.

So by creating this in house chips, OpenAI can reduce its reliance on third party manufacturers like Nvidia, which control a significant portion of the AI hardware market, almost 80%. So it's going to give them more control over the supply chains and allow them to specialize and optimize for their unique workloads, potentially improving their efficiency, performance and scalability. While semiconductor development is challenging and costly endeavor, I think this move could enable OpenAI to differentiate its hardware and scale its operations effectively. I think they've thought a lot about this, but I think it's a strategic move for them, but also to diversify totally. I mean, wild is what you're saying is basically like what's cheaper than trying to get h 100s? It's like literally building your own semiconductor supply chain, which is a really crazy thing to say, I guess an okay show.

But if you've got kind of thoughts on this, I mean, one big question is, like, do we think it's going to be successful? Like, I can almost see the argument for it, but man, if it isn't a high risk sort of thing, right? I mean, certainly high risk. But I really want to emphasize one point that Koutar brought up, which is there's tremendous opportunity as we look at kind of this next generation of AI and what's going to come next on AI and hardware co design. So making sure that we're developing these models and the hardware that runs them in tandem to really unlock kind of new performance levels, new efficiencies and cost, there's tremendous opportunity there. So I think it makes a lot of sense to start to put some skin in the game, so to speak, given that there's just a ton of ways that they could continue to innovate once they have better control over hardware design.

Yeah, for sure. And shobay, I guess maybe your kind of ideal to wrap up this section and close us out for the episode is you think a little bit about what this all means for business, what this all means for enterprise. Can you paint a picture a little bit more? Right, because I think the semiconductor stuff is often very abstract, but as Kate is saying, there's some very practical implications to our experience of these kinds of technologies in the systems. But I'm kind of curious, what does the everyday look like if OpenAI is really successful here?

You think Nvidia is a great partner with us. We do a lot of work, we have joined clients and whatnot, so we do exceed a lot of work. Yesterday I spent the entire day with Nvidia. We're doing a lot of work around where they can go and work with enterprises beyond the hyperscalers themselves. So they got into quite a bit of detail behind the covers, explaining us the intellectual property they've built, the differentiation. They have a significant moat today, not just on the chip level, but the way you do the architect. The entire end to end flow, the total cost of ownership, you're going down from a massive data center down to one box.

Just the wiring in the existing data centers is more expensive than that one box from Nvidia. So the total cost of ownership. And Jensen made this famous statement saying, even if their competitors, who are the customers as well, even if they made free chips, the total cost would still be lower on Nvidia. So they've done an incredibly good job on driving higher efficiencies, more throughput, five x, ten x on the same, same kind of footprint. So I think they will take a while for a company like OpenAI to do everything that's around it.

It'll take them a while. Just like when Tesla came to market, it took them a while to figure out how to actually productionalize this end to end creating a car, the actual core of it, that piece was great. Their researchers could solve for that. But the whole manufacturing and the supply chain and the total cost and how to get a car to actually be a $30,000 car that people want to buy, it'll take a while for OpenAI to get there. And I think that, in my view, is going to distract them a little bit from their core business.

They should, in my view, should be focusing more on how do we get to adding more intelligence. What Ilya just did with SSI raising a billion dollars, what cloud models are doing with more responsible AI and stuff. I think there's still a lot more focus that's needed on solving that side of the problem for enterprises. The cost will come down over time, just the way the economics work. The cost of computing on Nvidia has fermented in the last decade. So I think that the focus of OpenAI is still problems that need to resolve before they start to go vertically integrating end to end.

Yeah, it'll be fascinating to see. And as I said, I think this will not be the last time that we talk about this issue. So I'm not overly sad that we ran out of time today about it, but we will pick it up in the future. So that's what we have time for today. So, showbit. Kate Koutar, thanks for joining us on the show. And for all you listeners out there, if you enjoyed what you heard, as always, you can get mixture of experts on Apple Podcasts, Spotify and podcast platforms everywhere, and we'll see you next week.

Artificial Intelligence, Technology, Innovation, Prompt Engineering, Humanoid Robots, Ai Automation, Ibm Technology