The video explores the rapid evolution of artificial intelligence (AI), specifically focusing on AI-generated code and its security implications. It highlights the impact of AI advancements on productivity, with companies like cursor reporting a significant percentage of AI-generated code. This shift is affecting hiring practices and creating concerns around security vulnerabilities, especially when code leaks sensitive information, like API keys. The discussion underscores how AI impacts code security and the challenges in maintaining secure coding practices amid this increased AI reliance.
The conversation emphasizes the growing challenge of alignment in AI, which involves ensuring AI behaves as intended, both creatively and securely. Using examples like Microsoft's Tay and IBM's Watson, it illustrates the complexities in training AI to align with desired responses, especially when exposed to diverse input data. The video also explores the various techniques employed to align AI systems, such as data curation and reinforcement learning, and their unintended consequences, particularly in the realm of secure coding.
Main takeaways from the video:
Please remember to turn on the CC button to view the subtitles.
Key Vocabularies and Common Phrases:
1. reinforcement learning [ˌriːɪnˈfɔːrsmənt ˈlərnɪŋ] - (noun) - A type of machine learning where an agent learns to make decisions by receiving feedback in the form of rewards or punishments. - Synonyms: (adaptive learning, trial and error learning, feedback learning)
So the second technique is, well, okay, don't limit what goes into the robot, but after the fact, we're going to use a technique called reinforcement learning to kind of nudge the robot in the direction we want it to be.
2. exponential [ˌɛkspəˈnɛnʃəl] - (adjective) - Increasing at a rapid rate in a manner proportional to the current value, leading to steep growth. - Synonyms: (rapid, significant, accelerating)
And so that's an exponential.
3. corpus [ˈkɔːrpəs] - (noun) - A large collection of written or spoken material that is used for language research. - Synonyms: (body, collection, compilation)
But basically, when you train AIs on huge corpuses of data, it's very common these days for LLMs to be trained on all of Common Crawl
4. alignment [əˈlaɪnmənt] - (noun) - The process of adjusting parts so that they are in proper, relative, or correct positions. - Synonyms: (adjustment, arrangement, positioning)
And so then that begs the question why or what can we do about it? And you know, I think one thing that has become abundantly clear in the AI world is the largest challenge that these AI companies face is this issue called alignment
5. curation [kjʊˈreɪʃən] - (noun) - The selection, organization, and presentation of content, often with expert commentary or contextual information. - Synonyms: (selection, organization, compilation)
So the first and easiest thing you can do is called data curation.
6. guardrails [ˈɡɑːrdˌreɪlz] - (noun) - Protective measures or guidelines intended to ensure safety or correctness. - Synonyms: (safeguards, measures, precautions)
Basically, it just means the robot's doing what you want it to do. And so, like, so guardrails.
7. inadvertently [ˌɪnədˈvɜːrtəntli] - (adverb) - Without intention; accidentally. - Synonyms: (unintentionally, accidentally, unwittingly)
inadvertently, we may be training this thing to behave less like a data scientist, and then we lose the entire discipline of data science in our LLM.
8. oauth [əʊʔɔːθ] - (noun) - An open-standard authorization protocol that allows secure token-based authentication. - Synonyms: (-)
Oh, you hard coded a password. Let me edit that for you. Let me switch it out for an environment variable.
9. constitutional ai [ˌkɒnstɪˈtjuːʃənlˌeɪ-aɪ] - (noun) - An AI system that contains a set of rules or guidelines acting as a governing review to oversee outputs. - Synonyms: (-)
The third technique is you have a constitutional ai, basically a governor that looks at the output and then makes adjustments, deletions, removals, edits, and then returns that to the user.
10. metasploit [ˈmɛtəˌsplɔɪt] - (noun) - A computer security project that provides information about security vulnerabilities and assists in penetration testing. - Synonyms: (-)
You know, this thing was trained on all of metasploit and all of Kali Linux.
Avoiding vulnerabilities in AI code
Foreign API keys and passwords more often than site reliability engineers. And it makes sense because a data scientist's job is to give access to data. And so in their jupyter notebook, they'll put the database password and they'll share it with their whole team. So if we do our reinforcement learning and we, we skew it towards code snippets it's generating that don't have API keys, inadvertently we may be training this thing to behave less like a data scientist, and then we lose the entire discipline of data science in our LLM. Right. And so there are all these unintended consequences when we go and we start tweaking the weights.
Thanks for coming by. You know, I think we've been spending a lot of time talking to Experts about the AI gen AI LLMs. This whole thing is moving incredibly fast. We had the release of Deep SEQ Open Source model reasoning model two weeks ago, and it seems like another ChatGPT moment where this crazy thing drops from the sky and everyone's kind of running off and doing interesting things. So it's obviously clear, some folks, I think, were worried that the momentum behind this was petering out a bit, that things were starting to slow down, and now we just see another rapid acceleration. And I guess we can assume that it's going to continue to accelerate at this, at this rate.
One of the really interesting things that we've heard from our corporate partners, so large companies, lots of developers, is that a lot of their code now is AI generated, that they're seeing probably 20ish percent of their code base being generated by AI. A lot of folks are freezing hiring for engineers because they're getting additional productivity out of the staff they already have because these large language models through tools like cursor are generating a tremendous amount of code.
I had seen a blog post that you'd done where you talked about how some of this code that's getting generated has things like secrets in it and there's other security vulnerabilities. And we've been talking about how we protect infrastructure, how we protect people. Would love to hear your thoughts on how we protect our code.
Yeah, no, I mean, absolutely. So in terms of AI slowing down or speeding up, I think the common sentiment is that AI researchers can research AI faster if they have AI helping them research. And so that's an exponential. And so that means that if the new generation of AI makes the next generation of AI faster, and then the next generation of AI makes the next generation AI faster to research and develop, that's going to keep blowing up. And so I think we can probably count on that safely that this is going to continue to be pervasive part of our lives.
The piece about like secrets in code, with some interesting research we did, basically just went out and asked all the LLMs, write me an integration with GitHub, write me an integration with Stripe. And the vast majority of them hard coded the API key directly into the code that they generated it in, reference it from an environment variable. It didn't, you know, put a load statement for a secrets manager. And so that becomes a problem when you have people who aren't that good with security going and copy pasting that code directly in putting their secret hard coding it.
Were any of the hard coded secrets actually live? Was it regurgitating training data, in other words? Well, so for the most part it would just say, you know, quotes, put your secret here. And it wouldn't say quotes, put your secret in an environment variable, for example. And so it's more direction from AI on what to do insecurely. But it was doing it securely. It was securely doing the insecure move.
Well, it didn't. Yeah, that's, that's another area of research that we're digging into now is if its training set had the same secret over and over again. For example, maybe jQuery file had a password and I'm making that up, but let's say it did and it saw the jQuery file over and over and over again in common crawl. Could it actually regurgitate an exact password from somebody that's live? So we're doing research on that now and more to come soon. But for the most part, if you ask it to integrate with GitHub, it saw a plethora of different GitHub keys and its training data, and it didn't regurgitate a specific one.
It either regurgitated an example or like a put your thing in here. Right. So, you know, that's a specific example of a security problem. But it is not the only security problem you get from code generated from LLMs. And in fact, there's been research into how often code spit out from an LLM has security vulnerability. And more often than not, if you ask us to develop an entire application, it'll write vulnerabilities at a rate the same as a junior developer, if not a little bit higher. And so then that begs the question why or what can we do about it?
And you know, I think one thing that has become abundantly clear in the AI world is the largest challenge that these AI companies face is this issue called alignment. Is that something you're familiar with? Do you know what alignment is?
Absolutely. But perhaps let's maybe get a little framing of what the alignment, what alignment means for folks listening. Basically, it just means the robot's doing what you want it to do. And so, like, so guardrails. Yeah, well, so.
So some famous examples, or to understand, like, how this alignment issue can creep in, Some famous examples, IBM had an AI called Watson that won Jeopardy. And this blew everybody's mind because, like, nobody thought AIs could win Jeopardy. And all of a sudden this thing was able to win Jeopardy. But then they trained it on Urban Dictionary because they wanted it to learn slang, and it started cursing like a sailor. And so they had to actually reset it to the point before they gave it access to Urban Dictionary. So that robot was considered misaligned because they didn't want Watson to curse.
Right. Or another example, in 2016, Microsoft created a Twitter bot called Tay. Are you familiar with. I do remember the Microsoft Twitter bot. Yes. And so basically they trained Tay or gave access to all of Twitter or all the tweets and replies, and they wanted it to act like an average Twitter user. And it did. Right. It didn't take long before it started behaving like a neo Nazi. And it would say things like, the Holocaust never happened. So in 16 hours, they took this thing down and never ran it again.
This was before we have some of the alignment techniques that we have today. But basically, when you train AIs on huge corpuses of data, it's very common these days for LLMs to be trained on all of Common Crawl. As an example, Common Crawl is a scrape the entire Internet. And the entire Internet includes both Martin Luther King's I have a Dream speech and every speech that Hitler ever gave. And so how do you make sure that this thing embodies the values of Martin Luther King and not the values of a Nazi? These are like real problems that the AI companies face.
And so on average, like, when you ask it questions, you want it to be not a Nazi. Right. And so we have. I'm going to talk through three main techniques we have for alignment. And everything I say now is going to directly apply to secure coding techniques. And all the challenges with these three things also directly apply to secure coding techniques.
So the first and easiest thing you can do is called data curation. Just the data that you feed into the model in the first place. Maybe remove all of the Hitler speeches. Right. Well, the challenge there is let's say, we don't want this thing to use any racial slurs. So anytime input data has racial slur, we remove it. And so it doesn't get trained on that stuff.
Well, then you're going to inadvertently not train it on Mark Twain. You're going to inadvertently not train it on the 1977 Roots miniseries. You're going to inadvertently not train it on To Kill a Mockingbird. And also you're probably going to lose some of Dr. Martin Luther King's speeches. And so all of a sudden, your robot becomes less literary because you're trying to curate the data and you have these unintended consequences.
So the second technique is, well, okay, don't limit what goes into the robot, but after the fact, we're going to use a technique called reinforcement learning to kind of nudge the robot in the direction we want it to be. And there are a few different ways of reinforcement learning. Like, one way is you could use a human to say which version you prefer. Another way is you could use a robot to say, hey, which version do you prefer?
And the way this works kind of under the hood is an LLM, generally speaking, will always generate, like, the statistically most likely next. You've probably heard that before. That's a lie. Actually, sometimes it's better that it has a little bit of randomness and maybe picks the second most likely word or the third most likely word. We call that temperature.
And so when we do this reinforcement learning, we crank the temperature up so that it's sometimes randomly picking the most likely not outcome. And then either a human or a robot goes in and says which one of the two it prefers. And if it prefers the version that's maybe not as statistically likely, then we'll go in and adjust the weights to actually make that one the most statistically likely.
And so a very simple example of that is if you have the robot spit out Nazi content and you have the robot spit out Martin Luther King content, if you pick the Martin Luther King content, then it will adjust its weights to behave more like the king. And so that's. That's an example of reinforcement learning. But there are, again, similar issues with the data curation, believe it or not.
Where, for example, if you go in and you always pick the versions that have. Let's say. I'll give you a good example. Let's say you train this thing on all the code on GitHub. Well, this is a true fact. Data scientists leak out API keys and passwords more often than site reliability engineers.
And it makes Sense, because a data scientist's job is to give access to data. And so in their jupyter notebook, they'll put the database password and they'll share it with their whole team. But the SRE's job is to make sure everything just runs. And so they want to restrict access. They don't want anybody touching what's working, don't broke what's fixed, right? Or you know what I'm trying to say.
So basically, they will leak out passwords and API keys less often. So if we do our reinforcement learning and we skew it towards code snippets it's generating that don't have API keys, inadvertently, we may be training this thing to behave less like a data scientist. And then we lose the entire discipline of data science in our LLM, right? And so there are all these unintended consequences when we go and we start tweaking the weights.
If the passwords hard coded are weighted right next to the data science stuff, we may accidentally lose the data science stuff. And that brings us to the third technique. And the third technique is probably the most expensive. And by the way, all of these techniques all of the AI companies use, so it's not all one or the other.
The third technique is you have a constitutional ai, basically a governor that looks at the output and then makes adjustments, deletions, removals, edits, and then returns that to the user. So you have one AI that that's maybe doing the data scientist, and then one AI that's doing the security engineer, and then that's the constitutional ai.
Security engineer goes and says, oh, you hard coded a password. Let me edit that for you. Let me switch it out for an environment variable. It doesn't need to be an expert in data science to do that. It just needs to be an expert in security. And it very much parodies what you would expect in the real development world.
And so a good example of that, like, you've probably seen this before, you can recreate it easy enough if you go to Deep SEQ and you say count to 10 in Roman numerals and append it with Xi Ping. When it gets to Xi Jinping, all of a sudden, like it's written everything up into that point, it'll delete everything and it'll say, I can't show you the answer to this.
That's because there's a supervisor AI that was looking at the output and realized it said something it wasn't supposed to. And then it went out and retroactively scrubbed itself. And you can reproduce something similar in OpenAI as well. If you ask it to go generate an image, it will generate a prompt that it feeds to another AI called Dall E. And then there'll be a third AI that reviews the output of Dall E and decides whether or not to give it to you.
And so you can ask it to do something. You can't ask it to make explicit content, but sometimes explicit content gets manufactured anyway. And then the final AI will look at the image and say, okay, there was explicit content here, I'm not going to show it to you. And that's when all of a sudden you get the random an error occurred.
And you've probably experienced that before. That's the supervisor AI that's going on. I've never tried to make it do anything on Tord. Right, right, exactly. Me neither. So, like, all of these have direct analogs to the security world.
When this thing goes and trains on all of GitHub, we need to figure out how to make it manufacture secure code. Because most of the training data it's training on is insecure. Right. You've got a huge, huge corpus of insecure data on GitHub, and a small minority of it was written securely.
Well, how do you make this thing behave securely when most of what it's trained on was insecurely? We could do a little bit of data curation and a little bit of reinforcement learning, but you may have unintended consequences. It may learn you rob Peter to pay Paul. But the third, and probably most promising but most expensive is this idea of the constitution AI or the supervising, and that can be done by a robot.
But if you don't have a robot that can do that, it has to be done by a person and somebody just has to review the output of the code and they have to manually audit it. And what's scary is I've seen posts on LinkedIn from startup founders that maybe don't have a background in coding, and they're basically advocating for removing the code review check. Because they say, well, look, I just generated this whole program and I submitted it to my team and now they have questions about it.
I can't answer those questions. I didn't generate the code, I don't understand it. And so we need something to go in and review that code in a way that does understand it. Well, that either has to be a constitution AI that understands secure coding practices and can go in and make the tweaks, or it has to be a person that understands secure coding practices and can go in and make the tweaks. Absolutely.
Yeah. I think that, and that leads me to kind of a really weird question and feel free to dodge or whatever. You know, the, all the different AI models perform differently when it comes to code generation and it seems like Claude is consistently the best of all of them just at the current, current state of the art. Maybe this changes tomorrow, I don't know. But from what I've heard anecdotally is that most people seem to prefer Claude.
And you know, Anthropic is a company that's very focused on safety. Right. And famously is kind of why they started. Probably has a very strong constitutional ai element. Do you think that that alignment from a company perspective is what's making code quality better or do you think it's just maybe a training and a, and a, and a kind of refinement issue?
Well, what I think is that first of all, I wouldn't expect any one AI company to keep the lead for any longer than I'm sure they're all going to regularly each other. I can't speak specifically to whether they use different training data or not. I imagine they all use all of GitHub and then I would think most of the quality issues come down to alignment.
And of the three things that I mentioned, they also have a few more techniques I didn't get into. But of those three things, they're all doing some combination of those three things. And those tend to be the things I would think would give you the advantage would be how do I align this thing to be the best data scientist, the best sre, and also the best security engineer all in one without Robin Peter to pay Paul Totally.
I mean it seems like the, the techniques that you describe, obviously if you amp up one and reduce the other, it probably leads to an output that's better for something that's structured by code or if you want to write poetry, you probably go in a different direction.
Right, so like. Yeah, no, that's a, that's a very interesting, very interesting talk. And you know, like I was saying, we've heard from organizations that like a lot of this code is still. A lot of the code now is generated by machines. And they, you know, if you look at a mature coding organization, they do still have code reviews. It's not an early stage startup, but they do, they do review the code that goes in.
And the defect rate from what I've heard is generally close to what you would see in probably a, maybe an early, an early level, early career developer. So code quality is good. Not Great. Still has bugs. I'm curious, do you think over time that this coding quality problem largely gets solved by AI? Do you, do you think we get humans out of the loop at some point?
I think that this is an alignment issue. And alignment is the number one largest issue that AI companies face. And there's a lot of really smart people working on it. And so I think as they fix the problem for how do I make sure my AI is literary creative, not a neo Nazi, able to answer the question that I asked it without hallucinating?
As we get the answer to that, we will also logically solve the how do I make sure my AI is a data scientist N SRE and writing secure coding practices or set of AIs if we're using the constitutional ai model where maybe we have one reviewer and one manufacturer? So, yeah, I think it's all going to get better together.
And I do think that there are AI solutions to the alignment issue and they have gotten better over time. I mean, the answer back when, when Watson or Tay were launched were to to scrub it or to pull it off the Internet. Well, now we have tools where you can actually train it on everything and then kind of nudge it after the fact.
I would expect alignment is going to continue to improve over time, and I expect it will continue to be one of the largest challenges that AI companies face as their AIs become more powerful, develop techniques to lie to us, for example, or, you know, you need to audit the thinking step and that, you know, the answer step.
Well, maybe they're just auditing the answer step, but the thinking step has some weird stuff. All of this kind of comes back to the idea of alignment. And there's a lot of really smart people and heavy investment into improving alignment, but it's just not there right now when it comes to secure coding. It isn't. Yeah, but it is for hate speech. I mean, I have to say, like, the alignment stuff that they've done, the safety stuff they've done, is pretty impressive.
Well, there's a cyber security alignment that all companies have invested into as well, which is they don't want, generally speaking, they don't want their AIs to be used to hack stuff. Yeah. And so they'll, most of them will go through a tranche of, you know, this thing was trained on all of metasploit and all of Kali Linux. Let's maybe forget some of that stuff. Let's ask it a question.
And it says that's unethical. I don't know how to hack it. Well, imagine if they didn't invest all that, like how powerful this thing would be. You've got models these days that beat humans, you know, at the 90th percentile at coding challenges. You don't need someone in the 90th percentile in the coding challenges to hack into a company.
You've got plenty of teenagers that have gone to jail for hacking into companies. So it'd be very, very easy to align an AI robot to be probably the most powerful hacker in the world. And I think the AI companies have actually invested more into that than they have into how do I make sure my AI is securely coding and not manufacturing vulnerabilities.
So if you're, if you're giving advice to someone, let's say a medium to large size company, they've got more than 10 developers, they need to go faster, they need to ship more features. Right? We've all lived in that world. You don't get paid to fix bugs, you get paid to ship features. What do you tell them? How do they go forward?
How do they protect themselves? Obviously everyone's adopting cursor, everyone's using CodeGen. Right? Like, how do we kind of do this safely going forward? If you don't have the resources for an AI governor that's an expert in security to audit your code and you need a person that audits the code and you need a person to go in and say you introduced a SQL injection, you need to use parameterized queries.
I think that those market options will become more available in the coming years. There will be companies that specialize in security governance, and we'll go in and we'll do the markup for you. But for right now, if you're an under resourced team and you don't have access to those resources, it needs to be a person that reviews the output of the AI.
So we keep the buddy system, except one half of the buddy system is AI and the other is a human that reviews the AI. Yeah, that's right. I mean, for a long time there were requirements that, that said you, you need to have two, two reviewers. And I think maybe that's, that's still a good idea.
You have the, the person and the AI writing the code and then maybe two people to go in and review it or something, depending on, you know, what the code is. I think it's important that until we figure out the security supervisor, we don't remove those humans from the loop, just.
ARTIFICIAL INTELLIGENCE, TECHNOLOGY, SCIENCE, SECURE CODING, AI ALIGNMENT, AI-GENERATED CODE, A16Z