ENSPIRING.ai: AI News: A TON Happened This Week - Here's What You Missed
The video provides a comprehensive overview of notable developments in the AI sector, highlighted by significant announcements from major tech companies such as OpenAI, Google, Microsoft, and Meta. OpenAI's Dev Day revealed upcoming AI agents slated for 2025, potentially transforming AI capabilities. Furthermore, features like the new canvas UI in GPT were discussed, showing advancements in user interface and functionality.
Microsoft's innovations, including Copilot and generative search updates, indicate the company's drive to enhance user interaction and streamline technical processes. Meanwhile, Meta's introduction of Llama 3.2, a significant AI model that works offline and supports diverse applications, suggests advancements in AI's accessibility and flexibility, supporting both developers and general users.
Please remember to turn on the CC button to view the subtitles.
Key Vocabularies and Common Phrases:
1. overhaul [ˌoʊvərˈhɔːl] - (verb / noun) - To thoroughly examine and make necessary repairs or improvements. - Synonyms: (revamp, refurbish, renovate)
And canvas is kind of a complete overhaul of the UI inside of chat GPT.
2. fireside chat [ˈfaɪərsaɪd tʃæt] - (noun) - An informal conversation or discussion, often addressing a broad audience. - Synonyms: (discussion, informal talk, dialogue)
Sam Altman did a little fireside chat at the end of Dev Day and allowed the audience to ask questions.
3. distillation [ˌdɪstɪˈleɪʃən] - (noun) - The extraction of the essential meaning or most important aspects of something. - Synonyms: (extraction, purification, refinement)
They also rolled out model distillation in the API.
4. valuation [ˌvæljuˈeɪʃən] - (noun) - An approximate calculation or judgment of the value of something. - Synonyms: (appraisal, assessment, estimation)
They managed to raise $6.6 billion in funding at $157 billion post money valuation.
5. iterate [ˈɪtəˌreɪt] - (verb) - To repeat a process or set of instructions to achieve a desired result. - Synonyms: (repeat, redo, replicate)
It's available for everybody to use and improve upon and iterate on.
6. distributors [dɪˈstrɪbjətərz] - (noun) - Entities that supply goods to stores and other businesses that sell to consumers. - Synonyms: (suppliers, vendors, wholesalers)
Targets the distributors of AI deepfakes on social media, specifically, if their post resembles a political candidate.
7. Neural Processing Unit (Npu) [ˈnʊrəl ˈprɑsɛsɪŋ ˈjuːnɪt] - (noun) - A specialized circuit designed to accelerate machine learning algorithms, typically in AI applications. - Synonyms: (AI chip, processing chip, machine learning accelerator)
Especially if you have one of the new copilot plus PCs with the NPU neural processing unit built into them.
8. mnemonic [nɪˈmɒnɪk] - (noun / adjective) - Aiding or designed to aid the memory. - Synonyms: (memory aid, memorization tool, aide-mémoire)
Recall feature is essentially like your Internet browsing history, but for everything you do on your computer.
9. contextualize [kənˈtɛkstjʊəlaɪz] - (verb) - Place or study in context. - Synonyms: (frame, situate, relate)
If I ask it to explain some sort of complex concept to me and I'm not quite understanding it, you can essentially tell it to dumb it down for you and keep on dumbing it down until you finally get it.
10. empower [ɪmˈpaʊər] - (verb) - Give someone the power or authority to do something. - Synonyms: (authorize, enable, permit)
Llama has seen ten x the growth and has become one of the preferred large language models for AI development.
AI News: A TON Happened This Week - Here's What You Missed
So it's been an absolutely insane week in the world of AI. We've got announcements out of OpenAI, out of Google, out of Microsoft, out of meta, out of all of the AI art generators and all of the video generators. Everybody decided to drop new announcements and roll out new features this week. So there is a ton in this video. I'm going to try my best to rapid fire it and not ramble too much and share all the crazy AI news that happened with you this week.
Let's just get right into it. Let's start right here with OpenAI. This week was OpenAI's Dev day for the most part. Most of the announcements were more geared towards developers and not really end users of chat GPT. But there was still some interesting stuff that came out of Dev Day. Sam Altman did a little fireside chat at the end of Dev Day and allowed the audience to ask questions. And a lot of the questions were about when are we gonna get AI agents? And according to this article over on Tomsguy.com, it says OpenAI confirms AI agents are coming next year.
OpenAI is on target to launch agents next year. These are independent artificial intelligence models capable of performing a range of tasks without human input, and could be available in chat GPT soon. Now, as far as I know, OpenAI hasn't actually released the recordings from Dev Day yet. However, I did find this YouTube channel called Kyle Cabosaris, I'm sorry if I mispronounced your name, where he actually filmed the entire sort of fireside chat with Sam Altman. And the closest thing I can find to them saying that agents will be here by 2025 was this little clip here.
Maybe talk to us a bit more about how you see agents fitting into OpenAI's long term plans. Are you a huge part of the. I think the exciting thing is this set of models, zero one in particular, and all of its successors are going to be what makes this possible, because you finally have the ability to reason, to take hard problems, break them into simpler problems and act on them. I mean, I think 2025 is going to be the year that this really goes big.
Yeah, so they did mention from stage that they think 2025 is going to be the year, and Sam Altman kind of sort of confirmed it. That was the only clip that I can find that seemed to kind of confirm that they think agents by 2025. Again, I don't think the full video is online, but if you do want to watch this entire fireside chat, they did check out Kyle's channel here. I'll make sure it's linked up in the description as well.
I'm going to talk more about the dev day announcements in a second, but before I do, let's talk about canvas, which is actually a chat GPT feature that they just rolled out this week as well. And canvas is kind of a complete overhaul of the UI inside of chat GPT. We can see that chat GPT will start suggesting edits. It'll adjust the length to be shorter or longer, change the reading level, add final polish, and check for grammar and clarity. And you can even ask it to add emojis.
They've also added some coding features with this new canvas, like review code, add logs, add comments, fix bugs, or port to a different language so you can go from like JavaScript to Python or whatever. And according to Sam Altman here over on x, he said that the new canvas feature is now live for 100% of chat GPT plus subscribers. So if you are a paid subscriber to chat GPT, you should have this feature now. Now when I pop into my chat GPT account, it defaults still to chat GPT four O. But if I click this dropdown, you can see we now have an option to switch to GPT 40 with canvas.
We don't get canvas with the new zero one preview mode that thinks through things, but we do have it with the GPT four O model. So if we click into here, we now have the ability to call on the canvas. If I just give it a prompt like write a short story about wolves learning to use a computer and hit enter, you can see that it just completes. Change the whole interface and it put my chat over here on the left sidebar, and it put the story that it just wrote over here in the right window.
Now if you use Claude and you've seen Claude's artifacts, this feels very similar. The biggest difference is in cloud artifacts, you can actually have it generate code, and then you can actually preview what the code does without actually going to another screen. This will just output the code, but you still have to copy and paste the code over somewhere else to actually execute the code.
Now what this did when it created this new right side of chat GPT is it made it so I can sort of select text here, and when I select text, I can ask chat GPT about just that paragraph. I can tell it to rewrite just this paragraph. And if I submit it, you'll see that it will actually automatically fix just that paragraph and leave the rest of the document alone. In the past, you would have basically had to tell it to rewrite the whole thing, but just fix the first paragraph and it would have completely done the whole thing over again.
If I select this text, I can also format it, you know, make it bold, italic, change it to headings, things like that. And then over on the right sidebar, you can see this little pencil icon down here. If I hover over it, it opens up a new menu here with suggest edits, adjust the length reading level, add final polish and add emojis. So if I click suggest edits, it actually reads through the story that it wrote and then suggests its own edits to its own story.
And if you like the edits that it's suggesting, just like in a Google Doc or something like that, you click apply and it actually tweaks it and edits it with the new version that it suggested to itself. I can click on adjust length and we've got a little slider here to make it super long or super short. So if I go to shortest and just let go, you'll see it'll just rewrite the whole thing. But in a much shorter version, if I click adjust the length, bring it all the way up to longest, it reworks the entire story, this time making it quite a bit longer.
We can change the reading level with a slider as well, all the way down to a kindergarten level, up to a graduate school level. So if I bring it down to a kindergarten level here, you can actually see that the writing style is more like a kid friendly writing style. Deep in the forest under the tall trees, the wolf pack found something strange, et cetera, et cetera.
If I change the reading level all the way up to a graduate school level, you can see it's much more descriptive. Deep within the dense forest ecosystem, beneath the expansive canopy of towering conifers, a wolfpack encountered an anomalous object. So it completely changes the reading level.
So if you're trying to get it to explain some sort of complex concept to you and you're not quite understanding it, you could essentially tell it to dumb it down for you and keep on dumbing it down until you finally get it. You can add some final polish, you can see it goes through and formats it. We got like a headline and some sub headlines and it just sort of broke it up and cleaned it up so it's a little bit easier to read. And then if we click add emojis, it will do exactly what you expect it to do and just fling some emojis in there as well. And yeah, make it look like that.
I'm going to go ahead and clear this and go to a new chat window because there's something else that they've recently added that's pretty cool. If I come down to the chat window here, they actually have like quick shortcuts. Now if I type slash, you can see I've got reason, search and picture. So if I select picture, anything I put after this, it will call on Dolly three to generate.
If I do search, it will make sure that it searches the web before it finishes the prompt. And if I click on reason, it's going to make sure it uses the new o one model that really thinks things through. So some pretty handy new updates to chat GPT this week.
All right, jumping back to some of the other stuff they talked about at Dev Day, I'm going to kind of go quickly through it because this is more designed for the developers that are using the API. But they did introduce vision to fine tuning the API. So developers can now fine tune GPT 4.0 with images and text to improve the vision capabilities. So if you're using the OpenAI API and you want to build a tool that uses their vision models, you can actually upload some of your own images and give it some additional context and it will actually get better on the specific type of images that you train into it.
Again, something more focused towards developers. Another thing focused towards developers is that they introduced the real time API. They recently rolled out the advanced voice mode where you can have a much more conversational chat with chat GPT. Well, they rolled out the ability to use those conversational bots inside the API. So now other apps can use that same technology within their app.
And if you want, you can even test this yourself. Outside of the chat GPT app, you can actually go to the OpenAI playground here, platform dot OpenAI.com playground. And over on the left they added a real time box here and I can actually start a session and have a conversation with the GPT four model here using this sort of new advanced voice features. But outside of chat GPT. Hey, how are you doing today? I'm fantastic, thanks for asking.
How about you? I'm doing great. Just recording a video breaking down all the news for the week in the AI world. That sounds exciting. There's always so much happening in the AI world. Any particularly big stories you're covering this week? Oh, wouldn't you like to know? Fair enough. I'll just have to wait for the video then.
They also rolled out model distillation in the API this lets developers easily use the outputs of frontier models like O one preview and GPT four o to fine tune and improve the performance of more cost efficient models like GPT 400 mini. They also added prompt caching into the API. This is something that significantly reduces the cost of using the API if you're a developer. This is something that Claude's had for a little bit now, but we're finally getting it in the OpenAI API.
And those were pretty much the big announcements that came out of OpenAI's Dev day. But there's other OpenAI news this week in the fact that they also got new funding to scale OpenAI. Now it's looking more and more likely that OpenAI is going to convert from a nonprofit to a for profit company. And in this past week they managed to raise $6.6 billion in funding at $157 billion post money valuation. This makes them the third largest startup on the planet, I believe.
Last week we got meta Connect and they showed off some of the new meta ray ban sunglasses. During meta Connect, they announced a new feature that these glasses were going to be getting memory. So you can look at something and say, like, remember where I parked? Or hey, remind me in ten minutes to call my mom or whatever. Well, those features are rolling out right now into the sunglasses and glasses that you have if you've got a pair.
The new update will also allow them to recognize QR codes and open them on your phone and make phone calls based on a phone number that is seen in front of the camera. I actually tested these features. They work great. You look at a QR code and just by telling your glasses to scan this QR code, you can pull out your phone and it just opens the app on your phone real quick. Pretty handy.
The memory feature is really cool too. I actually haven't tried it with an image yet where I take a picture of like a parking spot number, but I have tried it where I told it to remind me to do something in five minutes. And then that reminder came through. Pretty handy feature. And while we're on the topic of meta for today's video, I partnered with meta because they just released Llama 3.2 and it's a significant leap forward in AI technology.
Whether you're a developer or you're just simply interested in AI, this update is worth your attention. One question I get asked constantly is how can I use an AI model without actually sending my data to, like, the big corporations? Well, llama 3.2 has a pretty solid answer for you. You can run it directly on your device. In fact, you can use it without even being connected to the Internet if you want.
So what's new in llama 3.2? Well, first, the larger models, eleven B and 90 B, they both now have vision capabilities. This means that not only can they understand text, but they can understand images now as well. You can ask the AI about a chart in your report or to describe a photo and it would understand the visual context.
But meta actually put out some lighter weight models as well, with one b and three b text only models. These are perfect for on device AIH applications and even on mobile phones. Imagine a personal assistant that can summarize your messages or manage your schedule, all while keeping your data on your device. One of the really cool features about these models is that it supports 128,000 token context window. It's like being able to put a whole book's worth of information inside of a single conversation.
This llama 3.2 is actually optimized for Qualcomm and MediaTek hardware right out of the gate. Now this is super important for anybody that wants to actually develop AIH powered applications for mobile phones. And one of the most important aspects of these llama models is that they're open source. So you can download these models from llama.com or from hugging face and start building immediately. They're compatible with platforms like AWS and Google Cloud and Microsoft Azure and a ton of others.
Now personally, I believe open sourcing these models is really, really important. It encourages innovation and it allows for more diverse applications. This year alone, llama has seen ten x the growth and has become one of the preferred large language models for AI development. And if you're a developer, this means that you have access to the most cutting edge models that you can modify and adapt to your specific needs. And while if you're not a developer, it means that the apps and services that you're going to use could be significantly more intelligent and more helpful.
I've also got to mention that meta really prioritize safety. With this new release, they've introduced new safeguards including Llama Guard three, which is designed to ensure that these powerful models are actually used responsibly. So whether you're aiming to create innovative AI applications or you're simply excited about the future of technology, Llama 3.2 is actually really worth exploring. It's powerful, it's really flexible, and it's available for everybody to use and improve upon and iterate on.
So if you're ready to get started, you can go to llama.com or you can go to hugging face and download the models to begin your journey with Llama 3.2. The future of AI is open source and it's here now.
Moving on to Microsoft now, there was a ton of announcements to come out of the world of Microsoft, especially if you have one of the new copilot plus PCs with the NPU neural processing unit built into them. These are pretty much all of the new laptops and computers that are coming out from Microsoft these days are this new version of the Copilot PCs. Some of the new features you're going to get are the recall feature. This was a feature that was supposed to roll out when the new copilot PCs rolled out, but a lot of security and privacy concerns popped up and they sort of put it on the back burner for a little bit to fix it and improve some things.
And now they're finally rolling out this recall feature, which is essentially like your Internet browsing history, but for everything you do on your computer, it remembers you editing videos or writing documents in word or browsing through your photos. Anything that you did throughout the day. It sort of saves it as a history. So you can go back to that moment and remember what you were doing on your computer. And yes, you have the option to turn it on and off. And no, they don't actually send the information that they're collecting back to Microsoft. This is all just on device.
They're also adding this click to do feature where if you have like an image open on your computer, you can click on it and you can see. It gives the option to visual search with Bing blur background with photos, erase objects with photos, remove background with paint. So just by clicking on the image you get a whole bunch of new sort of AI related options. It says here. It also assists with text related actions such as rewrite, summarize, or explain text in line, opening in a text editor, sending an email, web searches, and opening websites.
Click to do is context aware and accessible from any copilot plus PC screen. They're also improving the Windows search with some AI. You can see they did a search up in the top here where they searched barbecue party notice that all of these images are just titled like image 1123, image 1111. Windows figured out what the context of these images were and pulled up all the images related to a barbecue party that were on this computer. And it says here it works even when you're not connected to the Internet.
So this isn't like an online feature. This is just going to use your laptop's NPU that's built into it. I'm not sure if this works with videos or if it's really only for pictures. I really want this for video. That would help so much with organizing b roll, but getting it with pictures, I imagine it's only a matter of time before we're getting it with videos as well.
They're adding a feature called Super Resolution inside of photos so you can open a image inside of photos on Windows and actually upscale the image. They're adding generative fill and erase inside of Microsoft Paint. So you can erase things in the background and generative fill inside the image just like you can in Adobe Photoshop. But now you could do it in Microsoft paint as well. So a lot of cool new features that are going to be available on these Copilot plus PCs.
Microsoft also introduced Copilot Labs and Copilot vision. The first feature available in Copilot Labs is think deeper, which gives copilot the ability to reason through more complex problems. It sounds to me like this think deeper is essentially going to use the new OpenAI o one model that uses that chain of thought prompting where it really thinks things through. But it looks like we're going to get that. Inside of Copilot Labs, there's also copilot vision.
It says if you want it to, it can understand the page you're viewing and answer questions about its content. It can suggest next steps, answer questions, help navigate whatever it is you want to do, and assist with tasks. All the while you simply speak to it in natural language. And they say it's an entirely opt in feature, so it only works if you turn it on. But here's a little demo that they put out of what that looks like.
They say, hey, Copilot, I'm looking for a place to stay there. On this website, stayness.com, copilot starts making recommendations. What do you think of this loft house? Now in the video, they're speaking back and forth, but there's music on the video. I don't know the copyright status of that music, so I'm not playing the video for that reason.
But this is an audio conversation that's happening. The user says, hmm, it's a bit pricey. The AI calls that person bougie. And they say, I'm not, I'm just looking for something nice, a little color on the walls, you know? And then the AI says, this one definitely has some color.
The user says, wow, it's giving me a headache. Aha. We don't want that. Wait, this one looks perfect. Minimal, modern. Eww. The user says, you're right, I love it. We're booking it. So that's their little demo of what this Microsoft copilot vision looks like.
Microsoft also updated the Bing generative search feature. They say today we're rolling out an expansion of generative search to cover informational queries, such as how to effectively run a one on one and how can I remove background noise from my podcast recordings? Whether you're looking for a detailed explanation, solving a complex problem, or doing deep research, generative AI helps deliver a more profound level of answers that goes beyond surface level results. To use it, you simply type Bing generative search into the search bar and you're met with some queries that you can use. There's also a deep search button on the results page and they do say it might be a bit slow right now.
Let's just try bing generative search. And sure enough, when we test it out here, we get a whole bunch of different potential prompts that we can use. I click on reduce podcasting noise. You can see we get an AI generated response here along with a table of contents. Another thing Microsoft is doing is they're starting to pay publishers if their content is surfaced in some of these generative search results.
Right now it looks like it's just big companies like Reuters, Axel Springer, Hearst Magazine, USA Today, and the Financial Times. I'm not clear if this is going to be something that roll out for smaller content creators, because that'd be kind of cool. If you write blog posts or make YouTube videos or something like that, and it responds with information that it pulled from smaller creators, it'd be cool for them to get compensated as well. I don't know if that's on the roadmap or not, though.
Right now it looks like all of the big news media outlets though, they're trying to work with them so that they can show results from their websites and also pay them when they do. And in the last bit of Microsoft news for the week, the head of Microsoft AI, Mustafa Solomon here wrote a letter sharing his thoughts on where he thinks all of this is going. And what he describes is essentially copilot turning into more and more of an agent for you. He says Copilot will be there for you, in your corner, by your side, and always strongly aligned with your interests. It understands the context of your life while safeguarding your privacy, data and security, remembering the details that are most helpful in any situation it gives you access to a universe of knowledge, simplifying and decluttering the daily barrage of information and offering support and encouragement when you want it.
Over time, it will adapt to your mannerisms and develop capabilities built around your preferences and needs. We are not creating a static tool so much as establishing a dynamic, emergent, and evolving interaction. It will provide you with unwavering support to help you show up the way you really want in your everyday life, a new means of facilitating human connections and accomplishments alike. Copilot will ultimately be able to act on your behalf, smoothing life's complexities and giving you more time to focus on what matters to you.
So what he's describing essentially sounds like an AI agent that's trained on you and what you want to use it for most, which I think is something that most people can probably get behind, as long as it's done in a safe and ethical way without impeding too much on your privacy or sharing too much personal data with the big companies.
All right, moving on to Google. Because Google had a handful of announcements this week as well, they've been making some updates to the Google lens tool, a tool where you can upload images and have it sort of search out those images of where they are in the web and give extra information around those images, things like that. Well, now it can actually understand videos as well. We can see in this demo here some fish schooling. And they talk to it and they say, why are they swimming together? And it looks at the video, understands what's in the video, and then actually gives an AI response based on what it saw in the video.
They're also adding that voice questions feature where you can talk to Google lens. So they take a picture of the sky here, and then they say, what kind of clouds are these? And then it gives them an AI response. But they asked that question vocally. It wasn't them typing the question. They're also adding this feature to shop what you see. So you see a backpack, you take a picture of it, and then it finds where you can actually purchase that backpack online. They're adding the ability to identify songs in their circle to search. Kind of like the Shazam app, where you just hold it open, listen to a song, and then it tells you what the song is. It sounds like that exact feature is going to be rolled out into Android devices.
They're also going to organize your search results using AI. And if you're still using Google to do a lot of your searches, you're probably going to start to see some of these changes take effect pretty dang soon. But Google makes most of its money through advertising, and so in this new world of AI, they have to figure out how to make money off of the AI responses as well. So we're going to actually start seeing ads inside of the AI overviews. We can see right here that it showed some sponsored messages.
So somebody searches how do I get a grass stain out of jeans? And then it gives an AI response of how to do it. Then when they scroll down a little bit, you can see right below the response. There's some sponsored results for things like Tidepen and Oxiclean related to their search. And in large language model news this week, we also got a new version of Gemini.
Gemini 1.5 flash eight B. This is a new small large language model that's 50% cheaper, two times higher rate limits and lower latency on small prompts. This is really for developers that use the API here and on benchmark tests. It looks like it performs pretty well compared to other models in a similar size.
Since we're talking about large language models, Nvidia announced a new large language model this week called NVLM D 72 B, and this is an open source large language model that is also capable of vision tasks. And according to this article, it rivals the leading proprietary models like GPT four O. And if we look at the benchmarks here, we can see that this NVLM D 1.072 b is actually pretty on par with GPT four vision model, and in one benchmark even outperforms GPT four O and Claude sonnet here, which is pretty impressive given the fact that this is an open source model and not a closed model like anthropic or OpenAI or Gemini.
Pinterest is rolling out generative AI tools for product image to advertisers. Pretty much the same thing we've seen in tools like Shopify and Amazon. You can upload an image of your product and it can remove the background or put it in a different scene, things like that. We've seen this roll out in all sorts of ecommerce platforms at this point. Well, now you're going to get it directly inside of Pinterest. There was some huge news in the world of AI imagery this week.
Black Forest Labs released a new model called Flux 1.1 Pro and they also made their API available. So Flux 1.1 Pro, you can use it right now over on together AI, replicate, file AI and I free pick and it's quite a bit improved. If you've seen things on like Twitter or X, where people are referring to blueberry. Blueberry was sort of the code name for Flux 1.1. Here's a little comparison that my buddy angry penguin put together.
You should definitely be following him over on Twitter if you're not already. He shares all sorts of cool AI announcements, but we can see the difference here of flux pro. The old model find me the stars align. The new model, find me where the stars aligned seems to be much better with text. You can actually see his prompt here if you want to duplicate that.
Here's another one. Sky's the limit. Sky's schlint. Here's another one. Sky's the limit. So it's much more understanding of what you're looking for, at least from the text side of things. Here's another example. Feel free to pause on any of these if you want to look at them more closely and grab the prompt. But we can see that the text and even the image is quite a bit better.
In this image, we can see the barbell is kind of going into the cat's head or maybe behind it. I can't really tell this one. You got the whole thing on the screen, and the text is exactly what was asked for. Here's some more examples. Here's another example of a ghibli style.
Old japanese city. Blue sky, sunny background, japanese temple, japanese traditional. Here's the original one. Here's the new one. Looks quite a bit better. A lot better color palette, in my opinion. Obviously, what's aesthetically pleasing is very subjective, but I find this to be a little bit more aesthetically pleasing.
Here's another example. We can just really see how prompted to hear it it is, because look at how big this prompt is. A vector illustration. A group of adorable, smiling ghosts wearing different color witch hats. We're getting that each ghost has a unique expression. Cheerful pumpkins with carved faces.
Background should be dark purple. It pretty much nailed every single element from this long prompt here. Here's another example. In the first one, it didn't even get the eye lift to eat. In the second one, it kind of nailed it. Maybe one of these straws is meant to be the eye. I don't know. A handwritten letter written in Old English and signed flux pro at the bottom. Look at that. Here's another example. And another example. And one final example.
So thanks again to angry penguin for sharing all of those with me and giving me permission to share them in this video. Feel free to go back and pause on any of those if you want to see the specific prompt that I didn't cover. Angry Penguin also gave me a quick tip. He said that you can actually use Flux Pro 1.1 right now for free if you use it over on the Glyph app website. Some of the other sites that are allowing you to use it for free are only allowing you to use it for free for like a day.
This one, it's free for now, and from what I understand it'll be free for a few weeks, but I don't know how long it's going to be free for. But at the time this video is going live, you can actually play around with Flux Pro 1.1 for free on Glyph. So let's go ahead and sign in real quick. Let's just try a monkey holding a sign that says subscribe to Matt Wolf. It got the to Matt Wolf part, but kind of missed the subscribe. Let's run it one more time.
This time it got it right with no issues. I actually love making images over here on Glyph because they show up in this feed down here and now anybody that goes and looks at this feed is going to have a monkey telling them to subscribe to Mat Wolf. But that's Flex Pro 1.1 and I will link up where you can use that below. You can go use angry penguins integration of it, or you can go build your own glyph with it in the workflow. We also got some updates out of Leonardo AI this week.
Now this is a company that I am an advisor for, so just keep that in mind when I talk about them. But when they do really good stuff, I talk about the good stuff. When they do stuff that I don't really like, I point out the stuff that I don't really like. So I try to remain fairly unbiased. But this is just the news. This week they rolled out a new style reference feature.
You can upload up to four reference images to direct the aesthetics of your image output. You can also adjust the strength of the reference image. They also rolled out a new image to image feature using the Phoenix preset. They've had image to image in Leonardo for a while, but the feature wasn't available to use with the Phoenix model, which is the model that's probably the best inside of Leonardo. But now image to image is available using the Phoenix model.
So if I jump in here, I go to image creation. Up here in my prompt box you can see a new little image icon. If I click on this, it gives me the option for style reference or image to image or a content reference, which is coming soon. But even if you don't have a style reference already, what's really cool is I can click on style reference here, go to the community feed, and if there's an aesthetic of an image that I really like, I can pull in that same aesthetic for the images that I'm about to generate.
So let's say I really like this painterly look here. Let's go ahead and pull that in, confirm it, and it's going to use that as a style reference. And I'll just put a simple prompt, a robot looking into the camera, and then hopefully it will do it in a similar style. And there we go. It gave me four generations.
This one's probably the best looking, but you can see it looks like a painted image that models the style that we have up here, but with a robot looking into the camera. Also, a sort of newer feature they rolled out is under the generation mode, they have this ultra mode. And what that's actually doing is it's upscaling all of these images right as they generate. So if I actually look at this image at full size, you can see it's a fairly large image. It's actually been upscaled right within the pipeline of generating the image. That's pretty cool as well.
There's also lots of new features rolling out for Leonardo soon, but as they roll out, I'll show them off. I'm not quite allowed to talk about them yet, but there is some exciting stuff. I'll share that as it rolls out. Adobe rolled out some new AI features inside of their Photoshop elements and premier elements products.
These are sort of stripped down versions of Photoshop and Premiere that don't have all of the features, but they're for more like casual users. You've got like object removal, new AI color correction features, depth of field simulation, and a handful of other smaller AI related features that were in the bigger platforms. But now the sort of more casual elements version of these platforms are getting these AI features as well.
Luma's Dream machine, which is one of the more popular AI video generation models, got an upgrade this week. They now have hyper fast video generation. They're ten x faster inference, so you can now generate a full quality dream machine clip in under 20 seconds. Pica made a bunch of waves this week with their new Pika 1.5 model. But most of what we've seen from this 1.5 model have been more of these types of videos where there's an object and you can see the object getting squished or here's some of my generations where it shows me sitting there and I get blown up and then I float away like a balloon, or getting crushed by a hydraulic press, or getting exploded. We've been seeing a lot of these types of videos, but it also seems like it should be able to do text to video, because all of these are like text to video generations, or possibly image to video generations. But for me, for whatever reason, I have not gotten text to video to work.
I actually tried starting to generate like a monkey on roller skates and a wolf howling at the moon. And these have been trying to generate for about 36 hours now. At this point, I'm not confident they're ever gonna actually generate, but these, like, meme type videos where you can squish yourself or cut something open like a piece of cake, those all work perfectly. I just gotta figure out how to get the text to image to actually work, because that's not working for me anymore for some reason.
But the videos they did show off, they look pretty dang impressive. Possibly cherry picked. I mean, most of the time when you're gonna see stuff on social media, it's gonna be cherry pick stuff. It looks really, really cool. So I'm excited to actually be able to generate with text to video. Just still haven't really quite gotten it to work yet.
ByteDance, the company behind TikTok, also revealed a new AI video generator this week that is said to rival Sora. That seems to be the benchmark that everybody compares the video generator models against is a model that none of us have actually gotten our hands on yet. But here's some examples of what can do. Here's a woman taking off her sunglasses, standing up and then walking away from the camera. It looks pretty good.
I mean, you can tell it's AI generated, but it looks pretty good and is generating at 10 seconds as well. Here's another one of a man, like, bowing down to a woman here and then looking back up at her, and then she's crying. That one looks pretty good, too, but it sort of feels like it's in slow motion still. Here's another example of, like, a black and white video zooming in on a woman's face who's wearing sunglasses. I mean, they're looking pretty good.
My buddy Tim over here at theoretically media actually did a breakdown video all about this new model. That's about nine minutes. I'll link that up below. If you want to sort of take a deeper dive into this model.
Here's something that's pretty cool. That's coming to steam. So if you're a gamer and you have the Steam game engine on your computer, there's this new thing called Dream World coming out where you can create any 3d asset and just drop it into the world that you're playing in. Here's the demo they put out around that. They type in giant King Kong, and then a giant 3d King Kong is now just like in that world. Black and gold Anubis statue.
I think I mispronounced that, but you could see it put that big statue there. Anything that they can imagine, they can just drop into the world. Now, this looks like it's a sort of bigger game worth, like, open world and challenges and things that you actually do. But then, like, one of the sort of cool, interesting things that make this game novel is that you can just think of anything and then drop it into your world.
Whether or not you can actually use those things, I'm not sure. Like, if you make a boat on the ocean, can I then jump into that boat and sail across the ocean? If I, you know, generate a car in this world, can I jump in the car and drive it around? I don't know. It seems like you just kind of can create the stuff and it's just there and added to your world, and you got, like, all the crafting and open world and sort of elements you get out of, like a valheim or a Minecraft or something like that, but with also the ability to just sort of drop 3d objects anywhere in the world. I don't know. I'll probably grab it and play around with it once it's available, though.
We also got the news this week that the governor of California, Gavin Newsom, vetoed SB 1047. I've sort of talked about that one enough. But it was the bill that would hold responsible the AI companies that made the model. If somebody else took that model and did something that caused catastrophic harm. So if somebody took the llama model, tweaked with it, and then figured out how to make, like, a chemical weapon that had catastrophic impact, llama would be held responsible, as well as the person who made the actual chemical weapons.
And all of the AI companies were fighting against this bill because they were basically saying, we just want to make better and better models. We don't know what people are going to use these models for in the future. And Gavin Newsom vetoed the bill. Most likely some regulation is going to come around this stuff, just not that bill specifically. It's only a matter of time before another one gets drafted up and goes through Congress.
And hopefully it's something that more people can agree upon, I guess. But while we're speaking of AI legislation, a judge actually blocked another AI bill that was related to deepfakes. The AB 28 39 bill, which was signed by Governor Newsom, was slapped down by the courts. AB 28 39 targets the distributors of AI deepfakes on social media, specifically, if their post resembles a political candidate and the poster, it's a fake. That may confuse voters.
The law is unique because it does not go after the platforms on which AI deepfakes appear, but rather those who spread them. And the judge basically said that this goes against the freedom of speech and that the only thing that's going to stick from this bill is that if you are going to spread a deep fake message with, like, political figures, you have to say that it was generated with AI. You could still spread them and share them.
You just have to say that they were made with AI without trying to fake people. That's the only part of the bill that stuck. Everything else from the bill, they basically said, no, that's against freedom of speech.
Amazon's rolling out some new fire tablets that are going to have AI tools built into them. Things like writing assistance, getting webpage summaries, and creating wallpapers from a prompt. I mean, I'm pretty sure at this point, like, every tablet that comes out, no matter who makes it, is going to start rolling out with AI features. It's kind of become like a necessity or like an expectation of these devices these days. But we're getting it inside the fire tablets.
And then finally, I want to end with this one because I thought this was sort of one of the cooler things I saw this week, which is a robust ladder climbing, quadrupedal robot we can see in this video. They actually created one of these four legged robots and designed it so that it can now climb ladders. Right now that it can only climb up ladders. It can't climb down ladders. But if we look at the robot itself, what really makes it unique is sort of the claw like hand that it's got so that it can grip over ladders and climb them.
And the idea being we can send robots into very high, dangerous places up ladders that we would normally send humans. And for safety reasons, wherever it seems smarter to send a robot, they would do that overdose, putting a human's life at stake. So I think that's pretty cool. We can see here this sort of digital twin world where they're all being trained on these ladders and I just love robots.
Robots are really, really fun. Whenever I come across new robots doing novel things that I haven't ever seen robots do, I'm probably gonna talk about it because yeah, robots are just cool. Hopefully they don't rise up and destroy us all once they get smarter and smarter AI. But let's not think about that right now.
Oh, and one more thing before I wrap up here. I am going to be helping to judge an AI hackathon up in Santa Monica on October 12 and 13th. It should be pretty cool because you don't actually have to be a developer yourself to participate in this hackathon. You can actually use AI to help you code, or you can be someone that actually knows code and we'll just see who comes up with the coolest product at the end of it and should be pretty fun. It's happening again October 12 and 13th.
You can go to hack dot cerebralbeach.com to learn more again. I'll be one of the judges. I'll be there, so be fun to meet some people in person there. And that's what I got for you today. If you haven't already, check out Futuretools IO. This is where I share all of the coolest AI tools I come across. I share all the AI news that I come across and I have a free newsletter where I'll share just the coolest tools and coolest news that I think you need to know about directly to your inbox.
It's all free. You can find it at Futuretools IO. Thank you so much for tuning in. If you want to stay on the cutting edge and stayed looped in with AI and the latest AI tutorials and how they're doing this stuff and the latest AI news and all that kind of stuff like this video.
Subscribe to this channel and I will make sure more of that stuff keeps showing up in your YouTube feed. I'll try to help you stay on the cutting edge of everything that's happening in this world. Thank you once again for tuning in and nerding out with me. I know this video is a little bit long. There was a lot that happened this week, but I appreciate you sticking with it and hanging out with me.
Artificial Intelligence, Technology, Innovation, Openai Updates, Ai Advancements, Microsoft Copilot, Ai, Artificial Intelligence, Futuretools, Futurism, Machine Learning, Deep Learning, Future Tools, Matt Wolfe, Ai News, Ai Tools, Ai News, Ai Tools, Openai, Google Ai, Microsoft Ai, Ai News Today, Ai News This Week, Ai, Tech, Chatgpt, Meta, Ai Video Generator, Ai Video, Gemini, Artificial Intelligence, Open Ai, Sora, Matt Wolfe, Chat Gpt, Luma Ai, Sam Altman, Chatgpt 4O, Generative Ai, Copilot, Microsoft, Ai Revolution, Matt Wolfe
Comments ()