Feb 18, 2025 Article

The Evolution of AI Learning: From Zero-Shot to Multimodal Magic

Imagine watching a toddler learn about the world. They point at a dog and call it a “doggy,” then see a cat and—after a puzzled pause—maybe also call it a “doggy” until someone corrects them. Bit by bit, the child figures out new things, sometimes with examples and sometimes by guessing from what they already know. In many ways, AI models learn in a similar staged process. In fact, the journey of modern AI Large Language Models (LLMs) can be compared to a child growing up: starting from making educated guesses with no guidance (zero-shot), to learning from one example (one-shot), then a few examples (few-shot), and finally learning to use multiple “senses” like vision or hearing (multimodal).

In this engaging journey, we’ll explore how LLMs evolved through these stages. Don’t worry – you won’t need any programming background to follow along. We’ll use fun analogies (think students, games, and even cooking) to explain each step. By the end, you’ll see how AI went from being a good guesser to a multi-talented assistant that can see and hear the world like we do. So, let’s dive in!

Zero-Shot Learning: Guessing Without Examples

What is zero-shot learning? It’s essentially when an AI model is asked to perform a task without being shown any examples of how to do it right now. The model has to rely on what it already learned during its training (kind of like relying on intuition or prior knowledge). It’s called “zero-shot” because you give zero examples in the prompt or query.

Analogy – A student facing an unfamiliar test question: Imagine you’re a student walking into a test, and one question completely blindsides you – it’s something you were never explicitly taught. Still, you don’t just leave it blank; you rack your brain for related things you do know and make an educated guess. Maybe you recall a similar concept from another class or use common sense to reason out an answer. That’s zero-shot thinking in action! The student uses general knowledge and logic to tackle a new problem on the spot.

LLMs in their early days had to do the same. They were trained on huge amounts of text (like reading literally all the books and websites they can get). From that training, they picked up grammar, facts, and some reasoning abilities. So if you asked an early language model a question or gave it a task it hadn’t seen before, it would try to generalize from what it did know. It’s a bit like how you might figure out a riddle by connecting dots from your existing knowledge.

Example – Identifying a “zebra” without having seen one: One classic illustration of zero-shot learning is with images: say the AI has seen lots of horses but never a zebra. If you tell the AI (or a person) “a zebra is like a horse with black-and-white stripes,” then show an image of a strange striped animal, it can guess “hey, that must be a zebra!” even though it’s never seen a zebra before. It’s making an educated guess based on the description and its knowledge of horses. In technical terms, the AI is transferring knowledge about horses (known) to identify a zebra (unknown) using just a simple description as a bridge. Pretty clever, right?

In text terms, an LLM doing zero-shot learning might be asked, “Translate this sentence into Spanish,” without any example of a translated sentence. A well-trained model (like today’s GPT-based AIs) can often do it, because somewhere in its training it learned a bit of Spanish and the concept of translation. Just like a student pulling from memory, the AI is pulling from its vast training data to produce a reasonable answer with no new examples given.

Zero-shot capabilities were a big deal because they meant AI could be useful in tasks it wasn’t explicitly trained for. It’s as if you hired an employee who had read every book in the library – even if they never did your specific task before, they might still figure it out from related knowledge. Early breakthroughs in LLMs showed surprising zero-shot skills. They could answer questions, summarize text, or even do simple math problems without any task-specific training, simply by interpreting your request and winging it with what they knew. This was the first step in making AI more flexible and “smart” in a human-like way, using context and intuition much like we do when encountering something new.

Of course, guessing has its limits. The student facing that unknown test question might do a decent job if they’re clever, or they might completely miss the mark. Likewise, an AI’s zero-shot response might be impressive or amusingly off-target. How to improve it? That brings us to giving the model a bit of help – in the form of examples.

One-Shot Learning: Learning from a Single Example

What if instead of no help, we give our AI model one example of what we want? This is called one-shot learning (one-shot prompting in the case of LLMs). With just a single example or demonstration, the model often does a much better job on the next similar task. It’s like saying, “Here’s one prototype or solved problem – now do it yourself for a new case.”

Analogy – A teacher shows one example problem: Think back to school again. Suppose you’re about to learn a new type of math problem. The teacher writes one example on the board, works through the solution, and then says, “Now you try it with this similar problem.” With that one example to guide you, you have a template to follow. Even if you’ve never seen that exact kind of question before, having one solved example makes it a lot easier to get the idea. You go, “Oh, I see how it’s done from that one case!” and then you apply the same method.

AI models can use the same strategy. If we give an LLM a single example in the prompt (often literally providing an input and the desired output for that input), the model sees the pattern and imitates it for the next input. For instance, if you want the model to format an address in a certain way, you might say:

“Example:
Input: John Doe, 123 Apple St. -> Output: Doe, J. – 123 Apple St.
Now your turn:
Input: Jane Smith, 456 Orange Ave. -> Output: …?”

With that one-shot example, the AI will likely follow the pattern and output “Smith, J. – 456 Orange Ave.”. The single demonstration guides it.

Analogy – Learning a new gesture with one demo: Humans are actually pretty good at one-shot learning. Consider a simple gesture or game: If someone teaches you a cool new handshake or a dance move just once, chances are you can remember it and do it yourself. Or if a friend shows you one example of how to toss a paper airplane in a special way, you might catch on immediately. Research in AI often points out that children can learn a new action from just one or two demonstrations. For example, a child might learn the concept of giving a “high-five” after seeing it done just one time! They observe, internalize the pattern (“Ah, we slap palms as a greeting, got it”), and then replicate it.

LLMs with one-shot prompting are like that observant child. Once GPT-3 (an influential large language model released in 2020) came around, people noticed it could learn from just one example in the prompt. It was a striking improvement – the answers got more accurate and in line with what was wanted. If zero-shot was the model guessing in the dark, one-shot was like giving it a flashlight: even one light helps it see the path.

A real-life illustration: Suppose you ask an AI, “Convert the following sentence to a polite request: ‘Give me water.’”Without examples (zero-shot), the AI might try something reasonable like “Could you give me water, please?” Now if you instead first show it one example – “Convert: ‘Open the door.’ -> ‘Could you please open the door?’” – and then ask it to convert “Give me water,” it will almost certainly respond with the polite phrasing you expect. The single example sets the style and lets the model know exactly what you mean by “convert to a polite request.”

In summary, one-shot learning for LLMs means show-one, do-one. It taps into the model’s mimicry ability. Large language models love to imitate patterns – after all, they learned language by seeing tons of examples. Give them one more example for a new task, and they’ll happily follow suit. It’s a bit magical that you don’t need to re-train the whole model; just instructing it with one example on the fly is enough to teach it a new trick. This was a significant step in AI evolution: it showed that these models could be quite adaptive and quick to learn, almost like a person who only needs to see something done once to understand it.

Few-Shot Learning: When a Few Examples Do the Trick

If one example helps, what about a few more? Few-shot learning is the idea of providing a handful of examples (say anywhere from two up to maybe five or ten) before asking the model to perform the task on a new input. With each additional example, the model gets an even clearer idea of what you want. It’s like training wheels for a task, but you only use a few before the AI rides off on its own.

Analogy – Learning a game by watching a few rounds: Imagine you’re about to play a new card game with friends. You have no clue what the rules are. In the zero-shot scenario, you’d just jump in and play randomly (not ideal!). In a one-shot scenario, maybe someone shows you one example round and then expects you to play. But with few-shot, you might sit out and watch, say, three rounds of the game. By the end of those few rounds, you’ve seen enough patterns to get how the game works – what good moves look like, what the goal is, etc. Now when it’s your turn to play, you’re not just guessing; you’re imitating the strategies you observed and you have a much better chance to succeed.

Similarly, if you were learning to cook a new dish, you might want to see a recipe example or two, or taste a few samples, before improvising your own. Or think about learning new vocabulary in a language: seeing one example sentence is good, but seeing it used in three or four different sentences really locks in the meaning and usage.

For AI, few-shot prompting means we give the model several example input-output pairs in the prompt. It’s basically saying “Here are a few Q&As (or translations, or math problems, etc.) as examples. Now here’s a new question: please apply the same pattern.” The great thing is, modern LLMs are capable of understanding the common pattern from those examples and carrying it to the new query. With a few shots, the AI’s performance often becomes much more reliablethan with zero-shot. The model has been shown the way multiple times, so it’s far less likely to stray off course.

Why is few-shot so powerful? Because it leverages the model’s strength: recognizing and reproducing patterns. During training, these AI models essentially did a giant fill-in-the-blank on millions of sentences and examples. So giving a few examples at query time triggers that ability – it’s like, “Oh, I see what pattern you want! Let me continue it for the next one.” Few-shot learning demonstrated that even without explicit re-training, an AI could adapt to lots of different tasks on the fly. Researchers were stunned – it’s as if the model had a bunch of latent skills that just needed a couple of examples to activate. Indeed, one research paper pointed out that humans can learn broad concepts from little data, like picking up the basics of a game like Pac-Man after just a few tries or observations, and our AIs are inching closer to that kind of flexibility.

In practical terms, if you have an AI writing poetry and you want it in a certain style, you might feed it 3 sample poems in that style (few-shot) before asking it to compose. The result will be a poem much closer to what you envisioned than if you just said “write a poem” with no examples. With a handful of demonstrations, the AI “gets the vibe.” Few-shot learning was a turning point for LLMs like GPT-3 because it meant one model could perform many tasks—writing code, translating text, solving riddles, you name it—just by being prompted with the right few examples, rather than needing separate training for each task. It’s like having a Swiss Army knife that sharpens itself a bit each time you show it an example of the cut you need.

By now, our AI models have graduated from being pure guessers to being quick studies. But all of this was still in the realm of text. The AI was reading text and writing text. Humans, however, learn using multiple senses – we look, listen, maybe touch and smell – to understand the world. Wouldn’t it be cool if AI could do more than just read a page? Enter the era of multimodal learning.

The Rise of Multimodal Learning: Teaching AI to See and Hear

Up to this point, when we talked about LLMs, we implicitly meant they deal with language (text) only. You type a prompt, they type back an answer. But the world is so much richer than text – we have images, sounds, videos, all sorts of data. Multimodal learning refers to AI models that can handle and combine multiple types of input and output, not just text. In other words, an AI that can not only read but also see images, listen to audio, or even take in video – much like how we humans use our eyes and ears together.

Analogy – Combining senses like a human: Consider how you experience life every day. If you’re cooking, you’re reading a recipe (text) and looking at the ingredients (vision) and listening to the sizzle in the pan (sound) and maybe smelling the aroma. Your brain integrates all that information to make sure the dish turns out right. If one sense is missing, you can still cook, but having all senses gives a fuller understanding (you’d know if something’s burning because you can smell it or hear the crackle!). Similarly, when someone talks to you face-to-face, you’re not only hearing their words; you’re also perhaps reading their lips or noticing their facial expression. The combined input helps you understand the message better.

For a long time, AI models were single-modal – early image recognition AI could only see images, but not read text; conversely, LLMs could read text but ignored images. Each was like a specialist with one sense. Multimodal AI is the merging of these senses. A modern multimodal LLM might take an image and a prompt together, and produce an answer that involves both. For example, you could show the AI a photograph of a weird gadget and ask, “What is this used for?” The AI would analyze the image (vision) and your question (text) and then answer in text. This is something humans do effortlessly (“Oh, that photo shows a can opener, used for opening cans”), and now AI is getting pretty good at it too.

A big milestone in this was when OpenAI released GPT-4 in 2023. Unlike its predecessors, GPT-4 is multimodal – it can accept both text and images as inputs. That was a game-changer. Suddenly you could prompt the AI with pictures! During the live demo, they showed how GPT-4 could describe an image and even explain a joke in a cartoon. For instance, they fed GPT-4 an image of a humorous scene: a squirrel holding a camera taking a photo of a nut. When asked why it was funny, the model responded with a perfectly logical explanation of the joke (basically recognizing that a squirrel taking a photo of a nut mimics a human photographer, which is absurd and cute). This demonstrated visual understanding combined with language – the AI “saw” the image and then “spoke” the explanation.

They didn’t stop there. In another demo, the presenter drew a quick sketch of a website layout on a napkin – literally a hand-drawn mockup with a header, some text, and a goofy button – and gave that image to GPT-4 with a prompt to make a website out of it. Lo and behold, GPT-4 generated functional HTML/CSS code for a webpage that matched the sketch! It was as if the AI took a design drawn by a human and translated it into a working site, all by understanding the drawing (image) and the instruction to produce code (text). Imagine showing a child a drawing of a website and that child writes the code for it – that’s the level of multimodal understanding we’re talking about.

Another example of multimodal prowess is in content creation. You might have heard of DALL·E, which is an AI model that works in the opposite direction: you give it text, and it produces an image. For example, if you say “a flying car shaped like a banana,” DALL·E will paint you exactly that image in surprising detail. In simple terms, you provide a description, and DALL·E will generate an image, even for something that doesn’t exist in reality. This is also multimodal magic – the model understands language and visual art together. While DALL·E itself isn’t an LLM (it’s mainly an image generator), it’s a cousin in the multimodal family, showing how AI can connect text and vision in creative ways.

Multimodal learning isn’t limited to vision and text. Researchers are adding audio into the mix as well. There are AI models now that can listen to audio and respond with text or even with voice. For instance, imagine an AI you could talk to, that hears your question (audio), understands it, and then looks at a picture you send it (image), and finally answers you in speech (audio output) or text. We’re basically describing a future Jarvis from Iron Man – an assistant that can see and hear and talk. In fact, as of 2024, OpenAI has been exploring a model called GPT-4 “o” (for omni), which is designed to handle text, images, and audio together. They describe it as a model that “accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs”. In other words, it’s aiming to be an AI that can take in all sorts of sensory information and respond intelligently, almost like a person would. This is the cutting edge of AI research – giving our AI friends more senses and the ability to integrate them.

Why does multimodality matter? Because real-world problems often involve multiple kinds of data. If you want an AI assistant to help you around the house, it’s not enough for it to understand your words; it might need to see the house to navigate or hear a smoke alarm go off to alert you. If a doctor is diagnosing a patient, they use visual data (X-rays, charts) and textual data (medical history, patient explanations) together. Multimodal AI aims to mirror that: it can take an X-ray image and a doctor’s notes and consider both when suggesting a diagnosis. In fact, researchers at Google and others are actively working on medical AIs that do exactly this – for example, a system where a language model (text-based) can consult a radiology model (image-based) when you ask it about an MRI scan, effectively combining their skills.

Now that our AI models have learned to see, read, and maybe even hear, what are they doing with these abilities? Let’s look at some real-world applications that this evolution from zero-shot to multimodal has enabled.

Real-World Applications: AI in Action Across Domains

The evolution of LLMs from text-only brains to multimodal wizards has unlocked a ton of exciting applications. Here are some real-world examples of how these AI systems are being used (or soon could be) in different areas:

Creative Tools & Design: AI is boosting creativity by working with different media. For instance, as demonstrated with GPT-4, you can feed a hand-drawn app interface to an AI and have it generate the code for you. This means a designer can sketch an idea on paper, and the AI builds a prototype – speeding up app and web development dramatically. Similarly, writers and artists use multimodal AIs for inspiration: you can ask an AI to “draw”an image and then write a story about it, or vice versa. Text-to-image generators like OpenAI’s DALL·E allow anyone to create artwork from a simple sentence, which is amazing for illustrators or educators who need quick visuals. Musicians are exploring AIs that can take a melody (audio) and suggest lyrics or continuations (text) – a multimodal collaboration between your voice and the AI’s language ability. The bottom line: whether it’s coding, drawing, or writing, AI is acting like a creative assistant that can switch mediums on the fly.

Healthcare and Medicine: Multimodal LLMs are making their way into healthcare to help doctors and patients. A great example is in medical imaging and diagnosis. Researchers have built systems where you can show the AI a medical image (like an X-ray, MRI, or a pathology slide) and ask it questions in plain language. One such system, developed by a team including Microsoft Research and University of Washington, is capable of analyzing nine different types of medical images and answering questions about them for the doctor. Imagine uploading a chest X-ray and asking, “Do you see any signs of pneumonia?” – the AI can highlight relevant areas and explain its findings in simple terms. It’s like having a super well-read medical assistant that can cross-reference visuals and texts. Another scenario: a patient’s electronic health record contains written notes, lab results (maybe as charts), and even scans. A multimodal AI could theoretically synthesize all that information to help the doctor see patterns across different data sources. While AI won’t replace doctors, it’s being used to support decision-making, catch things humans might miss, and save time on analysis by handling some of the heavy lifting of reading and looking through medical data.

Accessibility for People with Disabilities: This is one of the most heartwarming applications of multimodal AI. For people who are blind or have low vision, AI that can describe the visual world is a game changer. There are smartphone apps (like Microsoft’s Seeing AI and others) that use the phone camera to see and then narrate the scene to the user. For example, point your phone at a street intersection and the AI might say, “There’s a red stop sign on the corner and a person with a dog crossing the street.” Recently, the app Be My Eyes started integrating OpenAI’s GPT-4 Vision technology as a “Virtual Volunteer.” A user can send a photo (say, of the contents of their fridge or a bus schedule) and ask the AI questions about it. GPT-4 not only recognizes what’s in the fridge, but can even suggest recipes with those ingredients! Now that’s helpful. Essentially, it’s giving someone who can’t see the ability to query images as if they had a human helper describing and reasoning about the scene. Similarly, for deaf individuals, AI can transcribe spoken words to text instantly (a mode of audio-to-text). For people with speech impairments, AI vision can interpret sign language into text or speech. All these are in the works, and some are already usable. It’s about using multimodal AI to break communication barriers and grant greater independence to those who need it.

Everyday Assistants and Beyond: Think about personal assistants like Siri or Alexa – they’ve been primarily voice-and-text based (you talk, they respond with words). Now imagine a next-gen assistant that you can also show things to. You might say, “Hey AI, what’s the name of this plant?” while snapping a photo of it – and the assistant can identify the plant from the picture and tell you. Or you could have a home robot with a camera and microphone: you ask it, “Did I leave my keys on the kitchen table?” and it can go check visually and answer you. This convergence of seeing + hearing + speaking is making its way into consumer tech. Companies are certainly looking at combining smart speakers with cameras (some already have screens) to create a more holistic helper. On the fun side, we’re seeing toys and educational tools that use multimodal AI – for example, an interactive globe where kids can touch a country (input), and the AI will talk about that country (output), or vice versa ask the kid to find something on the map. The possibilities are endless now that AI can engage with multiple forms of input.

These examples just scratch the surface. What ties them all together is the fact that modern AI can juggle different modes of information almost like a human can, enabling applications that were pure science fiction not long ago. From helping doctors see patterns in scans, to empowering a blind user to “see” through descriptions, to letting you create with words and pictures hand-in-hand – multimodal LLMs are broadening what AI can do for us in daily life.

Challenges and Future Prospects

With great power comes great responsibility – and also great challenges. As LLMs have grown more capable, we’ve also discovered a host of hurdles and questions on the road to truly human-like AI understanding.

Challenges today:

Understanding vs. Imitation: Even with zero-shot, one-shot, few-shot learning, and multimodal inputs, AI models don’t truly “understand” things the way humans do – at least not yet. They are fantastic at pattern matching and probability crunching. This means sometimes they get things confidently wrong. You might have heard of AI “hallucinations,” where a model just makes up facts or sees things in an image that aren’t there. For instance, a multimodal AI might describe a nonexistent object in a blurry photo because it guessed wrong. OpenAI noted that GPT-4 still has many limitations, including social biases and hallucinations, despite its advancements. Ensuring the AI’s guesses don’t turn into misinformation is an ongoing battle.
Data and Training: To learn all these skills, especially multimodal ones, AI models need to be trained on large and diverse datasets – text, images, etc. Getting high-quality, inclusive data is tough. If the data has biases or gaps, the model’s performance will reflect that. Think of how a child’s worldview is shaped by their upbringing; similarly, an AI’s “worldview” (or the range of things it can do) is shaped by what it was trained on. If we want an AI to understand medical images and also cook recipes and also converse about philosophy, it needs exposure to all those domains. Training such a comprehensive model is a massive undertaking, not to mention super expensive in terms of computational power. Not every company or researcher has the resources to do it, which is why only a few big players have developed these giant models so far.
Multimodal Fusion: Getting a model to handle one modality is hard; getting it to synchronize multiple modalities is even harder. The AI has to decide how to pay attention to an image versus the text of a question, for example. In a conversation, if there’s an image and a long dialogue history, the model has to juggle both. It’s a bit like patting your head and rubbing your tummy at the same time – doable with practice, but not straightforward. Sometimes multimodal models can be tricked or confused if one modality says something contradictory to another. For instance, if an image shows a cat but the text description with it says “a dog,” the AI might get puzzled on what to trust.
Ethical and Privacy Concerns: A model that can see and hear raises additional ethical issues. With text-only AI, you worry about it saying something offensive or wrong. With vision, what if it recognizes people’s faces or reveals someone’s private information from a photo? In fact, AI models are often deliberately trained not to identify real people in images for privacy reasons. Developers have to put guardrails so that a multimodal AI doesn’t become a creepy surveillance tool. There’s also the risk of misuse – for example, generating deepfake images or audio that sound very real, which could spread misinformation. Society will need to set norms and perhaps regulations for how far this should go, balancing innovation with protection against abuse.

Despite these challenges, the trajectory of AI is incredibly exciting. So, what’s next?

Future Prospects:

Even More Modalities: We have text, images, audio, and a hint of video. The future could bring AI that handles sensory data like touch or smell (for robotics applications, say). Researchers are already talking about “embodied AI” – models that can power robots, allowing them to not just chat, but also move around and interact with the physical world. Imagine an AI that could take in temperature sensor data or the feel of an object and include that in its reasoning (e.g., understanding something is fragile by the way it feels). While that’s a bit further out, it’s a logical extension: humans use all five senses; maybe advanced AI could too, one day.
Real-Time Multimodal Interaction: The goal is for AIs to handle multiple inputs in real time. There’s an early glimpse of this with things like GPT-4’s vision feature or the prototype GPT-4 “omni” which aims to handle live audio and video streams. Future AI assistants might watch a live video feed and talk you through what’s happening (“the drone’s camera shows your roof gutter is clogged with leaves, you should clean it before it rains”), almost like having an expert pair of eyes always available. Real-time language translation with vision is another idea – you point your phone at a street sign in a foreign country and the AI not only translates the text but also speaks it to you and overlays it visually on your screen. We’re partway there with current tech, but LLM-powered versions could be even more fluent and context-aware.
Better Understanding and Reasoning: As these models integrate more data types, developers are also working to make them reason more reliably. Techniques like “chain-of-thought prompting” (basically getting the AI to explain its reasoning step by step) might reduce silly mistakes. In the future, an AI might not just blurt out an answer; it might internally double-check – look at an image, describe it to itself, verify against knowledge, and then answer. This could mitigate issues like hallucinations or misidentifying an image. The holy grail would be an AI that can learn new concepts on the fly in a multimodal way – for example, you teach your personal AI what your pets look like and what their names are, and thereafter it recognizes them in images and talks about them by name. Some small-scale versions of this personalization exist, but it could become much more sophisticated.
Collaboration between AIs and Humans: The future will likely have humans and AI working in tandem, each doing what they do best. Multimodal AIs will be like super assistants in creative projects (helping filmmakers edit videos by understanding both the script and the footage), scientific research (analyzing experimental data and reading research papers to suggest insights), and education (tutoring a student by listening to their reasoning, looking at their work, and guiding them accordingly). The AI might even have a “sense” of when a human is confused (maybe by analyzing a video feed of the student’s facial expression) and adjust its teaching approach – a truly multi-sensorial tutoring system.

Conclusion

In conclusion, the journey from zero-shot to one-shot to few-shot to multimodal has dramatically expanded what AI can do. Not long ago, an AI that could hold a decent conversation was itself a marvel. Now we have AI that can converse, draw, see, and listen in varying degrees. It’s as if the AI went from being a brilliant but single-minded bookworm to a well-rounded polymath with eyes and ears. There are certainly challenges to overcome and lessons to learn (for both the AI and us humans guiding it). But the trajectory is clear: AI models are becoming more adaptable, more context-aware, and more human-like in how they learn and perceive.

Just like a child growing into a capable adult, our AI “children” are growing up. They started by learning to make do with no examples, then with a bit of guidance, and now they’re venturing into the world with multiple senses. It’s a fascinating evolution – one that we’re all witnessing and participating in. Who knows what the next chapter holds? Perhaps one day we’ll be talking about AI that can smell and taste (an AI chef, anyone?). For now, one thing is certain: the era of multimodal AI is here, and it’s turning yesterday’s science fiction into today’s reality.

Congratulations! You made it through this exploration of AI’s learning journey. From a zero-shot guesser to a multimodal understander, LLMs have come a long way – and they’re just getting started. Whether you’re simply curious or thinking of ways to use these AI advancements in your life or business, there’s plenty of reason to be excited about the road ahead. After all, we’ve taught the machines to see and hear; maybe next we’ll teach them to dream. 😊

Subscribe to our newsletter

No spam, no sharing to third party.