The CEO of Zoom wants AI clones in meetings

The CEO of Zoom wants AI clones in meetings - 49 minutes read

Today, I’m talking with Zoom CEO Eric Yuan — and let me tell you: this conversation is nothing like what I expected. Eric started Zoom after working at Cisco and realizing there was an opportunity to make videoconferencing simpler and easier to use. And he was right: Zoom is now a household name — especially after usage exploded during the pandemic.

But usage has since come down, and Zoom faces a number of business challenges he and I talked about. Yet, it turns out, Eric wants Zoom to be much, much more than just a video chat platform. He wants to take on Microsoft and Google in the enterprise software market by making docs and email and other productivity tools like chat. And like virtually every other company, Zoom now has a big investment in AI — and Eric’s visions for what that AI will do are pretty wild.

See, Eric really wants you to stop having to attend Zoom meetings yourself. You’ll hear him describe how he thinks one of the big benefits of AI at work will be letting us all create something he calls a “digital twin” — essentially a deepfake avatar of yourself that can go to Zoom meetings on your behalf and even make decisions for you while you spend your time on more important things, like your family.

I’ll just warn you: I tried to ask a bunch of the usual Decoder questions during this conversation, but once we got to digital twins going to Zoom meetings for people, I had a lot of follow-up questions. How many digital twins might you have? How will they all stay in sync? Can you trust them? What work will be left if everyone is sending their digital twins to all the meetings?

Eric was more than game to talk about these ideas with me, and this became a very different kind of CEO interview on Decoder. I haven’t stopped thinking about it since we recorded it. I think you’re going to like it.

Okay, Eric Yuan, founder and CEO of Zoom. Here we go.

This transcript has been lightly edited for length and clarity.

Eric Yuan, you are the founder and CEO of Zoom. Welcome to Decoder.

Thank you for having me. Appreciate it.

I’m very excited to talk to you. It feels like everything in software is a little topsy-turvy because of AI, and you’re making a big investment in AI. Zoom was one of the pandemic winners it felt like. It became a household name, and now you’re changing the product in real ways. There’s a lot to talk about.

Let’s start at the very beginning. Everyone knows Zoom as a videoconferencing app. You’ve just released a bunch of new features. You have workplace features. You have AI features. How do you think about Zoom right now?

I think, for now, we are embarking on a 2.0 journey. You are right on. Looking back at 1.0, it was more about building some applications; videoconferencing is one of them. Our slogan was “Work Happy.” Right now, [when] you look at a 2.0, it is different. It’s “Work Happy with the Zoom AI Companion” and everything really about Workplace, the entire collaboration platform as well as AI.

I’ve heard a version of this story from a variety of different kinds of enterprise software providers recently. My favorite is Squarespace. You think of Squarespace as a web company, and they’re saying, “Actually, we’re going to run your back office for you, and the website is just an expression of scheduling.” Is there a reason this seems to be a theme for enterprise companies, that you’ve got one core feature and then you’re going to get bigger and take over more of what happens in an office?

You’re so right. That’s probably the same for almost every enterprise software-as-a-service company. Remember, they all offer a service, right? However, given the AI era, everyone is thinking about how to leverage AI more and more, to make everything fully automated, to reduce manual work, and to reduce human involvement.

When you think about the elevator pitch for Zoom, you had the founder, and you had to go raise money once upon a time. In the beginning, it was very simple, right? Videoconferencing is very hard. It requires some dedicated hardware and expensive connections, and Zoom is going to be as simple to use as a consumer app. It’s videoconferencing, but simple. What’s the elevator pitch now?

With conferencing, again, this is one app. If you look at your calendar, it is not only to join your video meeting but also a lot of other things. You read emails, send a chat message, make a phone call, have a whiteboard session, schedule something with external third parties. What we are doing now, it’s really looking at your entire schedule, how to leverage Zoom Workplace to help you out. Essentially, you can leave Zoom Workplace, and Zoom Workplace can help you get most of your work done, right? That’s our pitch. We are not there yet.

Today for this session, ideally, I do not need to join. I can send a digital version of myself to join so I can go to the beach. Or I do not need to check my emails; the digital version of myself can read most of the emails. Maybe one or two emails will tell me, “Eric, it’s hard for the digital version to reply. Can you do that?” Again, today we all spend a lot of time either making phone calls, joining meetings, sending emails, deleting some spam emails and replying to some text messages, still very busy. How [do we] leverage AI, how do we leverage Zoom Workplace, to fully automate that kind of work? That’s something that is very important for us.

We have a big audience of product managers, engineers, and designers. I think what you’re saying is they’re going to send AI avatars to their stand-ups every morning.

More than that. It’s not only for meetings. Even for my emails. I truly hate reading email every morning, and ideally, my AI version for myself reads most of the emails. We are not there yet.

This is a huge vision that a lot of what you do in the workplace is busy work or status check-ins or non-decision-oriented conversations, and you can automate that in some way or send a version of yourself to quickly communicate whatever you need to communicate. You can go through your email in an automated way. This is a big dream. I’ve heard this dream a lot. I want to talk about how you get there and how you’re building Zoom to get there, but just at a very foundational level, what work is left, do you think, for the average person if the AI is doing all of that stuff?

I think for now, the number one thing is AI is not there yet, and that still will take some time. Let’s assume, fast-forward five or six years, that AI is ready. AI probably can help for maybe 90 percent of the work, but in terms of real-time interaction, today, you and I are talking online. So, I can send my digital version — you can send your digital version. Again, not like an in-person meeting. If I stop by your office, let’s say I give you a hug, you shake my hand, right? I think AI cannot replace that. We still need to have in-person interaction. That is very important. Say you and I are sitting together in a local Starbucks, and we are having a very intimate conversation — AI cannot do that, either.

The heart of Zoom is still the videoconferencing product. That’s how I think of it; that’s how most people think of it. Are you saying that Zoom, the videoconferencing product, will have AI avatars in it mostly, and it’ll push us into more in-person meetings?

I think two things. First of all, we are way more than just a videoconferencing business. We have a lot of other new capabilities, and essentially, that’s the entire Workplace platform. It’s a collaboration platform. That’s one thing. But if you look at just videoconferencing itself, I think we can leverage AI more and more. You do not need to spend so much time [in meetings]. You do not have to have five or six Zoom calls every day. You can leverage the AI to do that.

You and I can have more time to have more in-person interactions, but maybe not for work. Maybe for something else. Why do we need to work five days a week? Down the road, four days or three days. Why not spend more time with your family? Why not focus on some more creative things, giving you back your time, giving back to the community and society to help others, right? Today, the reason why we cannot do that is because every day is busy, five days a week. It’s boring.

So, just to be clear, this is a great conversation already. You’re saying Zoom is going to get us closer to the four-day workweek because AI will be taking Zoom calls?

Absolutely. Not only Zoom calls but also all other work we’re doing every day. Chat and messaging, phone calls, emails, whiteboard, coding, creative tasks, manager tasks, project management — all of those things together with AI help and new applications. That’s the direction. That’s part of our Workplace [platform]. It’s our 2.0 journey.

I have a million questions. Let me ask this one first. Just thinking about your own calendar: You’re the CEO. You have a lot to do. I’m sure you’re the busiest. What would you hand off to AI today if the AI was capable?

I think many things. First of all, I can tell you I hate my calendar. Every morning, when I look at my calendar, oh my god, there’s so many things. Even before I start, I know today, I have maybe nine or 10 meetings. In between, I need to check emails, read messages, and make phone calls. I’m not happy when I look at that.

Second is how did I get here? Because most of the time, my admin, she had scheduled some meetings. I occasionally also schedule something in my calendar as well. Again, every time either myself or my admin, when you schedule a meeting, it’s not just 30 seconds of work. You need to coordinate to do so many things. That’s the second thing — how we got here. The third thing is you look at all those meetings [and decide] which meeting you have to join, which meeting [is] optional. You also do not—

I’m asking you which meetings do you look at and think you would hand off?

I started with the problem first, right? And last but not least, after the meeting is over, let’s say I’m very busy and missed the meeting. I really don’t understand what happened. That’s one thing. Another thing for a very important meeting I missed, given I’m the CEO, they’re probably going to postpone the meeting. The reason why is I probably need to make a decision. Given that I’m not there, they cannot move forward, so they have to reschedule. You look at all those problems. Let’s assume AI is there. AI can understand my entire calendar, understand the context. Say you and I have a meeting — just one click, and within five seconds, AI has already scheduled a meeting.

At the same time, every morning I wake up, an AI will tell me, “Eric, you have five meetings scheduled today. You do not need to join four of the five. You only need to join one. You can send a digital version of yourself.” For the one meeting I join, after the meeting is over, I can get all the summary and send it to the people who couldn’t make it. I can make a better decision. Again, I can leverage the AI as my assistant and give me all kinds of input, just more than myself. That’s the vision.

I’m assuming when you looked at your calendar today and saw a Decoder session. You were going to come to that on your own. What would you have sent an AI avatar to instead?\

I think an AI avatar is essentially just an AI version of myself, right?

Essentially, in order to listen to the call but also to interact with a participant in a meaningful way. Let’s say the team is waiting for the CEO to make a decision or maybe some meaningful conversation, my digital twin really can represent me and also can be part of the decision making process. We’re not there yet, but that’s a reason why there’s limitations in today’s LLMs. Everyone shares the same LLM. It doesn’t make any sense. I should have my own LLM — Eric’s LLM, Nilay’s LLM. All of us, we will have our own LLM. Essentially, that’s the foundation for the digital twin. Then I can count on my digital twin. Sometimes I want to join, so I join. If I do not want to join, I can send a digital twin to join. That’s the future.

How far away from that future do you think we are?

I think in a few years, we’ll get there, but we’re just at the beginning. The reason why is because of two problems. The first problem is today, look at the large language model itself — it just started. A lot of potential opportunities, but it’s not there yet. Another thing is we have to make sure you have a customized version. Essentially, [for] every human being, you have to have your own version of LLM based on all the data, based on all the context around you. So you have your LLM; I have my LLM. I might have multiple versions of LLM. Sometimes I know I’m not good at negotiations. Sometimes I don’t join a sales call with customers. I know my weakness before sending a digital version of myself. I know that weakness. I can modify the parameter a little bit.

You think you would have a dial be like “be a better salesperson”?

Exactly. For that meeting I say, “Hey, tune that parameter to have better negotiation skills, send that version, and join.”

When you think about this as expressed in Zoom, the videoconferencing app, do you think there would be a 3D avatar of you, like the Vision Pro faces that Apple is doing, or do you think it would just be a voice?

To start, it’ll probably be voice, but for sure, down the road, the experience would be immersive, like with Vision Pro and Meta Quest 3. I think again, this is also the beginning, but the experience down the road, that’s a 3D version of yourself that can mimic you very well, so you can’t know if it’s a real person or just a 3D version.

This is a lot of stacked up technology problems to solve, right? There’s a realistic 3D avatar. There’s an LLM that you might be able to tune with different parameters that you can trust. I think a lot of people don’t trust LLMs today. They hallucinate a lot. There’s everybody in the world being culturally with talking to a digital avatar. That’s a lot of problems. How is Zoom organized to solve those problems and get to this vision today?

Even a few years ago, we talked about the vision at Zoomtopia, which is our user conference. Imagine a world where you and I live in Silicon Valley. I live in San Jose; you are in San Francisco. We may not be in the same place. Whenever you and I have a call down the road, it’ll feel like you and I are sitting together. I shake your hand, and you feel my hand. I give you a hug, and you feel my intimacy as well. Plus, even two people who speak a different language, the real-time translation will also work extremely well. And if you and I don’t want to meet, I send a digital version for myself, and you’ll have exactly the same conversation. I think that’s the vision we painted a few years ago.

But how to get there? I think two things. First of all, luckily, I think we’ve already started. Look at the industry. I think there are two technologies that are going to help us to start that. One is AI — another is AR. Vision Pro, the Meta Quest 3 — it’s just starting. Look at today and all the generative AI [products]; it’s just started. I do not think those technologies are ready yet, but they will help us get there.

When you think about Zoom itself as the company, you’ve had some changes recently. You laid off about 2 percent of the workforce, or 150 people you’ve restructured. How is the company structured today to help you achieve this vision?

First of all, given we’re a public company, we’re very disciplined. You cannot just invest in more R&D. How do you make your company more efficient? That’s the reason why we look at all positions and how to make the company more efficient. With that, we can shift some of the budget toward some new technology like AI. Almost every company today, they’re trying to save money because you invest more, you buy more GPUs, and you hire more AI engineers and more qualified product managers, designers, and engineers to understand the AI era and understand Vision Pro, the Quest 3 world.

How do you make sure you learn the new technology and then apply that to today’s product? You need new skills, and if you do not make your company more efficient, where do you get the budget to help? That’s the reason why, every quarter, we look at our positions and make sure we shift some of the budget toward new technology like AI.

So, that’s driving the change: “We’re going to stop doing some things. We’re going to unfortunately have to let some people go. We’re going to invest in H-100s and AI engineers.” Where have you landed? What is the structure of Zoom now?

Structure-wise, it’s similar to before but more efficient. We already started investing in GenAI even before ChatGPT was born last year. A lot of other companies did something similar. However, given that ChatGPT was born last year, we realized, “Wow, we have to triple down on that, right?” Today, look at the number of GPUs we have compared to two years ago — way more. Look at our AI engineers. Every day just to focus on AI technology, the foundational AI technology, and also essentially our AI platform team. [When’ you look at a number of our engineers, [we have] way more than we had two years ago.

So, you’ve got a dedicated AI team. What are your other teams?

We have a dedicated AI team led by our CTO, and we hired from other companies. Other teams are similar. We have an R&D team still focused on incremental innovation, to introduce new services. We also have a sales team, a marketing team, and an IT team — a lot of teams. [When] you look at the evolution of AI, everyone is looking at how to leverage AI to make their products better and also at the same time to leverage AI to improve their entire business workflow.

How do we make sure to leverage AI on the marketing materials sales team level? The sales managers can leverage the AI to monitor and help the sales rep when they talk with the customers after the Zoom meeting is over. Leverage AI to tell the sales managers, “Hey, this is a very productive sales session,” and help sales managers to generate some actionable insights. The rest of the team [is] also going to embrace AI as well.

How are you rolling that out? Are you saying to everyone, “Look, we’ve got to do this.” Are you saying, “Hey, the market is demanding this”? Where is that pressure coming from — to embrace AI through the product?

I think two things. Internally, we have to closely monitor technology. When we played around with ChatGPT early last year, I said, “Wow, that’s huge,” like in 1995 when the internet was born and everyone realized, “Wow, there’s huge potential.” You have to embrace that. That’s one thing. Another thing is, every day, I personally spend a lot of time on talking with our customer’s prospects. Guess what? First question they all always ask me now is “What’s your AI strategy? What do you do to embrace AI? Are there any other features you are working on that are AI-powered?” That’s the reason we understand, wow, this is the AI era. Not only ourselves but also our customers — they all ask about our AI strategy.

What’s the single most effective thing you say in response to that question that gets you a sale? [Imagine] I’m a customer and I say, “What’s your AI strategy?” What gets you to the revenue?

We are going to become an AI-first company. We look at our core meeting and say, “Hey, meeting summaries have already been there for almost a year now. Have you ever tested that, and do you like it? What’s your feedback?” So, we start from there. We can talk about our Contact Center. We embrace AI, and in almost every service today, we already have some AI features, so especially for those customers who already tried our AI Companion features, they really like it. Like a meeting summary every day — I’m using that. It’s amazingly accurate.

The reason I ask that — the reason I’m glad you brought up accuracy — is I feel like we went through the ChatGPT moment. Everyone lost their minds, and we said, “This is amazing. The computers can talk to us,” and now we’re here on the other side of it. You’ve rolled out a bunch of features. OpenAI has rolled out a bunch of features. Google has put AI Overviews in Search, and we’re like, “Oh, a bunch of this doesn’t work as well as it should.”

LLMs as a core technology are very convincing, but they are basically hallucinating all the time, and then I see the doubt. I personally have the doubt: “Okay, maybe this underlying technology, the large language model, isn’t as stable a foundation to build the vision that you’re describing.” How do you overcome that? Because that seems like the problem today with the big AI visions. LLMs let people imagine a lot of things, but maybe they’re not the thing you can actually build the vision on.

You are right on that. For any new technology, you cannot get it there overnight. On the way, you’ll see some hurdles like hallucination. It is one of the problems, a very well-known problem. I think given the progress, down the road, that problem will be fixed for sure, right? However, because of that problem, if you do not embrace that technology… I look at my meeting summary: it’s amazingly accurate, and I think I understand, in theory, there’s a hallucination problem, but so far, with our meeting summaries, I do not see that yet. So, maybe in other contexts, there will be that problem.

But meeting summary — it’s like assisted LLMs, right? You feed it a bunch of data. Ideally, your LLM is multimodal; it can just hear the audio. It can generate a summary. I understand how it can solve that problem. You want to get all the way to “I’m sending a digital twin into a meeting that is making decisions on my behalf.” Boy, we are going to have to solve hallucinations to ever get there, and I don’t know that that’s on the roadmap. Even as fast as model capabilities are increasing, I don’t know that hallucinations going down is a metric that anyone can see a Moore’s law-type approach to, where we know it will hit an acceptable point.

You are right on. In that context, that’s the reason why, today, I cannot send a digital version for myself during this call. I think that’s more like the future. The technology is ready. Maybe that might need some architecture change, maybe transformer 2.0, maybe the new algorithm to have that. Again, it is very similar to 1995, 1996, when the internet was born. A lot of limitations. I can use my phon. It goes so slow. It essentially does not work. But look at it today. This is the reason why I think hallucinations, those problems, I truly believe will be fixed. One example: Look at it today. GenAI is already very powerful, but guess what? At this moment, we still do not understand how the human brain works. Imagine we already truly understand how our brain works and how to apply that to the GenAI. I think that’s the future.

When you say the hallucination problem will be solved, I am thinking about this literally in terms of Moore’s law because it’s the closest parallel that I can think of to how other CEOs talk to me about AI. I don’t think you spend a lot of time thinking about transistor density on chips. I’m guessing you don’t just assume that Intel and Nvidia and TSMC and all the rest will figure out how to increase transistor density on chips, and Moore’s law will come along, and the chips will be more powerful and you can build more applications. Just correct me if I’m wrong. I’m guessing you—

No, you’re so right. Absolutely. This is a technology stack. You have to count on so many others.

So is the AI model hallucination problem down there in the stack, or are you investing in making sure that the rate of hallucinations goes down?

I think solving the AI hallucination problem — I think that’ll be fixed.

But I guess my question is by who? Is it by you, or is it somewhere down the stack?

It’s someone down the stack.

I think either from the chip level or from the LLM itself. However, we are more at the application level: how to make sure to level the AI to improve the application experience, create some innovative feature set, and also, at the same time, how to make sure to support customized personalized LLM as well. That is our work.

You mentioned earlier that everyone’s using the same LLM. That’s true, right? There’s only a handful of them that are big enough to do the sorts of things you’re describing. A lot of applications just use the ChatGPT API, or they now might use Gemini. Do you think there needs to be meaningful differentiation in these LLMs such that you would have your own model at Zoom?

I think at this moment, look at all those LLMs, but in the [end], it’s similar and no big difference. One may have more parameters, another one will catch up, so on forth. I think this is just the beginning. I do not think that’s the future. The future is really about personalized LLMs. I will have multiple variants of my LLMs. Every enterprise will also have their own LLM as well. That’s the future. I do not think all of us will share the exact same LLM. [That] doesn’t make any sense. We can, but it doesn’t make any sense because the reason why my LLM truly understands me is I believe that LLM can represent me anytime. But your LLM and my LLM will be very different in the future.

You started Zoom with an insight about a pretty real consumer problem, which is videoconferencing was very hard. You built an application to solve that problem. You’ve now grown it into this company. You have big ambitions about where to go next. The insight now is much bigger. You’re describing a fundamental shift in computing that’s coming and that there’s a fundamental shift in our culture that will accept this AI in the world. That’s very different from “it’s pretty hard to make a video call.” How has your process of making decisions changed? How do you make decisions now?

It’s very well said. You’re so right. At the same time, it’s also the opportunity ahead of us. The reason why, first of all, we have to embrace the change. We’ve got to change the culture to see this is a huge opportunity ahead of all of us. We have to embrace the change, make sure culture-wise, we embrace that. We cannot slow down our innovation. That’s the number one thing, right? Number two, we have to work together with our customers and really prioritize our tasks. We cannot say, “Just do this, do that.” That doesn’t make any sense.

If we do not listen to our customers, even if we have cool technology and embrace that, guess what? Whatever you deliver may not be liked by customers. That’s the second thing. The third thing, you have to have a bold vision, meaning you have to invest for the future. Some of the tasks we’re working on, we’re investing in, may not be released in the next 12 to 18 months, but we have to invest for the future. I think those several things can position us very well and to be the innovator in the AI era.

When you talk about investing for the future and not showing results right away, Zoom is a public company now. You do face that quarterly pressure. How do you insulate that part of the business from the quarterly pressure of being a public company?

Luckily, we are profitable. We generate positive cash flow and, at the same time, are also very disciplined. You are also right, given we’re public company. As I said earlier, we have to shift the budget. Look at Wall Street analysts’ perspective and look at our profit margin. On one hand, we still need to improve that. On the other hand, we can shift more and more resources to let them work on AI or AI-related tasks, right? I think it is still manageable. Again, it’s not that easy, but you have to embrace the changes.

You’re cash flow positive — you’re profitable. At the same time, it is not the height of the pandemic anymore. People have gone back outside. You are talking quite a lot about the value of in-person connection, and then you’ve got huge competitors that have rolled out feature parity to Zoom, right? Google Meet exists. Microsoft Teams exists. They’re bundled into their office suites. That’s a hard thing to compete against. Do you see the videoconferencing business declining or growing?

Looking at a high level, the videoconferencing business is still growing. More and more people [are] going to use videoconferencing more and more. You mentioned the competitors, but guess what? Talk with any users who use a Zoom product, try the Zoom experience, or try others’ product — we’re still much better. That’s one thing.

The second thing is there are a lot of things that are not done yet. We still can innovate a lot. We just launched Zoom 6.0. Some of the smaller incremental innovations, customers like them — like multi-speaker view and, when you send an interaction, how to make the interaction animated. A lot of things can be done. How to further improve the meeting summary, right? Generate actionable insights for the next meetings. I think a lot of things can be further improved. I do not think it is, “Oh, you already hit the wall. There’s no innovation over there.” That’s not the case.

You came from Cisco before you started Zoom. Every now and again, I have to use Webex. Do you ever have to use Webex? Does it just fill you with absolute anger? Because that’s how I feel about it.

So I think over the past probably 12 months, I never joined the Webex meeting. But a year ago, I joined once. Someone sent me a Webex link. Interestingly enough, I did join once, but recently, I did not join.

Don’t worry, they haven’t changed or improved it in a year. What’d you think when you joined the Webex meeting?

I feel like, “Ah, I already have my iPhone, right? I use an iPhone. You give me either iPhone 1.0 or maybe give me a Blackberry, so I don’t want to use that anymore.

That’s pretty good. You’re saying the overall videoconferencing is growing — more people are using it. Certainly out of the pandemic, more people got comfortable with it. The cultural norms about videoconferencing at work changed. More people started using it for more things, but the big players really have leaned into it in their bundles. If you want to go up to some IT person in a company, some CIO, and say, “Look, your team likes Zoom better. Pay us additional money over the bundled version of Google Meet or the bundled versions of Microsoft Teams.” How effective is that in this time of more efficient budgets across the industry?

Great question. First of all, I do not think that a bundle with the pricing strategy is fair. Hopefully, they can compete fairly, and that’s one thing that’s out of my control. However, we always look at everything from an end user perspective. I put those customers in two buckets. One is cost, number one. I do not care about employee experience. I do not care about the best video service. Cost number one, whatever the solution, I deploy regardless whether employees like it or not — just deploy it. That’s one category. Another category: look at the Silicon Valley companies. The employee experience is really important. It’s the number one important thing. You have happy employees, you have happy customers. For those companies, they always look at deploying the best service, to make sure employees like it. We just double down on those customers. That’s okay, too.

Not every company just looks at the cost, but interestingly enough, look at the cost today, look at the total cost of ownership for those companies, the CIO IE team, the deep dive to analyze the total cost of ownership. First of all, look at Zoom. The support cost is much lower. Second thing, look at AI. We offer the Zoom AI Companion to our paid customers at no additional cost. Look at our competitors — $30 per user per month. What do you do about it? That’s why the total cost of ownership of Zoom is much better.

When you think about bundle pricing, obviously Microsoft is in trouble in Europe. They’re going to unbundle Teams. There’s a bunch of antitrust action happening in this country. Is that something you’re paying attention to? Is that something you’re actively advocating for? Is it something that you’re just watching?

I do pay attention to that because we all want to compete but also in a fair way. Ultimately, we all want to work together, even with our competitors, to build the best product for customers. Who is going to benefit from that? Customers. If you have unfair competition, it’s not good for the entire industry. It’s not good for the end user, either. Imagine many years ago, we all used Netscape. Later on, I said probably the worst software in history was Internet Explorer. It was not good for the end user.

Do you think that Microsoft unbundling Teams in the European Union will have a positive impact on Zoom?

I think that’s the right direction, but it’s not good enough, meaning they will do that for new customers but not for existing customers. [When] you look at it, even the unbundled app, the price is still very unfair. Again, hopefully they fix that. It’s out of my control, but again, it is not good for the entire industry.

The flip side of this is the Zoom Workplace features are a bunch of features that exist in other software suites that you’re bundling into the price of Zoom. I’ll give you an example. There’s chat now that looks a lot like Teams or a lot like Slack. You are offering some of these meeting scheduling features, the AI summaries. You’re even edging into email, which is interesting. How do you think about growing that bundle when you think the other bundle is unfair?

Great question. Look at team chat: we had that on day one. Unfortunately, we did not market it well. It’s a shame, but it’s not something like, “Hey, later on, we added a services bundle.” This was a feature from day one. Another thing is to look at our meeting scheduler. For sure, there are some small competitors. But two things: First of all, customers can buy their meeting scheduler separately. Second, the bundle price is fair. It is not like we offer a very, very low price.

Look at any of our services: they’re not like our competitors’. There are only one or two players, and they are already dominating in that space. Look at email and calendar — you have no choice. That’s not the case if you look at our services. Even look at videoconferencing: you have so many players. No one has more than 60 percent market share. But look at other services — they’re very different, so I do not think we have unfair bundling compared to our competitors.

Whenever I talk to enterprise software CEOs like this, the big tension is always between whether the CIOs, who are really the customers here, are going to go with the big bundle from Microsoft or Google or [if] they’re going to build a best of breed bundle on their own. Ideally, you want Slack and Zoom and all this other stuff, and then each of those companies is going to try to eat everything else. Slack has huddles, which compete with audio calls. You can see all these products are in tension with each other, even in the other model where you’re picking the best version of each feature. How do you think about that? Because that has to be real for you — that you’re in that best of breed category, you’re not quite in the big bundle category, and you’re trying to get from one to the other.

Zoom Workplace is an open platform. We never want to tell customers, “You have to use everything from Zoom.” It’s not like that, not like our competitors. The competitors, they push customers. You have to use everything from their solution. [With] Zoom Workplace, that’s not the case. You can deploy Zoom team chat, or you can use Slack or Teams. That’s okay, too. You can use our whiteboard; you can use others’ whiteboards. We’re also integrated very well. Even for another service called Zoom Clips, [if] you want to use a Loom or any other one, that’s okay. It’s an open platform. We are not forcing customers to use everything from our workplace. It’s very different compared to our competitors. They force the customer [and say] you have to use that.

When you think about growing that business, adding all these AI features into it feels like there’s just a lot of reticence, particularly on privacy and security. We’re going to put a lot of our data into these products. Somewhere, there will be some training — who knows on what, who knows if it’s appropriate — and then there’ll be an AI assistant. There have already been some scandals, I would say, and controversy around Zoom AI features where the data is coming from. What’s your approach today?

When it comes to AI, you have to be responsible and accountable. That’s why we are the first AI company in the videoconferencing industry. Last year, we made a commitment. We do not use any of our customer’s data to train our own large language model as well as third-party large language models. We already made that commitment. However, AI is a new area. You also need to educate the customer on what that means in case customers do not understand. They thought, “Oh, we might be using [the data].” For some customers, they want to opt in to help us to tune the data. But no, that’s not the case. Even [to] customers that want to opt in, we say, “No, we do not need that.”

That’s a very responsible approach, and we already made that commitment. That’s the reason why customers today look at our AI Companion feature and more than 700,000 accounts already enabled it. More and more customers are going to enable that. For sure, especially for very large enterprise customers, they’re going through the internal audit to make sure we never turn on the AI feature from any of the vendors and make sure you pass their internal security and privacy check. But they look at our AI stack, our policy, our commitment, our terms of service, [and] they feel very comfortable.

When you think about the big vision — which still my mind is blown that this is your big vision, — of “I’m going to send a digital twin into a meeting, and it’s going to make decisions on my behalf that everyone trusts, that everyone agrees on, and everyone acts upon,” the privacy risk there is even higher. The security surface there becomes even more ripe for attack. If you can hack into my Zoom and get my digital twin to go do stuff on my behalf, woah, that’s a big problem. How do you think about managing that over time as you build toward that vision?

That’s a good question. So, I think again, back to privacy and security, I think of two things. First of all, it’s how to make sure somebody else will not hack into your meeting. This is Eric; it’s not somebody else. Another thing: during the call, make sure your conversation is very secure. Literally just last week, we announced the industry’s first post-quantum encryption. That’s the first one, and at the same time, look at deepfake technology — we’re also working on that as well to make sure that deepfakes will not create problems down the road. It is not like today’s two-factor authentication. It’s more than that, right? And because deepfake technology is real, now with AI, this is something we’re also working on — how to improve that experience as well.

What you’re working on in both directions. If the vision is “I have a digital twin that goes to a Zoom meeting and makes a decision,” you need to deepfake me. You need to make a realistic render of me that can go act in those situations, and then on the other side, you need to detect it. Where is your investment bigger? On the creating the digital twin or detecting the digital twin?

I think those are very related. You cannot focus on one, ignore another one. Otherwise, it’ll not work. We have to offer the end-to-end experience and make it right. I think equally important, we have to invest in both.

Do you think it will be possible to reliably detect digital twins or deepfakes?

Of course. Because, again, it’s more like: Let’s say I send a digital twin of myself. It will be authenticated. You will say this is the real digital twin of Eric or the digital twin of somebody else, given the critical technology and a lot of new stuff. I think it’s very feasible to detect. Otherwise, you send a digital twin, and it may not be myself — you’re meeting somebody else.

Let me ask you this: what do you think is the limit? If I have a digital twin of me that can go to Zoom meetings or appear at conferences, should I be able to send a hundred digital twins out in the world? A thousand? Is there a limit? Do you think there should be a limit?

I don’t think there’s a limit. It really depends on yourself. More like how today, [if] you join a meeting, you want to wear a black jacket or you want to wear a white jacket. It depends on the day. Again, they all belong to you, though, right? You have multiple versions of digital twin. Some versions, I just want one—

Will all the digital twins be connected to each other? So if I have a hundred digital twins of me out in the world and one is being asked if the next car I want should be red or blue and another one is asking me if the next car I want should be white or black and they answer different colors, how will they know?

So, again, first of all, you control all your digital twins.

That’s one thing. The second thing: your digital twins, multiple digital twins, are different based on your training. One digital twin is really more like a sales expert; another digital twin of yourself is more like an engineering expert. Again, you manage that. Whenever you send a digital twin of yourself to join any other meetings, any other digital context, we know that they’ll be authentic given AI-based authentication. They know that it’s one of the digital twins of Nilay or one of the digital twins of Eric. They know that, too.

Do you imagine this is all happening in Zoom’s data center? So I log in to my Zoom account, I’ve got my engineer digital twin and my designer digital twin and whoever else, and I’m saying, “Alright, go off, go do stuff,” in a Zoom interface, or do I own these and I’m connecting to Zoom?

I think the interface is Zoom’s interface. However, how to manage that is very different. That’s the reason why I like crypto technology. It’s more like fully distributed. I do not think you can store the digital twin of yourself to our server. You will store somewhere you feel very safe, likely maybe on the edge, on your phone, desktop, or maybe somewhere you trust more, like where you store your Bitcoin. Something like that. I do not think you give your digital twin to each of the vendors. You use Zoom, use other services. I do not think that’s our architecture.

I’m just so curious because you have to build a lot of this to enable this. This is the vision. And some of these questions just seem fundamental. What is the rate at which you can deploy digital twins? That seems like a big decision we all need to make together. Have you thought about that? That someone might want to send a thousand digital twins out into the world? That might be a weird outcome.

You are so right. That’s the reason why AI is full of uncertainties, but in reality, it will happen. Whenever I train using my consumer LLM and have multiple digital twins, my friend also trusts that. That works for sure. There’s some side effects, such as how to leverage all the distributing computing technology, AI technology, AR technology, crypto technology. That’s the reason why in the next 10 or 20 years, it’s more exciting than the past 20 years.

Let me bring this back to Zoom. I wanted to get way in the clouds of what the future might look like, but Zoom as a company has to execute and do these things. One of the things that I think is interesting is that you have brought a lot of your people back to the office. I’ve heard you in interviews say, “I have Zoom fatigue.” Explain that to me. It seems like it’s hard to sell distributed office software, remote meeting software, and then to say, “I have fatigue using the software and my employees should come back.”

Let me set the record straight here, for two reasons. One reason is, again, our company culture is to deliver happiness, make sure we build a better product to deliver happiness to our customers. We have to eat our own dog food. A lot of our customers do hybrid work — one or two days in the office, the rest of time at home. When our customers embrace hybrid work, we build a lot of features to support hybrid work. Guess what? If we do not support hybrid work, we are not eating our own dog food. In particular, when customers visit us, they say, “Eric, show me all those features and how you support the hybrid work.” One feature, like reserving a desk, when we show those features, the customer also asks us, “Hey, how do you use that internally?” If all of our employees work from home all five days, I cannot do that. That’s one reason.

The second reason is very unique to Zoom. Because during the covid-19 crisis, we hired more than 6,000 people. If you already have employees that already know each other for a while, I think you do not need to go back to the office. But here, we had way more than the existing employees in terms of newly hiring employees: 6,000 versus 2,000. When you have so many new employees, how do you accelerate building trust? If you and I have a one-time in-person meeting with an intimate interaction, it’d be much faster to build trust. That’s the second reason. That’s only probably unique to Zoom because of the two reasons — we have to support hybrid work. We never told our employee back to the office for five days [a week]. Only two days.

I agree with you. We hired a bunch of people. I think being in person is really important, and then the tools are designed to get your work done digitally. How do you see that playing out in your own workplace? Are people getting together more often physically? Are they wanting to do that more often? Are the tools making that easier or creating more space for that?

I do not think that people want to get together more often in person, but they want to occasionally get together, particularly new employees. They want to start with in-person interaction, and afterward, you and I already know each other very well. For future interactions, we can go online, right? Also, the first time we meet in person, the future conversation will go online, and we also want to meet in person maybe once or twice a year because we have so many employees who live in LA and also in Florida and Texas. They also want to get together once or twice a year. I think that’s good enough.

You’re talking about dogfooding your own product, right? We can’t build tools for hybrid workplaces unless we have a hybrid workplace. There’s risk with AI. No one knows how it’s going to work. We’ve talked a lot about those risks. We talked a lot about some of the wild opportunities that might be presented. As you roll out more and more AI features, part of dogfooding them will be accepting the risk in Zoom’s own business. Have you thought about that balance? How much risk of an unproven AI tool you yourself will be willing to accept?

Great question. That’s why, internally, we also have a process. Our security and privacy team also has a product team — a separate team within the same team. That’s one of the key reasons. Because, otherwise, we would announce some AI features and just immediately roll them out to our customers. That’s not very responsible. Internally, we let all the Zoomies, our internal employees, try that first, share the feedback, and play around. When we release to customers, we also take a very responsible approach, meaning some AI features by default will be off. Only until enterprise IT tests those features and feels comfortable can they enable them.

For sure, that will slow down the adoption of those AI features, but we might take a different approach for other buyers. For an account, we might turn some features on by default. We look at it from different segments and make sure you have all the controls — enterprise, account IT. [As] the user, you also have control as well. The group also has control, plus our internal test. We are taking a very responsible approach.

Give me an example of an AI feature that you tested and thought, “This is great. We got to roll this out to the entire company.”

I think last year, almost a year now. The first feature we announced in the AI Companion feature is a meeting summary. We tested it, and when we found some potential issues, we said, “Let’s hold on releasing that to customers,” and also when we release to customers, by default, it’s still off. Enterprise account and IT, they need to take on this feature — they feel comfortable and then give it to customers. That’s one thing.

I’ll give another example, just a recent example. Suppose in our meeting client’s workplace six-star release, we have a very cool feature I play around [with], which is to leverage AI to generate the virtual background. Today, for different meetings, I can use a different virtual background. I need to pick it up. You just type “Happy Monday” or “Busy Friday,” or something like that. You type some keywords to generate a meaningful AI virtual background. Before the release, we found something. As you said, some of the potential issues will not make our customers comfortable. Guess what? We held that, and we tested more. We fixed that until we felt very, very comfortable — until we feel the customer is not going to have any issues in terms of privacy, security, or hallucination, so on and so forth. Then, we are going to release it.

Alright, so here’s the flip side of that question. What’s an AI feature you’ve seen maybe you developed or someone else developed that you would not feel comfortable using at Zoom to do Zoom’s work?

Let’s say I want to record a video. Every time I know I recorded by myself and some technology, [I’ll] say, “Hey, you just send me a few minutes of video,” and down the road, I just send a script and automatically generate a video. For now, I think we’re also working on that feature as well, but I do not feel comfortable yet because it still sometimes creates some potential privacy issues. We have to fix that.

Do you think that’s really connected to the idea of these digital twins — that you’re going to be able to make realistic videos based on someone’s face? What’s the ramp before you’re comfortable using it yourself?

We need to make sure we have more than 8,000 people, all the internal Zoomies use that and feel comfortable, and we pick up some of the early adopters and roll out those features to them. They also feel comfortable, and then we have a global rollout, go through a release process but more conservative than before. For any AI features here, our release principle is always be conservative, be responsible.

One of the things that I’ve been thinking about as we cover AI here at The Verge is that there’s a gap in enthusiasm for AI between the companies, which are all investing in it like crazy and building it and rolling out the features and people. Our audience is very skeptical and, in some cases, very angry about AI. I don’t know that the Valley has perceived that gap.

I’ll give you an example. I think the Scarlett Johansson story with OpenAI is a clear example of that gap. The Apple ad where they crushed everything in the iPad, which had really nothing to do with AI — people were mad at it because they’re just mad at AI, right? That’s a clear example of that gap. How do you think about that? Because it feels like we’re at the point now where it’s really time to wrestle with how regular people feel about the technology and how the industry feels about the technology.

It’s a great question. The way I look at this is that, for any new technology, in particular for revolutionary technology, it’s human nature. You have to focus on those early adopters — improve the technology to be better and better every day until it becomes a mainstream service. On day one, if you’re a new technology, you cannot assume every ordinary user or families or companies [are] going to embrace that. If that’s your assumption, I think that’s not right. You have to focus on early adopters. That’s it.

For any new technology, new features, that’s it. But if gradually you make the technology better and let the entire user base embrace that, I think that’s reality for every new technology. AI is no exception. That’s the reason why we feel very comfortable. I think especially for all those individuals, I think they will also feel comfortable. We cannot always say, “Hey, I have this very cool feature. Now every user, you have to embrace that.” That’s not going to happen.

Eric, this is a huge vision for the future of Zoom. What’s the next step people will be able to see? What’s next for Zoom?

I want the end user to realize that Zoom is a workplace collaboration platform. It’s more than just videoconferencing. That’s one thing. Another thing is the Zoom AI Companion is there, has great features, and is very accurate at no additional cost. I think end users have got to embrace more AI features. We already have a lot of features already rolled out, but we are going to roll out more AI features there. I think more customers, if they are embracing those features, will realize, wow, they never thought about this feature, never thought about that feature. Guess what? That’s innovation.

Well, this is great, Eric. Thank you so much for being on Decoder. You’re going to have to come back soon, maybe as a digital twin someday.

Yes. Thank you, my friend. I really appreciate it.

Decoder with Nilay Patel /

A podcast about big ideas and other problems.

SUBSCRIBE NOW!

Source: The Verge

Powered by NewsAPI.org

Comments

sarahlison555 commented on The CEO of Zoom wants AI clones in meetings about 2 months ago

With the rapid development of technology, Zoom is an extremely effective workplace collaboration platform. Especially Zoom AI Companion has great and accurate features at a cost of O. So why don't we use AI features for bob the robber testing?

Cyril Technology

222 views

5 points (100% upvoted)

Submitted about 2 months ago