Google I/O 2022: Advancing knowledge and computing

Mike Letterman

May 11, 2022, 8:52 PM
14 Views

It's [TL;DR]

The mission of the company was to organize the world's information and make it universally accessible and useful. We have been developing technology to deliver on that mission for decades.

Our years of investment in advanced technologies, from artificial intelligence to the technical infrastructure that powers it all, has made the progress we have made. On my favorite day of the year, we share an update on how it is going at Google I/O.

Today, I talked about how we are creating products that are built to help and how knowledge and computing are two fundamental aspects of our mission. It's even more exciting to see what people do with these products.

We would like to thank everyone who helps us do this work. We are thankful for the opportunity.

- Sundar

Below is an edited transcript of the keynote address given by Sundar Pichai at the opening of the I/O Developers Conference.

Everyone, and welcome. Let's make that welcome back! It's great to be back at Shoreline Amphitheatre after three years. It is great to see all of you with us. We're so happy you're here, too.

Last year, we shared how new discoveries in some of the most technically challenging areas of computer science are making products more helpful in the moments that matter. Our timeless mission is to organize the world's information and make it universally accessible and useful.

I am excited to show you how we are driving that mission forward by furthering our understanding of information so that we can turn it into knowledge and by furthering the state of computing so that knowledge is easier to access.

Progress on the two parts of our mission ensures that products built to help. I will start with a few examples. The goal of the Pandemic was to deliver accurate information to help people stay healthy. Over the last year, people used search and maps to find where they could get a vaccine.

A visualization of Google’s flood forecasting system, with three 3D maps stacked on top of one another, showing landscapes and weather patterns in green and brown colors. The maps are floating against a gray background.

In India and Bangladesh, 23 million people received flood warnings last year.

We expanded our flood forecasting technology to help people stay safe in the event of a natural disaster. More than 23 million people in India and Bangladesh were notified of floods during last year's monsoon season. We estimate that this supported the timely removal of hundreds of thousands of people.

We worked with the government to quickly deploy air raid alert. Hundreds of millions of alerts have been delivered to help people get to safety. Millions of Ukrainians have sought refuge in Poland. Families host refugees in their homes and schools welcome thousands of new students as Warsaw's population has increased by 20%. I spoke to almost every employee of the company who hosted someone.

In many countries around the world, newcomers and residents use Google Translate to communicate. We are proud of how it helps Ukrainians find hope and connection until they are able to return home.

Two boxes, one showing a question in English — “What’s the weather like today?” — the other showing its translation in Quechua. There is a microphone symbol below the English question and a loudspeaker symbol below the Quechua answer.

We can add languages like Quechua to the Translate.

Real-time translation is a testament to how knowledge and computing come together. We still have work to do to make it universally accessible despite the fact that more people are using it. There is a long tail of languages that areunderrepresented on the web, and it is difficult to translate them. For example, the same phrase in both English and Spanish can be found in translation models. There isn't enough publicly available bilingual text for every language.

With advances in machine learning, we have developed a monolingual approach where the model learns to translate a new language without ever seeing a direct translation. We collaborated with native speakers and institutions to find that the translations were of sufficient quality to be useful.

A list of the 24 new languages Google Translate now has available.

We are adding 24 new languages to the Translate service.

The first indigenous languages of the Americas will be included in the 24 new languages that will be added to Google Translate. The languages are spoken by more than 300 million people. A radical shift in how we access knowledge and use computers is being powered by breakthrough technologies.

Much of what we know about our world is found in the physical and geospatial information around us. For more than 15 years, Google Maps has worked to create representations of this information to help us navigate. Advances in artificial intelligence are taking this work to the next level, whether it's expanding our coverage to remote areas or rethinking how to explore the world in more intuitive ways.

An overhead image of a map of a dense urban area, showing gray roads cutting through clusters of buildings outlined in blue.

Artificial intelligence is helping to map remote and rural areas.

Over 60 million kilometers of roads and 1.6 billion buildings have been mapped around the world. Some remote and rural areas have previously been difficult to map due to scarcity of high-quality imagery and distinct building types. We are using computer vision and neural networks to detect buildings at scale from satellite images. Since July 2020, the number of buildings on the map in Africa has increased by 5X.

We doubled the number of buildings mapped in India and Indonesia this year. Over 20% of the buildings on the map have been detected using these new techniques. The dataset of buildings in Africa has been made public. The World Bank and the United Nations are using it to provide support and emergency assistance.

Video format not supported.

Immersive view in Maps combines aerial and street level images.

New capabilities are being brought into Maps. We are using advances in 3D mapping and machine learning to create a high-fidelity representation of a place. Immersive view is a new experience in Maps that allows you to explore a place like never before.

Take a look at London. You and your family are going to visit Westminster. You can see this view from your phone, and you can also see the sights at the abbey. If you are planning on going to Big Ben, you can check out the traffic, the weather, and even check out the location. If you want to get a bite during your visit, you can check out restaurants nearby.

We use neural rendering to create the experience from images alone. The experience can be run on any phone. This feature will be added to the maps later this year.

The eco-friendly route is a big improvement to Maps. It was launched last year and shows you the most fuel efficient route that will save you money on gas. Eco-friendly routes have been used to travel 86 billion miles in the U.S. and Canada, saving an estimated half million metric tons of carbon emissions.

Still image of eco-friendly routing on Google Maps — a 53-minute driving route in Berlin is pictured, with text below the map showing it will add three minutes but save 18% more fuel.

Eco-friendly routes will be added to Europe later this year.

We're going to expand this feature to more places later this year, including Europe. You could reduce your fuel consumption by 18% by taking a route that is three minutes slower. The decisions that are small have a big impact. We think carbon emission savings will double by the end of the year with the expansion into Europe and beyond.

We have added a similar feature to Google flights. We show you carbon emission estimates when you search for flights between two cities, making it easy to choose a green option. Our goal is to empower 1 billion people to make more sustainable choices through our products, and we are excited about the progress here.

Video is becoming an even more fundamental part of how we communicate and learn. We want to help you get to a specific moment in a video quicker, because you are looking for a specific moment in a video.

We launched auto-generated chapters last year to make it easier to jump to the part you're most interested in.

This is great for creators because it saves them time. We are applying DeepMind technology. It uses text, audio and video to generate chapters with greater accuracy and speed. We now have a goal of 10X the number of videos with auto-generated chapters, from eight million today to 80 million over the next year.

Speech recognition models are being used to transcribe videos so that we can get a sense of the video's content quicker. Video transcripts are now available to all users.

Animation showing a video being automatically translated. Then text reads "Now available in sixteen languages."

There is an auto-translated caption on YouTube.

Next up, we are bringing auto-translation to mobile. It means that viewers can now auto-translate video caption in 16 languages. Next month, we will be expanding auto-translated caption to Ukrainian YouTube content, part of our larger effort to increase access to accurate information about the war.

We are building it into our products to help people be more efficient. If you work for a small business or a large institution, chances are you spend a lot of time reading documents. Maybe you have felt a wave of panic when you realized you have a 25-page document to read before the meeting starts.

Whenever I get a long document or email, I look for a TL;DR at the top, which is short for Too Long, Didn't Read.

That's why we've introduced automated summarization. When using one of our machine learning models for text summarization, you will be able to pull out the main points.

This is a big step forward for natural language processing. Machine learning models used to be able to understand long passages, information compression and language generation, which used to be outside of their capabilities.

Doctors are only the beginning. We are launching summarization for other products. In the next few months, you will be able to get a helpful digest of chat conversations, so you can jump right into a group chat or look back at the key highlights.

Animation showing summary in Google Chat

In the coming months, we will bring summarization to Google chat.

We are working to bring transcription and summarization to the meet so you can catch up on some meetings you missed.

Visual improvements on Google Meet

There are many times when you really want to be in a virtual room with someone. Project Starline inspired us to improve audio and video quality. Project Starline was introduced at I/O last year. We have been testing it across the offices to get feedback and improve the technology for the future. We have learned some things that we can use to apply to Google Meet.

Machine learning-powered image processing is inspired by Starline. It works on all types of devices, so you can look your best wherever you are.

An animation of a man looking directly at the camera then waving and smiling. A white line sweeps across the screen, adjusting the image quality to make it brighter and clearer.

Machine learning-powered image processing improves image quality.

Virtual lighting is also being brought to Meet. You can adjust the light position and brightness to make you visible in a dark room or sitting in front of a window. We are testing this feature to make sure everyone looks like their true self, continuing the work we have done with Real Tone and the Monk Scale.

Read the article.

We are introducing a next step in our commitment to image equity.

Making our products more helpful, more accessible, and delivering innovative new features for everyone are some of the ways that Artificial Intelligence is improving our products.

Gif shows a phone camera pointed towards a rack of shelves, generating helpful information about food items. Text on the screen shows the words ‘dark’, ‘nut-free’ and ‘highly-rated’.

We are helping people find helpful information in more intuitive ways on Search.

We've talked about how we're trying to advance access to knowledge, from better language translation to improved Search experiences across images and video, to richer explorations of the world using Maps.

We are going to focus on how we can make that knowledge more accessible through computing. The journey we have been on with computing is exciting. Knowledge is more useful in our daily lives because of every shift from desktop to the web to mobile to Wearables.

We have had to work hard to adapt to our devices. I have always believed that computers should be adapting to people. We are pushing ourselves to make progress.

We're making computing more natural and intuitive with the Google Assistant.

Read the article.

There are more ways to interact with devices with the help of the Google Assistant.

Introducing LaMDA 2 and AI Test Kitchen

Animation shows demos of how LaMDA can converse on any topic and how AI Test Kitchen can help create lists.

The generative language model for dialogue application is shown in a demo.

We are continually working to improve our capabilities. Natural language processing and conversation can make computers more accessible. Large language models are important.

Last year, we introduced LaMDA, a generative language model for dialogue applications that can converse on any topic. LaMDA 2 is our most advanced artificial intelligence yet.

We feel a deep responsibility to get it right because we are at the beginning of a journey to make models like these useful to people. We need people to experience the technology and give feedback. Thousands of people enjoyed testing LaMDA and seeing its capabilities. This resulted in a reduction in offensive or inaccurate responses.

That is the reason we have made a test kitchen. It is a new way to explore the features of artificial intelligence. There are different experiences inside the test kitchen. It is intended to give you a sense of what it would be like to have LaMDA in your hands and use it for what you care about.

If the model can take a creative idea you give it, and generate imaginative and relevant descriptions, then this demo is for you. These are sketches that allow us to explore what LaMDA can do with you. The user interface is easy to use.

You are writing a story and need some inspiration. Maybe one of your characters is in the deep ocean. You can ask what it feels like. There is a scene in the Mariana Trench. Follow-up questions are generated on the fly. You can ask LaMDA what kinds of creatures live there. We didn't hand-program the model for specific topics. These concepts were created from training data. You can ask about everything from being on a planet made of ice cream to the rings of Saturn.

Language models are challenged to stay on topic. You want the learning experience to be open-ended so people can explore where curiosity takes them, but stay safe on the topic. The second demo shows how LaMDA works.

We primed the model to focus on the topic of dogs. If you ask a follow-up question, you get an answer with some relevant details.

The conversation can be taken anywhere you want. Maybe you are curious about how smell works and you want to dive deeper. You will get a unique response to that too. It will try to keep the conversation going on the topic of dogs. The model brings the topic back to dogs when I start asking about cricket.

This challenge of staying on topic is a difficult one, and it is an important area of research for building useful applications with language models.

These experiences show the potential of language models to help us with things like planning, learning about the world, and more.

There are significant challenges that need to be solved before these models can be useful. We have improved safety, but the model might still generate offensive responses. People can help report problems by giving feedback in the app.

All of this work will be done in accordance with our principles. Our process will be iterative, opening up access over the coming months, and carefully assessing feedback with a broad range of stakeholders. We will incorporate this feedback into future versions of LaMDA.

We intend to add more emerging areas of artificial intelligence into the test kitchen over time. You can learn more at the website.

LaMDA 2 has a lot of capabilities. A new model was recently announced to explore other aspects of natural language processing. The model is called the Pathways Language Model. Our largest model to date has been trained on 540 billion parameters.

Natural language processing tasks, such as generating code from text, answering a math word problem, or even explaining a joke, are shown to have breakthrough performance by PaLM.

This can be achieved through greater scale. The results are promising when we combine that scale with a new technique called chain-of- thought prompting. Multi-step problems are described as a series of intermediate steps.

An example of a math word problem that requires reasoning can be found here. When using a model, you should prompt it with a question and answer and then ask questions. How many hours are in the month of May? The model didn't get it right.

In chain-of-thought prompting, we give the model a question-answer pair, but this time, an explanation of how the answer was derived. It's like when your teacher gives you a step-by-step example to help you understand how to solve a problem. If we ask the model again how many hours are in the month of May, it will answer correctly and show its work.

There are two boxes below a heading saying ‘chain-of-thought prompting’. A box headed ‘input’ guides the model through answering a question about how many tennis balls a person called Roger has. The output box shows the model correctly reasoning through and answering a separate question (‘how many hours are in the month of May?’)

Chain-of-thought prompting leads to better reasoning.

Chain-of-thought prompting increases accuracy. State-of-the-art performance is achieved across several reasoning benchmarks. We can do it all without changing how the model is trained.

PaLM can do so much more. It is hard to find information in a language that is not well-represented on the web today. The answer you are looking for is probably out there. A new approach to making knowledge more accessible for everyone holds enormous promise.

We can help answer questions in a language that is spoken by a quarter billion people. Just like before, we prompt the model with two examples of questions in Bengali and English.

Now we can ask questions in Bengali about the national song of Bangladesh. You would expect that content to exist in Bengali.

The model answers correctly in Bengali if you try something that is less likely to have related information. It probably caused a debate amongst New Yorkers about how correct it is.

PaLM has never seen parallel sentences between Bengali and English. It was never taught to answer questions or translate. The model was able to answer questions in Bengali. We can extend the techniques to more languages.

The potential for language models is so bright. One day, we hope we can answer questions on more topics in any language you speak, making knowledge even more accessible in Search and across all of the internet.

Our continued innovation in our infrastructure is what makes the advances we have shared today possible. We are going to invest $9 billion in data centers and offices in the U.S.

Mayes County, Oklahoma is home to one of our state-of-the-art data centers. The world's largest machine learning hub is going to be launched there.

Still image of a data center with Oklahoma map pin on bottom left corner.

There is a data center in Mayes County, Oklahoma.

This machine learning hub is built on the same networking infrastructure that powers the largest neural models. Our customers have an unprecedented ability to run complex models and workloads because they have nearly nine exaflops of computing power. We hope this will encourage innovation in many fields.

The machine learning hub is powered by 90% carbon-free energy. We want to become the first major company to operate all of our data centers and campuses on carbon-free energy by the year 2030.

As we invest in our data centers, we are trying to improve our mobile platforms so more processing can happen locally. Our custom system on a chip was an important step in this direction. The best speech recognition we've ever deployed is right to your phone, and it's already running on the two phones. Making those devices more secure is a big step forward. It can run data-powered features directly on the device so that it is private to you.

Every day, people turn to our products for help. Every step of the way, you have to protect your private information. Even as technology grows more complex, we keep more people safe online than anyone else in the world, with products that are secure by default, private by design and that put you in control.

Read the article.

An update on how the internet giant keeps its users safe.

We spent time today sharing updates to other platforms. Billions of people are receiving access, connection, and information through their connected devices like TVs, cars, and watches.

Read the article.

To help all your devices work better together, at I/O, there was an announcement of updates to your phone, watch and tablets.

We shared our new portfolio of products, which were all built with ambient computing in mind. We are excited to share a family of devices that work better together.

Read the article.

Our work in ambient computing is furthered by the new Pixel portfolio.

We talked about the technologies that are changing how we use computers. It's easier to get things done with devices that work together, when and where you need them, and with conversational interfaces.

There is a new frontier of computing, which has the potential to extend all of this even further, and that is augmented reality. We have invested a lot in this area. We have been building augmented reality into many of the products of the company.

The magic of augmented reality will come alive when you can use it in the real world without the technology getting in the way.

The ability to spend time focusing on what matters in the real world, in our real lives, is what gets us most excited about augmented reality. The real world is amazing.

We need to design in a way that is built for the real world and doesn't take you away from it. We have new ways to accomplish this.

Let's use language as an example. It's so important to connect with one another through language. Understanding someone who speaks a different language or trying to follow a conversation if you are hard of hearing can be difficult. When we take our improvements in translation and transcription and deliver them in one of the early prototypes we have been testing, let's see what happens.

The joy that comes with speaking naturally to someone can be seen in their faces. There was a moment of connection. To be understood. That is what our focus is on. It is what we strive for every day, with products that help.

We get closer to our mission each year. We have more to go. We feel a sense of excitement about that. The breakthrough you just saw will help us get there. Thank you to everyone who joined us today. We want to build the future with you.

Google I/O 2022: Advancing knowledge and computing

Mike Letterman

Visual improvements on Google Meet

Introducing LaMDA 2 and AI Test Kitchen

Tech

Buzz

Travel

Health

Related News

Patrick Mahomes, Kansas City Chiefs prevail against Buffalo Bills, win wild AFC divisional game in overtime

Popular on TVN

Stay In Touch

Follow Us