Natural Sciences – Google AI Blog

(This is part 7 of our series of posts covering different topic areas of Google research. Other posts in the series can be found here.)

It’s an incredibly exciting time to be a scientist. Thanks to amazing advances in machine learning (ML) and quantum computing, we now have powerful new tools that enable us to act on our curiosity, collaborate in new ways, and radically accelerate progress toward breakthrough scientific discoveries.

Since joining Google Research eight years ago, I’ve been honored to be part of a community of talented researchers who are passionate about using modern computing to push the boundaries of what’s possible in applied science. Our teams explore topics in the physical and natural sciences. So for this year’s blog post, I want to focus on the high-impact advances we’ve made recently in biology and physics, from helping to organize the world’s protein and genomics information to benefit people’s lives to improving our understanding of nature. with the quantum computers of the universe. We are inspired by the great potential of this work.

Using machine learning to unlock the mysteries of biology

Many of our researchers are fascinated by the extraordinary complexity of biology, from the mysteries of the brain, to the potential of proteins, to the genome that encodes the language of life. We work with scientists from other leading organizations around the world to solve important challenges in the fields of connectomics, protein function prediction, and genomics, and to make our innovations accessible and useful to the larger scientific community.


One of the interesting applications of our Google-developed ML methods has been to study how information travels through neural pathways in zebrafish brains, providing insight into how fish engage in social behaviors such as courtship. In collaboration with researchers at the Max Planck Institute for Biological Intelligence, we were able to computationally reconstruct part of a zebrafish brain imaged by 3D electron microscopy, an exciting advance in the use of imaging and computational pipelines to map neural circuits in small brains. , and another step forward in our long-standing investment in connectomics.

Reconstruction of the neural circuitry of the larval zebrafish brain, courtesy of the Max Planck Institute for Biological Intelligence.

The technical advances needed for this work will have applications even beyond neuroscience. For example, to address the challenge of working with such large connectomics data sets, we developed and released TensorStore, an open source C++ and Python software library for storage and manipulation. n:– dimensional data. We look forward to seeing the ways it is used in other industries to store large data sets.

We also use ML to shed light on how the human brain performs remarkable feats such as language by comparing human language processing and autologous deep language models (DLMs). For this study, a collaboration with colleagues at Princeton University and New York University Grossman School of Medicine, participants listened to a 30-minute podcast while their brain activity was recorded by electrocorticography. The recordings suggest that the human brain and DLMs share computational principles of language processing, including continuous prediction of the next word, dependence on contextual embeddings, and computation of post-onset surprise based on word matching (we can measure how surprised the human brain is: the word and correlate that surprise signal with how well the word is predicted by the DLM). These results provide new insights into language processing in the human brain and suggest that DLMs can be used to reveal valuable insights into the neural basis of language.


ML has also allowed us to make significant progress in understanding biological sequences. In 2022, we used recent advances in deep learning to accurately predict protein function from raw amino acid sequences. We also worked closely with the European Bioinformatics Institute of the European Molecular Biology Laboratory (EMBL-EBI) to thoroughly evaluate model performance and add hundreds of millions of functional annotations to the public protein databases UniProt, Pfam/InterPro and MGnify. Human annotation of protein databases can be a laborious and slow process, and our ML methods have allowed us to make giant leaps, such as increasing the number of Pfam annotations to a greater number than all efforts in the last decade combined. Millions of scientists around the world who access these databases every year can now use our annotations for their research.

Google Research’s investment in Pfam exceeds all database expansion efforts over the past decade.

Although the first draft of the human genome was released in 2003, it was incomplete and had many gaps due to technical limitations of sequencing technologies. In 2022, we celebrated the remarkable achievements of the Telomere-2-Telomere (T2T) consortium to resolve these previously inaccessible regions, including five complete chromosome arms and nearly 200 million base pairs of new DNA sequences of interest and relevance to human questions. biology, evolution and disease. Our open source genomics variant caller, DeepVariant, was one of the tools used by the T2T consortium to prepare the complete 3.055 billion base pair sequence of the human genome. The T2T consortium is also using our new open source method, DeepConsensus, which provides on-device error correction for Pacific Biosciences’ long-read sequencing tools, in their latest research into comprehensive genomic resources that can represent the breadth of human genetic diversity.

Using quantum computing to make new discoveries in physics

As far as scientific discoveries go, quantum computing is still in its infancy, but has great potential. We are exploring ways to advance the capabilities of quantum computing so that it can become a tool for scientific discovery and discovery. In collaboration with physicists around the world, we are also beginning to use our existing quantum computers to create exciting new experiments in physics.

As an example of such experiments, consider the problem where a sensor measures something and a computer then processes the data from the sensor. Traditionally, this means that sensor data is processed as classical information on our computers. Instead, one idea behind quantum computing is to directly process quantum data from sensors. Passing data from quantum sensors directly to quantum algorithms without classical measurements can provide a great advantage. In a recent scientific paper written in collaboration with researchers from multiple universities, we show that quantum computing can extract information from exponentially fewer experiments than classical computing, as long as the quantum computer is directly connected to quantum sensors and runs a learning algorithm. This “quantum machine learning” can provide an exponential advantage in database size, even with today’s noisy intermediate-scale quantum computers. Because experimental data is often the limiting factor in scientific discovery, quantum ML has the potential to unlock the enormous power of quantum computers for scientists. Even better, the insights from this work are also applicable to learning about the results of quantum computing, such as the results of quantum simulations, which can otherwise be difficult.

Even without quantum ML, a powerful application of quantum computers is the experimental study of quantum systems that would otherwise be impossible to observe or model. In 2022, the Quantum AI team used this approach to observe the first experimental evidence of entangled multiple microwave photons using superconducting qubits. Photons do not normally interact with each other and require an additional element of nonlinearity to cause them to interact. The results of our quantum computer simulations of these interactions surprised us. we thought that the existence of these bound states relied on fragile conditions, but instead we found that they are robust even to the relatively strong perturbations that we applied.

Occupancy probability versus discrete time step of n -photon bound states. We notice that most photons (darker colors) remain bound together.

Given the initial success we’ve had in using quantum computing to make advances in physics, we hope this technology will enable future breakthroughs that could have as significant an impact on society as the invention of transistors or GPS. : The future of quantum computing as a scientific tool is exciting.


I’d like to thank everyone who worked hard on the advances described in this post, including the Google Applied Sciences, Quantum AI, Genomics, and Brain teams and their collaborators at Google Research and externally. Finally, I’d like to thank the many Googlers who responded to this post, including Lizzie Dorfman, Erica Brand, Elise Kleeman, Abe Asfau, Viren Jain, Lucy Colwell, Andrew Carroll, Ariel Goldstein, and Charina Chow.


Google Research, 2022 and beyond

This was the seventh blog post in the Google Research, 2022 & Beyond series. Other entries in this series are listed in the table below:

* Articles will be linked when they are published.

Source link