The field of life science and biomedicine is stepping into the digital 3.0 era, and AI is accelerating the steady development of the field of life health and biomedicine towards a faster, more accurate, safer, more economical and more inclusive direction.
On the afternoon of September 26, the 2021 World Internet Conference was held in Wuzhen. At the Data and Algorithm Forum, Academician Zhang Yaqin, president of the Institute of Intelligent Industry (AIR) of Tsinghua University, introduced the new digitization and intelligence changes in the biological world around the theme of "Artificial intelligence enables life science", and shared the new layout of the Institute of Intelligent Industry (AIR) of Tsinghua University in the development of artificial intelligence and life health interdisciplinary. The report was jointly completed by President Zhang Yaqin and team members Ma Weiying, LAN Yanyan and Huang Tingting.
With the development of gene sequencing technology, high-throughput biological experiments, sensors and other technologies, the field of life science and biomedicine is stepping into the digital 3.0 era, and the process of digitalization and automation is accelerating. As a new intelligent scientific computing model, health computing is the fourth research paradigm with artificial intelligence and data-driven as the core. It will greatly help human beings to explore and solve life and health problems.
The development of artificial intelligence from the 1950s to today has produced a lot of different algorithms, especially the early deep learning technology represented by RNN, LSTM and CNN, and the past two years GAN, transformer based (BERT and GPT-3 models), pre-trained models, and so on. It can be said that from our perception, speech recognition, face recognition, and object classification have reached the same level as people. But there are many gaps in natural language understanding, knowledge reasoning, and video semantics and generalization abilities. In addition, there are still major challenges in algorithmic transparency, interpretability, causality, security, privacy and ethics.
There have been many recent advances in trusted AI computing, one example of which is Federated Learning, which is also an important research topic at Tsinghua University's Intelligent Industry Research Institute. There are two main schemes for federated learning. One is horizontal federated learning, which is mainly oriented to scenarios with the same characteristics and models from different sources and can ensure the privacy of data from different sources with the same mode. The other is called vertical federation learning, which can handle different features and models from different sources and can guarantee the privacy of multi-modal data.
We have seen that AI is accelerating the steady development of life health and biomedicine fields towards a faster, more accurate, safer, more economical and more inclusive direction. Specifically, the research of artificial intelligence in protein structure prediction, CRISPR gene editing technology, antibody /TCR/ personalized vaccine research and development, precision medicine, AI-assisted drug design and other aspects has become an international frontier strategic research hotspot.
Considering such disciplinary development trends and industrial background, Tsinghua University Intelligent Industry Research Institute (AIR) has made four research directions in the "AI+ life and health direction", focusing on the research of "AI enhances personal health management and public health", "AI+ medical and life sciences", "AI-assisted drug research and development" and "AI+ gene analysis and editing".
As a cross-field research and application, AIR recognizes that there is a large knowledge gap between artificial intelligence and the life sciences and biomedical fields, and there is a lack of data sets, AI platforms, core algorithms, and computing engines for biological computing, and cross-border talents are also very scarce. In response to the above challenges, AIR proposed the "AI+ Life Science Breaking the Wall Plan", the goal is to define the core frontier research tasks in the field of AI+ life science, cross the field gap between the field of life health and artificial intelligence, break the barriers, promote the deep cross-integration of AI and life science, and accelerate scientific discovery.
To this end, we need to build artificial intelligence infrastructure, data platforms, and core algorithm engines for the field of life science to support cutting-edge research tasks in life science. At the same time, by creating a flagship open data set, organizing algorithm challenge competitions, building a mass intelligence platform for AI+ life science, cultivating cross-border talents, and building an industrial ecology.
AlphaFold2 is a classic success story for AI+ life sciences. Its success factors come from two aspects. First, it is the particularity of the task. Protein structure prediction can be regarded as a one-to-one mapping problem from sequence to three-dimensional structure, so it is a well defined AI problem. This is the goal of Project Break the Wall, to find significant research tasks in the life sciences that can be abstracted as suitable for AI. The second is the superiority of the model. On the one hand, long-term research in the field of life sciences has accumulated large-scale protein structure data, and the entire model architecture of AlphaFold2 makes full use of data-driven end-to-end deep learning models, and the combination of big data and deep models is exactly the typical characteristic of the fourth paradigm. Therefore, the revelation that AlphaFold2 brings us is that in the research of AI+ life science, we should pay attention to the importance of breaking the wall and the fourth paradigm.
email:1583694102@qq.com
wang@kongjiangauto.com