Clearly, AlphaFold2 is just the beginning, and its success is starting a new paradigm. The accurate prediction of protein structure provides life scientists with an efficient computational tool, and also provides the possibility of major life science discoveries based on AI. In the future, the epitopes prediction of antibodies and antigens, the precision therapy of tumors, and the design and optimization of TCR/ personalized vaccines will become important research hotspots, and breakthrough progress will be made under the new computing model driven by AI, and the golden age of AI+ macromolecular pharmaceuticals will officially arrive.
Among them, many new scientific challenges will arise, but also herald the emergence of new computing paradigms, such as the integration of dry and wet closed-loop computing framework. On the one hand, artificial intelligence models will become more intelligent through closed-loop verification and data supplement of high-throughput, multi-round wet experiments. On the other hand, through active learning or reinforcement learning, AI will actively plan the automation of wet experiments, form dry and wet closed-loop verification, and iteratively accelerate life science discovery and industrial application. We foresee that through the opening of the wet and dry closed loop, life science research and biomedical industry will usher in a new research paradigm and industrial model.
AIR has already made some initial advances in the expression and prediction of genetic data. Recently, the GeneBert team led by Professor LAN Yanyan from the Institute of Intelligent Industry (AIR) of Tsinghua University designed a novel gene pre-training model. By constructing a two-dimensional matrix between sequences and transcription factors, a multi-modal gene pre-training model was realized, and an effective representation of genetic data was obtained. In particular, the data value of non-coding regions has been mined, which has greatly improved the performance in the prediction of downstream promoter and transcriptor binding sites, and gene screening for Hirschsprung's disease. We believe that the continued in-depth application of cutting-edge AI technologies such as pre-training on genetic data will further explore the value of genetic data, help us crack the human code, and play a role in important issues such as the precision treatment of cancer.
In summary, we believe that the biological world is in the new revolution of digitalization, automation and intelligent scientific computing, and it has become an important research direction to use computational methods, namely artificial intelligence and data-driven fourth research paradigm, to assist people to explore and solve life and health problems. In the future, it is necessary for academia and industry to jointly promote the development of life sciences, biomedicine, genetic engineering and personal health from isolation and open-loop to collaborative and closed-loop development, and achieve faster, more accurate, safer, more economical and more inclusive innovation in life sciences and biomedicine, which represents a huge new opportunity for scientific development and industrial innovation in the next decade.
email:1583694102@qq.com
wang@kongjiangauto.com