The group focuses on biomedical high-performance computing studies. Through the development of high-performance computing, big data and artificial intelligence algorithms, it solves major problems in biomedical science, empowers the biomedical industry, develops accurate and ultra-fast disease diagnosis and treatment and new drug research and development algorithms, and builds a large number based on Tianhe No. 2 According to the unified cloud supercomputing platform for analysis and calculation, it provides one-stop service for industrial applications.
With the rapid development of health big data, health medicine has become a hot spot in AI industry applications. Tencent, Alibaba, Huawei and other companies have all laid out relevant industrial development one after another. This year's fight against the epidemic has fully demonstrated the important role of big data analysis in human health. During the epidemic, the laboratory made full use of the super computing power of Tianhe 2 to carry out a series of work such as CT-based intelligent diagnosis, drug intelligent recommendation algorithm, and medical knowledge map, and achieved many important results. The project involves CV, NLP, ML, knowledge map and other fields. It has been widely used in various cutting-edge AI technologies. Students who are interested in any technology can find their own stage here.
At present, the main research contents include drug intelligence design, protein structure and function prediction, knowledge-driven multi-omics big data analysis, and the development of a biomedical high-performance computing platform based on Tianhe 2.
Focusing on the whole process of drug design, AI-based full-process algorithms have been developed, including drug screening, molecular optimization, ADMET property prediction, chemical synthesis route prediction, etc. Relevant representative work includes:
Proteins are one of the most important macromolecules in organisms. They participate in almost all life activities. Accurately predicting the three-dimensional spatial structure and folding process of proteins is listed as one of the major scientific difficulties in the 21st century. Since 2013, PI has developed the SPIDER series of protein secondary structure prediction using deep learning technology, which is one of the earliest studies in the world to use deep learning for protein secondary structure prediction. Since then, it has continuously introduced multitasking learning, model iterative training and other strategies, and has made structural prediction from the past. The discrete state of the secondary structure is converted into continuous numerical prediction.
With the diversification and scale of histological data, the state of biological individuals can be comprehensively explained from multiple spatiotemporal scales and different perspectives, so that multiomics data analysis plays an increasingly important role. However, the multi-noise, high-dimensional, and complex relationship between variables of multiomics requires the help of prior knowledge to achieve accurate multi-omics data analysis.