|
University of Edinburgh - (2023-2024)
Project: Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning
I was a visiting student in the Autonomous Agents Research Group at the University of Edinburgh, supervised by Prof. Stefano V. Albrecht.
|
|
Harbin Institute of Technology - (2019-)
Thesis: Causal Reinforcement Learning Generalisation for Healthcare Agents
I began my doctoral studies directly following my undergraduate degree, thanks to the postgraduate recommendation scheme. I am currently pursuing a PhD at the Faculty of Computing at Harbin Institute of Technology.
GPA: 92.53/100
|
|
Harbin Engineering University - (2015-2019)
I earned my bachelor’s degree in Internet of Things Engineering from the College of Computer Science and Technology, Harbin Engineering University in 2019. I was honoured the Outstanding Graduates and Outstanding Graduation Thesis in 2019.
GPA: 88.95/100.
|
|
[1] Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning
Xuehui Yu, Mhairi Dunion, Xin Li, Stefano V Albrecht
NeurIPS 2024 (poster)
Keywords: contrastive learning, RL, meta RL, zero-shot generalisation.
❓ How to build general robots capable of seamlessly operating in any environment, with any object, and utilising various skills?
With our SaMI learning objective, RL agents are incentivised to become versatile and zero-shot generalise across infinite tasks 😉
💡 Generalisation starts with corrective behaviors. The ability to correct and try again is likely a key ingredient.
Code | Paper | Our benchmark: Sa-Panda-gym | SlidesLive demo video
|
|
[2] ARLPE: A Meta Reinforcement Learning Framework for Glucose Regulation in Type 1 Diabetics
Xuehui Yu, Yi Guan, Lian Yan, Shulang Li, Xuelian Fu, Jingchi Jiang*
Expert Systems With Applications, IF: 8.665.
Keywords: meta-RL, active learning, fast online adaptation, RL generalisation.
❓ How can rapid adaptation be achieved with extremely limited data in an online deployment? Employ “optimistic exploration” through active RL!
💉 A RL-based closed-loop control method for artificial pancreas systems, enabling automatic medication infusion via pump control for diverse, previously unseen clinical patients.
Code | Paper| WI Healthcare APP
|
|
[3] Causal Coupled Mechanisms: A Control Method with Cooperation and Competition for Complex System
Xuehui Yu, Jingchi Jiang, Xinmiao Yu, Yi Guan*, Xue Li
The (BIBM) 2022 IEEE International Conference on Bioinformatics and Biomedicine.
Keywords: hierarchical RL, transfer learning, skill composition, RL generalisation.
💡 Using inductive biases to encourage or ensure the model does not rely on features that we expect to change: The policy should only rely on features which will behave similarly in both the training and testing environments.
Paper
|
|
[4] PercolationDF: A percolation-based medical diagnosis framework
Jingchi Jiang, Xuehui Yu, Yi Lin, Yi Guan
Mathematical Biosciences and Engineering, 2022, 19(6): 5832-5849.
Keywords: medical diagnosis, knowledge representation.
The dynamic model based on cascading theory, which models the physiological domino effect in environment dynamics; Increasing similarity between training and testing for generalisation in class-imbalanced datasets.
Paper
|
|
[5] DECAF: An Interpretable Deep Cascading Framework for ICU Mortality Prediction
Jingchi Jiang, Xuehui Yu, Boran Wang, Linjiang Ma, Yi Guan
Artificial Intelligence in Medicine (2022): 102437.
Keywords: graph attention networks, spatio-temporal forecasting, interpretability, mortality prediction.
The dynamic model based on cascading theory, which models the physiological domino effect in environment dynamics; Increasing similarity between training and testing for generalisation in class-imbalanced datasets.
Paper
|
|
[6] Contextual Policy Transfer in Meta-Reinforcement Learning via Active Learning
Jingchi Jiang, Lian Yan, Xuehui Yu and Yi Guan
19th International Conference on Web Information Systems and Applications.
Keywords: meta-RL, active learning, fast online adaptation, RL generalisation.
Paper
|
|
[7] Unified Fine-Grained Biomedical Entity Recognition as a Combination of Boundary Detection and Sequence Generation
Xue Li, Yang Yang, Mingchen Ye, Yi Guan, Xuehui Yu, and Jingchi Jiang
The (BIBM) 2022 IEEE International Conference on Bioinformatics and Biomedicine.
Paper
|
|
[8] An interactive food recommendation system using reinforcement learning
Liangliang Liu, Yi Guan, Zi Wang, Rujia Shen, Guowei Zheng, Xuelian Fu, Xuehui Yu, Jingchi Jiang
Expert Systems With Applications, IF: 8.665.
Keywords: food recommender systems, RL, collaborative filtering, cross attention, state representation
Paper| WI Healthcare APP
|
|
[a] Causal Prompting Model-based Offline Reinforcement Learning
Xuehui Yu, Yi Guan, Rujia Shen, Chen Tang, Jingchi Jiang*
Keywords: differentiable and scalable simulators, model-based offline RL, causal RL, RL generalisation.
💡 A simulation good enough for useful evaluation signal may be much easier to build than a full digital clone for training.
📐 Building better simulators doesn’t necessarily mean they must be entirely realistic; the key is to focus on the sim-to-real gap and differentials have smaller sim-to-real gap.
💥 VirtualPatient is designed to be fully compatible with differentiable simulation.
Paper
|
|
[b] KaDGT: How to Survive in Online Personalisation with Highly Low-quality Offline Datasets
Xuehui Yu, Rujia Shen, Yanming Li, Chen Tang, Yi Guan*
Keywords: real-to-sim, offline RL, decision transformer, trajectory stitching, RL generalisation.
❓ Although Decision Transformer (DT) purports to generate an optimal trajectory, empirical evidence suggests it struggles with trajectory stitching.
💡 A general principle of DT-based algorithms is that decisions should be based on as much information as possible.
|
|