top of page

Our Work

Here shows our work related to AI for Health & AI for Science.

We collaborate with biologists, chemists, and clinicians for highly interdisciplinary research.

Active Research Directions:

AI for Surgical Data Science

AI for Robotic Surgery

Surgical Data Science.png


Automatic Microsurgical Skill Assessment Based on
Cross-Domain Transfer Learning


The assessment of microsurgical skills for Robot-Assisted Microsurgery (RAMS) still relies primarily on subjective observations and expert opinions. A general and automated evaluation method is desirable. Deep neural networks can be used for skill assessment through raw kinematic data, which has the advantages of being objective and efficient. However, one of the major issues of deep learning for the analysis of surgical skills is that it requires a large database to train the desired model, and the training process can be time-consuming. This paper presents a transfer learning scheme for training a  model with limited RAMS datasets for microsurgical skill assessment.  

An in-house Microsurgical Robot Research Platform Database (MRRPD) is built with data collected from a microsurgical robot research platform (MRRP). It is used to verify the proposed cross-domain transfer learning for RAMS skill level assessment. The model is fine-tuned after training with the data obtained from the MRRP. Moreover, microsurgical tool tracking is developed to provide visual feedback while task-specific metrics and the other general evaluation metrics are provided to the operator as a reference. The method proposed has been shown to offer the potential to guide the operator to achieve a higher level of skills for microsurgical operation.

Full paper link:

Dandan Zhang; Zicong Wu; Junhong Chen; Anzhu Gao; Xu Chen; Peichao Li;
Zhaoyang Wang; Guitao Yang; Benny Lo; Guang-Zhong Yang


Surgical Gesture Recognition Based on  Bidirectional
Multi-Layer Independently  RNN with Explainable Spatial Feature Extraction

Dandan Zhang; Ruoxi Wang; Benny Lo


In this work, we aim to develop an effective surgical gesture recognition approach with an explainable feature extraction process. A   Bidirectional Multi-Layer independently  RNN (BML-indRNN) model is proposed in this paper, while spatial feature extraction is implemented via fine-tuning of a Deep Convolutional Neural Network (DCNN) model constructed based on the VGG architecture. To eliminate the black-box effects of DCNN, Gradient-weighted Class Activation Mapping (Grad-CAM) is employed. It can provide explainable results by showing the regions of the surgical images that have a strong relationship with the surgical gesture classification results.

The proposed method was evaluated based on the suturing task with data obtained from the public available JIGSAWS database. Comparative studies were conducted to verify the proposed framework. Results indicated that the testing accuracy for the suturing task based on our proposed method is 87.13%, which outperforms most of the state-of-the-art algorithms.

Full paper link:


Real-time Surgical Environment Enhancement for
Robot-Assisted Minimally Invasive Surgery Based on  Super-Resolution


In Robot-Assisted Minimally Invasive Surgery (RAMIS), a camera assistant is normally required to control the position and the zooming ratio of the laparoscope, following the surgeon's instructions. However, moving the laparoscope frequently may lead to unstable and suboptimal views, while the adjustment of zooming ratio may interrupt the workflow of the surgical operation. To this end, we propose a  multi-scale Generative Adversarial Network (GAN)-based video super-resolution method to construct a framework for automatic zooming ratio adjustment. It can provide automatic real-time zooming for high-quality visualization of the Region of Interest (ROI) during the surgical operation. In the pipeline of the framework, the Kernel Correlation Filter (KCF) tracker is used for tracking the tips of the surgical tools, while the Semi-Global Block Matching (SGBM)-based depth estimation and Recurrent Neural Network (RNN)-based context-awareness are employed to determine the upscaling ratio for zooming. The framework is validated with the JIGSAW dataset and Hamlyn Centre Laparoscopic/Endoscopic Video Datasets, with results demonstrating its practicability.

Full paper link:

Ruoxi Wang, Dandan Zhang, Qingbiao Li, Xiao-Yun Zhou, Benny Lo

bottom of page