Task-aware asynchronous multi-task model with class incremental contrastive learning for surgical scene understanding

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Task-aware asynchronous multi-task model with class incremental contrastive learning for surgical scene understanding

Seenivasan, L; Islam, M; Xu, M; Lim, CM; Ren, H; (2023) Task-aware asynchronous multi-task model with class incremental contrastive learning for surgical scene understanding. International Journal of Computer Assisted Radiology and Surgery , 18 pp. 921-928. 10.1007/s11548-022-02800-2. Green open access

Preview

Text
2211.15327.pdf - Accepted Version
Download (645kB) | Preview

Abstract

PURPOSE: Surgery scene understanding with tool-tissue interaction recognition and automatic report generation can play an important role in intra-operative guidance, decision-making and postoperative analysis in robotic surgery. However, domain shifts between different surgeries with inter and intra-patient variation and novel instruments’ appearance degrade the performance of model prediction. Moreover, it requires output from multiple models, which can be computationally expensive and affect real-time performance. METHODOLOGY: A multi-task learning (MTL) model is proposed for surgical report generation and tool-tissue interaction prediction that deals with domain shift problems. The model forms of shared feature extractor, mesh-transformer branch for captioning and graph attention branch for tool-tissue interaction prediction. The shared feature extractor employs class incremental contrastive learning to tackle intensity shift and novel class appearance in the target domain. We design Laplacian of Gaussian-based curriculum learning into both shared and task-specific branches to enhance model learning. We incorporate a task-aware asynchronous MTL optimization technique to fine-tune the shared weights and converge both tasks optimally. RESULTS: The proposed MTL model trained using task-aware optimization and fine-tuning techniques reported a balanced performance (BLEU score of 0.4049 for scene captioning and accuracy of 0.3508 for interaction detection) for both tasks on the target domain and performed on-par with single-task models in domain adaptation. CONCLUSION:The proposed multi-task model was able to adapt to domain shifts, incorporate novel instruments in the target domain, and perform tool-tissue interaction detection and report generation on par with single-task models.

Type:	Article
Title:	Task-aware asynchronous multi-task model with class incremental contrastive learning for surgical scene understanding
Location:	Germany
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1007/s11548-022-02800-2
Publisher version:	https://doi.org/10.1007/s11548-022-02800-2
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords:	Curriculum learning, Domain generalization, Scene graph, Surgical scene understanding
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Med Phys and Biomedical Eng
URI:	https://discovery-pp.ucl.ac.uk/id/eprint/10164003

Downloads since deposit

690Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item