by Alexandre Lädermann (Hôpital de La Tour, Meyrin, Switzerland), Philippe Collin (American Hospital of Paris, France) and Patrick J. Denard (Oregon Shoulder Institute, Medford, Oregon, USA)

Surgery is said to be indicated when conservative treatment fails. Previous studies reported that around 20% of patients do not improve sufficiently after surgery, inducing frustration, high societal costs, and an overload of healthcare systems. The consortium members investigated the efficacy of machine learning methods in detecting outcomes. They achieved a promising model with a recall of 32% of the cases that were inappropriate candidates for an operation.

Healthcare systems face several challenges linked to the ageing of the population and an increase in the prevalence of chronic conditions. Among them, debilitating musculoskeletal pathologies are widespread, resulting in severe restrictions on daily activities and work capabilities, leading to a dramatic increase in the number of patients admitted to hospitals for surgery during their last decades. These musculoskeletal pathologies constitute a high societal cost and an occupational burden for workers; however, there is little overall evidence behind surgical indications. Consequently, around one-tenth to one-fifth of them will be not treated in the appropriate setting according to their respective medical condition in a healthcare continuum.[1] Some patients could have avoided surgeries and have been sent directly from inpatient to outpatient care services such as physiotherapy, improving care pathways instead of compromising use of resources.

To solve these problems, our consortium developed an innovative decision-making tool that could improve clinical practice guidelines and help communicate the expected result from a proposed surgical treatment as an essential component of informed consent. Machine learning (ML) is a field that focuses on the learning aspect of artificial intelligence (AI) by developing algorithms that best represent a set of data.[2] Within our consortium, existing collaborating partners are used to apply value-based health care (VBHC) strategy and skills.[3] Evaluating the value of health care is of paramount importance to keep improving patients' quality of life and optimising associated costs. This study aimed to compare ML algorithms and determine the accuracy of AI in predicting clinical outcomes after rotator cuff repair. The hypothesis was that preoperative clinical and intraoperative data alone would be insufficient to provide proper guidelines.

We analysed prospectively collected data from patients undergoing rotator cuff repair between March 2013 and July 2020. In total, 9,030 cases were enrolled, leading to a refined selection of cases: 4,683 subjects, which were divided into the training (80%) and testing (20%) sets. Different ML models were trained and tested using the 236 valid pre-treatment features. The models were also tested on a reduced set of only ten features, identified through Shapley Additive Explanations (SHAP) (Figure 1). The Single Assessment Numeric Evaluation (SANE) score at one year was the output variable. Minimal specificity was set at 95%.

Figure 1: Plots for the precision-recall curve (left) and the area under the ROC curve (right) for the three machine learning models: XGBoost, multi-layer perceptron (MLP) and support vector machine (SVM). Mean ROC is represented in the AUROC as a plot line, with the standard deviation of the cross-validation shown as a shade.
Figure 1: Plots for the precision-recall curve (left) and the area under the ROC curve (right) for the three machine learning models: XGBoost, multi-layer perceptron (MLP) and support vector machine (SVM). Mean ROC is represented in the AUROC as a plot line, with the standard deviation of the cross-validation shown as a shade.

The performance of the XGBShap10 model revealed a specificity of 0.951 (0.934–0.965), a precision of 0.587 (0.485–0.685), a recall of 0.32 (0.252–0.392), an accuracy of 0.838 (0.814–0.861), an F1 score of 0.413 (0.338–0.487) and an area under the receiver operating characteristic curve (AUC) of 0.667 (0.618–0.715).

With a mean accuracy of 84% and a specificity set above 95%, our pilot study showed that the model is not a simple heuristic; it could be integrated into existing healthcare information systems to help clinicians develop better and more reasonable treatment programmes, more adequately inform patients about expected results (empowerment), and save, at a European level, billions per year. We now aim to improve the accuracy and fairness of the tool, obtain certification, develop better guidelines and transfer such platform technology to other disciplines, such as knee (anterior cruciate ligament) and foot (Achilles tendon) surgeries.

The authors would like to acknowledge the consortium partners Med4Cast (Martigny, Switzerland), Idiap Research Institute (Martigny, Switzerland), the insurance Group Mutuel (Martigny, Switzerland), the IDE4 foundation (Geneva, Switzerland), and the Image Analysis research unit of the Université Libre de Bruxelles (LISA-IA, Belgium), Katalysen (Stockholm, Sweden), SCIPROM (Lausanne, Switzerland), and Nexialist (La Ciotat, France) for their valuable feedback in the design, implementation, visual-data acquisition, certification and annotation involved in this project. This project has received funding from the Fondation de Bienfaisance Pierre & Andrée Haas (Geneva, Switzerland), the insurance Groupe Mutuel (Martigny, Switzerland), and FORE (Foundation for Research and Teaching in Orthopedics, Sports Medicine, Trauma, and Imaging in the Musculoskeletal System), Grant #2023-13.

References:
[1] P. Collin, et al., “Prospective evaluation of clinical and radiologic factors predicting return to activity within 6 months after arthroscopic rotator cuff repair,” Journal of Shoulder Elbow Surgery, vol. 24, no. 3, pp. 439–45, Mar. 2015, doi: 10.1016/j.jse.2014.08.014.
[2] L. J. H. Allaart et al., “Developing a machine learning algorithm to predict probability of retear and functional outcomes in patients undergoing rotator cuff repair surgery: protocol for a retrospective, multicentre study,” BMJ Open, vol. 13, no. 2, p. e063673, Feb. 10 2023, doi: 10.1136/bmjopen-2022-063673.
[3] A. Lädermann, et al., “Measuring patient value after total shoulder arthroplasty,” Journal of Clinical Medicine, vol. 10, no. 23, Dec. 4 2021, doi: 10.3390/jcm10235700.

Please contact:
Alexandre Lädermann, Division of Orthopaedics and Trauma Surgery, Hôpital de La Tour, Meyrin, Switzerland
This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: April 2025
Special theme:
Cultural AI
Call for the next issue
Image ERCIM News 134
This issue in pdf

 

Image ERCIM News 134 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed