Cancer is a major health burden and the cause of ~50,000 deaths per year in Australia. One in four Australians will receive a cancer diagnosis during their lifetime. Immunotherapy has emerged as a promising treatment for some patients but not all. An area of critical need is the ability to predict who will benefit from immunotherapy
Cancer is a major health burden and the cause of ~50,000 deaths per year in Australia. One in four Australians will receive a cancer diagnosis during their lifetime. Immunotherapy has emerged as a promising treatment for some patients but not all. An area of critical need is the ability to predict who will benefit from immunotherapy prior to treatment. To date, one of the genomic features used to identify therapy response is tumour mutational burden (TMB). However, its predictive accuracy (~60%-80%) is inadequate and TMB often fails to translate across different cancer types.
In this project, we utilised publicly available genomic and transcriptomic datasets (limited to melanoma data) to develop a machine learning model predicting immunotherapy response and to test generalisability of this model to lung cancer we teamed up with our partners, Metro North Hospital Health Service, QIMR Berghofer and BGI. There, we prospectively recruited lung cancer patients from 8 hospitals within Australia. We generated genomic and transcriptomic data from these patients to tested our model on that dataset.
The final machine learning (ML) model, integrated information from the genome (somatic mutations and TMB) and the transcriptome data (differential expression) to classify patients into responders and non-responders to immunotherapy. Within melanoma we achieved accuracy above 92% on a validation dataset and above 70% on a totally independent/unseen test set.
Key Findings:
DISCLAIMER: This work is unpublished and was supported by CRCP grant and led by Dr Maciej Trzaskowski when he acted as Head of Research at Max Kelsen.
Cancer of Unknown Primary is a heterogenous group of unrelated cancers whose primary site escapes detection. By definition, CUP is a diagnosis of exclusion after performing a minimum standardised diagnostic work-up.
CUP accounts for 3-5% of all malignancies worldwide3 and prognosis remains poor. Patients with CUP have a median overall su
Cancer of Unknown Primary is a heterogenous group of unrelated cancers whose primary site escapes detection. By definition, CUP is a diagnosis of exclusion after performing a minimum standardised diagnostic work-up.
CUP accounts for 3-5% of all malignancies worldwide3 and prognosis remains poor. Patients with CUP have a median overall survival (OS) of 8-11 months. In Australia, CUP constitutes 1.6% of all cancers and its incidence has significantly decreased between 1982 and 2019 (from 16 to 8.1 per 100,000 persons), however, like the global statistics, the prognosis is poor. The 5-year survival of a patient with CUP remains low at 13%.
Based on our previous work with transparency and numerical safety of neural networks [1,2], we have devised and tested AI-based differential expression algorithms that have the potential to deliver fast, reliable and accurate predictors of the primary site. The aim was to help pathologist and clinical oncologist identify where in the body the tumour originated as early and as fast as possible. Knowing this could initiate early and/or better treatment.
DISCLAIMER: This work is unpublished and was led by Dr Maciej Trzaskowski when he acted as Head of Research at Max Kelsen.
Uncertainty estimation is crucial for understanding the reliability of deep learning (DL) predictions, and critical for deploying DL in the clinic. Differences between training and production datasets can lead to incorrect predictions with underestimated uncertainty. To investigate this pitfall, we benchmarked one pointwise and three appr
Uncertainty estimation is crucial for understanding the reliability of deep learning (DL) predictions, and critical for deploying DL in the clinic. Differences between training and production datasets can lead to incorrect predictions with underestimated uncertainty. To investigate this pitfall, we benchmarked one pointwise and three approximate Bayesian DL models for predicting cancer of unknown primary, using three RNA-seq datasets with 10,968 samples across 57 cancer types. Our results highlight that simple and scalable Bayesian DL significantly improves the generalisation of uncertainty estimation. Moreover, we designed a prototypical metric—the area between development and production curve (ADP), which evaluates the accuracy loss when deploying models from development to production. Using ADP, we demonstrate that Bayesian DL improves accuracy under data distributional shifts when utilising ‘uncertainty thresholding’. In summary, Bayesian DL is a promising approach for generalising uncertainty, improving performance, transparency, and safety of DL models for deployment in the real world.
Our study highlighted approaches for quantifying and improving robustness to shift-induced overconfidence with simple and accessible DL methods in the context of oncology. We justified our approach with mathematical and empirical evidence, biological interpretation, and a new metric, the ADP designed to encapsulate shift-induced overconfidence—a crucial aspect that needs to be considered when deploying DL in real-world production. Moreover, the ADP is directly interpretable as a proxy to expected accuracy loss when deploying DL models from development to production. Although we have addressed the shift-induced overconfidence by utilising first-line solutions, work remains to bridge DL from theory to practice. We must account for data distributions, evaluation metrics, and modelling assumptions as all are equally important and necessary considerations to see safe translation of DL into clinical practice.
DISCLAIMER: This work was published in Scientific Reports and was led by Dr Maciej Trzaskowski when he acted as Head of Research at Max Kelsen.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.