Projects

Immunotherapy Outcome Prediction

Generalising Uncertainty in Machine Learning

Immunotherapy Outcome Prediction

Cancer is a major health burden and the cause of ~50,000 deaths per year in Australia. One in four Australians will receive a cancer diagnosis during their lifetime. Immunotherapy has emerged as a promising treatment for some patients but not all. An area of critical need is the ability to predict who will benefit from immunotherapy prior to treatment. To date, one of the genomic features used to identify therapy response is tumour mutational burden (TMB). However, its predictive accuracy (~60%-80%) is inadequate and TMB often fails to translate across different cancer types.

In this project, we utilised publicly available genomic and transcriptomic datasets (limited to melanoma data) to develop a machine learning model predicting immunotherapy response and to test generalisability of this model to lung cancer we teamed up with our partners, Metro North Hospital Health Service, QIMR Berghofer and BGI. There, we prospectively recruited lung cancer patients from 8 hospitals within Australia. We generated genomic and transcriptomic data from these patients to tested our model on that dataset.

The final machine learning (ML) model, integrated information from the genome (somatic mutations and TMB) and the transcriptome data (differential expression) to classify patients into responders and non-responders to immunotherapy. Within melanoma we achieved accuracy above 92% on a validation dataset and above 70% on a totally independent/unseen test set.

Key Findings:

Analysing different types of genetic data together, such as DNA and RNA, can provide more reliable results than when using DNA or RNA alone
We observed issues with predicting immunotherapy response, using a single value as a total aggregation of cancer mutations (otherwise known as Tumour Mutational Burden). It is argued that cancers with a high number of such mutations respond to immunotherapy better, because cancer is less able to "hide" itself from our immune system. However, we observed that in two separate populations, diagnosed with the same cancer, non-responders had a higher total number of such mutations compared to responders in another, which poses a serious problem to the TMB hypothesis. Specifically, because we show that it could be mutations in specific families of genes that matter rather than just the overall number of mutations.
We found that genes whose expression levels change in immunotherapy non-responders are different from the genes that are characterised by mutations in coding regions
Expression level changes seemed to be related to biological processes that dealt with cell’s distress signals, which directly impacted activation or inactivation of the immune system. Whereas, the changes caused by coding mutations were mostly linked to cell wall structures, cell ability to migrate and the external environment surrounding the cells.

DISCLAIMER: This work is unpublished and was supported by CRCP grant and led by Dr Maciej Trzaskowski when he acted as Head of Research at Max Kelsen.

Show Less

Cancer of Unknown Primary

Generalising Uncertainty in Machine Learning

Immunotherapy Outcome Prediction

Cancer of Unknown Primary is a heterogenous group of unrelated cancers whose primary site escapes detection. By definition, CUP is a diagnosis of exclusion after performing a minimum standardised diagnostic work-up.

CUP accounts for 3-5% of all malignancies worldwide3 and prognosis remains poor. Patients with CUP have a median overall su

CUP accounts for 3-5% of all malignancies worldwide3 and prognosis remains poor. Patients with CUP have a median overall survival (OS) of 8-11 months. In Australia, CUP constitutes 1.6% of all cancers and its incidence has significantly decreased between 1982 and 2019 (from 16 to 8.1 per 100,000 persons), however, like the global statistics, the prognosis is poor. The 5-year survival of a patient with CUP remains low at 13%.

Based on our previous work with transparency and numerical safety of neural networks [1,2], we have devised and tested AI-based differential expression algorithms that have the potential to deliver fast, reliable and accurate predictors of the primary site. The aim was to help pathologist and clinical oncologist identify where in the body the tumour originated as early and as fast as possible. Knowing this could initiate early and/or better treatment.

DISCLAIMER: This work is unpublished and was led by Dr Maciej Trzaskowski when he acted as Head of Research at Max Kelsen.

Show Less

Generalising Uncertainty in Machine Learning

Uncertainty estimation is crucial for understanding the reliability of deep learning (DL) predictions, and critical for deploying DL in the clinic. Differences between training and production datasets can lead to incorrect predictions with underestimated uncertainty. To investigate this pitfall, we benchmarked one pointwise and three approximate Bayesian DL models for predicting cancer of unknown primary, using three RNA-seq datasets with 10,968 samples across 57 cancer types. Our results highlight that simple and scalable Bayesian DL significantly improves the generalisation of uncertainty estimation. Moreover, we designed a prototypical metric—the area between development and production curve (ADP), which evaluates the accuracy loss when deploying models from development to production. Using ADP, we demonstrate that Bayesian DL improves accuracy under data distributional shifts when utilising ‘uncertainty thresholding’. In summary, Bayesian DL is a promising approach for generalising uncertainty, improving performance, transparency, and safety of DL models for deployment in the real world.

Our study highlighted approaches for quantifying and improving robustness to shift-induced overconfidence with simple and accessible DL methods in the context of oncology. We justified our approach with mathematical and empirical evidence, biological interpretation, and a new metric, the ADP designed to encapsulate shift-induced overconfidence—a crucial aspect that needs to be considered when deploying DL in real-world production. Moreover, the ADP is directly interpretable as a proxy to expected accuracy loss when deploying DL models from development to production. Although we have addressed the shift-induced overconfidence by utilising first-line solutions, work remains to bridge DL from theory to practice. We must account for data distributions, evaluation metrics, and modelling assumptions as all are equally important and necessary considerations to see safe translation of DL into clinical practice.

DISCLAIMER: This work was published in Scientific Reports and was led by Dr Maciej Trzaskowski when he acted as Head of Research at Max Kelsen.

Show Less

See Publication

Projects

Immunotherapy Outcome Prediction

Generalising Uncertainty in Machine Learning

Immunotherapy Outcome Prediction

Cancer of Unknown Primary

Generalising Uncertainty in Machine Learning

Immunotherapy Outcome Prediction

Generalising Uncertainty in Machine Learning

Generalising Uncertainty in Machine Learning

Generalising Uncertainty in Machine Learning

This website uses cookies.