PUBH5006: Assessment 3: ML deployment in Shiny
PUBH5006: Assessment 3: ML deployment in Shiny
Introduction:
In this assessment you are required to develop and deploy a risk prediction model for AIE_outcome in Shiny app, which you will submit to me by cutting and pasting two code files in the space provided on the following pages. The files include:
Model development in R: including all data formatting and pre-processing, model development and validation code, including the code in which you save your model object (full details given p.2.) remember to set seed!!!
The Shiny app code (the full app code so I can run it myself)
This assessment is less structured than previous assessments, and you are free to develop any of the three models from assessment 2 (regularised regression, XGBoost or neural networks), and you can even use the model you selected in your assessment 2 (but you still need to give me all the code again! And you may want to improve upon your original model!).
However, you must follow some instructions, including:
You must provide me with ALL your code so that I can reproduce your model and Shiny app. This includes ALL code to reformat your data, all code to develop the model, and all Shiny app code.
The model must be developed and tested in R, then saved as an .rds file and loaded into Shiny.
You must include code to remove 10 patients from your data after pre-processing but before splitting into train/test so that I can run load these patients through your Shiny app (exactly as done in week 10).
The assessment is worth 40% of your overall mark and includes 40 marks in total, with marks allocated as shown above the answer box in each of the two sections.
Important note of model explainability: Your Shiny app needs to offer an explanation for why the patient is at high/low risk. If using XGBoost this is easy and you can use the SHAP values as demonstrated in the week 10 exercise. However, if using neural network or regularised regression, you will need to take a different approach, some ideas include:
Regularised regression: could show the top 5 or 10 strongest coefficients and the current patients values (this could show for example that older age increases risk of outcome and this patient was older).
Neural networks: Derive the permutation feature importance of all variables (could be a lot of work) and simply list the 5 or 10 most important.
Or for any of the models you could explore alternative approaches available in other packages not shown in this unit.
I designed this assessment so you have some control over how challenging you make it for yourself (if youre not feeling it you can pretty much replicate the week 11 app with the new outcome however there are some marks for novelty which you wont get if you do this).
This assessment was made available at 4 pm Friday 13th October and is due in two weeks, by 4 pm Friday 27th October.
Type student name and date here:______________________________________________
Section 1: Model development Total of 18 marks
Mark allocation:
3 marks proper pre-processing steps.
2 marks Removing 10 patients to be used during deployment prior to train/test split.
3 marks Developing a model which is not over-fit (roughly means train/test set AUC not more than 5 points difference).
3 marks Check the model calibration using a method shown in class/lecture and comment on the calibration over the range of predicted probabilities.
3 marks Apply a re-calibration method and decide if the original and re-calibrated probabilities are better, and then use the one you believe is better.
2 marks Identifying a cut-off value for the predicted probability so that you can base a recommendation from this.
2 marks Saving your model as an .rds for loading into Shiny.
Section 2: Shiny app total of 22 marks
Mark allocation:
5 marks The basic app working (loads ML model, reads in patient .csv, produces predicted probabilities, displays predicted probabilities to user).
5 marks Including an explanation of why the patient is at increased/decreased risk (if neural network or regularised regression, part of this will be developed in your R code and Ill give these marks here for what you do there read final point in introduction for further details).
2 marks Including a recommendations of whether the patient is at high or low risk.
5 marks Overall presentation and information (e.g., title, explanation of the app and outputs, app not showing error messages or showing other aesthetic elements).
5 marks Creativity and difficulty (trying and succeeding in doing something different or difficult, and not simply replicating the app made in week 10).
 
								