MA5832 Capstone Report
- Subject Code :
MA5832
- Country :
Australia
1 Capstone scenario
Dataset
The data, “AUS data 2023.xlsx”, used in the capstone is aggregated and collected from the Australian Bureau of Statistics (ABS).1 The data is available quarterly from Dec 1982 to March 2023. The data includes the response variable (unemployment rate) and 7 predictors:
Further explanations about the variables can be found in the ABS website.
2 Assessment Tasks
- Y : unemployment rate measured in percentage
- X1: Percentage change in Gross domestic product;
- X2: Percentage change in the Government final consumption expenditure;
- X3: Percentage change in final consumption expenditure of all industry sectors;
- X4: Term of trade index (percentage)
- X5: Consumer Price Index of all groups (CPI) ;
- X6: Number of job vacancies measured in thousands;
- X7: Estimated Resident Population in thousands.
- Use the data from Dec 1982 to Dec 2020 as the training set
- Provide an overview of the Australian unemployment rate over the training data pe riod, and some insights on factors driving the unemployment rate (provide relevant references when needed; maximum one A4 page).
- (10 marks)
Data
Prepare data appropriate for the proposed supervised machine learning methodolo gies such as:
(10 marks)
- implementing appropriate data wrangling procedures, e.g. missing values treat ment /transformation of variables.
- provide and comment on descriptive statistics of the variable
Machine Learning
Select the most effective supervised machine learning (ML) algorithm discussed in this course to the dataset prepared in Question 3 to predict the Australian unem ployment rate from March 2021 to March 2023.
(25 marks)
- Justify your choice over the other supervised machine learning algorithms.
- Justify the choice of the hyper-parameter(s) which is required to be specified in R to estimate the selected model.
- Report the performance(s) and interpretation(s) of the obtained ML model(s) on the training dataset.
- Discuss the predictive performance of the model on the test dataset (March 2021 to March 2023).
Neural Network
Apply a neural network (NN) to the data prepared in Question 3 to predict the Australian unemployment rate from March 2021 to March 2023. (35 marks)
- Describe the structure of the selected neural network model.
- Report the performance(s) and interpretation(s) of the produced NN models on the training dataset.
- Discuss the predictive performance of the model on the test dataset (March 2021 to March 2023).
- Vary the number of hidden layers in the model 5(a). Explore the impacts of the change on the prediction performance of the model.
- Vary the number of neurons in each layer in the model 5(a). Explore the impacts of the change on the prediction performance of the model.
Comparison and Suggestion
Compare the chosen ML model in Question 4 with the NN model in Question 5, and then provide a recommended model. At a minimum, include (10 marks)
- Cross-validated accuracy
- Computational time to train models
- Interpretability
Provide some suggestions regarding the methodologies/data to further improve the prediction of the unemployment rate of Australia. (10 marks)
Are you struggling to keep up with the demands of your academic journey? Don't worry, we've got your back! Exam Question Bank is your trusted partner in achieving academic excellence for all kind of technical and non-technical subjects.
Our comprehensive range of academic services is designed to cater to students at every level. Whether you're a high school student, a college undergraduate, or pursuing advanced studies, we have the expertise and resources to support you.
To connect with expert and ask your query click here Exam Question Bank