FIT3152 Data analytics -Early Stages Of Covid 19 Assignment
- Subject Code :
FIT3152
- University :
Monash University Exam Question Bank is not sponsored or endorsed by this college or university.
- Country :
Australia
Questions
During the early stages of the COVID-19 pandemic, researchers surveyed participants around the globe. A baseline study was conducted with the aim of identifying the most important predictors of pro-social COVID-19 behaviours, that is, actions that would reduce the spread of thevirus. You can read a more detailed description of the research and results in Van Lissa (2022), see references. The aim of this assignment is to understand country-level differences in predictors of pro-social behaviours, reported by participants as: I am willing to:
help others who suffer from coronavirus. (c19ProSo01)
make donations to help others that suffer from coronavirus. (c19ProSo02)
protect vulnerable groups from coronavirus even at my own expense. (c19ProSo03)
make personal sacrifices to prevent the spread of coronavirus. (c19ProSo04)
Your task is to analyse the baseline survey data overall, with a focus on the country you have been assigned. You may make use of any additional data you require to answer the following questions.
1. Descriptive analysis and pre-processing. (6 Marks)
(a) Describe the data overall, including things such as dimension, data types, distribution of numerical attributes, variety of non-numerical (text) attributes, missing values, and anything else of interest or relevance.
(b) Comment on any pre-processing or data manipulation required for the following analysis.
2. Focus country vs all other countries as a group. (12 Marks)
(a)Identify your focus country from the accompanying list (FocusCountryByID.pdf). How do participant responses for your focus country differ from the other countries in the survey as a group?
(b) Repeat Question 2(b) for the other countries as a group. Which attributes are the strongest predictors? How do these attributes compare to those of your focus country?
3. Focus country vs cluster of similar countries. (10 Marks)
(a) Using several social, economic, health, political or other indicators, identify between 3 and 7 countries (in the baseline data) that are similar to your focus country using clustering. Van Lissa (2022) refers to several indicators you might consider, among others. Some of these are listed in the references, but these are not exhaustive. State the indicators used and describe how you calculated/identified similar countries. Copy and paste the table of values you used for your clustering into your report as an Appendix.
(b) How well do participant responses predict pro-social attitudes (c19ProSo01,2,3 and 4) for this cluster of similar countries? Which attributes are the strongest predictors?
How do these attributes compare to those of your focus country? Comment on the similarity and/or difference between your results for this question and Question 2(c). That is, does the 3 group of all other countries 2(c), or the cluster of similar countries 3(b) give a better match to the important attributes for predicting pro-social attitudes in your focus country? Discuss.
4. Video Presentation: (Submission Hurdle and 4 Marks)
Record a short presentation using your smart phone, Zoom, or similar method. Your presentation should be approximately 5 minutes in length and summarise your main findings for Sections 1 3, as well as describing how you conducted your research and any assumptions made. Pay particular emphasis to your results in Questions 2(c) and 3(b)
5 Overall considerations (8 Marks)
This includes: the quality and clarity of your reasoning and assumptions; the strength of support for your findings; the quality of your writing in general and communication of results; the quality of your graphics throughout, including at least one high-quality multivariate graphic; the quality of your R coding.
Data
The data for this assignment is a reduced version of that collected for the PsyCorona baseline study, Van Lissa et al. (2022). The filename is PsyCoronaBaselineExtract.csv. The data includes ordinal data coded on a numerical scale. For this assignment assume it is reasonable to treat these responses as numerical.
Create your individual data as follows:
rm(list = ls())
set.seed(12345678) # XXXXXXXX = your student ID
cvbase = read.csv("PsyCoronaBaselineExtract.csv")
cvbase <- cvbase[sample(nrow(cvbase), 40000), ] # 40000 rows
Locate your focus country using the accompanying document FocusCountryByID.pdf.