ICT583 Data Science Applications

Subject Code :
ICT583
University :
Murdoch University Exam Question Bank is not sponsored or endorsed by this college or university.
Country :
Australia

ICT583 Data Science Applications

Murdoch University

Mid-term assignment & Data science application project

In this unit, you will complete two consecutive assignments that focus on a specific topic in real-world data science applications. These assignments are designed to help you develop a good understanding of the latest data-driven modeling techniques used in real-world applications, and to guide you through the implementation of the entire data science pipeline on a real dataset using R. By completing these assignments, you will gain hands-on experience and knowledge that will prepare you for new real-world data science projects.

-47625114412

Topic background: Dementia is a debilitating disease that affects millions of people worldwide. Early detection and risk prediction are crucial for effective treatment and care. Data-driven models are increasingly important in the field of dementia research, as they can identify patterns and relationships in complex datasets that can be used to predict an individual's risk of developing the disease.

Assignment 1 Mid-term assignment (group assignment)

For this assignment, your group will conduct a concise literature review on the latest data-driven models for the dementia risk analysis and prediction. The purpose of this review is to help you gain knowledge and ideas about the most up-to-date data-driven approaches used for dementia risk analysis and prediction, so that you can develop your own models to analyze the dementia data provided in Assignment 2.

Group assignment guidelines:

You will be working on this assignment in a group of 3 to 4 students.

Please note that you are only allowed to form a group with students who are enrolled in the same tutorial as yours.

Each group is required to submit one literature review in a Word document and one signed group contribution sheet. Only one group member, designated as the liaison person, should submit the required documents on behalf of the group.

Collaborate with your group members to complete the assignment and submit it before the deadline. Make sure to communicate effectively and contribute to the group's work EQUALLY. A group contribution sheet is required to submit along with this assignment. Each group members individual mark will be given based on the contribution to the group work.

Literature review guidelines:

You are required to review at least five computing JOURNAL articles published date after 2020 that focus on dementia risk analysis and prediction modeling using data-driven models and analytics tools, such as statistical, machine learning and other data mining techniques.

Word limit: 1,500 words (can be within a +/- 10% range of this word limit), excluding references.

The document should be formatted in Times New Roman 12 font, single line space, with Normal margins selected (from the Word 'Layout' menu, choose 'Normal').

Your review should be well-structured, clearly written, and appropriately referenced. The following outline should be followed:

1. Introduction:

The introduction is used to set the context of your review. In this opening paragraph, you need to:

a. Define the topic of your study and provide any relevant background information that helps your reader to understand the topic.

b. Explain your reason or perspective for reviewing the literature on this topic.

By doing so, you will give your readers an idea of what to expect in your review and what and why data-driven models for dementia risk analysis and prediction is significant.

2. Body:

This section begins with an explanation of how you have organized your small-scale literature review.

Before you begin this section, be sure that you have sorted your reviewed articles into different themes which can be based on different analyzed data types, data-driven techniques, or the purposes of data modelling. After you sort your articles, it is important to give your sorted groups a descriptive name. The names of the sorted articles will become your headings for each of the paragraphs that you write in the body of your review.

To write the body of your small-scale literature review, it is important to include the following:

a. Write an introduction paragraph for the body of your review. This paragraph tells the reader specific information on how many articles you reviewed and how you sorted the articles into common themes.

b. This will be separate paragraphs that describe each theme and a summary of each article including the data resources used, adopted data-driven models, findings, advantages, and weaknesses, etc. you can also compare, contrast and/or connect the articles you've selected under each theme.

3. Summary

This is the last paragraph of your small-scale literature review. In this paragraph, it is important to summarize the main findings and insights from the review. You should also identify any gaps or limitations in the studies reviewed, as well as any opportunities for further research and development in this field.

4. References

This is the last page of your review. It serves as a listing of all references that you mentioned in your paper. Please use IEEE reference style when completing this list. Please refer to the useful links below.

Useful links:

Where to find literature review

https://scholar.google.com.au/https://librarysearch.murdoch.edu.au/discovery/search?vid=61MUN_INST:61MU?=enSearch for literature Guide

https://libguides.murdoch.edu.au/LitReview/searchIEEE Referencing Guide

https://libguides.murdoch.edu.au/IEEEhttps://medium.com/academicianhelp/ieee-referencing-using-microsoft-word-66c855181d64Assignment 2 Data science application project (individual assignment)

Students will work independently to perform the entire data science pipeline on a given real-world dementia dataset using R. You will be required to describe the entire project in a detailed report and submit the code.

The data set used in this study was obtained from a mobile health care service offered in collaboration with non-governmental organizations that run elderly care centers. This service was provided to elderly people residing in various districts of Hong Kong for free from 2008 to 2018. The data set consists of 2299 cases, each of which includes eleven variables. These variables include age, body height, body weight, education level, financial support, geriatric depression scale score, out-of-pocket financial source (whether they were independent or dependent on family), marital status, Mini Nutritional Assessment part A score, Mini Nutritional Assessment part B score. The outcome labels were based on the categories of the Mini Mental State Exam.

Assignment guidelines:

Each student is required to submit one project report in a Word document, and R files which are reproducible to generate all the results in the report.

R is the only accepted programming language for this assignment. You must use R to complete all tasks and analyses.

Project report guidelines:

Do not include any form of code snippets directly into the report. All code should be included solely in the R files submitted.

Word limit: 800 words (can be within a +/- 10% range of this word limit), excluding references, figures, and tables. The report should be formatted in Times New Roman 12 font with normal margins selected (from the Word 'Layout' menu, choose 'Normal').

Note that 800 words can be a relatively short length for a project report, so it's important to focus on being clear and concise in your writing, and make the maximum use of well-designed visualization to help convey information in a more efficient and impactful way. The following outline should be followed:

Introduction: Introduce the topic of the data science project, including the problem statement and the goals that the project aims to achieve.

Dataset description: Provide background information on the dataset used in the project, including its source and any relevant characteristics. Include summary statistics to give readers an overview of the data.

Data pre-processing: Explain any pre-processing steps that were necessary for the dataset and justify why they were performed. This section should consider steps such as cleaning, transforming or encoding the data.

Exploratory data analysis: Perform preliminary investigations on the dataset using summary statistics and visualizations. This section should provide insights into the dataset and help identify any potential patterns or trends.

Prediction modelling: Select two prediction models and applied them on the given dataset. This section should also include some brief information on the selected models, explain why the chosen models were appropriate for the dataset. Also evaluate the performance of the two models and compare their results using the appropriate performance metrics.

Results and discussion: Analyze the results and discuss the findings in a clear and engaging manner. This section should include visualizations and any insights gleaned from the data.

Conclusion: summarize the project to give a concise overview of the project and useful insights and conclusions.

In addition to the project report, we also require the submission of an R file that includes the complete code performed from data loading to prediction modeling. The code should be well-organized, easy to follow, and produce the same outcomes as presented in the project report.

R file guidelines:

In your submitted code file, include comments to explain the purpose and functionality of each section of code.

Organize the code into clear sections, such as data cleaning, exploratory data analysis and prediction model implementation.

Use white space and indentation to enhance readability.

Avoid using overly complicated code, and instead focus on writing clear, concise code.

Bonus task:

Create an R Shiny app that allows users to interact with the data science pipeline you developed in the project.

Note that

1) This task is a bonus, which means you will not lose any mark if it is not completed. However, if you completed, you would earn extra marks (up to extra 15 points on the total mark of the assignment, with the cap of reaching 100).

2) The bonus task will not be supervised by the teaching staff. Some useful online links are provided to guide creating the R Shiny app. Therefore, students who are interested need to rely on their self-learning and exploration to complete the task.

Specification: The R Shiny app should 1) be user-friendly, with clear instructions and intuitive navigation. 2) Users should be able to upload the dataset, perform exploration data analysis via generating different visualizations, select prediction models, and view performance metrics. To develop the app, the student will need to integrate the code used in the previous tasks into the Shiny framework. Additional features, such as interactive visualizations, can also be added to enhance the user experience.

Submission for the bonus task requires the Shiny app R scripts and a separate simple user guide Word document (1-2 pages) that explains the app's functionality and provides instructions on how to use it. Students can include screenshots and code snippets to showcase the app's features and functionality.

Useful links for Bonus task R shiny task

How to Build a Data Analysis App in R Shiny

https://towardsdatascience.com/how-to-build-a-data-analysis-app-in-r-shiny-143bee9338f7R shiny quick tutorial

https://shiny.rstudio.com/tutorial/written-tutorial/lesson7/

Order New Solution

Uploaded By : Akshita
Posted on : November 26th, 2024
Downloads : 0
Views : 189

ICT583 Data Science Applications