diff_months: 12

ITEC325 Applied Data Mining Assessment

Download Solution Now
Added on: 2023-06-08 09:20:20
Order Code: clt31329
Question Task Id: 0
  • Subject Code :

    ITEC325

  • Country :

    Australia

Context

Heart disease is one of the leading causes of death for people of most races in the world. According to the CDC, about half of all Americans (47%) have at least 1 of 3 key risk factors for heart disease: high blood pressure, high cholesterol, and smoking. Other key indicators include diabetic status, obesity (high BMI), not getting enough physical activity or drinking too much alcohol. Detecting and preventing the factors that have the greatest impact on heart disease is very important in healthcare.

Instructions

Task 0

Download the data set from LEO.

Task 1

Conduct an exploratory data analysis of the data set using RapidMiner to understand the characteristics of each variable and the relationship of each variable to the other variables in the data set. Summarize the findings of your exploratory data analysis in terms of describing key characteristics of each of the variables in the data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables if relevant in a table.

Hint: Statistics Tab and Chart Tab in RapidMiner provide a lot of descriptive statistical information and useful charts like Bar charts, Scatterplots etc. You might also like to look at running some correlations and chi square tests. Indicate in Task 1 Table which variables are contributing the most to determining the risk rating of heart disease.

Briefly discuss the key results of your exploratory data analysis and the justification for selecting your five top variables for predicting the risk of heart disease based on the results of your exploratory data analysis and a review of the relevant literature about assessing the risk of heart disease (About 250 words)

Task 2

Build and evaluate two predictive models for determining the risk rating of heart disease using appropriate data mining models in RapidMiner using two appropriate data mining methods you learned in this unit.

Briefly explain your predictive model process, justify your choice of the data mining method, and discuss the results of predictive model drawing on the key outputs. This discussion should be based on the contribution of each of the top five variables to the Final Decision Tree Model and relevant supporting literature (at least 3 credible sources) on the interpretation of the selected data mining models (About 250 words).

Task 3

Discuss and compare the accuracy of the two data mining models (methods). Use a table here to compare the key results of the confusion matrix (About 250 words).

Note the important outputs from your data mining analyses conducted in RapidMiner should be included in your Assignment 3 report to provide support for your conclusions regarding each analysis conducted. Export the important outputs from RapidMiner as jpg image files and insert these screenshots in the relevant parts of your Assignment 3 Report.

Task 4

Based on relevant supporting literature (at least 3 credible sources), briefly discuss the ethical perspectives in data mining and identify the possible ethical issues in the context of this case study (250 words).

Task 5

Use Zoom to record a short video presentation (4-5 minutes). In your presentation turn on your webcam (to include your face in the presentation) and share your screen to show your predictive process/models in RapidMiner. Briefly explain the steps you have followed to create the Rapid Miner processes, run the process, and present the results.

Important Notice:

The purpose of this recording is to safeguard academic integrity. The assignment will receive a FAIL grade if you do not submit the recorded presentation, or your presentation does not provide adequate evidence that the submitted materials are original and the result of your own work.

Structure (Report)

  • Cover page (including Unit title, Your full name and student ID)
  • Exploratory Data Analysis
  • Predictive Models
  • Model evaluation
  • Model Comparison
  • Ethical issues
  • References
  • Uploaded By : Katthy Wills
  • Posted on : June 08th, 2023
  • Downloads : 0
  • Views : 95

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more