diff_months: 8

Use KNIME or RAPIDMINER for this assessment

Download Solution Now
Added on: 2024-11-22 11:30:21
Order Code: SA Student Fahad IT Computer Science Assignment(9_23_36506_300)
Question Task Id: 494946
Use KNIME or RAPIDMINER for this assessment

Overview

A data analytics project starts with collecting the data and ends with communicating the results from the data. In between, there are multiple steps that are required to be followed- data preprocessing is one of the most important steps among them. The data preprocessing step itself has multiple steps depending on the nature, type, value etc. of the data.

On the other hand, data visualisation uses visual representations to explore, make sense of, and communicate data that often includes charts, graphs, illustrations etc. Today, there is a move towards visualisation that can be observed among many big companies.

Assessment Details

Case Study 1: Students are required to select a data set for regression tasks and define a question based on business requirement. This should include: (i) selection of dataset; (ii) exploring, summarizing and preparing the data; (iii) defining the problem and requirements; (iv) defining an experiment setup; (v) implementing your approach; and (vi) evaluating and analysing approach. (Marks: 20)

Problem: Describe the problem and highlight the business need.

Approach: Describe your approach It should focus on e.g., learning techniques, features, model tuning, parameter selection and analysis e.g., how the analysis will answer your questions

Results: Summarize results and critically analyse results e.g. limitations of data, setup or approach, characteristic errors, possible improvements.

Conclusion: Conclude with what you have learned from this study which would improve yourself as a data analyst. Would you recommend this as a solution to your problem? Provide reasons.

Case Study 2: Suppose that you have built a classifier that can identify whether an email is spam or not spam. After applying the classifier to the training data, you get the following confusion matrix. (Marks: 20)

Calculate the accuracy, true positive rate, true negative rate, precision, and recall.

Based on the accuracy value, do you think the classifier is doing a good job identifying spam - emails? Justify your answer.

What is the class imbalance problem? How it is affecting the accuracy for the given scenario.

Note: Students are allowed in include other sections as they deem necessary based on their case study.

Sample data set for case study 1:

Absenteeism at work Data Set Bank Marketing Data Set Iranian Churn Dataset Data Set Productivity Prediction of Garment Employees Data Set Real estate valuation data set Data Set Apartment for rent classified Data Set Chronic_Kidney_Disease Data Set Marking Rubric for Case Study 1

Score Very Good Good Satisfactory Unsatisfactory

Presentation Information is well Information is Information is somewhat Information is somewhat

/Layout organised, well written, organised, well written, organised, proper organised, but proper

and proper grammar with proper grammar grammar and grammar and

and punctuation are and punctuation. punctuation mostly punctuation not always

used throughout. Correct layout used. used. Correct layout used. Some elements of

/02 marks Correct layout used. used. layout incorrect.

Structure Structure guidelines Structure guidelines Structure guidelines Some elements of

/02 marks Enhanced followed exactly mostly followed. structure omitted

Introduction Introduces the topic of Introduces the topic of Satisfactorily introduces Introduces the topic of

the report in an the report in an the topic of the report. the report, but omits a

extremely engaging engaging manner which Gives a general general background of

manner which arouses arouses the reader's background. the topic and/or the

the reader's interest. interest. Indicates the overall overall "plan" of the

Gives a detailed general Gives some general "plan" of the paper. paper.

background and background and

indicates the overall indicates the overall

/02 marks "plan" of the paper. "plan" of the paper.

Design and Analysis All topics are discussed in Consistently detailed A topic has been Inadequate discussion

Depth coherently. discussion. Displays adequately discussed. of issues Little/no

Significant evidence of sound understanding Displays some demonstrated

Critical analysis and with some analysis of understanding and understanding or

Reflection.

Topics.

analysis of issues.

analysis of most issues and/or some irrelevant

/10 marks information.

Summary & Conclusion An interesting, well A good summary of the Satisfactory summary of Poor/no summary of the

written summary of the main points. the main points. main points.

main points. A good final comment A final comment on the A poor final comment on

An excellent final on the subject, based subject, but introduced the subject and/or new

comment on the on the information new material. material introduced.

subject, based on the provided.

/02 marks information provided.

Referencing Correct referencing Mostly correct Mostly correct Not all material correctly

(APA7 Style). All quoted referencing (APA7 Style). All referencing (APA7 Style) acknowledged.

material in quotes and quoted material in Some problems with Some problems with the

acknowledged. All Quotes & acknowledged. quoted material and reference list.

paraphrased material All paraphrased material paraphrased material

acknowledged. acknowledged. Some problems with the

Correctly set out Mostly correct setting reference list.

/02 marks reference list. out reference list.

Total out of 20

  • Uploaded By : Pooja Dhaka
  • Posted on : November 22nd, 2024
  • Downloads : 0
  • Views : 92

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more