diff_months: 21

INFO411: Data Mining and Knowledge Discovery Assignment

Download Solution Now
Added on: 2022-08-20 00:00:00
Order Code: 432421
Question Task Id: 0
  • Subject Code :

    INFO411

  • Country :

    Australia

Instructions:

This task is a real-world data mining problem. You are required to prepare a set of presentation slides that must include (1) the full name and student number of each student in the group, the contribution (in percent) of each group member, (2) your proposed data mining approach and methodology; (3) the strengths and weaknesses of your proposed approach; (4) the performance measures that can evaluate your data mining results; (5) the results and a brief discussion.

Below is the recommended structure of your slides:

  • Introduction (define the problem and the goal)
  • Methods (propose approaches, and discuss their strengths and weaknesses)
  • Results (Figures and tables of data analysis)
  • Discussion (discovered knowledge from data mining)

Task: Air pollution prediction in the United States


Background: The US records daily ozone, SO2, CO, and NO2 levels in several counties of every state. The data set for this task contains the yearly summary data for these readings and associated meteorological data such as air quality index (AQI) and particulate matter (PM) index.

Download the AQI by County annual summary data for 2020. The Days with AQI column indicates the number of days in the year that the AQI index was recorded in the county and the following six columns indicate how many of those days had which index level.

Requirements:

  1. Explore the relationships between air pollution (this could be what you judge to be bad AQI days, or high median/high max AQI, or another criterion of your own definition), the meteorological variables, and the states.
  2. Present relevant visualizations of the data, which help to illustrate the relationships, trends, and differences found in the previous items.
  3. Develop models to predict the number of days of PM < 2>
  4. Provide the performance evaluation of any fitted models, including details of cross-validation or splitting into training, validation, and/or testing sets.
  5. Present your interpretations and conclusions.
  • Uploaded By : Katthy Wills
  • Posted on : July 30th, 2022
  • Downloads : 0
  • Views : 142

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more