diff_months: 15

Data Management :SAS studio Assignment

Download Solution Now
Added on: 2023-02-24 05:01:24
Order Code: CLT315418
Question Task Id: 0
  • Country :

    Malaysia

LEARNING OUTCOMES

  1. Evaluate the various data types, data storage systems and associated techniques for indexing and retrieving data.
  2. Design feature engineering techniques to transform transactional data into meaningful inputs in order to create a predictive model.
  3. Propose a suitable approach to designing a data warehouse to store and process large datasets.

DATA MANAGEMENT

The machine learning pipeline involves several tasks before the development of a predictive/descriptive models. The inevitable and vital process includes preparing and understanding the data. Moreover, the performance of the predictive/descriptive model depends on the choice of pre-processing techniques.

For the assignment, you are required to prepare and explore the given dataset. It is imperative to explain and justify the pre-processing, transformation, and feature engineering techniques that have been chosen. Your analysis should be deep and in detail, also it must go further than what has already been covered in this course.

The assignment should involve a number of experiments, and a detailed exploration and analysis of the results using SAS Studio.

You need to do the following tasks:

PART 1

Feature Engineering

Several Data Mining/Machine Learning algorithms are designed to work with qualitative or quantitative data and very few algorithms support mixed data. Hence, this task requires you to transform with an appropriate method(s) and proper justification to be provided. In addition to that, the metadata should be created for each dataset. Feature engineering itself can be divided in 2 steps:

  • Variable transformation.
  • Variable / Feature creation.

In this section , you need to summarize feature engineering task and provide the interpretation of work related to feature engineering task that you have done in SAS Studio.[1000 words].

PART 2

1. Related Works

In this section, you are supposed to research and present the other works related to the application domain.

Initial Data Exploration –

Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems.

This section should contain the following task.

Indicate the type of each attribute (nominal, ordinal, interval or ratio).
Identify the values of the summarising properties for each attribute including frequency and spread e.g. value ranges of the attributes, frequency of values, distributions, medians, means, variances, and percentiles. Wherever necessary, use proper visualisations for the corresponding statistics. Summary / descriptive stats
Using SAS explore your dataset and identify the variables any outliers, missing values, and outliers treatment.

2. Data Pre-processing

Investigate the required method(s) to handle the incomplete, noisy and inconsistent data.

Report each of the applied techniques with detailed explanations. Show your results and justify your approach. 

3. Exploratory Data Analysis (EDA) – graph

This task requires you to perform an analysis on the datasets generated during your feature engineering. Exploratory Data Analysis (EDA) can be defined as the numerical and graphical examination of data characteristics and relationships before formal, rigorous statistical analyses are applied. You are evaluated based on the approaches undertaken to get familiar with the dataset.

4. Hypothesis

Formulate a minimum of FIVE (5) hypotheses based on the dataset (cleaned dataset or transformed dataset) with required analytical variable(s). Interpret the hypotheses with the query resulted from Query and visualization using SAS Studio.

Deliverables

The deliveries include:

  • A report, which structure should follow the tasks of the assignment.
  • SAS program (Initial Data Exploration, Data Pre-processing, and Dataset Transformation) and queries with an individual file for each task.

Get your Data Management "Machine learning pipeline" assignment solved by our Data Science Experts from Exam Question Bank . Our Assignment Writing Experts are efficient to provide a fresh solution to all question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing Style. Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered.

You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turn tin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

  • Uploaded By : Katthy Wills
  • Posted on : February 24th, 2023
  • Downloads : 0
  • Views : 232

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more