diff_months: 11

Do not use pie charts or 3D Graphs

Download Solution Now
Added on: 2024-11-12 14:00:19
Order Code: SA Student Prafull IT Computer Science Assignment(5_24_42294_312)
Question Task Id: 507058

PROJECT OUTLINE:

Do not use pie charts or 3D Graphs

Create a data science-related software product based on a chosen real-world problem domain/dataset which generates insights.

You are strongly recommended to develop machine learning predictive of forecasting models which are deployed through some software artefact. My recommendation is that you use Streamlit as a front-end for your model outputs. You can of course use some other web app that you are familiar with too. At a bare minimum though, you should have a Jupyter notebook which encapsulates and communicates the main aspects of your work.

Topics to be covered:

KNN,

Regression,

Web scrapping , Web API, Static Data

Data mining algorithms

Clustering and K algorithm

Naives bytes

Time series

Some ideas for possible projects:

Data from current events in the news cycle something that is topical and interesting

Time-series analysis, time-series forecasting.

Recommender engine: create application for making recommendations based on user preferences.

Fitness data: analysis of your personal or some groups FitBit data.

Twitter: sentiment analysis, text classification, semantic analysis, network visualisation, geospatial visualisation, data storage etc.

Data journalism: data visualisation implementation of interactive graphs (web enabled),

infographics.

A live Kaggle competition problem dataset https://www.kaggle.com/competitions (see notes below)

Web app that performs some data-related service.

...or something entirely different.

Topics NOT to cover:

Currency markets, BitCoin, share market stock prices

Closed Kaggle competition datasets

Previously researched topics for which there are existing notebooks

Definitely NO to the TITANIC dataset

More COVID-related topics

DATA SOURCES

This is a recommendation, not a requirement: be as original as you can with your data sources. Some datasets are very popular and have come up repeatedly in assignments over the years. Unfortunately, because they are popular there are a lot of online sources that have scripts published for those datasets. In many cases, related assignment submissions involve some form of plagiarism. While the internet is a big place, we have seen a lot of these scripts before and it is easy to catch. Unless you are going to do something genuinely novel with a well-used data source (you will know it is well-used if you can easily find python kernels for it), avoid these data sources. The safest bet is a dataset that is integrated from multiple disparate sources.

WARNING ABOUT CHOOSING A KAGGLE DATASET

Discuss this with the lecturer first. A high standard is set when marking Kaggle-related submissions. If you use a Kaggle dataset, we recommend you do not look at related Kaggle kernels as there can be a temptation to copy what you see. Copying without attribution is plagiarism which could lead to zero marks for this assignment. Be aware that markers are familiar with Kaggle kernels, in part due to marking assignments for other papers and cohorts. We will also be looking through related kernels prior to marking.

TECHNOLOGY

You are encouraged to use Streamlit as it is Python-based. Use ChatGPT to help you develop a simple Streamlit front-end. This does not need to be fancy. In previous years, other web app frameworks have been used depending on each groups familiarity with various technologies. In previous years, some students have created web-based applications which have both front-end and back-end components that both serve webpages and perform some data science related tasks. You can make this as simple or as complex as you like, but the main point is to focus on the machine learning aspect. It is sufficient that your application run on localhost. The technology itself is not important.

If you choose to build a GUI based application, Python does possess libraries that facilitate this; however, you can use Qt or technologies like .NET which allows you to call your Python methods that implement the logic in your application.

PRESENTATIONS

Make your presentation interesting. Don't focus on technical details. Consider your audience to be tech-savvy executives. Focus instead on the story that you are trying to tell and sell to the audience/decision makers. The presentations will be marked in part by your peers.

MARKING CRITERIA:

Marks will be awarded for different components of the project using the following rubric:

Component Requirements Marks

  • Uploaded By : Pooja Dhaka
  • Posted on : November 12th, 2024
  • Downloads : 0
  • Views : 192

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more