diff_months: 17

Unlocking Insights: Data Analysis with Random Data Sets

Download Solution Now
Added on: 2024-04-29 11:46:38
Order Code: CLT323971
Question Task Id: 0
  • Country :

    United Kingdom

Data analysis using random data

Generate a synthetic dataset that simulates monthly retail sales data for the period from January 2020 to April 2024. This dataset will be used for analysis and forecasting purposes. Simulate three main components of the data:

  • A trend component that represents an increase in the sales represented in terms of a polynomial function at2 + bt+ c, where t represents the time units and a, b, c are constants. You can choose the constants as you like, just make sure that it is an increasing function.
  • A seasonal component capturing quarterly recurring patterns in sales over each year.
  • Random fluctuations representing noise or unexplained variance in the data.

Then, combine these components to create the retail sales data.

  1. Store the generated synthetic dataset in a pandas DataFrame and display the first 10 rows of data. (12 points)
  2. Perform visualizations to display the following: (8 points)
    • the change of sales over time.
    • the data points corresponding to the ten lowest values of sales over time
    • data points corresponding to the ten highest values of sales over time.
    • comparison of the highest sales in any quarter (Jan-Mar, Apr-Jun, Jul-Sept, Oct-Dec) per year over 2020-2023.
  3. Fit an ETS model from statsmodel and provide a forecast of the next two quarters in 2024. Explain the algorithmic working of the ETS model in producing a forecast. Provide visualizations and interpret the quality of the results in your own words in the report. (10 points)

Data analysis using actual stock data

Import the yfinancePython library for this assignment. Make sure to install it before you import this library. This library will provide a convenient way to fetch historical market data, including stock prices, from Yahoo Finance. If you need information regarding this library, please refer to the documentation, https://pypi.org/project/yfinance/. An example to retrieve stock data for the ticker AAPL:

data = yf.download(AAPL, start=2021-01-01, end=2022-12-31)

2.1 Analysis of the pre-COVID era (before March 2020)

For any top three tech and healthcare companies of your choice, do the following:

(a) Write a Python function stock_retrieval() to retrieve the historical stock data for two years before March 2020 for all of your chosen companies from Yahoo Finance. Decide which arguments could be the best fit foryour function. Use a dictionary to store the retrieved stock data where the date is the key and a nested dictionary contains the stocks attributes (e.g., Open, High, Low, Close, Volume, Adj Close) as the value. (10 points)

[b] Using the retrieved stock data perform statistical analyses like the aver- age, maximum, and minimum opening and closing prices in that period in another Python function stock_stats(). In this function, compare the opening and closing prices for each trading day and identify any sig- nificant differences. You can iterate over each trading day, calculate the absolute difference between the opening and closing prices, and compare it to a predefined threshold (e.g., 5% of the closing price). If the difference exceeds the threshold, you can add the date and the price difference to a list of dates with significant differences. (10 points)

[c] Provide suitable and intuitive visualizations (e.g., line plots, histograms) to display the change in the statistical parameters over time for all the companies. Compare the trends between the healthcare and tech compa- nies during this period using plots. (15 points)

Analysis of during COVID and post-COVID era (after March 2020)

[a] Perform regression analyses using a rolling window of 4 months to iden- tify any significant trend changes during COVID (the year following March 2020) and post-COVID (the period beyond March 2021 until the present). Please justify with explanations and visualizations the differences and sim- ilarities between tech and healthcare companies. Explain the rationale behind the algorithmic structure of your analysis. The analysis should be based on the top three companies in each category chosen by you for the previous analyses. (15 points)

[b] Consider relevant features that may influence stock prices, such as market indices, trading volume, volatility etc. from the dataset and apply PCA to reduce the dimensionality of the feature space and extract principal com- ponents capturing the variability in the data. Remember to standardize the features to ensure they have comparable scales. Provide explanations and visualizations to explain your findings. (10 points)

Any foresights?

[a] Forecast the stock data trends if COVID did not exist during March 2020 - March 2021. Explain with visualizations the differences between actual data and the forecasted data for each month during that period. (10 points)

Are you struggling to keep up with the demands of your academic journey? Don't worry, we've got your back!

Exam Question Bank is your trusted partner in achieving academic excellence for all kind of technical and non-technical subjects. Our comprehensive range of academic services is designed to cater to students at every level. Whether you're a high school student, a college undergraduate, or pursuing advanced studies, we have the expertise and resources to support you.

To connect with expert and ask your query click here Exam Question Bank

  • Uploaded By : Mohit
  • Posted on : April 29th, 2024
  • Downloads : 0
  • Views : 318

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more