BUSS6002 Solar Power Generation Report

Subject Code :
BUSS6002
Country :
Australia

Overview

The Australian federal government is building a website to provide households with a tool to determine if installing a rooftop solar panel system is right for them. On the website, users will be able to enter information about their house and the website will provide an estimate of the possible solar power generation.

The government collected a random sample of existing households with solar panels, including information about the households, solar panel installation and the associated solar power generation. The generation from the solar panels was collected from 1/1/2022 to 31/12/2022. The data has been assembled from a multiple sources including the customer energy retailer, energy distributors and solar installers. Sampling is limited to installations with:

a single solar panel array or multiple arrays that are oriented identically,
rooftop installation only.

After you presented your EDA from Assignment 1, you have been given a new task: determine if a model can be built to predict Generation, that can outperform simple baselines.

Data Files

The following files are available on Canvas.

File
SolarSurvey.csv
Description
Data file with 3000 observations.
File
DataDictionary.txt
Description
Data dictionary containing description of each variable

1 Introduction

In this section of your report, you should

provide a brief project background so that the reader of your report can understand the general problem that you are solving;
state the aim of your project;
briefly describe the dataset;
briefly summarise your key results.

2 Candidate model

Propose at least three candidate models for predicting the response variable Generation. For i ? 1, 2, 3, each candidate model should take the form Y=fi(xibi>+ei)where y is the Generation, and xi, ?i , and ?i are the predictor vector, parameter vector, and the error term of the i-th model, respectively. The set of variables chosen for the feature vector xi should be a subset (or constructed from a subset) of the predictors in the provided dataset. You may label your models M1, M2, and M3. The proposed models should be different in terms of model complexity (i.e., number of parameters) and/or feature engineering. For each proposed model, you should:

clearly define the function fi , which can be either linear or nonlinear with respect to xi;
clearly define the feature vector xi;
justify your choices of fi and xi;
state any assumptions on the error term ?i;
discuss how the model parameters ?i can be estimated.

Hints:

one effective way to motivate/justify your choices of fi and xi is to present the relevant evidence in the data.
carefully consider how the predictors are related to the target

3 Model estimation and selection

Select the best model from the set of candidate models proposed in Section 2 using the “validation set” approach. In this section of your report, you should:

include a description of the model selection procedure that you adopted;
report and discuss the estimation results (based on the training set) of each candidate model;
discuss whether each candidate model is correctly specified based on residuals (obtained from fitting each model to the training set); report the validation performance (MSE) of each candidate model;
identify the best model;
discuss the complexity of the selected model in terms of bias-variance tradeoff.

The description of the model selection procedure (first point above) should provide enough details so that the reader is able to implement exactly what you have done by following your description.

4 Model evaluation

Evaluate the generalisation performance of the selected model in Section 3 against two benchmark models. The generalisation performance should be measured by the observed MSE calclated using the test set. In this section of your report, you should

combine the training and validation sets and re-estimate the selected model on the combined set;
describe the model evaluation procedure;
describe the two benchmark models;
report and discuss the generalisation (i.e., test set) performance of the selected model against the two benchmark models.

The two benchmark models are specified in the following subsections.

4.1 Benchmark Model 1

The Benchmark Model 1 (BM1) predicts the Generation by averaging the observed Generation values for modern systems within each city. Modern systems are those installed in 2019, 2020 and 2021.

Let D be the set constructed by combining (or concatenating) the observed Generation in the training and validation sets. Let C(x) be the subset of D that contains only the Generation from the city of x installed between 2019 and 2021. For example C(’Sydney’) contains the Generation in D from Sydney only. Then BM1 is given by: yˆ = 1/ m(x) X y?C(x) y where m(x) is the size of the set C(x).

4.2 Benchmark Model 2

The Benchmark Model 2 (BM2) extends BM1 by further grouping based on panel capacity i.e. let C(x1, x2) be the subset of D that contains only the Generation:

from the city of x1,
with the panel capacity of x2
installed between 2019 and 2021.

Download Solution Now

Uploaded By : Katthy Wills
Posted on : May 03rd, 2023
Downloads : 0
Views : 263

Download Solution Now

Choose a Plan

Premium

80 USD

All in Gold, plus:
30-minute live one-to-one session with an expert
- Understanding Marking Rubric
- Understanding task requirements
- Structuring & Formatting
- Referencing & Citing

Most
Popular

Gold

30 50 USD

Get the Full Used Solution
(Solution is already submitted and 100% plagiarised.
Can only be used for reference purposes)

Save 33%

Silver

20 USD

Journals
Peer-Reviewed Articles
Books
Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more

BUSS6002 Solar Power Generation Report

Download Solution Now

Download Solution Now

Choose a Plan

Premium

Gold

Silver

Request a Call Back