diff_months: 28

PGP-DS Capstone Project Industry Review Writing

Download Solution Now
Added on: 2023-06-21 05:28:26
Order Code: 491346
Question Task Id: 0

Guidelines for PGP-DS Capstone ProjectIndustryReview

  • IndustryReviewCurrentPractices,BackgroundResearch
  • LiteratureSurvey-Publications,Applications,pastandundergoingresearch

DatasetandDomain

  • data dictionary
  • Variable categorization (countofnumericand categorical)
  • Pre-ProcessingDataAnalysis(countofmissing/nullvalues,redundantcolumns,)
  • Alternatesourcesofdatathatcansupplementthecoredataset(atleast 2-3 columns)
  • Project Justification -Project Statement, Complexity involved, Project Outcome Commercial,AcademicorSocialvalue

data exploration(EDA)

  • Relationship between variables
  • Check for
  • Multi-colinearity
  • Distribution of variables
  • Presence of outliersandtheirtreatment
  • Statistical significance ofvariables
  • Class imbalance anditstreatment

FeatureEngineering

  • Whether any transformationsrequired
  • Scaling the data
  • Feature selection
  • Dimensionality reduction

Assumptions

  • Check for theassumptionsto besatisfiedforeachofthemodelsin
  • RegressionSLR,MultipleLinearRegression,LogisticRegression
  • ClassificationDecisionTree,RandomForest,SVM,Baggedandboostedmodels
  • Clustering PCA (multi-co linearity), K-Means (presence of outliers, scaling, conversion to numerical,)

InterimPresentationCheckpoint

Modelbuilding

  • Split the datatotrainand test.
  • Start with asimplemodelwhichsatisfiesalltheaboveassumptionsbasedonyour
  • Check for biasandvariance
  • To improve the performance, try cross-validation, ensemble models, hyperparameter tuning,gridsearch

Evaluationofmodel

  • RegressionRMSE,R-Squaredvalue,
  • ClassificationClassificationreportwithprecision,recall,F1-score,Support,AUC,
  • ClusteringInertiavalue
  • Comparison of differentmodelsbuiltanddiscussionof thesame
  • Time taken fortheinferences/predictions

BusinessRecommendations&Futureenhancements

  • How to improvedatacollection,processing,andmodelaccuracy?
  • Commercial value/Socialvalue/Researchvalue
  • Recommendations based oninsights

FinalPresentationCheckpoint

Dashboard

  • EDACorrelationmatrix,pairplots,boxblots,distributionplots
  • Model
  • ModelParameters
  • Visualization of performanceof themodelwithvaryingparameters
  • Visualization of modelMetrics
  • Testing outcome
  • Failure cases andexplanationforthesame
  • Most successful andobviouscases
  • Border cases

FinalSubmissionCheckpoint

  • Uploaded By : Katthy Wills
  • Posted on : June 21st, 2023
  • Downloads : 2
  • Views : 435

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more