diff_months: 4

DataMiningandNeuralNetworks

Flat 50% Off Order New Solution
Added on: 2025-05-29 11:00:42
Order Code: LD523816
Question Task Id: 0
  • Subject Code :

    MA3022-MA4022-MA7022

MA3022/MA4022/MA7022 Data Mining and Neural Networks

Computational Task 2

Due till 10.03.2025 100 marks available

Balloons Data Set { https://archive.ics.uci.edu/ml/datasets/Balloons }

Data previously used in cognitive psychology experiment; 4 data sets represent different con- ditions of an experiment

Description of database and original experiments can be found in paper [Pazzani, M. (1991). The influence of prior knowledge on concept acquisition: Experimental and computational re- sults. Journal of Experimental Psychology: Learning, Memory & Cognition, 17, 3, 416-432.] for this assignment four databases was joined and duplicates were removed. As a result we have data set presented in file Baloon.csv on BB or in table below.

Database consist on 13 cases of Inflated (T) and 13 cases of not Inflated (F).

Table 1: Baloon database

Color

Size

Act

Age

Inflated

YELLOW

SMALL

STRETCH

ADULT

T

YELLOW

SMALL

STRETCH

CHILD

T

YELLOW

SMALL

DIP

ADULT

T

YELLOW

SMALL

DIP

CHILD

F

YELLOW

LARGE

STRETCH

ADULT

T

YELLOW

LARGE

STRETCH

CHILD

T

YELLOW

LARGE

DIP

ADULT

T

YELLOW

LARGE

DIP

CHILD

F

PURPLE

SMALL

STRETCH

ADULT

T

PURPLE

SMALL

STRETCH

CHILD

T

PURPLE

SMALL

DIP

ADULT

T

PURPLE

SMALL

DIP

CHILD

F

PURPLE

LARGE

STRETCH

ADULT

T

PURPLE

LARGE

STRETCH

CHILD

T

PURPLE

LARGE

DIP

ADULT

T

PURPLE

LARGE

DIP

CHILD

F

YELLOW

SMALL

STRETCH

CHILD

F

YELLOW

SMALL

DIP

ADULT

F

YELLOW

LARGE

STRETCH

CHILD

F

YELLOW

LARGE

DIP

ADULT

F

PURPLE

SMALL

STRETCH

CHILD

F

PURPLE

SMALL

DIP

ADULT

F

PURPLE

LARGE

STRETCH

CHILD

F

PURPLE

LARGE

DIP

ADULT

F

YELLOW

SMALL

DIP

CHILD

T

YELLOW

LARGE

STRETCH

ADULT

F

PURPLE

SMALL

STRETCH

ADULT

F

PURPLE

LARGE

STRETCH

ADULT

F

Task 1. Decision trees predictions (20 marks)

Select randomly 3 Inflated (T) and 3 not Inflated (F) for the test set (exclude them from the training set, of course).

Create a decision tree for prediction of the Inflation (using the training set). Test the prediction results on the test set (6 examples).

Repeat this procedure 7 times for 7 different choices of the test examples.

Is the structure of the decision tree different for different training sets? Describe the differences (if any).

Task 2. Pruning (20 marks)

Let us do not split the cells with less than m examples. Delete splitting of the cells with less than m examples in the previously created trees.

Test the pruned trees on the corresponding test sets. For which m the testing is better (try m =2, 3, and 4)?

Present learning curves (average testing error rate as function of m and test error versus training

set error for different m). Select the best m.

Task 3. Comparison with kNN (15 marks)

Evaluate the kNN error rate (k = 1 and k = 3) for complete dataset. Compare the decision tree error rate (for the best m) with kNN error rate. Comment.

Task 4. Linear separability (20 marks)

Solve the problem using the linear Fisher discriminant for the complete data set (select the threshold with minimization of the number of errors).

Teach the Rosenblatt perceptron to solve this problem. Describe the result. Compare to Fishera Cs discriminant.

Task 5. Clustering. (15 marks)

Find clusters using k-means (k=2,3).

Evaluate the quality of clustering using one of the standard indexes.

Analyse distribution of Inflated and not inflated in clusters. Are there clusters with definite one class?

Report (with diagrams and plots) (10 mark for quality)

  • Uploaded By : Nivesh
  • Posted on : May 29th, 2025
  • Downloads : 0
  • Views : 138

Order New Solution

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more