diff_months: 12

Web Tracking Assessment

Download Solution Now
Added on: 2023-05-29 09:45:51
Order Code: clt317004
Question Task Id: 0
  • Country :

    Australia

TASK 1: Merits of Entropy in Attack Detection/Diagnostics (marks 4)

Please answer the following questions based on a server-log dataset that is available on Google Drive at this link: https://drive.google.com/file/d/1JLMpm6aQ5FJWtBW0VhFBP-BFQ9lXCS4W/view?usp=sharingF.

The dataset contains information about two attacks that occurred sometime between 8:00 am and noon on a single day:

  • Identify the precise date and time of the attacks, as indicated in the columns of the dataset. Describe the attack methodology used by the attackers. (marks 2)
  • There is a significant body of literature12 that discuss the use of entropy to detect network attacks. Typically, approximation schemes are utilized to make this process more effective. It is not necessary to implement these approximation techniques, but an analysis should be conducted to determine the usefulness of entropy and the combinations of factors that should be tried, such as source IP,destination IP, source port, and destination port. During the two attacks mentioned in the dataset, were there any anomalies revealed by any of these combinations? (marks 2)

TASK 2: Web Tracking (marks 8)

When a user accesses a webpage in their web browser, the webpage typically contains multiple webcomponents, such as images, JavaScript codes, Flash content, CSS, etc. These components are often downloaded through additional HTTP(S) connections from either the first-party domain (the website the user is visiting) or from third-party domains. This article will focus specifically on JavaScript codes, which are commonthly used by ad networks, content distribution networks (CDNs), tracking services, analytic platforms, and online social networks, including Facebook’s use of them for plugins.

browser-1685353173.jpg

Figure 1. Overview of a web page rendering process and web tracking. Websites (in this case cnn.com) use third-party domains for content provisions and analytics services.

The scenario of web tracking via JavaScript codes is depicted in Figure 1. When a user accesses a web page from a first-party domain (steps 1 2), the web browser interprets the HTML tags and executes any JavaScript programs within the HTML script tags. These programs may trigger the browser to send additional requests to retrieve content from third-party domains (step 3). Depending on their intended functionality, JavaScript programs can be considered either useful (functional), such as fetching content from a CDN, or used for tracking purposes. In the latter case, once the web page has fully loaded (step 4), the JavaScript codes track the user’s activities on the web page, read from or write to the cookie database (steps 5 6), and potentially reconstruct user identifiers. Tracking JavaScript programs may also b e employed to "fingerprint" the user’s browser and system, and transfer sensitive information to third-party domains (step 7).

Suppose you are tasked with developing a machine learning model based on a single class (e.g., One-Class SVM (OCSVM) or Positive Unlabeled (PU) Learning, see ref3) to distinguish between functional and tracking JavaScript codes. You will be provided with a labeled dataset that contains functional and tracking JavaScript codes, which can be found on You may use the code provided on iLearn to perform the following tasks.

  • Use Term Frequency - Inverse Document Frequency (TF-IDF) to extract features from functional and tracking JavaScript codes. (marks 2)
  • Develop either One-Class SVM or PU Learning, and a baseline SVM for comparison, to classify the JavaScript codes. (marks 3)
  • Design and conduct experiments to validate and test the efficacy of your developed model ((marks 3):
    • To report any over- or under-fitting of the models, you may use 60% of the data for testing, 20% for validation, and 20% for the testing.
    • Report and discuss the parameters of OCSVM or PU Learning model which give your improved results.
  • Uploaded By : Katthy Wills
  • Posted on : May 29th, 2023
  • Downloads : 0
  • Views : 237

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more