diff_months: 16

distributive data processing  framework -Apache Spark Report writing

Download Solution Now
Added on: 2023-01-16 06:33:15
Order Code: CLT296290
Question Task Id: 0
  • Country :

    Australia

Tasks: 

  1. Using Spark, write a program to count the number of words in “book.txt”. 

Example: 

Input: A distributed system is a collection of autonomous computing elements that  appears to its users as a single coherent system.” 

Output: system: 2, distributed: 1, ….. 

  1. Using Spark, write a program to count how many times each letter appeared in the “book.txt.”
  2. Using Spark, write a program to replace the words to lowercase letters and write it the file “words_lower.txt.” 
  3. Using Spark, write a program to replace spaces with “-” in the “book.txt” and write it to “words-.txt”.
  4. Using Spark, compute the sum of the numbers given in “numbers.txt” in the numbers.zip file.  
  • Additionally, you are given files Numbers2.txt, Numbers4.txt, Numbers8.txt, Numbers16.txt, and  Numbers32.txt.  
  • Compute the sum of the numbers in the individual files and plot a bar-graph. On the x-axis plot the  size of the file and on y-axis plot the time taken by the Spark to compute the result.  

Report 

Write a 1-page report on Spark and mention its main features & use cases. For instance, what kind of  data can be processed in it. What are RDDs? 

  • Uploaded By : Katthy Wills
  • Posted on : January 16th, 2023
  • Downloads : 0
  • Views : 137

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more