diff_months: 21

CITS2003-Open Source Tool and Scripting Report Writing - Computer Science Assignment Help

Download Solution Now
Added on: 2022-08-20 00:00:00
Order Code: 5_22_25744_241
Question Task Id: 433027
  • Subject Code :

    CITS2003

  • Country :

    Australia

Assignment Task

 

 

Task


1. Exploring Malaria Incidence Data:

Kaggle is a remarkable web-based, data science resource which contains a a huge number of different data sets and tutorials on tools. (Highly recommended.) One particular data set is the World Health Organisation statistics for 2020. We have downloaded for you data on the incidence of Malaria for a range of countries and done a little pre-processing, mainly to convert real-valued incidence per 1,000 population into integers (because Shell can only handle integer data). Your program should explicitly read from incedence Of Malaria.csv, which will be placed in the same directory as your program. Your program should be called malaria_incidence. The program will be given a single argument. If the argument is a year, the program should report the country with the highest incidence and the corresponding incidence value for that year. On the other hand, if the argument is a country, the program should report the year with the highest incidence and the corresponding incidence value. In the case of several hits, do something that would make sense to the user. (Outputs will be assessed by a human, so we can deal with differences.) Finally, as you scan through the list of countries you will notice some that have further information in brackets; the bracketted information can be ignored. Sudan, where there is genuine ambiguity, will not be tested.


1.1. Examples

Here is an example session:

20220524045355AM-1071226670-1063780439.PNG


1.2 Perhaps of Use:

You will have noticed that names of countries have an initial capital letter, but not common words, such as "and". Thus a country in Africa that is in the database is Sao Tome and Principe; although it is part of the name, the "and" is not capitalized. However, the first word is captalized regardless, e.g. The. This is called title-case, i.e. the form of capitalisation used in title of books, articles, etc. Implementing this is a bit painful in Shell.

 

2. Common Words:

This task is a development of the example that motivated several of the lectures in the unit: finding all the words in text, and from that, the most common word, etc. The program you are to write should be called common_words. With no arguments, the usage summary should be:

The program will implement two related functions, indicated by the optional arguments -w and -nth, which are mutually exclusive. In both cases, the program also expects a mandatory argument, which is a directory containing text files. If the optional argument -w, followed by a word, is provided, you must determine the frequency rank for that word in each text file and report the highest rank. Frequency ranks are numbered from 1 for the most common word in a file, 2 for second most common word in that file, etc. The program should report the text file for which the frequency rank of the specified word is highest along with the rank of the word in that file. If there are equal highest ranks, just return the first that your program finds.

If the optional argument -nth, followed by an integer N, is provided, your program is to report the word that is the Nth most common for the largest number of files in the specified directory. Your program must also report the number of files for which that word is Nth most common. If there are no options, then assume you are being asked for the word that is the most common across the largest number of files, i.e. -nth 1. You can assume that all the text files in the text-file directory have the suffix .txt. As in the examples seen in lectures, for these purposes, a word is a sequence of one or more letters, so even though "common-garden" word-pair is linguistically one word, please treat them as two words. Secondly, unlike the examples in lectures, please preserve letter case. That is, please don't convert every letter to lower case.

 

2.1 Examples Here is an example session based on the set of 10 texts in the Gutenberg sample linked above:

20220524045355AM-2061128401-1987612401.PNG

 

This CITS2003 Computer Science Assignment has been solved by our Computer Science Expert at Exam Question Bank. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing Style. Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turn tin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

  • Uploaded By : Katthy Wills
  • Posted on : May 24th, 2021
  • Downloads : 0
  • Views : 317

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more