Implement a Small-Scale Distributed System
- Subject Code :
DDE602
Task Summary
In this assessment, you will implement a small-scale distributed system and accomplish the following:
- Demonstratean understanding of multi-threading concepts and constructs;
- Practisesynchronising multi-threaded applications;
- Gainan understanding of the advantages and disadvantages of multi-threaded applications;
- Gainknowledge of the client-server distributed model and application protocols; and
- Gainexperience in writing multi-threaded
Please refer to the Task Instructions (below) for details on how to complete this task.
Context
Multi-threaded applications promise better performance and concurrency. Notably, all of us use such applications in our everyday life. For example, a web browser can download multiple files while you continue browsing. If a particular web page does not respond, this will not prevent the web browser from downloading other files. You may be surprised to learn that many of the applications that you use are probably built around multi-threaded principles.
Your task is to write a multi-threaded (i.e., concurrent) client-server application that will perform the spell checking of text files using a standard operating system dictionary. You can use any programming language of your choice; however, Python is recommended for this project. The code must be well formatted and conform to Python naming conventions. You also need to provide sufficient comments in the code
After the file has been processed by the server, the client should display all misspelt words and any near words as suggested replacements. An example of such output can be seen below.
Line No. |
Misspelt word |
Suggestion |
7 |
enogh |
enough |
12 |
nothng |
nothing |
230 |
happenned |
happened |
234 |
glob |
globe |
563 |
poeple |
people |
1034 |
lathgter |
laughter |
1455 |
kowing |
knowing |
1788 |
wuold |
would |
1951 |
quate |
quite |
2145 |
javascript |
no suggestion |
2344 |
winkle |
wrinkle |
Total processing time = 3.723 seconds
File processing should be undertaken by the server on a line-by-line basis. Each time a line of text is read, it should be pre-processed by converting all uppercase letters to lowercase letters and removing leading or tailing punctuation characters. A word is to be defined as a sequence of alphabetic characters as well as the characters of an ' ' ' (apostrophe) and a '-'(hyphen). Thus, words like co- operate and its will be processed as whole words. The total processing time encompasses the time from the commencement of the program (or client) until its termination. To search for words in the dictionary, just use a linear search.
To determine if two words are the same (or different) use the int WordCmp(char*,char*) function below. It will return 0 if the words are the same or a number > 0 if the words are different. (Note: The greater the return value, the greater the difference between the words). To obtain a suggestion word for a misspelt word, use the word with the minimum difference. (Note: If there is more than one word with the minimum difference for a misspelt word, then use the one found first as the suggestion word).
int WordCmp(char*Word1,char*Word2){
if(!strcmp(Word1,Word2))return 0;
if(Word1[0]!=Word2[0]))return 100;
float AveWordLen = (strlen(Word1) + strlen(Word2)) / 2.0;
return int(NumUniqueChars(Word1,Word2) / AveWordLen * 100);
}
where the NumUniqueChars() is the number of chars in Word1 that are not in Word2 plus the number of chars in Word2 that are not in Word1.
A sample text file (Sample_File_For_Assessment_4.txt available for download from the Assessment 4 section in Blackboard) with spelling mistakes is provided for you to test your system. However, you do not have to use the sample file. Further, you are encouraged to test your code on multiple files that you have saved to your computer.
Task Instructions
You should complete this assignment by performing the following steps:
Step 1: Implement an iterative single-threaded process program without sockets. (See the algorithm below).
Step 2: Convert the program in Step 1 into a client-server application using TCP sockets. A multi- threaded server is appropriate.
Step 3: Convert the server program in Step 2 into a program capable of processing lines of input file data in parallel from an input queue and recording all misspelt words, their suggestion words and line numbers in a global linked list. When processing is complete, the contents of the linked list are sent to the client to display to the user.
Step 4: Implement mutexes and conditions in the server program in Step 3 to provide mutual exclusion to the shared global data to prevent the threaded processes updating it simultaneously.
Step 5: Add descriptive comments in your code that should not only explain the code but will also share your research findings demonstrating your ability to develop large-scale distributed applications. Through the comments exhibit your critical thinking for each of the criteria in the rubric.
Sample Algorithms SINGLE THREADED
open input file while !eof
read line from file preprocess line
for each word in line
open dictionary
for each word in dictionary compare words
if same found=true; break;
if word difference is minimum
keep dictionary word as a suggestion word
end for close input file
close dictionary if !found
print line no. word and suggestion on screen
print processing time
CLIENT
- getfilename and number of slaves from command line
- gethostbyname(to obtain IP_Address of the Server)
- contactServer (using connect + IP_Address from above)
- sendfilename and number of threads to server & wait for misspelt words to return
- takemiss-spelt words from Server and print them
- closesocket
SERVER
- setupsocket
- serversocket object initialisation
- acceptclient connection (2)
- readfilename and number of threaded slaves
- openfile
- createthreaded slaves
- keepfilling input buffer with preprocessed lines of (See Producer below)
- waitfor threads to complete
- sendmiss-spelt words to client
- closesocket & file, goto 2
PRODUCER
while(1)
read line from file if no data left
set data_finished & break; if queue full
pthread_cond_wait else
put item in input queue
CONSUMER THREAD
while(1)
if data_finished
break;
if input queue empty
pthread_cond_wait else
take item from queue process item