diff_months: 21

Data Mining And Text Analysis - IT Assignment Help

Flat 50% Off Order New Solution
Added on: 2022-08-20 00:00:00
Order Code:
Question Task Id: 402202
  • Country :

    United Kingdom

Assignment Task
 
 


Department of Computer Science
Assignment Brief - Data Mining And Text Analysis
Module Learning Outcomes

  • Analyse different data mining and text processing tasks and the algorithms most appropriate for addressing them.
  • Critically evaluate and select the appropriate open-source or commercial data mining and text processing toolkits and implement the algorithms.
  • Critically evaluate the algorithms with respect to the accuracy of their results.
  • Develop and communicate a data mining and text processing solution to a real-world problem.
  • Identify and discuss the challenging research issues in data mining and text processing.

Assessment Background/Scenario
You have been approached by a credit card company looking to explore credit card fraud patterns and profile potential targets. The company has identified four areas (problems) it would like to identify and test potential solutions for, using the data mining and/or text analysis techniques (such as data cleaning and preparation, tracking patterns, classification, association, outlier detection, clustering, regression, prediction, decision trees). Using only the given data set/s (training dataset and testing dataset provided separately), the company would like you to explore and present possible solutions for only two of the following:
Top profiles: Being able to profile potential targets effectively may help improve fraud prevention in the future. Examine the data and identify three distinct profiles (differing sets of personal attributes; there may be some overlap) that are linked to high levels of fraudulent actions. You will need to define and clearly state what you have identified as ‘high level’ as part of your assumptions for this problem.
Location: Determine whether a ‘transaction’s location’ is a good predictor of the likelihood of fraud and clearly demonstrate this against 2-3 of the other attributes in the data set. Make sure you clearly state which other attributes you selected as comparators and justify why you have selected them for the role.
Recommender: Consider how you could use the data to recommend to credit card users safe places to perform their transactions on a daily/weekly basis. As part of this problem, also consider how this information could be best communicated to credit card users in a visual way and put forward or demonstrate one option. You may need to consider the data sparsity problem here and research possible solutions.
Transaction Search: The company is looking to provide an extra service/level of security for high-value targets. Ascertain if there is a strong relationship between the transaction amounts and the time of day of the transactions. Consider what other attributes (within the data set) could be included to help protect high-value targets.The credit card company has provided you with a simulated sample set of data as two CSV files – one suitable for training and the other for testing. How you use these is up to you, and you may not need both depending on your approach. You should clearly state in your submission which of the data sets you have used at appropriate points.
These data samples are provided under a Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license and are available from the Kaggle data repository.
Assessment Task/s
Given the scenario above, your task is to write a report detailing possible solutions to your two selected options. Your report should provide an initial executive summary and consist of three clear sections – one for each task. Your response in one section will not contribute to grades in another, so you should consider this assignment in the same way you would an examination. Further formatting details are given below.
Overview/summary of the report which should at least contain the following points:

  • Which options you have chosen to present solutions for
  • What was achieved/undertaken
  • What processes were applied
  • What the results demonstrated
  • What should be reconsidered in future.

Task 1: Discussion of techniques used in your two solutions 
Given the scenario above, design and discuss the potential solutions to the problems you have selected. You will need to write small programs and/or use tools to run simulations as supporting evidence on the given test data (using WEKA and/or Python). In this task you should make it clear which problem you are presenting a solution for.
Your report should clearly cover the following:

  • Any assumptions you are making about the scenario or selected problem
  • Any pre-processing you would undertake to make the data fit for purpose
  • Which data mining/text analysis techniques you have employed in your solutions.
  • Justification for the selection of those techniques, given the nature of the data and the requirements of the problem you are attempting to solve.
  • An evaluation of the techniques you have applied in terms of the accuracy of their results. You will need to clearly define and state the measures/methods by which you are evaluating the techniques. It is perfectly acceptable for your techniques to have been unsuccessful. Whether successful or not, it should be clear how your evaluation has informed your conclusion.
  • All code examples and results (output) should be presented in the appendices as screen shots, not as typed (handwritten) code. All supporting evidence in the appendices must be referred to and discussed in the body of the report. You will need to present evidence of your prototype programs and tests to pass this assessment. Your discussion should be supported by reference to relevant literature in this section (using IEEE referencing format).

 

Task 2: Evaluation of the tools/languages

  • Given the languages/tools you have selected and used, provide a critical evaluation of their effectiveness in the context of the given scenario. You should clearly make comparisons to other options available and draw on the specific requirements of the scenario when presenting your argument. Your discussion should be supported by reference to relevant literature in this section. Consider the question: If you undertook this assignment a second time, would you use the same languages/tools and why?
  • Task 3: Discussion of the current literature (Learning objective 5 - 30%)
  • (Suggested word count for this section: 1,000 words)
  • Given the scenario above and the nature of the problems you have selected, research and identify the main areas of investigation the research community is currently tackling. Consider the following questions:
  • What are the current ‘problem’ areas?
  • What solutions have been put forward and how are they being evaluated?
  • Given your experience, would you consider these potentially successful solutions?
  • Justify why you consider them successful or not.
  • Present a discussion around these questions and consider how current research could potentially change or improve your solutions to the given scenario. To attain a pass, your discussion must be supported by reference to relevant literature in this section.

 


    

This IT Assignment has been solved by our IT Experts at Exam Question Bank. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
    

Be it a used or new solution, the quality of the work submitted by our assignment Experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

  • Uploaded By : Katthy Wills
  • Posted on : February 28th, 2020
  • Downloads : 0
  • Views : 225

Order New Solution

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more