Assignment 8
Using Dictionaries for Text Analysis
Submit before 11:30 PM Saturday June 1
Overview
You will modify the text analysis code so that it does the following:
- Processes the words for better analysis
- Provides recommendations based on two literary works that you select
- Uses the proper title when referring to the literary works
Text Processing
Modify the script so that it does at least two of the following for better comparisons:
- Strip punctuation from each word
- Remove words with little meaning (e.g. "the", "a")
- Make all words lower case
- Employs a stemmer
Recommendation
Write a function called recommend. It should have one parameter: the name of a file without the extension (e.g. 'dracula'). It should then print a message recommending the literary work most similar to the given work named in the parameter. Note: you may optionally add a second and third parameter that allows you to pass in the docs dictionary and the word set that comes from the analyze function. If you don't do that, you will need to call analyze inside your recommend function so that you can obtain the docs dictionary and word set to provide the recommendation (see the analyze_demo function inside the text_compare folder).
Report by Titles
Modify your recommendation function so that it reports the titles of the works rather than their file names. To do this, write a function that reads in the titles.txt file and creates a dictionary that looks up the title using the file name. This dictionary should then be used to report the works by their title instead of their file name.
Deliverables
Create a text or pdf file called assn8 that contains the following:
- A statement that describes your experience. Indicate how it addresses the requirements. Include any information on how you got help or collaborated with someone.
- Listings of your code.
- Output of you running your program to show that it is processing the words correctly and that it produces the recommended output.
Submit your assn8 file (text or pdf) to D2L.