CSC 221: Introduction to Programming
Fall 2011

HW5: Files, Lists and Readability

You may work with one (and only one) partner on this assignment.
Your team will submit a single solution, with both names listed, and both team members will share the same grade.

A variety of measures have been developed to characterize the readability of text. Usually, these measures describe readability in terms of grade level, e.g., This sentence is at a seventh grade reading level. For this assignment, you will write Python functions for calculating the readability grade level for files using two different measures.

For example, suppose a text file contained 10 sentences, consisting of 50 words. Those 50 words contained a total of 100 syllables, with 10 of the words having three or more syllables in them. Then,

F-K grade level = 0.39*5 + 11.8*2 - 15.59 = 9.96 G-F grade level = 0.4*(5 + 20) = 10.0

PART 1: Helper Functions (40%)

Due to the complexity of the English language, identifying the ends of sentences and the number of syllables in a word can be tricky. To make these tasks manageable, we will make the following simplifications:

Define a function named isEndOfSentence that has a single word as input. The function should return True if the word ends in a period, exclamation point, or question mark (ignoring trailing quotation marks). For example, isEndOfSentence("What?") should return True, while isEndOfSentence("So,") should return False. Hint: to ignore trailing quotation marks, use the string rstrip method. For example, the following assignment will strip trailing quotation marks off of a word and save the resulting string in stripped:

stripped = word.rstrip("\"\'")

Define a function named countSyllables that has a single word as input. The function should return the number of consecutive vowel sequences in the word. For example, countSyllables("people") should return 2, while countSyllables("Italian") should return 3.

Be sure to test your functions thoroughly before moving on to the next part.

PART 2: Calculating Grade Levels (60%)

Define a function named gradeLevel that processes a text file, which is selected by the user using a dialog box, and displays its readability grade level using each of the formulas listed above. Since the file may be large, your function should read its contents one line at a time, breaking each line into individual words (using the string split method). It should collect statistics on the words in the file (using the helper functions written in Part 1) and use those statistics to calculate the Flesch-Kincaid grade level and Gunning Fog grade level. In addition to printing the two grade levels, the function should also display the name of the file, and the total number of syllables, words, and sentences in the file. For example:

/Users/davereed/Documents/Classes/CSC221/melville.txt Number of syllables = 22959 Number of words = 14343 Number of sentences = 817 Flesch-Kincaid grade level = 10.1451112561 Gunning Fog grade level = 13.2692263534

One special case you will need to watch out for when processing a text file are "words" that contain no syllables. These include numbers, e.g., "2011" and punctuation sequences, e.g., "--"). For this assignment, any "word" that contains no syllables (i.e., no vowels) should not contribute to the word count for the file. For example, the sentence "The year is 2011." would be considered to have only 3 words in it.