### CSC 221: Introduction to Programming Fall 2011 HW5: Files, Lists and Readability

You may work with one (and only one) partner on this assignment.
Your team will submit a single solution, with both names listed, and both team members will share the same grade.

A variety of measures have been developed to characterize the readability of text. Usually, these measures describe readability in terms of grade level, e.g., This sentence is at a seventh grade reading level. For this assignment, you will write Python functions for calculating the readability grade level for files using two different measures.

• The Flesch-Kincaid Grade Level Formula estimates grade level using the average number of words per sentence and the average number of syllables per word:
F-K grade level = 0.39*avgWordsPerSentence + 11.8*avgSyllablesPerWord - 15.59
• The Gunning Fog Formula estimates grade level using the average number of words per sentence and the percentage of complex words (ie., words with three or more syllables):
G-F grade level = 0.4*(avgWordsPerSentence + percentageComplexWords)

For example, suppose a text file contained 10 sentences, consisting of 50 words. Those 50 words contained a total of 100 syllables, with 10 of the words having three or more syllables in them. Then,

F-K grade level = 0.39*5 + 11.8*2 - 15.59 = 9.96 G-F grade level = 0.4*(5 + 20) = 10.0

### PART 1: Helper Functions (40%)

Due to the complexity of the English language, identifying the ends of sentences and the number of syllables in a word can be tricky. To make these tasks manageable, we will make the following simplifications:

• We will assume that any word that ends in a period, exclamation point, or question mark (ignoring trailing quotation marks) is the end of a sentence. For example, the following paragraph contains three sentence: What? He told me to "Go away." So, I left as soon as I could.
• We will assume that any sequence of consecutive vowels (including 'y') corresponds to a syllable. For example, `"heavy"` has two syllables while `"Italian"` has three syllables.

Define a function named `isEndOfSentence` that has a single word as input. The function should return `True` if the word ends in a period, exclamation point, or question mark (ignoring trailing quotation marks). For example, `isEndOfSentence("What?")` should return `True`, while `isEndOfSentence("So,")` should return `False`. Hint: to ignore trailing quotation marks, use the string `rstrip` method. For example, the following assignment will strip trailing quotation marks off of a `word` and save the resulting string in `stripped`:

stripped = word.rstrip("\"\'")

Define a function named `countSyllables` that has a single word as input. The function should return the number of consecutive vowel sequences in the word. For example, `countSyllables("people")` should return `2`, while `countSyllables("Italian")` should return `3`.

Be sure to test your functions thoroughly before moving on to the next part.

### PART 2: Calculating Grade Levels (60%)

Define a function named `gradeLevel` that processes a text file, which is selected by the user using a dialog box, and displays its readability grade level using each of the formulas listed above. Since the file may be large, your function should read its contents one line at a time, breaking each line into individual words (using the string `split` method). It should collect statistics on the words in the file (using the helper functions written in Part 1) and use those statistics to calculate the Flesch-Kincaid grade level and Gunning Fog grade level. In addition to printing the two grade levels, the function should also display the name of the file, and the total number of syllables, words, and sentences in the file. For example:

/Users/davereed/Documents/Classes/CSC221/melville.txt Number of syllables = 22959 Number of words = 14343 Number of sentences = 817 Flesch-Kincaid grade level = 10.1451112561 Gunning Fog grade level = 13.2692263534

One special case you will need to watch out for when processing a text file are "words" that contain no syllables. These include numbers, e.g., "2011" and punctuation sequences, e.g., "--"). For this assignment, any "word" that contains no syllables (i.e., no vowels) should not contribute to the word count for the file. For example, the sentence `"The year is 2011."` would be considered to have only 3 words in it.