CSC 321: Data Structures
Spring 2024

HW6: TreeSets & Text Processing


For this assignment, you will implement a simple class that stores a set of words using a TreeSet. Your WordSet class should have the following methods:

public boolean add(String word)
This method should attempt to add a clean version of the word to the stored set of words. The word should be cleaned by making it lowercase and removing any characters that are not letters or digits from the ends of the word. For example, "--Abc-123!"should be cleaned to produce "abc-123". If the cleaned word has no remaining characters or it is already stored in the set, the method should return false; otherwise, it should add the word and return true.
public int size()
This method should return the number of words stored.
public String getLongest()
This method should return the longest word stored. If there is more than one word with the same maximum length, any such word may be returned.
public String toString()
This method should return a String containing all the stored words, with 5 words per line. The words should be in alphabetical order and aligned in columns whose width is one larger than the longest word length.

In addition, you should implement a WordSetDriver class that prompts the user for the name of a text file, then reads words from that file and stores them in a WordSet. After processing the file, the driver should display the number of unique words in the file and the table of words. For example, processing the file lincoln.txt should produce the following output:

Enter the file name: lincoln.txt
lincoln.txt contains 125 unique words.

a           above       add         ago         all         
and         any         are         as          battlefield 
be          before      birth       brave       brought     
but         by          can         cannot      cause       
civil       come        conceived   consecrate  consecrated 
continent   created     dead        dedicate    dedicated   
detract     devotion    did         died        do          
earth       endure      engaged     equal       far         
fathers     final       for         forget      forth       
fourscore   freedom     from        full        gave        
government  great       ground      hallow      have        
here        highly      honored     in          increased   
is          it          larger      last        liberty     
little      live        living      long        may         
measure     men         met         might       nation      
never       new         nor         not         note        
now         of          on          or          our         
people      perish      place       poor        portion     
power       proposition propriety   rather      remaining   
remember    resolve     resting     say         sense       
seven       shall       so          struggled   take        
task        testing     that        the         these       
they        this        those       to          us          
vain        war         we          what        whether     
which       who         will        world       years