CSC 221: Introduction to Programming
Fall 2018

HW4: Python Strings and Encryption


The use of codes (or ciphers) as a means of hiding the meaning of messages traces its roots to ancient history. The first known military use of codes was by Julius Caesar in 50 - 60 B.C. The Caesar cipher specified that each letter in the alphabet would be encoded using the letter three positions later in the alphabet. For example, 'A' would be encoded as 'D', 'B' would be encoded as 'E', 'C' would be encoded as 'F', and so on. The code wraps around at the end of the alphabet, so 'X', 'Y' and 'Z' would be encoded as 'A', 'B', and 'C', respectively. The complete mapping of letters is shown below.

Caesar cipher: ABCDEFGHIJKLMNOPQRSTUVWXYZ (alphabet) | | | | | | | | | | | | | V V V V V V V V V V V V V DEFGHIJKLMNOPQRSTUVWXYZABC (cipher key)

The Caesar cipher is an example of a substitution cipher, a code in which one letter of the alphabet is substituted for another. They key "DEFGHIJKLMNOPQRSTUVWXYZABC" defines the letter mapping used by the Caesar cipher. Using a different key, a completely different substitution cipher is defined. For example:

Qwerty cipher: ABCDEFGHIJKLMNOPQRSTUVWXYZ (alphabet) | | | | | | | | | | | | | V V V V V V V V V V V V V QWERTYUIOPASDFGHJKLZXCVBNM (cipher key)

Since there are 26! (or roughly 4 x 1026) different arrangements of the 26 letters in the alphabet, there are 26! different keys and so 26! different substitution ciphers that can be defined.


PART 1: Encoding with a Cipher (50%)

You are to define a function named encode that encodes a word or phrase using a substitution cipher. Your function should have two inputs, they key (a string containing the 26 letters in some order) and the word or phrase to be encoded (a string). It should return the string encoded using the substitution cipher with that key.

Your function should be able to transform both lowercase and uppercase letters, preserving the case. Any non-letter characters should be left unchanged when encoding. The capitalization of letters in the key should also be irrelevant. For example, the call

    encode("DEFGHIJKLMNOPQRSTUVWXYZABC", "ABCXYZ")
should return the encoded string "DEFABC", whereas the call
    encode("qwertyuiopasdfghjklzxcvbnm", "Et tu, Brute?")
should return the encoded string "Tz zx, Wkxzt?".

Hint: Your function should be similar to the caesar and rotate functions from class, in that it must traverse the word letter-by-letter, building up an encoded copy of that word. The steps for encoding each letter are different, however. To encode a letter using a substitution cipher, you must find the position of that letter in the alphabet (using the find method) and then get the letter in that same position in the key (using indexing).

Be sure to test your function thoroughly before moving on to the next part.


PART 2: Encoding & Decoding (30%)

Now suppose we want to write code for decoding messages that have been encoded using a substitution cipher. We could define a separate decode function for this task, but that function would be almost identical to your encode function. The only difference would be that the roles of the alphabet and the cipher key are reversed - to decode a letter in a message, you find its index in the key and replace it with the corresponding letter in the alphabet.

Instead of defining such a redundant function, let us instead rename your encode function with the more generic name cipher and add another input to the function. If the first input to the cipher function is the word "encode", then it performs the same encoding task as your encode function. If the first input is "decode", however, it performs the decoding operation. For example, the call

    cipher("encode", "qwertyuiopasdfghjklzxcvbnm", "Et tu, Brute?")
should return the encoded string "Tz zx, Wkxzt?", whereas the call
    cipher("decode", "qwertyuiopasdfghjklzxcvbnm", "Tz zx, Wkxzt?")
should return the decoded string "Et tu, Brute?". If the first input is anything other than "encode" or "decode", the function should simply return "UNKNOWN OPERATION".


PART 3: Rotating Ciphers (10%)

While substitution ciphers are reasonably effective at encoding/decoding messages, they are susceptible to various attacks that make them unworthy for serious encryption. Not all letters are equally likely in text, so the relative frequency of characters in a coded message can provide clues for decoding. For example, 'e' is the most frequently used letter in English text. If the letter 'w' appeared most frequently in a coded message, then one might guess that the substitution cipher maps 'e' to 'w'.

One approach to counter this type of analysis is to add rotation to the cipher. After each character is encoded, the key is rotated so that the first character is moved to the end. For example, using the cipher key "defghijklmnopqrstuvwxyzabc", the letter 'a' would be encoded as 'd'. But, after this encoding the key would be rotated to be "efghijklmnopqrstuvwxyzabcd". Thus, if the next letter to be encoded was another 'a', it would get encoded as an 'e' this time. Note: the rotation should take place even if the character was not a letter (and so remained unchanged).

Modify your cipher function so that it performs a rotation after each character is encoded/decoded. Once you are done, test your modified cipher function by decoding the following secret message, using the key "qwertyuiopasdfghjklzxcvbnm":

    Ehhpcymwzprzqqu, hufs fskmzphp oapvdge peuqcc!


PART 4: Encoding/Decoding Files (10%)

Finally, define the function cipher_file that encodes or decodes entire files of text. Your function should have four inputs: the type of operation (either "encode" or "decode"), the cipher key, the name of the file to be encoded/decoded, and the name of the file where the encoded/decoded text is to be saved. It should call your cipher function to perform the encoding/decoding. For example, the call

    cipher_file("encode", "qwertyuiopasdfghjklzxcvbnm", "speech.txt", "secret.txt")
would read in the contents of the file "speech.txt", encode it using the specified cipher key, and then save the encoded text in the file "secret.txt". Subsequently, the call
    cipher_file("decode", "qwertyuiopasdfghjklzxcvbnm", "secret.txt", "recover.txt")
would decode the contents of "secret.txt" and save the decode text in "recover.txt". As a result, the contents of "speech.txt" and "recover.txt" should be identical.