CSC 221: Introduction to Programming
Fall 2011

HW4: Python Strings and Encryption


The use of codes (or ciphers) as a means of hiding the meaning of messages traces its roots to ancient history. The first known military use of codes was by Julius Caesar in 50 - 60 B.C. The Caesar cipher specified that each letter in the alphabet would be encoded using the letter three positions later in the alphabet. For example, 'a' would be encoded as 'd', 'b' would be encoded as 'e', 'c' would be encoded as 'f', and so on. The code wraps around at the end of the alphabet, so 'x', 'y' and 'z' would be encoded as 'a', 'b', and 'c', respectively. The complete mapping of letters is shown below.

Caesar cipher: abcdefghijklmnopqrstuvwxyz | | | | | | | | | | | | | V V V V V V V V V V V V V defghijklmnopqrstuvwxyzabc

As we discussed in class, the following Python function (along with the assignment to ALPHABET) encodes a word using the Caesar cipher:

PART 1: Substitution Ciphers (50%)

Although the Caesar cipher was effective in its time (when very few people could read at all), its simple pattern of encoding letters seems pretty obvious today. However, it can be generalized to create more effective ciphers. The Caesar cipher is an example of a substitution cipher, a code in which one letter of the alphabet is substituted for another. They key "defghijklmnopqrstuvwxyzabc" defines the letter mapping used by the Caesar cipher. Using a different key, a completely different substitution cipher is defined. For example:

Qwerty cipher: abcdefghijklmnopqrstuvwxyz | | | | | | | | | | | | | V V V V V V V V V V V V V qwertyuiopasdfghjklzxcvbnm

Since there are 26! (or roughly 4 x 1026) different arrangements of the 26 letters in the alphabet, there are 26! different keys and so 26! different substitution ciphers that can be defined.

You are to define two functions, encode and decode, that encode and decode (respectively) a word using a substitution cipher. Each function should have two inputs, they key (a string containing the 26 letters in some order) and the word to be encoded/decoded (a string of letters). Each should return the string encoded/decoded using a the substitution cipher with that key. For example, the call

    encode("qwertyuiopasdfghjklzxcvbnm", "abcxyz")
should return the encoded string "qwebnm", while conversely
    decode("qwertyuiopasdfghjklzxcvbnm", "qwebnm")
should return the decoded string "abcxyz".

Hint: Your functions should both be similar to the caesar function, in that they must traverse the word letter-by-letter, building up an encoded/decoded copy of that word. The steps for encoding/decoding each letter are different, however. To encode a letter using a substitution cipher, you must find the position of that letter in the alphabet (using the find method) and then get the letter in that same position in the key (using indexing). Conversely, to decode a letter, you must find the position of that letter in the key and then get the letter in that same position in the alphabet.

Be sure to test your functions thoroughly before moving on to the next part.

PART 2: Encoding Entire Messages (30%)

As they are descibed in PART 1, your encode and decode functions are good for encoding/decoding words only. If you tried to encode/decode a message containing spaces, other non-letters, or even capital letters, an error would occur. Modify your encode and decode functions so that

For example, the call
    encode("qwertyuiopasdfghjklzxcvbnm", "Abc, xyz!")
should return the encoded string "Qwe, bnm!".

PART 3: Rotating Ciphers (20%)

While substitution ciphers are reasonably effective at encoding/decoding messages, they are susceptible to various attacks that make them unworthy for serious encryption. Not all letters are equally likely in text, so the relative frequency of characters in a coded message can provide clues for decoding. For example, 'e' is the most frequently used letter in English text. If the letter 'w' appeared most frequently in a coded message, then one might guess that the substitution cipher maps 'e' to 'w'.

One approach to counter this type of analysis is to add rotation to the cipher. After each letter is encoded, the key is rotated so that the first character is moved to the end. For example, using the cipher key "defghijklmnopqrstuvwxyzabc", the letter 'a' would be encoded as 'd'. But, after this encoding the key would be rotated to be "efghijklmnopqrstuvwxyzabcd". Thus, if the next letter to be encoded was another 'a', it would get encoded as an 'e' this time.

Modify your encode and decode functions so that they perform a rotation after each letter is encoded/decoded. Once you are done, test your decode function by decoding the following secret message, using the key "qwertyuiopasdfghjklzxcvbnm":

    Ehhpcymwzprzqqu, ftsp vnbaqe txgtvdzz ibngsry ir dypr!

Note: this work must be entirely your own, with no outside assistance other than the instructor.