Skip to content Skip to sidebar Skip to footer

Determining Letter Frequency Of Cipher Text

I am trying to make a tool that finds the frequencies of letters in some type of cipher text. Lets suppose it is all lowercase a-z no numbers. The encoded message is in a txt file

Solution 1:

import collections

d = collections.defaultdict(int)
for c in 'test':
    d[c] += 1print d # defaultdict(<type'int'>, {'s': 1, 'e': 1, 't': 2})

From a file:

myfile = open('test.txt')
forlinein myfile:
    line = line.rstrip('\n')
    forcin line:
        d[c] += 1

For the genius that is the defaultdict container, we must give thanks and praise. Otherwise we'd all be doing something silly like this:

s = "andnowforsomethingcompletelydifferent"
d = {}
for letter in s:
    if letter not in d:
        d[letter] = 1
    else:
        d[letter] += 1

Solution 2:

The modern way:

from collections import Counter

string = "ihavesometextbutidontmindsharing"
Counter(string)
#>>> Counter({'i': 4, 't': 4, 'e': 3, 'n': 3, 's': 2, 'h': 2, 'm': 2, 'o': 2, 'a': 2, 'd': 2, 'x': 1, 'r': 1, 'u': 1, 'b': 1, 'v': 1, 'g': 1})

Solution 3:

If you want to know the relative frequency of a letter c, you would have to divide number of occurrences of c by the length of the input.

For instance, taking Adam's example:

s = "andnowforsomethingcompletelydifferent"n = len(s) # n = 37

and storing the absolute frequence of each letter in

dict[letter]

we obtain the relative frequencies by:

from string import ascii_lowercase # this is "a...z"for c in ascii_lowercase:
    print c, dict[c]/float(n)

putting it all together, we get something like this:

# get input
s = "andnowforsomethingcompletelydifferent"
n = len(s) # n = 37# get absolute frequencies of lettersimport collections
dict = collections.defaultdict(int)
for c in s:
    dict[c] += 1# print relative frequenciesfrom string import ascii_lowercase # this is "a...z"for c in ascii_lowercase:
    print c, dict[c]/float(n)

Post a Comment for "Determining Letter Frequency Of Cipher Text"