Skip to content Skip to sidebar Skip to footer

Python Counter From Txt File

I would like to init a collections.Counter object from a text file of word frequency counts. That is, I have a file 'counts.txt': rank wordform abs r mod 1

Solution 1:

Here are two versions. The first takes your counts.txt as a regular text file. The second treats it as a csv file (which is what it kind of looks like).

from collections import Counter

withopen('counts.txt') as f:
    lines = [line.strip().split() for line in f]
    wordCounts = Counter({line[1]: int(line[2]) for line in lines[1:]})
    print wordCounts.most_common(3)

If your data file some how turned out to be delimited by some consistent character or string you could use a csv.DictReader object to parse the file.

Shown below is how it could be done IF your file were TAB delimited.

The data file (as edited by me to be TAB delimited)

rank    wordform    abs r   mod1   the 22530029223066.92and15748629156214.43   to  13447829134044.8999 fallen  34529326.61000    supper  36827325.8

The code

from csv import DictReader
from collections import Counter

withopen('counts.txt') as f:
    reader = DictReader(f, delimiter='\t')
    wordCounts = Counter({row['wordform']: int(row['abs']) for row in reader})
    print wordCounts.most_common(3)

Solution 2:

import collections.Counter

words = dict()
fp = open('counts.txt')

for line in fp:
   items = line.split()
   words[items[1].strip()] = int(items[2].strip())

wordCounts = collections.Counter(words)

Post a Comment for "Python Counter From Txt File"