Skip to content Skip to sidebar Skip to footer

Python- File Parsing

Write a program which reads a text file called input.txt which contains an arbitrary number of lines of the form ', ' then records this information using a dictionary, and

Solution 1:

I don't understand what you are trying to do, so I can't really explain how to fix it. In particular, why are you execing the lines of the file? And why write exec "foo" instead of just foo? I think you should go back to a basic Python tutorial...

Anyway, what you need to do is:

  • open the file using its full path
  • for line in file: process the line and store it in a dictionary
  • return the dictionary

That's it, no exec involved.

Solution 2:

Yup, that's a whole lot of crap you either don't need or shouldn't do. Here's how I'd do it prior to Python 2.7 (after that, use collections.Counter as shown in the other answers). Mind you, this'll return the dictionary containing the counts, not print it, you'd have to do that externally. I'd also not prefer to give a complete solution for homeworks, but it's already been done, so I suppose there's no real damage in explaining a bit about it.

defparseFile(filename):
  withopen(filename, 'r') as fh:
    lines = fh.readlines()
    d={}
    for country in [line.split(',')[1].strip() for line in lines]:
      d[country] = d.get(country,0) + 1return d

Lets break that down a bit, shall we?

withopen(filename, 'r') as fh:
    lines = fh.readlines()

This is how you'd normally open a text file for reading. It will raise an IOError exception if the file doesn't exist or you don't have permissions or the likes, so you'll want to catch that. readlines() reads the entire file and splits it into lines, each line becomes an element in a list.

d={}

This simply initializes an empty dictionary

    for country in [line.split(',')[1].strip() for line in lines]:

Here is where the fun starts. The bracket enclosed part to the right is called a list comprehension, and it basically generates a list for you. What it pretty much says, in plain english, is "for each element 'line' in the list 'lines', take that element/line, split it on each comma, take the second element (index 1) of the list you get from the split, strip off any whitespace from it, and use the result as an element in the new list" Then, the left part of it just iterates over the generated list, giving the name 'country' to the current element in the scope of the loop body.

      d[country] = d.get(country,0) + 1

Ok, ponder for a second what would happen if instead of the above line, we'd used the following:

      d[country] = d[country] + 1

It'd crash, right (KeyError exception), because d[country] doesn't have a value the first time around. So we use the get() method, all dictionaries have it. Here's the nifty part - get() takes an optional second argument, which is what we want to get from it if the element we're looking for doesn't exist. So instead of crashing, it returns 0, which (unlike None) we can add 1 to, and update the dictionary with the new count. Then we just return the lot of it.

Hope it helps.

Solution 3:

I would use a defaultdict plus a list to mantain the structure of the information. So additional statistics can be derived.

import collections

defparse_cities(filepath):
    countries_cities_map = collections.defaultdict(list)
    withopen(filepath) as fd:
        for line in fd:
            values = line.strip().split(',')
            iflen(values) == 2:
                city, country = values
                countries_cities_map[country].append(city)
    return countries_cities_map

defformat_cities_per_country(countries_cities_map):
    for country, cities in countries_cities_map.iteritems():
        print" {ncities} Cities found in {country} country".format(country=country, ncities = len(cities))


if __name__ == '__main__':
    import sys
    filepath = sys.argv[1]
    format_cities_per_country(parse_cities(filepath))

Solution 4:

import collections

defreadFile(fname):
    withopen(fname) as inf:
        return [tuple(s.strip() for s in line.split(",")) for line in inf]

defcountCountries(city_list):
    return collections.Counter(country for city,country in city_list)

defmain():
    cities = readFile("input.txt")
    countries = countCountries(cities)

    print("{0} cities found in {1} countries:".format(len(cities), len(countries)))

    for country, num in countries.iteritems():
        print("{country}: {num}".format(country=country, num=num))

if __name__=="__main__":
    main()

Post a Comment for "Python- File Parsing"