How To Chunk A Csv (dict)reader Object In Python 3.2?

August 19, 2022 Post a Comment

I try to use Pool from the multiprocessing module to speed up reading in large csv files. For this, I adapted an example (from py2k), but it seems like the csv.dictreader object ha

Solution 1:

From the csv.DictReader documentation (and the csv.reader class it subclasses), the class returns an iterator. The code should have thrown a TypeError when you called len().

You can still chunk the data, but you'll have to read it entirely into memory. If you're concerned about memory you can switch from csv.DictReader to csv.reader and skip the overhead of the dictionaries csv.DictReader creates. To improve readability in csv2nodes(), you can assign constants to address each field's index:

CELL = 0
SEQ_EI = 1
DAT_DEB_OCCUPATION = 4
DAT_FIN_OCCUPATION = 5

I also recommend using a different variable than id, since that's a built-in function name.

Baca Juga

How To Nest Itertools Products?
Adding State To A Function Which Gets Called Via Pool.map -- How To Avoid Pickling Errors
Multiprocessing Module In Python2.7 Causing Some Issue

Learn Python Tutorials

How To Chunk A Csv (dict)reader Object In Python 3.2?

Solution 1:

Post a Comment for "How To Chunk A Csv (dict)reader Object In Python 3.2?"