Skip to content Skip to sidebar Skip to footer

How Can I Merge Fields In A Csv String Using Python?

I am trying to merge three fields in each line of a CSV file using Python. This would be simple, except some of the fields are surrounded by double quotes and include commas. Here

Solution 1:

Something like this?

import csv
source= csv.reader( open("some file","rb") )
dest= csv.writer( open("another file","wb") )
forrowin source:
    result=row[:6] + [ row[6]+row[7]+row[8] ] +row[9:]
    dest.writerow( result )

Example

>>> data=''',,Joe,Smith,New Haven,CT,"Moved from Portland, CT",,goo,
... '''.splitlines()
>>> rdr= csv.reader( data )
>>> row= rdr.next()
>>> row
['', '', 'Joe', 'Smith', 'New Haven', 'CT', 'Moved from Portland, CT', '', 'goo', '' ]
>>> row[:6] + [ row[6]+row[7]+row[8] ] +  row[9:]
['', '', 'Joe', 'Smith', 'New Haven', 'CT', 'Moved from Portland, CTgoo', '']

Solution 2:

You can use the csv module to do the heavy lifting: http://docs.python.org/library/csv.html

You didn't say exactly how you wanted to merge the columns; presumably you don't want your merged field to be "Moved from Portland, CTgoo". The code below allows you to specify a separator string (maybe ", ") and handles empty/blank fields.

[transcript of session]
prompt>type merge.py
import csv

defmerge_csv_cols(infile, outfile, startcol, numcols, sep=", "):
    reader = csv.reader(open(infile, "rb"))
    writer = csv.writer(open(outfile, "wb"))
    endcol = startcol + numcols
    for row in reader:
        merged = sep.join(x for x in row[startcol:endcol] if x.strip())
        row[startcol:endcol] = [merged]
        writer.writerow(row)

if __name__ == "__main__":
    import sys
    args = sys.argv[1:6]
    args[2:4] = map(int, args[2:4])
    merge_csv_cols(*args)

prompt>typeinput.csv
1,2,3,4,5,6,7,8,9,a,b,c
1,2,3,4,5,6,,,,a,b,c
1,2,3,4,5,6,7,8,,a,b,c
1,2,3,4,5,6,7,,9,a,b,c

prompt>\python26\python merge.py input.csv output.csv 63", "

prompt>type output.csv
1,2,3,4,5,6,"7, 8, 9",a,b,c
1,2,3,4,5,6,,a,b,c
1,2,3,4,5,6,"7, 8",a,b,c
1,2,3,4,5,6,"7, 9",a,b,c

Solution 3:

There's a builtin module in Python for parsing CSV files:

http://docs.python.org/library/csv.html

Solution 4:

You have tagged this question as 'database'. In fact, maybe it would be easier to upload the two files to separate tables of the db (you can use sqllite or any python sql library, like sqlalchemy) and then join them.

That would give you some advantage after, you would be able to use a sql syntax to query the tables and you can store it on the disk instead of keeping it on memory, so think about it.. :)

Post a Comment for "How Can I Merge Fields In A Csv String Using Python?"