To Sum Column With Condition
I have data in textfile. example. A B C D E F 10 0 0.9775 39.3304 0.9311 60.5601 10 1 0.9802 32.3287 0.9433 56.1201 10 2 0.9816 39.9759 0.9446 54.0428 10 3 0.9737 37.8779 0.9419 5
Solution 1:
>>>L=map(str.split, """10 0 0.9775 39.3304 0.9311 60.5601...10 1 0.9802 32.3287 0.9433 56.1201...10 2 0.9816 39.9759 0.9446 54.0428...10 3 0.9737 37.8779 0.9419 56.3865...10 4 0.9798 34.9152 0.905 69.0879...10 5 0.9803 50.057 0.9201 64.6289...10 6 0.9805 39.1062 0.9093 68.4061...10 7 0.9781 33.8874 0.9327 60.7631...10 8 0.9802 32.5734 0.9376 60.9165...10 9 0.9798 32.3466 0.94 54.7645...11 0 0.9749 40.2712 0.9042 71.2873...11 1 0.9755 35.6546 0.9195 63.7436...11 2 0.9766 36.753 0.9507 51.7864...11 3 0.9779 35.6485 0.9371 59.2483...11 4 0.9803 35.2712 0.8833 79.0257...11 5 0.981 46.5462 0.9156 66.6951...11 6 0.9809 41.8181 0.8642 83.7533...11 7 0.9749 36.7484 0.9259 62.36...11 8 0.9736 36.8859 0.9395 58.1538...11 9 0.98 32.4069 0.9255 61.202...12 0 0.9812 37.2547 0.9121 68.1347...12 1 0.9808 31.4568 0.9372 55.9992...12 2 0.9813 36.5316 0.9497 53.1687...12 3 0.9803 33.1063 0.9051 69.8894...12 4 0.9786 35.0318 0.8968 72.9963...12 5 0.9756 63.441 0.9091 69.9482...12 6 0.9804 39.1602 0.9156 65.2399...12 7 0.976 35.5875 0.9248 62.6284...12 8 0.9779 33.7774 0.9416 56.3755...12 9 0.9804 32.0849 0.9401 55.2871""".split("\n"))>>>from collections import defaultdict>>>D = defaultdict(float)>>>for a,b,c,d,e,f in L:... D[a] += float(c)...>>>D
defaultdict(<type 'float'>, {'11': 9.7756, '10': 9.791699999999999, '12': 9.7925})
>>>dict(D.items())
{'11': 9.7756, '10': 9.791699999999999, '12': 9.7925}
Solution 2:
with open('data.txt') as f:
next(f)
d=dict()
forx in f:
if x.split()[0] not in d:
d[x.split()[0]]=float(x.split()[2])
else:
d[x.split()[0]]+=float(x.split()[2])
output:
{'11': 9.7756, '10': 9.791699999999999, '12': 9.7925}
Solution 3:
For fun
#!/usr/bin/env kshwhile <file; do
((a[$1]+=$3))
doneprint -C a
output
([10]=9.7917 [11]=9.7756 [12]=9.7925)
Requires the undocumented FILESCAN compile-time option.
Solution 4:
If you want the sum grouped by A value:
awk '{sums[$1] +=$3} END {for (sum in sums) print sum, sums[sum]}' inputfile
Solution 5:
import csv
withopen("file.txt","rU") as f:
reader = csv.reader(f)
# read header
reader.next()
# summarize
a_values = []
sum = 0for row in reader:
if row[0] notin a_values:
a_values.append(row[0])
sum += float(row[2])
Post a Comment for "To Sum Column With Condition"