Csv Find Max In Column And Append New Data
Solution 1:
if you have it in a pandas DataFrame, you can get the max of any column like this:
>>> max(data['time'])
'2012-01-18 15:52:26'
where data
is the variable name for the DataFrame and time
is the name of the column
Solution 2:
I'll give you two answers, one that just returns the max value, and one that returns the row from the CSV that includes the max value.
import csv
import operator as op
import requests
symbol = "mtgoxUSD"
url = 'http://api.bitcoincharts.com/v1/trades.csv?symbol={}'.format(symbol)
csv_file = "trades_{}.csv".format(symbol)
data = requests.get(url)
withopen(csv_file, "w") as f:
f.write(data.text)
withopen(csv_file) as f:
next(f) # discard first row from file -- see notes
max_value = max(row[0] for row in csv.reader(f))
withopen(csv_file) as f:
next(f) # discard first row from file -- see notes
max_row = max(csv.reader(f), key=op.itemgetter(0))
Notes:
max()
can directly consume an iterator, andcsv.reader()
gives us an iterator, so we can just pass that in. I'm assuming you might need to throw away a header line so I showed how to do that. If you had multiple header lines to discard, you might want to useislice()
from theitertools
module.In the first one, we use a "generator expression" to select a single value from each row, and find the max. This is very similar to a "list comprehension" but it doesn't build a whole list, it just lets us iterate over the resulting values. Then
max()
consumes the iterable and we get the max value.max()
can use akey=
argument where you specify a "key function". It will use the key function to get a value and use that value to figure the max... but the value returned bymax()
will be the unmodified original value (in this case, a row value from the CSV). In this case, the key function is manufactured for you byoperator.itemgetter()
... you pass in which column you want, andoperator.itemgetter()
builds a function for you that gets that column.
The resulting function is the equivalent of:
defget_col_0(row):
return row[0]
max_row = max(csv.reader(f), key=get_col_0)
Or, people will use lambda
for this:
max_row = max(csv.reader(f), key=lambda row: row[0])
But I think operator.itemgetter()
is convenient and nice to read. And it's fast.
- I showed saving the data in a file, then pulling from the file again. If you want to go through the data without saving it anywhere, you just need to iterate over it by lines.
Perhaps something like:
text = data.text
rows= [line.split(',') for line in text.split("\n") if line]
rows.pop(0) # get rid offirstrowfrom data
max_value =max(row[0] forrowinrows)
max_row =max(rows, key=op.itemgetter(0))
- I don't know which column you want... column "A" might be column 0 so I used 0 in the above. Replace the column number as you like.
Solution 3:
It seems like something like this should work:
import requests
import csv
symbol = "mtgoxUSD"
url = 'http://api.bitcoincharts.com/v1/trades.csv?symbol={}'.format(symbol)
data = requests.get(url)
with open("trades_{}.csv".format(symbol), "r+") as f:
all_values = list(csv.reader(f))
max_value = max([int(row[2]) for row in all_values[1:]])
(write-out-the-value?)
EDITS: I used "row[2]" because that was the sample column I was taking max of in my csv. Also, I had to strip off the column headers, which were all text, which was why I looked at "all_values[1:]" from the second row to the end of the file.
Post a Comment for "Csv Find Max In Column And Append New Data"