Skip to content Skip to sidebar Skip to footer

Csv Find Max In Column And Append New Data

I asked a question about two hours ago regarding the reading and writing of data from a website. I've spent the last two hours since then trying to find a way to read the maximum d

Solution 1:

if you have it in a pandas DataFrame, you can get the max of any column like this:

>>> max(data['time'])
'2012-01-18 15:52:26'

where data is the variable name for the DataFrame and time is the name of the column

Solution 2:

I'll give you two answers, one that just returns the max value, and one that returns the row from the CSV that includes the max value.

import csv
import operator as op
import requests

symbol = "mtgoxUSD"
url = 'http://api.bitcoincharts.com/v1/trades.csv?symbol={}'.format(symbol)
csv_file = "trades_{}.csv".format(symbol)

data = requests.get(url)
withopen(csv_file, "w") as f:
    f.write(data.text)

withopen(csv_file) as f:
    next(f) # discard first row from file -- see notes
    max_value = max(row[0] for row in csv.reader(f))

withopen(csv_file) as f:
    next(f) # discard first row from file -- see notes
    max_row = max(csv.reader(f), key=op.itemgetter(0))

Notes:

  • max() can directly consume an iterator, and csv.reader() gives us an iterator, so we can just pass that in. I'm assuming you might need to throw away a header line so I showed how to do that. If you had multiple header lines to discard, you might want to use islice() from the itertools module.

  • In the first one, we use a "generator expression" to select a single value from each row, and find the max. This is very similar to a "list comprehension" but it doesn't build a whole list, it just lets us iterate over the resulting values. Then max() consumes the iterable and we get the max value.

  • max() can use a key= argument where you specify a "key function". It will use the key function to get a value and use that value to figure the max... but the value returned by max() will be the unmodified original value (in this case, a row value from the CSV). In this case, the key function is manufactured for you by operator.itemgetter()... you pass in which column you want, and operator.itemgetter() builds a function for you that gets that column.

The resulting function is the equivalent of:

defget_col_0(row):
    return row[0]
max_row = max(csv.reader(f), key=get_col_0)

Or, people will use lambda for this:

max_row = max(csv.reader(f), key=lambda row: row[0])

But I think operator.itemgetter() is convenient and nice to read. And it's fast.

  • I showed saving the data in a file, then pulling from the file again. If you want to go through the data without saving it anywhere, you just need to iterate over it by lines.

Perhaps something like:

text = data.text
rows= [line.split(',') for line in text.split("\n") if line]
rows.pop(0)  # get rid offirstrowfrom data
max_value =max(row[0] forrowinrows)
max_row =max(rows, key=op.itemgetter(0))
  • I don't know which column you want... column "A" might be column 0 so I used 0 in the above. Replace the column number as you like.

Solution 3:

It seems like something like this should work:

import requests
import csv
symbol = "mtgoxUSD"
url = 'http://api.bitcoincharts.com/v1/trades.csv?symbol={}'.format(symbol)
data = requests.get(url)
with open("trades_{}.csv".format(symbol), "r+") as f:
    all_values = list(csv.reader(f))
    max_value = max([int(row[2]) for row in all_values[1:]])
(write-out-the-value?)

EDITS: I used "row[2]" because that was the sample column I was taking max of in my csv. Also, I had to strip off the column headers, which were all text, which was why I looked at "all_values[1:]" from the second row to the end of the file.

Post a Comment for "Csv Find Max In Column And Append New Data"