Calculate Mean For Each Csv Row
Solution 1:
In this situation, Pandas is really helpful. You can avoid all looping and neatly read csv into dataframe. Then join all three dataframes into one and calculate the pandas.DataFrame.mean of the required fields in row wise.
pandas.read_csv has the option to limit the number of rows using nrows parameter.
import pandas as pd
df1=pd.read_csv('file1.txt',names=['x1','Y1','Value1'],nrows=5356)
df2=pd.read_csv('file2.txt',names=['x2','Y2','Value2'],nrows=5356)
df3=pd.read_csv('text3.txt',names=['x3','Y3','Value3'],nrows=5356)
df_concat= pd.concat([df1,df2,df3], axis=1)
print df_concat
df_concat['meanvalue']=df_concat[['Value1','Value2','Value3']].mean(axis=1)
print(df_concat.to_csv(columns=['meanvalue'],index=False))
output
meanvalue
-96.5-97.0-86.0-95.0
Solution 2:
You may just want to make one large pandas table in that case using join. The join value will need to be the index of the respective dataframe.
This way, you can join where the x and y value are the same. You will end up with 5 columns, x,y and the following 3 columns will be your values you want to calculate from. Now, you can simply create a new column that measures the mean across the last 3 values in a row for the dataframe. x or y, whichever is unique can be made as the index.
The pandas merge function should help you merge based on the rows themselves.
The SQL equivalent of what you are doing is an inner join on the y values, which I assume are unique per csv file.
Post a Comment for "Calculate Mean For Each Csv Row"