Calculate average of every n rows from a csv file
You can use integer division by step
for consecutive groups and pass to groupby
for aggregate mean
:
step = 30
m_df = pd.read_csv(m_path, usecols=['Col-01'])
df = m_df.groupby(m_df.index // step).mean()
Or:
df = m_df.groupby(np.arange(len(dfm_df// step).mean()
Sample data:
step = 3
df = m_df.groupby(m_df.index // step).mean()
print (df)
H
0 3
1 1
2 2
Calculate Average Every x Rows with MySQL Query
It's not clear to me whether you actually want to persist the average values in your table or just calculate them in a SELECT query.
Assuming it's the latter, you could do something like this:
set @rownum := 0;
set @sum := 0;
select ts,messageVals,the_avg
from (
select ts,messageVals,
@rownum := (@rownum + 1) as rownum,
@sum := IF(@rownum mod 20 = 1,0 + messageVals,@sum + messageVals) as running_sum,
IF(@rownum mod 20 = 0,@sum / 20,NULL) as the_avg
from so9571582
order by ts
) s;
Python: I need to find the average over x amount of rows in a specific column of a large csv file
numpy - average & reshape
n = 3
x = df['Pressure']
# calculates the average
avgResult = np.average(givenArray.reshape(-1, n), axis=1)
the result is array, which divide columns into n sets:
eg:
array([3.33333333, 4.66666667])
in:
n=3
x = np.array([1, 4, 5,2,8,4])
How to calculate the average every x steps
Hi I do not know the meaning of your columns but in principle you can do like this
Create a group variable with values 1,2,.. cm
group by that variable and compute the mean of the target columns
In my code below I suppose Depth
is the variable measured in cm (it goes from 0.0 to 3.9 with step 0.1). The groups based on Depth
can be obtained just rounding the values of this column.
I will use data.table
in my attempt (the code below computes the mean for all the columns...you can specify a single one in dd[, lapply(.SD, mean), by = grp]
).
library(data.table)
dd = data.table(data) # data is your data.frame
dd[, grp := floor(Depth)]
dd_avg = dd[, lapply(.SD, mean), by = grp]
print(dd_avg)
grp Depth AIncCoh BFeIncCoh CAlinccoh DPinccoh ECaTi FBrCl
1: 0 0.45 5.690180 1.9827592 0.0007325078 0.0004672228 3.117323 3.143967
2: 1 1.45 6.448082 0.6429498 0.0003361405 0.0002156380 2.728719 2.877215
3: 2 2.45 6.353116 1.5307018 0.0006007855 0.0004556655 2.902610 2.693379
4: 3 3.45 6.261482 1.7493288 0.0006819498 0.0004533617 2.014624 2.782969
GSrCa HSiinccoh IMnFe JFeTi KZrRb Age
1: 0.6493423 0.0009668790 0.005252742 315.63711 2.781960 NA
2: 0.5024054 0.0007016292 0.005712667 97.73804 3.356514 NA
3: 0.4322129 0.0008660846 0.007432665 221.57936 2.303042 NA
4: 0.4557961 0.0011636097 0.004084271 183.38096 2.633945 NA
This is just an example because I do not know the precise details of your problem. I think you can easily adapt the 'logic' to your problem.
SQL - Average every n rows for items with same ID
This ended up working for me, however is not what I initially planned it:
select id, reference, avg(b1), avg(b25), avg(b10), max(created_at)
from
(
select id,
@row_number := case when @reference = reference then @row_number + 1 else 0 end as row_number,
@reference := reference as reference,
b1,
b25,
b10,
created_at
from history_air
cross join (select @row_number := -1, @reference := '') as t
order by reference, created_at
) as t
group by reference, row_number div 150
order by reference, row_number div 150;
Mean of every 15 rows of a dataframe in python
The following should work:
dfnew=df[:0]
for i in range(100):
df2=df.iloc[i*15:i*15+15, :]
x=pd.Series(dict(df2.mean()))
dfnew=dfnew.append(x, ignore_index=True)
print(dfnew)
Related Topics
Difference Between Len() and ._Len_()
How to Plot a Gradient Color Line in Matplotlib
How to Find the Groups of Consecutive Elements in a Numpy Array
Print to Standard Printer from Python
How to Pipe a Subprocess Call to a Text File
Understanding Popen.Communicate
Search for String in All Pandas Dataframe Columns and Filter
Example Use of "Continue" Statement in Python
Django: Improperlyconfigured: the Secret_Key Setting Must Not Be Empty
Problems with Pip Install Numpy - Runtimeerror: Broken Toolchain: Cannot Link a Simple C Program
How to Get Rid of Python Tkinter Root Window
Monitoring Contents of Files/Directories
Reloading Module Giving Nameerror: Name 'Reload' Is Not Defined
Why Is Bubble Sort Implementation Looping Forever
Python Equivalent of Setinterval()
Execute a File with Arguments in Python Shell