Finding moving average from data points in Python
Before reading this answer, bear in mind that there is another answer below, from Roman Kh, which uses
numpy.cumsum
and is MUCH MUCH FASTER than this one.
Best One common way to apply a moving/sliding average (or any other sliding window function) to a signal is by using numpy.convolve()
.
def movingaverage(interval, window_size):
window = numpy.ones(int(window_size))/float(window_size)
return numpy.convolve(interval, window, 'same')
Here, interval is your x
array, and window_size
is the number of samples to consider. The window will be centered on each sample, so it takes samples before and after the current sample in order to calculate the average. Your code would become:
plot(x,y)
xlim(0,1000)
x_av = movingaverage(interval, r)
plot(x_av, y)
xlabel("Months since Jan 1749.")
ylabel("No. of Sun spots")
show()
Moving average or running mean
For a short, fast solution that does the whole thing in one loop, without dependencies, the code below works great.
mylist = [1, 2, 3, 4, 5, 6, 7]
N = 3
cumsum, moving_aves = [0], []
for i, x in enumerate(mylist, 1):
cumsum.append(cumsum[i-1] + x)
if i>=N:
moving_ave = (cumsum[i] - cumsum[i-N])/N
#can do stuff with moving_ave here
moving_aves.append(moving_ave)
Moving average on list of (x,y) points in python
Here is an example using a convolution of size s
:
v = np.array([(0, 4), (1, 5), (2, 6), (-1, 9), (3, 7), (4, 8), (5, 9)])
s = 2
kernel = np.ones(s)
x = np.convolve(v[:,0], kernel, 'valid') / s
y = np.convolve(v[:,1], kernel, 'valid') / s
res = np.hstack((x[:, None], y[:, None]))
print(res)
Output:
[[0.5 4.5]
[1.5 5.5]
[0.5 7.5]
[1. 8. ]
[3.5 7.5]
[4.5 8.5]]
The bigger s
, the smoother the path. However, the bigger s
, the shorter the path.
How to calculate rolling / moving average using python + NumPy / SciPy?
A simple way to achieve this is by using np.convolve
.
The idea behind this is to leverage the way the discrete convolution is computed and use it to return a rolling mean. This can be done by convolving with a sequence of np.ones
of a length equal to the sliding window length we want.
In order to do so we could define the following function:
def moving_average(x, w):
return np.convolve(x, np.ones(w), 'valid') / w
This function will be taking the convolution of the sequence x
and a sequence of ones of length w
. Note that the chosen mode
is valid
so that the convolution product is only given for points where the sequences overlap completely.
Some examples:
x = np.array([5,3,8,10,2,1,5,1,0,2])
For a moving average with a window of length 2
we would have:
moving_average(x, 2)
# array([4. , 5.5, 9. , 6. , 1.5, 3. , 3. , 0.5, 1. ])
And for a window of length 4
:
moving_average(x, 4)
# array([6.5 , 5.75, 5.25, 4.5 , 2.25, 1.75, 2. ])
How does convolve
work?
Lets have a more in depth look at the way the discrete convolution is being computed.
The following function aims to replicate the way np.convolve
is computing the output values:
def mov_avg(x, w):
for m in range(len(x)-(w-1)):
yield sum(np.ones(w) * x[m:m+w]) / w
Which, for the same example above would also yield:
list(mov_avg(x, 2))
# [4.0, 5.5, 9.0, 6.0, 1.5, 3.0, 3.0, 0.5, 1.0]
So what is being done at each step is to take the inner product between the array of ones and the current window. In this case the multiplication by np.ones(w)
is superfluous given that we are directly taking the sum
of the sequence.
Bellow is an example of how the first outputs are computed so that it is a little clearer. Lets suppose we want a window of w=4
:
[1,1,1,1]
[5,3,8,10,2,1,5,1,0,2]
= (1*5 + 1*3 + 1*8 + 1*10) / w = 6.5
And the following output would be computed as:
[1,1,1,1]
[5,3,8,10,2,1,5,1,0,2]
= (1*3 + 1*8 + 1*10 + 1*2) / w = 5.75
And so on, returning a moving average of the sequence once all overlaps have been performed.
moving average in python
if lst[i] < n:
You do not define i
but I guess i
is a index of lst
so try:
for i, _ in enumerate(lst):
if lst[i] < n:
s += lst[n - (n-i)]
EDIT:
def gen(lst, n):
if n > len(lst):
print("Sorry can't do this")
try:
for i, _ in enumerate(lst[:-(n-1)]):
s = sum(lst[i:i+n])
yield s/n
except ZeroDivisionError:
print("Sorry can't do this")
Related Topics
Stacked Bar Plot Using Matplotlib
Python Subprocess and User Interaction
Run Code After Flask Application Has Started
Download File Using Partial Download (Http)
Valueerror: Unknown Ms Compiler Version 1900
Parse Key Value Pairs in a Text File
Fastest Way to Search a List in Python
How to Change the String Representation of a Python Class
Filtering a List Based on a List of Booleans
How to Run Python Script on Terminal
Builtins.Typeerror: Must Be Str, Not Bytes
"Private" (Implementation) Class in Python
What Does "Error: Option --Single-Version-Externally-Managed Not Recognized" Indicate
Which Version of Python Do I Have Installed
How to Force a Python Wheel to Be Platform Specific When Building It