There is a more accurate method for calculating focal statistics than usually implemented in numerical packages. This is documented here and the C++ code is download from here
The SWIG setup file is simple:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* File: RunningStats.i */ | |
%module RunningStats | |
%{ | |
#define SWIG_FILE_WITH_INIT | |
#include "RunningStats.h" | |
%} | |
%include "RunningStats.h" |
which you run using:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
swig -c++ -python RunningStats.i |
The Distutils setup script is:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
setup.py file for RunningStats | |
""" | |
from distutils.core import setup, Extension | |
RunningStats_module = Extension('_RunningStats', | |
sources=['RunningStats_wrap.cxx', 'RunningStats.cpp'], | |
) | |
setup (name = 'RunningStats', | |
version = '0.1', | |
author = "John D. Cook / Daniel Buscombe", | |
description = """Compute running stats on an array""", | |
ext_modules = [RunningStats_module], | |
py_modules = ["RunningStats"], | |
) |
compile using "python setup.py install"
Then in python, a simple implementation of calculating the global stats from an array would be something like:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
# generate some data | |
x = np.random.rand(1000,1) | |
# get the class | |
import RunningStats | |
# get an instance of the class | |
rs = RunningStats.RunningStats() | |
# global stats | |
for k in x: | |
rs.Push(k[0]) | |
print rs.Variance() | |
print rs.Mean() | |
print rs.Kurtosis() | |
print rs.Skewness() | |
print rs.StandardDeviation() | |
rs.Clear() |
To show the convergence of locals means on the global mean, you might use the RunningStats class in the following way:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import matplotlib.pylab as plt | |
# generate some data | |
x = np.random.rand(1000,1) | |
# get the class | |
import RunningStats | |
# get an instance of the class | |
rs = RunningStats.RunningStats() | |
# convergence on global mean | |
convav = [] | |
for k in x: | |
rs.Push(k[0]) | |
convav.append(rs.Mean()) | |
plt.plot(x); plt.plot(convav,'r'); plt.show() |
Finally, to compute running averages on the array you might use something like the following, where the function to window the data efficiently was downloaded from here
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import matplotlib.pylab as plt | |
# generate some data | |
x = np.random.rand(1000,1) | |
# get the class | |
import RunningStats | |
# get an instance of the class | |
rs = RunningStats.RunningStats() | |
# define a window size for the running average | |
window_size = 10 | |
from itertools import product | |
# use sliding window_nd from here: http://www.johnvinyard.com/blog/?p=268 | |
xs = sliding_window_nd(np.squeeze(x),window_size,1) | |
runav = [] | |
for j in xrange(len(xs)): | |
for k in xs[j]: | |
rs.Push(k) | |
runav.append(rs.Mean()) | |
rs.Clear() | |
i = np.linspace(1,len(x),len(runav)) | |
plt.plot(x); plt.plot(i,runav,'r'); | |
# compute using a larger window size | |
window_size = 50 | |
xs = sliding_window_nd(np.squeeze(x),window_size,1) | |
runav = [] | |
for j in xrange(len(xs)): | |
for k in xs[j]: | |
rs.Push(k) | |
runav.append(rs.Mean()) | |
rs.Clear() | |
i = np.linspace(1,len(x),len(runav)) | |
plt.plot(i,runav,'k'); | |
plt.show() |
Obviously, as well as the mean, the above could be used for variance, standard deviation, kurtosis and skewness.