Chi-squared for determining people voting in each category
Your data is in the form of a contingency table. SciPy has the function scipy.stats.chi2_contingency
for applying the chi-squared test to a contingency table.
For example,
In [48]: import numpy as np
In [49]: from scipy.stats import chi2_contingency
In [50]: tbl = np.array([[1500, 826, 431], [212, 652, 542]])
In [51]: stat, p, df, expected = chi2_contingency(tbl)
In [52]: stat
Out[52]: 630.0807418107023
In [53]: p
Out[53]: 1.5125346728116583e-137
In [54]: df
Out[54]: 2
In [55]: expected
Out[55]:
array([[1133.79389863, 978.82440548, 644.38169589],
[ 578.20610137, 499.17559452, 328.61830411]])
Chi square test with different sample sizes in Python
You can't do this unless both f_exp
and f_obs
have the same length. You can achieve your goal by interpolating Y_data2
on the x-axis of Y_data1
. You can do it as follows:
from scipy.interpolate import InterpolatedUnivariateSpline
spl = InterpolatedUnivariateSpline(X_data2, Y_data2)
new_Y_data2 = spl(X_data1)
As both Y_data1
and new_Y_data2
have same lengths now, you can use them in stats.chisquare
as follows:
from scipy import stats
stats.chisquare(f_obs=Y_data1, f_exp=new_Y_data2)
Chi-Squared test in Python
scipy.stats.chisquare
expects observed and expected absolute frequencies, not ratios. You can obtain what you want with
>>> observed = np.array([20., 20., 0., 0.])
>>> expected = np.array([.25, .25, .25, .25]) * np.sum(observed)
>>> chisquare(observed, expected)
(40.0, 1.065509033425585e-08)
Although in the case that the expected values are uniformly distributed over the classes, you can leave out the computation of the expected values:
>>> chisquare(observed)
(40.0, 1.065509033425585e-08)
The first returned value is the χ² statistic, the second the p-value of the test.
Related Topics
Building Python with Ssl Support in Non-Standard Location
Permanent Fix for Opencv Videocapture
Sqlalchemy: Print the Actual Query
How to Customise Qgroupbox Title in Pyqt5
Equivalent of a Python Dict in R
How Can One Find the Unicode Codepoints That a Font Has Glyphs For, on a Debian-Based System
Swift If Or/And Statement Like Python
Ipython Reads Wrong Python Version
Quick Way to Upsample Numpy Array by Nearest Neighbor Tiling
How to Specify Working Directory for Popen
List Running Processes on 64-Bit Windows
How to Install Pil with Pip on MAC Os
Using Beautiful Soup to Convert CSS Attributes to Individual HTML Attributes
Dealing with the Class Imbalance in Binary Classification
Python Equivalent of Ruby's .Select
Function Which Returns the Least-Squares Solution to a Linear Matrix Equation