Count frequency of values in pandas DataFrame column
You can use value_counts
and to_dict
:
print df['status'].value_counts()
N 14
S 4
C 2
Name: status, dtype: int64
counts = df['status'].value_counts().to_dict()
print counts
{'S': 4, 'C': 2, 'N': 14}
Count the frequency that a value occurs in a dataframe column
Use value_counts()
as @DSM commented.
In [37]:
df = pd.DataFrame({'a':list('abssbab')})
df['a'].value_counts()
Out[37]:
b 3
a 2
s 2
dtype: int64
Also groupby
and count
. Many ways to skin a cat here.
In [38]:
df.groupby('a').count()
Out[38]:
a
a
a 2
b 3
s 2
[3 rows x 1 columns]
See the online docs.
If you wanted to add frequency back to the original dataframe use transform
to return an aligned index:
In [41]:
df['freq'] = df.groupby('a')['a'].transform('count')
df
Out[41]:
a freq
0 a 2
1 b 3
2 s 2
3 s 2
4 b 3
5 a 2
6 b 3
[7 rows x 2 columns]
Count frequency of values in pandas DataFrame
Try:
result = df.apply(pd.value_counts).fillna(0)
col1 col2 col3 col4 col5 col6 col7 col8
A 5.0 1.0 0.0 0.0 5.0 5.0 0.0 0.0
C 0.0 0.0 1.0 4.0 2.0 0.0 6.0 1.0
G 1.0 1.0 6.0 3.0 0.0 1.0 0.0 0.0
T 1.0 5.0 0.0 0.0 0.0 1.0 1.0 6.0
How to count frequency of values across the entire dataframe
You can use the stack function to stack all values in one column, and then use value_counts
:
df.stack().value_counts()
Pandas: how to count the frequency of words in a column based on another column
You can first select base on column2
then use .values_counts()
. try this:
>>> df[df['column2'] == 'Ford']['column4'].value_counts()
As DataFrame:
>>> pd.DataFrame(df[df['column2'] == 'Ford']['column4'].value_counts()
).reset_index()\
.rename(columns={'index':'model', 'column4':'counts'}
)
model counts
0 Ford 2
1 Mustang 1
pandas count frequency of column value in another dataframe column
Use Series.map
by Series created by Series.value_counts
, last replace missing values to 0
:
df2["freq"] = df2["col2"].map(df1["col1"].value_counts()).fillna(0).astype(int)
print (df2)
col2 freq
0 636 2
1 734 0
2 801 1
3 803 0
Count frequency of categories in column across subsets of data
I believe you're looking for pd.crosstab
:
>>> pd.crosstab(index=df["Growth Class 2"], columns=[df.Date, df.Plot])
Date 2021-06-14 2021-06-20 2021-06-28 2021-07-05
Plot C1 C2 C3 C4 T1 T2 C3 C4 T2 T4 C1 C4 T1 T4 T2 T4
GrowthClass2
I 1 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1
S 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0
SS 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0
V 1 0 1 1 0 0 0 0 1 1 1 0 1 0 1 0
Count the frequency that a bunch of values occurs in a dataframe column
IIUC, use pd.cut
:
out = df.groupby(pd.cut(df['col2'], np.linspace(0, 1, 101)))['col1'].sum()
print(out)
# Output
col2
(0.0, 0.01] 33
(0.01, 0.02] 0
(0.02, 0.03] 31
(0.03, 0.04] 12
(0.04, 0.05] 0
..
(0.95, 0.96] 0
(0.96, 0.97] 0
(0.97, 0.98] 0
(0.98, 0.99] 0
(0.99, 1.0] 0
Name: col1, Length: 100, dtype: int64
R Count Frequency of Custom Dictionary in a Dataframe Column but Group them
Try group_by()
and summarise()
and you can spread()
after to create a column for each year.
See if this works for your:
freq_auth <- tweetsanalysis1 %>%
mutate(authority_dic =str_c(str_extract(text, str_c(authority_dic, collapse = '|')))) %>%
group_by(authority_dic, year, user_username) %>%
summarise(freq_word = n()) %>%
arrange(desc(freq_word)) %>%
spread(year, freq_word)
Related Topics
Using Django Database Layer Outside of Django
Preprocessing in Scikit Learn - Single Sample - Depreciation Warning
How to Leave/Exit/Deactivate a Python Virtualenv
How to Print Utf-8 Encoded Text to the Console in Python < 3
Use Python Requests to Download CSV
Which Seeds Have to Be Set Where to Realize 100% Reproducibility of Training Results in Tensorflow
How to Change Spacing Between Ticks in Matplotlib
Pandas Selecting by Label Sometimes Return Series, Sometimes Returns Dataframe
Tab Completion in Python's Raw_Input()
How to Trace the Path in a Breadth-First Search
How to Avoid "Permission Denied" When Using Pip with Virtualenv
Matplotlib Dateformatter for Axis Label Not Working
How to Implement Band-Pass Butterworth Filter with Scipy.Signal.Butter