Python: get a frequency count based on two columns (variables) in pandas dataframe some row appers
You can use groupby's size
:
In [11]: df.groupby(["Group", "Size"]).size()
Out[11]:
Group Size
Moderate Medium 1
Small 1
Short Small 2
Tall Large 1
dtype: int64
In [12]: df.groupby(["Group", "Size"]).size().reset_index(name="Time")
Out[12]:
Group Size Time
0 Moderate Medium 1
1 Moderate Small 1
2 Short Small 2
3 Tall Large 1
value_counts() based on two columns?
You can melt
and groupby
:
(df.melt(id_vars=['text', 'genre'], var_name='category', value_name='count')
.groupby(['genre', 'category'])
['count'].sum()
# below is for formatting only
.reset_index()
.query('count > 0')
.assign(category=lambda d:d['category'].str[9:])
)
output:
genre category count
1 fiction history 1
2 fiction nature 2
4 news history 1
5 news nature 1
6 scientific art 1
Group by two columns and count the occurrences of each combination in Pandas
Maybe this is what you want?
>>> data = pd.DataFrame({'user_id' : ['a1', 'a1', 'a1', 'a2','a2','a2','a3','a3','a3'], 'product_id' : ['p1','p1','p2','p1','p1','p1','p2','p2','p3']})
>>> count_series = data.groupby(['user_id', 'product_id']).size()
>>> count_series
user_id product_id
a1 p1 2
p2 1
a2 p1 3
a3 p2 2
p3 1
dtype: int64
>>> new_df = count_series.to_frame(name = 'size').reset_index()
>>> new_df
user_id product_id size
0 a1 p1 2
1 a1 p2 1
2 a2 p1 3
3 a3 p2 2
4 a3 p3 1
>>> new_df['size']
0 2
1 1
2 3
3 2
4 1
Name: size, dtype: int64
Count values in one column based on the categories of other column
You can do this using a groupby on two columns
results = df.groupby(by=['Age', 'LearnCode']).count()
This outputs a count for each ['Age', 'LearnCode']
pair
Count number of each unique value in pandas column
Use crosstab:
pd.crosstab(df['group'], df['Sex'])
Sex female male
group
1 2 4
2 3 3
Count the frequency that a value occurs in a dataframe column
Use value_counts()
as @DSM commented.
In [37]:
df = pd.DataFrame({'a':list('abssbab')})
df['a'].value_counts()
Out[37]:
b 3
a 2
s 2
dtype: int64
Also groupby
and count
. Many ways to skin a cat here.
In [38]:
df.groupby('a').count()
Out[38]:
a
a
a 2
b 3
s 2
[3 rows x 1 columns]
See the online docs.
If you wanted to add frequency back to the original dataframe use transform
to return an aligned index:
In [41]:
df['freq'] = df.groupby('a')['a'].transform('count')
df
Out[41]:
a freq
0 a 2
1 b 3
2 s 2
3 s 2
4 b 3
5 a 2
6 b 3
[7 rows x 2 columns]
Python: get a frequency count based on two columns (variables) in pandas dataframe some row appers
You can use groupby's size
:
In [11]: df.groupby(["Group", "Size"]).size()
Out[11]:
Group Size
Moderate Medium 1
Small 1
Short Small 2
Tall Large 1
dtype: int64
In [12]: df.groupby(["Group", "Size"]).size().reset_index(name="Time")
Out[12]:
Group Size Time
0 Moderate Medium 1
1 Moderate Small 1
2 Short Small 2
3 Tall Large 1
How to get value counts for multiple columns at once in Pandas DataFrame?
Just call apply
and pass pd.Series.value_counts
:
In [212]:
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))
df.apply(pd.Series.value_counts)
Out[212]:
a b c d
0 4 6 4 3
1 6 4 6 7
Related Topics
Negative Integer Division Surprising Result
Python Ctypes - Loading Dll Throws Oserror: [Winerror 193] %1 Is Not a Valid Win32 Application
In Matplotlib, What Does the Argument Mean in Fig.Add_Subplot(111)
Parsing Boolean Values with Argparse
Googletrans Stopped Working with Error 'Nonetype' Object Has No Attribute 'Group'
How to Add Custom Methods/Attributes to Built-In Python Types
Error: (-215) !Empty() in Function Detectmultiscale
Case Insensitive Regular Expression Without Re.Compile
Django Set Default Form Values
How to Activate an Anaconda Environment
Quick and Easy File Dialog in Python
Create a .CSV File with Values from a Python List
Pandas: Rolling Mean by Time Interval