Pandas Groupby Range of Values

how to group by list ranges of value in python pandas

Use groupby + cut:

bins = [-1, 100, 200, np.inf]
labels=['0-100','100-200','more than 200']
df=df.groupby(pd.cut(df['value'], bins=bins, labels=labels)).size().reset_index(name='count')
print (df)
           value  count
0          0-100      2
1        100-200      3
2  more than 200      2

Groupby range of numbers in Pandas and extract start and end values

IIUC you can use diff and cumsum to group, then check if the group has more than 1 element:

df["group"] = df["higher_count"].diff().ne(1).cumsum()
print (df.loc[df.groupby("group")["higher_count"].transform(len)>1]
         .rename_axis("date")
         .reset_index()
         .groupby("group")[["date", "price"]].agg(["first", "last"]))

                     date                     price
                    first                last first last
group
2     2020-03-19 01:00:00 2020-03-19 04:00:00     8   11
3     2020-03-19 05:00:00 2020-03-19 08:00:00     6    9
6     2020-03-19 11:00:00 2020-03-19 13:00:00     9   11

Pandas groupby range of values when range is unknown

Would some thing like this work (you should modify the print command to write to file):

thresh = 10
s = df.groupby('range')['pos1'].diff().gt(thresh).cumsum()

for (r,g), d in df.groupby(['range',s])['pos1']:
    print(r, list(d))

Output:

range1 [1, 2, 3, 4]
range1 [100, 101, 102, 104, 107, 108]
range1 [207, 208, 209, 210]
range2 [10, 11, 12]
range2 [50, 51, 52, 54, 55]
range3 [50, 51, 52, 53]
range3 [107, 108, 109, 110, 111, 112, 113]
range3 [800, 802, 803, 804, 805]

How to group by a range of values in pandas?

I am guessing OP wants to group by categorical variables, followed by a numeric variable binned in intervals. In that case you can use the np.digitize().

smallest = np.min(df['strike'])
largest = np.max(df['strike'])
num_edges = 3
# np.digitize(input_array, bin_edges)
ind = np.digitize(df['strike'], np.linspace(smallest, largest, num_edges))

then ind should be

array([1, 1, 2, 2, 2, 2, 3], dtype=int64)

which corresponding to binning

 [10, 10, 12, 13, 12, 13, 14]

with bin edges

array([ 10.,  12.,  14.]) # == np.linspace(smallest, largest, num_edges)

Finally, group by all the columns you want, but with this additional bin column

df['binned_strike'] = ind
for grp in df.groupby(['symbol', 'serie', 'binned_strike']):
    print "group key"
    print grp[0]
    print "group content"
    print grp[1]
    print "============="

This should print

group key
('IP', 'A', 1)
group content
   last  price serie  strike symbol  type  binned_strike
0   1.0     11     A      10     IP  call              1
=============
group key
('IP', 'A', 2)
group content
   last  price serie  strike symbol  type  binned_strike
2   2.5     11     A      12     IP   put              2
4   4.5     11     A      12     IP  call              2
=============
group key
('IP', 'B', 1)
group content
   last  price serie  strike symbol type  binned_strike
1   2.0     11     B      10     IP  put              1
=============
group key
('IP', 'B', 2)
group content
   last  price serie  strike symbol type  binned_strike
3   3.0     11     B      13     IP  put              2
5   5.0     11     B      13     IP  put              2
=============
group key
('IP', 'B', 3)
group content
   last  price serie  strike symbol  type  binned_strike
6   6.0     11     B      14     IP  call              3
=============

Pandas: Group by column and count values in range in another column and add that count to a new column

You could create a binning column displaying whether each position is between 0,10 and then use a pivot_table with aggfunc set to count:

df['threshold'] = np.where(df['position'].between(0,10),'within 10','outside of 10')
df.pivot_table(index='page', columns='threshold', values='position', aggfunc='count',fill_value=0)

prints:

threshold  outside of 10  within 10
page                               
url/1                  0          1
url/2                  2          1
url/3                  0          1

Pandas Groupby Range of Values

Pandas Groupby Range of Values

how to group by list ranges of value in python pandas

Groupby range of numbers in Pandas and extract start and end values

Pandas groupby range of values when range is unknown

How to group by a range of values in pandas?

Pandas: Group by column and count values in range in another column and add that count to a new column

Related Topics

Leave a reply