Pandas equivalent of Oracle Lead/Lag function
You could perform a groupby/apply (shift) operation:
In [15]: df['Data_lagged'] = df.groupby(['Group'])['Data'].shift(1)
In [16]: df
Out[16]:
Date Group Data Data_lagged
2014-05-14 09:10:00 A 1 NaN
2014-05-14 09:20:00 A 2 1
2014-05-14 09:30:00 A 3 2
2014-05-14 09:40:00 A 4 3
2014-05-14 09:50:00 A 5 4
2014-05-14 10:00:00 B 1 NaN
2014-05-14 10:10:00 B 2 1
2014-05-14 10:20:00 B 3 2
2014-05-14 10:30:00 B 4 3
[9 rows x 4 columns]
To obtain the ORDER BY Date ASC
effect, you must sort the DataFrame first:
df['Data_lagged'] = (df.sort_values(by=['Date'], ascending=True)
.groupby(['Group'])['Data'].shift(1))
Groupby and lag all columns of a dataframe?
IIUC, you can simply use level="grp"
and then shift by -1:
>>> shifted = df.groupby(level="grp").shift(-1)
>>> df.join(shifted.rename(columns=lambda x: x+"_lag"))
col1 col2 col1_lag col2_lag
time grp
2015-11-20 A 1 a 2 b
2015-11-21 A 2 b 3 c
2015-11-22 A 3 c NaN NaN
2015-11-23 B 1 a 2 b
2015-11-24 B 2 b 3 c
2015-11-25 B 3 c NaN NaN
Is there a similar pandas/numpy function to group_by lead/lag in dplyr with ifelse statements?
Using transform
+ idxmax
cno = example_data['contract_no']
ob = example_data['outstanding_balance']
md = example_data['maturity_date']
drc = example_data['date_report_created']
i = ob.eq(0).groupby(cno).transform('idxmax')
j = md.eq(drc).groupby(cno).transform('idxmax')
i.eq(j).view('i1')
0 1
1 1
2 0
3 0
4 0
5 0
dtype: int8
Is there an equivalent of SQL GROUP BY ROLLUP in Python pandas?
Refer to this answer Pandas Pivot tables row subtotals
It uses pivot_table() with margins=True to add a totals column
Then some reshaping of the pivot_table through the use of stack()
Not as slick as group by rollup, but it works
Related Topics
Python - Using the Multiply Operator to Create Copies of Objects in Lists
Anaconda/Conda - Install a Specific Package Version
How to Format Axis Number Format to Thousands with a Comma in Matplotlib
Rotate Image and Crop Out Black Borders
How to Pip or Easy_Install Tkinter on Windows
Installing MySQL Python on MAC Os X
Print Combining Strings and Numbers
Serving Dynamically Generated Zip Archives in Django
How to Sort a Pandas Dataframe by Index
Keep Same Dummy Variable in Training and Testing Data
Python: How to Remove Empty Lists from a List
Convert a Python Dict to a String and Back
Handling Urllib2's Timeout? - Python
Checking Odd/Even Numbers and Changing Outputs on Number Size