Recursive definitions in Pandas
As I noted in a comment, you can use scipy.signal.lfilter
. In this case (assuming A
is a one-dimensional numpy array), all you need is:
B = lfilter([a], [1.0, -b], A)
Here's a complete script:
import numpy as np
from scipy.signal import lfilter
np.random.seed(123)
A = np.random.randn(10)
a = 2.0
b = 3.0
# Compute the recursion using lfilter.
# [a] and [1, -b] are the coefficients of the numerator and
# denominator, resp., of the filter's transfer function.
B = lfilter([a], [1, -b], A)
print B
# Compare to a simple loop.
B2 = np.empty(len(A))
for k in range(0, len(B2)):
if k == 0:
B2[k] = a*A[k]
else:
B2[k] = a*A[k] + b*B2[k-1]
print B2
print "max difference:", np.max(np.abs(B2 - B))
The output of the script is:
[ -2.17126121e+00 -4.51909273e+00 -1.29913212e+01 -4.19865530e+01
-1.27116859e+02 -3.78047705e+02 -1.13899647e+03 -3.41784725e+03
-1.02510099e+04 -3.07547631e+04]
[ -2.17126121e+00 -4.51909273e+00 -1.29913212e+01 -4.19865530e+01
-1.27116859e+02 -3.78047705e+02 -1.13899647e+03 -3.41784725e+03
-1.02510099e+04 -3.07547631e+04]
max difference: 0.0
Another example, in IPython, using a pandas DataFrame instead of a numpy array:
If you have
In [12]: df = pd.DataFrame([1, 7, 9, 5], columns=['A'])
In [13]: df
Out[13]:
A
0 1
1 7
2 9
3 5
and you want to create a new column, B
, such that B[k] = A[k] + 2*B[k-1]
(with B[k] == 0
for k < 0), you can write
In [14]: df['B'] = lfilter([1], [1, -2], df['A'].astype(float))
In [15]: df
Out[15]:
A B
0 1 1
1 7 9
2 9 27
3 5 59
Define recursive function in Pandas dataframe
You could try something like this.
import pandas as pd
import numpy as np
df = pd.DataFrame({'date': [1,2,3,4,5,6],
'col_1': [951, 909, 867, 844, 824, 826],
'col_2': [179, 170, 164, 159, 153, 149]})
col_2_update_list = []
for i, row in df.iterrows():
if i != 0:
today_col_1 = df.at[i,'col_1']
prev_day_col_2 = df.at[i-1,'col_2']
new_col_2_val = prev_day_col_2 * today_col_1
col_2_update_list.append(new_col_2_val)
else:
col_2_update_list.append(np.nan)
df['updated_col_2'] = col_2_update_list
Recursive Dictionary for Pandas Dataframe
Try:
df.groupby([0,1]).agg(list).to_dict('index')
{('a', 'a'): {'index': [0, 1], '2': [0.2, 0.4]},
('a', 'b'): {'index': [0, 1], '2': [0.4, 0.7]}}
Pandas - Recursively look for children in dataframe
If you only want to print an indented graph, you could use a simple recursive function:
def desc(i, indent=0):
print(' '*indent + i)
for j in df.loc[df['id2'] == i, 'id1']:
desc(j, indent + 2)
for i in ('111', '222'): desc(i)
With the example df, it gives:
111
aaa
ccc
333
222
bbb
zzz
999
888
ddd
eee
Recursive loop over pandas dataframe
Here's how I would approach this (explanations in the comments):
# Replace NaN in df["Employee Number"] with empty string
df["Employee Number"] = df["Employee Number"].fillna("")
# Add a column with sets that contain the individual employee numbers
df["EN_Sets"] = df["Employee Number"].str.findall(r"\d+").apply(set)
# Build the maximal distinct employee number sets
en_sets = []
for en_set in df.EN_Sets:
union_sets = []
keep_sets = []
for s in en_sets:
if s.isdisjoint(en_set):
keep_sets.append(s)
else:
union_sets.append(s)
en_sets = keep_sets + [en_set.union(*union_sets)]
# Build a dictionary with the replacement strings as keys the distinct sets
# as values
en_sets = {", ".join(sorted(s)): s for s in en_sets}
# Apply-function to replace the original employee number strings
def setting_en_numbers(s):
for en_set_str, en_set in en_sets.items():
if not s.isdisjoint(en_set):
return en_set_str
# Apply the function to df["Employee Number"]
df["Employee Number"] = df.EN_Sets.apply(setting_en_numbers)
df = df[["Company", "Employee Number"]]
Result for
df:
Company Employee Number
0 1 12
1 2 34, 12
2 3 56, 34, 78
3 4 90
4 5 NaN
is
Company Employee Number
0 1 12, 34, 56, 78
1 2 12, 34, 56, 78
2 3 12, 34, 56, 78
3 4 90
4 5
Recursive Operation in Pandas
Check with networkx
, you need a direction graph with 'root'
to 'leaf'
path
import networkx as nx
G=nx.from_pandas_edgelist(df,source='operator',target='nextval', edge_attr=None, create_using=nx.DiGraph())
road=[]
for n in G:
if G.out_degree(n)==0: #leaf
road.append(nx.shortest_path(G, 1, n))
road
Out[82]: [[1, 2, 4], [1, 3, 5, 6]]
Update
import networkx as nx
G=nx.from_pandas_edgelist(df,source='operator',target='nextval', edge_attr=None, create_using=nx.DiGraph())
road=[]
for n in G:
if G.out_degree(n)==0: #leaf
road.append(list(nx.all_simple_paths(G, 1, n)))
road
Out[509]: [[[1, 3, 5, 6], [1, 6]], [[1, 2, 4]]]
Related Topics
Any Reason Not to Use '+' to Concatenate Two Strings
Python - Initializing Multiple Lists/Line
How to Log Server Errors on Django Sites
What's 0Xff for in Cv2.Waitkey(1)
Matplotlib: How to Draw a Rectangle on Image
Logging, Streamhandler and Standard Streams
Python's Sum VS. Numpy's Numpy.Sum
Export a Pandas Dataframe as a Table Image
Why Can't I Repeat the 'For' Loop for CSV.Reader
Find Usa Phone Numbers in Python Script
Convert Dictionary Entries into Variables
Matplotlib Connect Scatterplot Points with Line - Python
Pandas Selecting by Label Sometimes Return Series, Sometimes Returns Dataframe
Reload Flask App When Template File Changes
Why Does Python's Multiprocessing Module Import _Main_ When Starting a New Process on Windows
How to Find Tags with Only Certain Attributes - Beautifulsoup