How to Use and Print the Pandas Dataframe Name

How to print dataframe name in title of a plot?

I found nice function here: (Get the name of a pandas DataFrame)

def get_df_name(df):
name =[x for x in globals() if globals()[x] is df][0]
return name

It will help you.

def plot_dist(df,col):
ax=sns.countplot(x=col,data=df)
ax.set_title(get_df_name(df))

How can I get the name of a DataFrame in Python?

Since Python allows you to assign arbitrary attribute names to objects, you can assign an attribute named name to your dataframe to represent the name of it:

import pandas as pd 
df = pd.DataFrame()
df.name = 'My Data Frame'
print(df.name) # My Data Frame

In your case, after you define a name attribute for dataAllR:

dataAllR.name = 'dataAllR'

You would use:

exportPath = exportPath + '\\final_py_upload' + data.name + '.csv'

Or, even better:

exportPath = f'{exportPath}\\final_py_upload{data.name}.csv'

How to extract the name of a dataframe from a list and print it as a heading to the output?

Dataframes do not store self.name... you need to provide it yourself with something like

for i, name in zip(list1, names_list):

You can decide to gather all dfs in a dictionary

all_dfs = {name: df for name, df in zip(names_list, list1)}
# then iterate
for name, df in all_dfs.items():

Often if several dfs have the same columns and the index has the same levels it is a good idea to keep them as a single df and use df.groubpy to iterate over grouped rows (since your dfs seem to have come from a groupby anyway you could skip the step where you presumably separate into several dfs and iterate straight from the groubpy, if that is your case).

# if dfs come from different sources
# concatenate them into a single df
df_main = pd.concat(
list1, # collection of dfs to be concatenated
keys=names_list, # names for dfs, will be appended as the outermost index level
names=['name_of_df'] # name for the level that will be appended
)

# iterate over a groupby object
for name_of_df, df_sub in df_main.groubpy('name_of_df'):
# name_of_df: string provided in `names_list`
# df_sub: filtered df

EDIT:

Please understand that all the code blocks above are exclusive, i.e. pick one solution and stick with it. Your comment is trying to combine the different solutions and on top of that using all_dfs.items() which is not in any option provided and is not necessary for this case.

If you choose the first option then

for i, name in zip(list1, names_list):
print(name)
needed_cols = i.columns # from your code
# the rest of your code inside the loop



Related Topics



Leave a reply



Submit