Standard Deviation on Dataframe Does Not Work

Dataframe Standard Deviation issue due to a single column of text

Try:

df.iloc[:, :-1].std()

In english, this means use all rows, and use all but the last column.

If you want a standard deviations per row, then you will need:

df.iloc[:, :-1].std(axis=1)

Value error when calculating standard deviation on dataframe

Why is it not working?

Because axis=1 is for std per columns, but you count Series, df_stats.distance, there is no columns so error raised.

If use std of column, output is scalar:

print (df_stats.distance.std()) 

df_stats['std'] = df_stats.distance.std()

If need processing per multiple columns then axis=1 count std per rows:

df_stats['std'] = df_stats[['distance','a1/a2','mean_distance']].std(axis=1)

If need std per some datetimes, e.g. days:

df_stats['std'] = df_stats.groupby(pd.Grouper(freq='d')).distance.transform('std')

Standard deviation of dataframe?

I think there is a misunderstanding of the docs.

What pandas is deprecating is specifically the level parameter in favor of its groupby counterpart (the link you shared). Nowhere it says pandas.Series.std is deprecated as a whole:

level: int or level name, default None
If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a scalar.
Deprecated since version 1.3.0: The level keyword is deprecated. Use groupby instead.

and:

numeric_only: bool, default None
Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series.
Deprecated since version 1.5.0: Specifying numeric_only=None is deprecated. The default value will be False in a future version of pandas.

Given the line of code you propose, I see no reason for changing it. Keep using:

df['col'].std()

Python dataframe: Standard deviation of last one year of data

It looks like you are trying to calculate a rolling standard deviation, with the rolling window consisting of previous 252 rows.

Pandas has many .rolling() methods, including one for standard deviation:

df['Daily_SD'] = df['Interday_Close_change'].rolling(252).std().shift()

If there is less than 252 rows available from which to calculate the standard deviation, the result for the row will be a null value (NaN). Think about whether you really want to apply the .fillna('') method to fill null values, as you are doing. That will convert the entire column from a numeric (float) data type to object data type.

Without the .shift() method, the current row's value will be included in calculations. The .shift() method will shift all rolling standard deviation values down by 1 row, so the current row's result will be the standard deviation of the previous 252 rows, as you want.

with pandas version >= 1.2 you can use this instead:

df['Daily_SD'] = df['Interday_Close_change'].rolling(252, closed='left').std()

The closed=left parameter will exclude the last point in the window from calculations.