Why Does My Pandas Dataframe Not Display New Order Using 'Sort_Values'

Why does my Pandas DataFrame not display new order using `sort_values`?

df.sort_values(['Total Due']) returns a sorted DF, but it doesn't update DF in place.

So do it explicitly:

df = df.sort_values(['Total Due'])

or

df.sort_values(['Total Due'], inplace=True)

Python pandas dataframe sort_values does not work

Presumbaly, what you're trying to do is sort by the numerical value after sso_. You can do this as follows:

import numpy as np

df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)

This

  1. splits the strings at _

  2. converts what's after this character to the numerical value

  3. Finds the indices sorted according to the numerical values

  4. Reorders the DataFrame according to these indices

Example

In [15]: df = pd.DataFrame({'test_type': ['sso_1000', 'sso_500']})

In [16]: df.sort_values(by=['test_type'], ascending=True)
Out[16]:
test_type
0 sso_1000
1 sso_500

In [17]: df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)]
Out[17]:
test_type
1 sso_500
0 sso_1000

when sorting the data frame using sort_value(inplace=True) error is showing up

You can use this:

df = df.sort_values(by=['id_student'])

What is the issue with my date sorting in pandas dataframe?

You have two options.
Edit last line, either

df = df.sort_values(['Start Date'], ascending= True, kind= 'quicksort')

or

df.sort_values(['Start Date'], ascending= True, kind= 'quicksort', inplace=True)

Python pandas dataframe sort_values does not work for second term

Use read_csv with parameter thousands for remove , in floats and parse_dates for convert column to datetime, because values of column Sales was read as strings:

df = pd.read_csv('sales.csv', thousands=',', parse_dates=['Order Date'])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
0 928.0 2009-01-01 High 32.0 180.36
1 10369.0 2009-01-02 Low 43.0 4083.19
2 10144.0 2009-01-02 Critical 16.0 137.63
3 32323.0 2009-01-01 Not Specified 9.0 872.48
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
6 26756.0 2009-01-02 Critical 43.0 614.80
7 18144.0 2009-01-02 Low 4.0 1239.06
8 22912.0 2009-01-02 Low 32.0 4902.38

df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56

Another solution is use replace + astype or to_numeric:

df['Order Date'] = pd.to_datetime(df['Order Date'])
df['Sales'] = df['Sales'].replace(',', '', regex=True).astype(float)
#if astype does not work because bad data
#df['Sales'] = pd.to_numeric(df['Sales'].replace(',', '', regex=True), errors='coerce')
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56

Pandas wrong sorting of DateTime column

(Turning my comment into an answer as suggested by @Wai Ha Lee)

df.sort_values('DateTime') returns a sorted copy of the dataframe, but doesn't change the original.

That can be done either by explicit reassignment:

df = df.sort_values('DateTime')

or by using the inplace flag

df.sort_values('DateTime', inplace=True)

Although the latter is discouraged and slated for deprecation.

Pandas dataframe - sort and shift within a group

sort_values is not an inplace operation by default. Either pass inplace=True

df.sort_values(['O','A', 'N', 'time'], inplace=True)
# other operations

or reassign:

df = df.sort_values(...)
# other operations

Custom sorting in pandas dataframe

Pandas 0.15 introduced Categorical Series, which allows a much clearer way to do this:

First make the month column a categorical and specify the ordering to use.

In [21]: df['m'] = pd.Categorical(df['m'], ["March", "April", "Dec"])

In [22]: df # looks the same!
Out[22]:
a b m
0 1 2 March
1 5 6 Dec
2 3 4 April

Now, when you sort the month column it will sort with respect to that list:

In [23]: df.sort_values("m")
Out[23]:
a b m
0 1 2 March
2 3 4 April
1 5 6 Dec

Note: if a value is not in the list it will be converted to NaN.


An older answer for those interested...

You could create an intermediary series, and set_index on that:

df = pd.DataFrame([[1, 2, 'March'],[5, 6, 'Dec'],[3, 4, 'April']], columns=['a','b','m'])
s = df['m'].apply(lambda x: {'March':0, 'April':1, 'Dec':3}[x])
s.sort_values()

In [4]: df.set_index(s.index).sort()
Out[4]:
a b m
0 1 2 March
1 3 4 April
2 5 6 Dec

As commented, in newer pandas, Series has a replace method to do this more elegantly:

s = df['m'].replace({'March':0, 'April':1, 'Dec':3})

The slight difference is that this won't raise if there is a value outside of the dictionary (it'll just stay the same).

Pandas sort_values does not sort numbers correctly

For whatever reason, you seem to be working with a column of strings, and sort_values is returning you a lexsorted result.

Here's an example.

df = pd.DataFrame({"Col": ['1', '2', '3', '10', '20', '19']})
df

Col
0 1
1 2
2 3
3 10
4 20
5 19

df.sort_values('Col')

Col
0 1
3 10
5 19
1 2
4 20
2 3

The remedy is to convert it to numeric, either using .astype or pd.to_numeric.

df.Col = df.Col.astype(float)

Or,

df.Col = pd.to_numeric(df.Col, errors='coerce')
df.sort_values('Col')

Col
0 1
1 2
2 3
3 10
5 19
4 20

The only difference b/w astype and pd.to_numeric is that the latter is more robust at handling non-numeric strings (they're coerced to NaN), and will attempt to preserve integers if a coercion to float is not necessary (as is seen in this case).



Related Topics



Leave a reply



Submit