Why does my Pandas DataFrame not display new order using `sort_values`?
df.sort_values(['Total Due'])
returns a sorted DF, but it doesn't update DF in place.
So do it explicitly:
df = df.sort_values(['Total Due'])
or
df.sort_values(['Total Due'], inplace=True)
Python pandas dataframe sort_values does not work
Presumbaly, what you're trying to do is sort by the numerical value after sso_
. You can do this as follows:
import numpy as np
df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)
This
splits the strings at
_
converts what's after this character to the numerical value
Finds the indices sorted according to the numerical values
Reorders the DataFrame according to these indices
Example
In [15]: df = pd.DataFrame({'test_type': ['sso_1000', 'sso_500']})
In [16]: df.sort_values(by=['test_type'], ascending=True)
Out[16]:
test_type
0 sso_1000
1 sso_500
In [17]: df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)]
Out[17]:
test_type
1 sso_500
0 sso_1000
when sorting the data frame using sort_value(inplace=True) error is showing up
You can use this:
df = df.sort_values(by=['id_student'])
What is the issue with my date sorting in pandas dataframe?
You have two options.
Edit last line, either
df = df.sort_values(['Start Date'], ascending= True, kind= 'quicksort')
or
df.sort_values(['Start Date'], ascending= True, kind= 'quicksort', inplace=True)
Python pandas dataframe sort_values does not work for second term
Use read_csv
with parameter thousands
for remove ,
in floats and parse_dates
for convert column to datetime, because values of column Sales
was read as string
s:
df = pd.read_csv('sales.csv', thousands=',', parse_dates=['Order Date'])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
0 928.0 2009-01-01 High 32.0 180.36
1 10369.0 2009-01-02 Low 43.0 4083.19
2 10144.0 2009-01-02 Critical 16.0 137.63
3 32323.0 2009-01-01 Not Specified 9.0 872.48
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
6 26756.0 2009-01-02 Critical 43.0 614.80
7 18144.0 2009-01-02 Low 4.0 1239.06
8 22912.0 2009-01-02 Low 32.0 4902.38
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
Another solution is use replace
+ astype
or to_numeric
:
df['Order Date'] = pd.to_datetime(df['Order Date'])
df['Sales'] = df['Sales'].replace(',', '', regex=True).astype(float)
#if astype does not work because bad data
#df['Sales'] = pd.to_numeric(df['Sales'].replace(',', '', regex=True), errors='coerce')
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
Pandas wrong sorting of DateTime column
(Turning my comment into an answer as suggested by @Wai Ha Lee)
df.sort_values('DateTime')
returns a sorted copy of the dataframe, but doesn't change the original.
That can be done either by explicit reassignment:
df = df.sort_values('DateTime')
or by using the inplace
flag
df.sort_values('DateTime', inplace=True)
Although the latter is discouraged and slated for deprecation.
Pandas dataframe - sort and shift within a group
sort_values
is not an inplace operation by default. Either pass inplace=True
df.sort_values(['O','A', 'N', 'time'], inplace=True)
# other operations
or reassign:
df = df.sort_values(...)
# other operations
Custom sorting in pandas dataframe
Pandas 0.15 introduced Categorical Series, which allows a much clearer way to do this:
First make the month column a categorical and specify the ordering to use.
In [21]: df['m'] = pd.Categorical(df['m'], ["March", "April", "Dec"])
In [22]: df # looks the same!
Out[22]:
a b m
0 1 2 March
1 5 6 Dec
2 3 4 April
Now, when you sort the month column it will sort with respect to that list:
In [23]: df.sort_values("m")
Out[23]:
a b m
0 1 2 March
2 3 4 April
1 5 6 Dec
Note: if a value is not in the list it will be converted to NaN.
An older answer for those interested...
You could create an intermediary series, and set_index
on that:
df = pd.DataFrame([[1, 2, 'March'],[5, 6, 'Dec'],[3, 4, 'April']], columns=['a','b','m'])
s = df['m'].apply(lambda x: {'March':0, 'April':1, 'Dec':3}[x])
s.sort_values()
In [4]: df.set_index(s.index).sort()
Out[4]:
a b m
0 1 2 March
1 3 4 April
2 5 6 Dec
As commented, in newer pandas, Series has a replace
method to do this more elegantly:
s = df['m'].replace({'March':0, 'April':1, 'Dec':3})
The slight difference is that this won't raise if there is a value outside of the dictionary (it'll just stay the same).
Pandas sort_values does not sort numbers correctly
For whatever reason, you seem to be working with a column of strings, and sort_values
is returning you a lexsorted result.
Here's an example.
df = pd.DataFrame({"Col": ['1', '2', '3', '10', '20', '19']})
df
Col
0 1
1 2
2 3
3 10
4 20
5 19
df.sort_values('Col')
Col
0 1
3 10
5 19
1 2
4 20
2 3
The remedy is to convert it to numeric, either using .astype
or pd.to_numeric
.
df.Col = df.Col.astype(float)
Or,
df.Col = pd.to_numeric(df.Col, errors='coerce')
df.sort_values('Col')
Col
0 1
1 2
2 3
3 10
5 19
4 20
The only difference b/w astype
and pd.to_numeric
is that the latter is more robust at handling non-numeric strings (they're coerced to NaN
), and will attempt to preserve integers if a coercion to float is not necessary (as is seen in this case).
Related Topics
Pil: Convert Bytearray to Image
Retrieving a Foreign Key Value with Django-Rest-Framework Serializers
What Is the Correct Way to Set Python's Locale on Windows
How to Extract an Arbitrary Line of Values from a Numpy Array
Executing Command Using Paramiko Exec_Command on Device Is Not Working
Inverse Distance Weighted (Idw) Interpolation with Python
Login Credentials Not Working with Gmail Smtp
How to Convert Strings in a Pandas Data Frame to a 'Date' Data Type
Implementation Hmac-Sha1 in Python
Python Method for Reading Keypress
How to Specify an Authenticated Proxy for a Python Http Connection
How to Check If Character in a String Is a Letter? (Python)
Python Pandas Extract Year from Datetime: Df['Year'] = Df['Date'].Year Is Not Working
Add 'Decimal-Mark' Thousands Separators to a Number