How to Sort Pandas Dataframe from One Column

how to sort pandas dataframe from one column

Use sort_values to sort the df by a specific column's values:

In [18]:
df.sort_values('2')

Out[18]:
        0          1     2
4    85.6    January   1.0
3    95.5   February   2.0
7   104.8      March   3.0
0   354.7      April   4.0
8   283.5        May   5.0
6   238.7       June   6.0
5   152.0       July   7.0
1    55.4     August   8.0
11  212.7  September   9.0
10  249.6    October  10.0
9   278.8   November  11.0
2   176.5   December  12.0

If you want to sort by two columns, pass a list of column labels to sort_values with the column labels ordered according to sort priority. If you use df.sort_values(['2', '0']), the result would be sorted by column 2 then column 0. Granted, this does not really make sense for this example because each value in df['2'] is unique.

How to sort a pandas DataFrame on one column given an already ordered list of the values in that column?

Approach 1

Convert the Fruit column to ordered categorical type and sort the values

df['fruit'] = pd.Categorical(df['fruit'], ordered_list, ordered=True)
df.sort_values('fruit')

Approach 2

Sort the values by passing a key function, which maps the fruit names to there corresponding order

df.sort_values('fruit', key=lambda x: x.map({v:k for k, v in enumerate(ordered_list)}))

   id      fruit  trash
2   3  pineapple     93
1   2     banana     22
3   4     orange      1
4   5     orange     15
0   1      apple     38

Sort pandas dataframe by two columns using key in one of them, kind mergesort, not working

import pandas as pd

data = {
  "col1": ["chr5","chr5","chr5","chr3","chr3","chr3","chr3","chr2","chr2","chr2","chr11"],
  "col2": ["CDS","gene","mRNA","three_prime_UTR","gene","CDS","mRNA","CDS","gene","mRNA","CDS"]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print("Before Sort:",df)
df['col2'] = pd.Categorical(df['col2'],categories=['gene','mRNA','five_prime_UTR', 'CDS', 'three_prime_UTR'],ordered=True)
df['new'] = df['col1'].str.extract('(\d+$)').astype(int)
df = df.sort_values(by=['new', 'col2']).drop('new', axis=1)
df.reset_index(drop=True, inplace=True)
print("\n\nAfter sort:",df)

For col2 i have used categorical sort and for col1 retrived number in end and sorted based on it and dropped newly created column "new".

Output:

Before Sort:      col1             col2
0    chr5              CDS
1    chr5             gene
2    chr5             mRNA
3    chr3  three_prime_UTR
4    chr3             gene
5    chr3              CDS
6    chr3             mRNA
7    chr2              CDS
8    chr2             gene
9    chr2             mRNA
10  chr11              CDS

After sort:      col1             col2
0    chr2             gene
1    chr2             mRNA
2    chr2              CDS
3    chr3             gene
4    chr3             mRNA
5    chr3              CDS
6    chr3  three_prime_UTR
7    chr5             gene
8    chr5             mRNA
9    chr5              CDS
10  chr11              CDS

Sort a pandas dataframe by 2 columns (one with integers, one with alphanumerics) with priority for integer column

You can do it this way:

Split the second column with alphanumeric strings into 2 columns: one column Letter to hold the first letter and another column Number to hold a number of one or two digits.
Convert Number column from string to integer.
Then, sort these 2 new columns together with the first column of integers

Let's illustrate the process with an example below:

Assume we have the dataframe df as follows:

print(df)

   Col1 Col2
0     2  B12
1    11   C2
2     2   A1
3    11   B2
4     2   B1
5    11  C12
6     2  A12
7    11   C1
8     2   A2

Step 1 & 2: Split Col2 into 2 columns Letter & Number + Convert Number column from string to integer:

df['Letter'] = df['Col2'].str[0]               # take 1st char
df['Number'] = df['Col2'].str[1:].astype(int)  # take 2nd char onwards and convert to integer

Result:

print(df)

   Col1 Col2 Letter  Number
0     2  B12      B      12
1    11   C2      C       2
2     2   A1      A       1
3    11   B2      B       2
4     2   B1      B       1
5    11  C12      C      12
6     2  A12      A      12
7    11   C1      C       1
8     2   A2      A       2

Step 3: Sort Col1, Letter and Number with priority: Col1 ---> Number ---> Letter:

df = df.sort_values(by=['Col1', 'Number', 'Letter'])

Result:

print(df)

   Col1 Col2 Letter  Number
2     2   A1      A       1
4     2   B1      B       1
8     2   A2      A       2
6     2  A12      A      12
0     2  B12      B      12
7    11   C1      C       1
3    11   B2      B       2
1    11   C2      C       2
5    11  C12      C      12

After sorting, you can remove the Letter and Number columns, as follows:

df = df.drop(['Letter', 'Number'], axis=1)

If you want to do all in one step, you can also chain the instructions, as follows:

df = (df.assign(Letter=df['Col2'].str[0], 
                Number=df['Col2'].str[1:].astype(int))
        .sort_values(by=['Col1', 'Number', 'Letter'])
        .drop(['Letter', 'Number'], axis=1)
     )

Result:

print(df)

   Col1 Col2
2     2   A1
4     2   B1
8     2   A2
6     2  A12
0     2  B12
7    11   C1
3    11   B2
1    11   C2
5    11  C12

How to sort ascending and descending depending on a value in another column in pandas?

If you can assume that your "price" column will always contain non-negative values, we could "cheat". Assign a negative value to the prices of buy or sell operations, sort, and then calculate the absolute value to go back to the original prices:

If type is "buy", the price remains positive (2 * 1 - 1 = 1). If type is "sell", the price will become negative (2 * 0 - 1 = -1).
```
df["price"] = df["price"] * (2 * (df["type"] == "buy").astype(int) - 1)
```
Now sort values normally. I've included both "initiator_id" and "type" columns to match your expected output:
```
df = df.sort_values(["initiator_id", "type", "price"])
```
Finally, calculate the absolute value of the "price" column to retrieve your original values:
```
df["price"] = df["price"].abs()
```

Expected output of this operation on your sample input:

   initiator_id   price  type  bidnum
0             1  170.81  sell       0
2             2  169.19   buy       0
1             2  170.81  sell       0
4             3  169.19   buy       0
3             3  170.81  sell       0
5             3   70.81  sell       1
9             4   69.19   buy       1
7             4  169.19   buy       0
6             4  170.81  sell       0
8             4   70.81  sell       1

How to sort dataframe rows by multiple columns

Use sort_values, which can accept a list of sorting targets. In this case it sounds like you want to sort by S/N, then Dis, then Rate:

df = df.sort_values(['S/N', 'Dis', 'Rate'])

#     S/N      Dis       Rate
# 0   332   4.6030  91.204062
# 3   332   9.1985  76.212943
# 6   332  14.4405  77.664282
# 9   332  20.2005  76.725955
# 12  332  25.4780  31.597510
# 15  332  30.6670  74.096975
# 1   445   5.4280  60.233917
# 4   445   9.7345  31.902842
# 7   445  14.6015  36.261851
# 10  445  19.8630  40.705467
# 13  445  24.9050   4.897008
# 16  445  30.0550  35.217889
# ...

Sort values in a dataframe by a column and take second one only if equal

Your solution almost working well, but if use inplace in reset_index it is not reused in sort_values.

Possible solution is add ignore_index=True, so reset_index is not necessary.

np.random.seed(2022)  
df = pd.DataFrame({'col1':np.random.random(5), 'col2':np.random.random(5)})
df = df.sort_values(by=['col2','col1'],ascending=False, ignore_index=True)
print (df)
       col1      col2
0  0.499058  0.897657
1  0.049974  0.896963
2  0.685408  0.721135
3  0.113384  0.647452
4  0.009359  0.486988

Or if want use inplace add it only to sort_values and add also ignore_index=True:

df.sort_values(by=['col2','col1'],ascending=False, ignore_index=True,inplace=True)
print (df)
       col1      col2
0  0.499058  0.897657
1  0.049974  0.896963
2  0.685408  0.721135
3  0.113384  0.647452
4  0.009359  0.486988

How to Sort Pandas Dataframe from One Column