Add a "Rank" Column to a Data Frame - ITCodar

Adding a ranking column to a dataframe

You can compute a ranking from an ordering as follows:

dat$rank <- NA
dat$rank[order.scores] <- 1:nrow(dat)
dat
#   fname score rank
# 1   Joe   500    5
# 2   Bob   490    3
# 3  Bill   500    4
# 4   Tom   750    8
# 5   Sue   550    7
# 6   Sam   500    6
# 7  Jane   210    1
# 8  Ruby   320    2

DataFrame - Add a new ranking column

I think this should be like this:

import pandas as pd

Original data:

df = pd.DataFrame({
      'fruit': ['Apple', 'Apple', 'Apple', 'Pear', 'Pear', 'Pear', 'Pear', 'Peach', 'Peach'], 
      'percentage': [23, 99, 50, 45, 87, 67, 70, 93, 75]
})

Output

Sample Image

Create new 'rank' column based on grouping the df dataframe on fruit and rank the value of percentage within the group.

df['rank'] = df.groupby('fruit')['percentage'].rank()

Output:

Sample Image

Pandas rank by column value

Here's one way to do it in Pandas-way

You could groupby on Auction_ID and take rank() on Bid_Price with ascending=False

In [68]: df['Auction_Rank'] = df.groupby('Auction_ID')['Bid_Price'].rank(ascending=False)

In [69]: df
Out[69]:
   Auction_ID  Bid_Price  Auction_Rank
0         123          9             1
1         123          7             2
2         123          6             3
3         123          2             4
4         124          3             1
5         124          2             2
6         124          1             3
7         125          1             1

How to make a rank column in R

Using rank and relocate:

library(dplyr)

df1 %>% 
  mutate(across(M1:M2, ~ rank(-.x), .names = "{.col}_rank"),
         M3_rank = rank(M3)) %>% 
  relocate(order(colnames(.)))

   M1 M1_rank  M2 M2_rank  M3 M3_rank
1 400       1 500       1 420       4
2 300       2 200       2 330       3
3 200       3  10       4 230       2
4  50       4 100       3  51       1

If you have duplicate values in your vector, then you have to choose a method for ties. By default, you get the average rank, but you can choose "first".

Another possibility, which is I think what you want to do, is to convert to factor and then to numeric, so that you get a only entire values (not the average).

df1 <- data.frame(M1 = c(400,300, 50, 300))
df1 %>% 
  mutate(M1_rankAverage = rank(-M1),
         M1_rankFirst = rank(-M1, ties.method = "first"),
         M1_unique = as.numeric(as.factor(rank(-M1))))

   M1 M1_rankAverage M1_rankFirst M1_unique
1 400            1.0            1         1
2 300            2.5            2         2
3  50            4.0            4         3
4 300            2.5            3         2

Add a rank column to a data frame

There is a rank function to help you with that:

transform(df, 
          year.rank = ave(count, year, 
                          FUN = function(x) rank(-x, ties.method = "first")))
  item year count year.rank
1    a 2010     1         3
2    b 2010     4         2
3    c 2010     6         1
4    a 2011     3         2
5    b 2011     8         1
6    c 2011     3         3
7    a 2012     5         3
8    b 2012     7         2
9    c 2012     9         1

In pandas, how to create a new column with a rank according to the mean values of another column

Use groupby + transform for mean and then rank:

df['Average'] = df.groupby('Country')['value'].transform('mean')
df['Rank'] = df['Average'].rank(method='dense', ascending=False)
print (df)
   Country  value    Average  Rank
0       UK     42  42.000000   1.0
1       US      9   9.500000   4.0
2       US     10   9.500000   4.0
3   France     15  15.500000   3.0
4   France     16  15.500000   3.0
5  Germany     17  18.333333   2.0
6  Germany     18  18.333333   2.0
7  Germany     20  18.333333   2.0

Similar solution:

a = df.groupby('Country')['value'].transform('mean')
b = a.rank(method='dense', ascending=False)

df = df.assign(Average=a, Rank=b)
print (df)
   Country  value    Average  Rank
0       UK     42  42.000000   1.0
1       US      9   9.500000   4.0
2       US     10   9.500000   4.0
3   France     15  15.500000   3.0
4   France     16  15.500000   3.0
5  Germany     17  18.333333   2.0
6  Germany     18  18.333333   2.0
7  Germany     20  18.333333   2.0

How to add a ranking column for this dataset?

Does this work:

library(dplyr)
df %>% group_by(country) %>% mutate(rank = rank(desc(profit)))
# A tibble: 12 x 4
# Groups:   country [4]
   comp_name country profit  rank
   <chr>     <chr>    <dbl> <dbl>
 1 A         US         100     3
 2 B         UK         125     2
 3 C         France     150     1
 4 D         Germany    165     1
 5 E         US         150     1
 6 F         UK         110     3
 7 G         France     110     2
 8 H         Germany    125     2
 9 J         US         130     2
10 K         UK         250     1
11 L         France      95     3
12 M         Germany    100     3

How do I create a 'Rank' column in pandas?

df['Rank'] = df.sort_values(by=['col1', 'col2'], ascending=False) \
               .reset_index() \
               .sort_values('index') \
               .index + 1

This code goes through these steps:

Sorts the data frame into a new temporary one
Resets the index, keeping the old one as a new column called 'index'. Make sure you don't have any starting columns named 'index'. If you do have one, the new column is called level_0. If you have both index and level_0, the code will throw an exception.
Sorts the temporary dataframe according to the old index (so it's in the original order again)
Uses the new index (which is in the rank order we want) as the new 'Rank' column. Plus 1 because your question seems to want ranks that start at 1, not 0

Warning while adding rank column to a pandas dataframe

Because after you filter your dataframe and assign to another variable name even if you use loc you are still using a shallow copy. In order to not get the warning you should copy the dataframe properly like this.

min_100k_volume_df = ohlcv_m30_groupby_df[filter_condition].copy()

ranking dataframe using two columns in R

You can use data.table::frank or dplyr::min_rank:

`data.table::frank`

dt$Rank <- frank(dt, B, A, ties.method = "min")
dt
  A B Rank
1 1 1    1
2 2 1    2
3 2 1    2
4 4 4    5
5 5 3    4

`dplyr::min_rank`

mutate(dt, Rank = min_rank(paste(B,A)))
  A B Rank
1 1 1    1
2 2 1    2
3 2 1    2
4 4 4    5
5 5 3    4

Data

dt <- data.frame(A = c(1,2,2,4,5), B = c(1,1,1,4,3))

Related Topics

How to Sort Letters in a String

Is There a More Elegant Way to Convert Two-Digit Years to Four-Digit Years with Lubridate

Max Length for a Vector in R

Split Data.Frame by Value

Load Multiple Packages at Once

Add New Row to Dataframe, at Specific Row-Index, Not Appended

Using R to List All Files with a Specified Extension

Determine the Data Types of a Data Frame's Columns

How to Wait for a Keypress in R

Options for Caching/Memoization/Hashing in R

Remove All Text Before Colon

Ggplot2 Plot Area Margins

How to Run R on a Server Without X11, and Avoid Broken Dependencies

Examples of the Perils of Globals in R and Stata

Remove Backslashes from Character String

How to Create a Marimekko/Mosaic Plot in Ggplot2

Split Up '...' Arguments and Distribute to Multiple Functions

Handling Java.Lang.Outofmemoryerror When Writing to Excel from R

Leave a reply