Sort a Factor Based on Value in One or More Other Columns

Sort a factor based on value in one or more other columns

Here's a reproducible sample, with solution:

set.seed(0)
a = sample(1:20,replace=F)
b = sample(1:20,replace=F)
f = as.factor(letters[1:20])

> a
[1] 18 6 7 10 15 4 13 14 8 20 1 2 9 5 3 16 12 19 11 17
> b
[1] 16 18 4 12 3 5 6 1 15 10 19 17 9 11 2 8 20 7 13 14
> f
[1] a b c d e f g h i j k l m n o p q r s t
Levels: a b c d e f g h i j k l m n o p q r s t

Now for the new factor:

fn = factor(f, levels=unique(f[order(a,b,f)]), ordered=TRUE)

> fn
[1] a b c d e f g h i j k l m n o p q r s t
20 Levels: k < l < o < f < n < b < c < i < m < d < s < q < g < h < e < ... < j

Sorted on 'a', next 'b' and finally 'f' itself (although in this example, 'a' has no repeated values).

Changing order of factor levels based on lookup in other column

Try this:

factor(df$val, levels = unique(df$val[order(df$id)]))

Set levels of a factor based on numeric value of another column

The answer here, actually provided by @RichScriven is that I wasn't setting the order of the column I wanted (df$ftor) and rather the whole data.frame (df). In addition, the indexing was wonky. So ultimately I replaced this:

df$ftor <- factor(df$ftor, levels=df[order(df$order_ID),], ordered=TRUE)

with this:

df$ftor <- factor(df$ftor, levels=df$ftor[order(df$order_ID)], ordered=TRUE)

Reorder a factor based on the ratio of the group sums of two columns - grouping by the factor to be reordered

Base R solution (using dat as your data.frame)

stu.tea <- names(sort(by( 
dat[c("Nstudents","Nteachers")],dat["District"],
function(x) do.call("/",as.list(colSums(x)))
)))
#[1] "B" "A"

dat$District <- factor(dat$District,levels=stu.tea)
dat$District
#[1] A A A B B B
#Levels: B A

Sort data frame column by factor

order takes multiple arguments, and it does just what you want:

with(score, score[order(sex, y, x),])
## x y sex
## 3 SUSAN 6.636370 F
## 5 EMMA 6.873445 F
## 9 VIOLET 8.539329 F
## 6 LEONARD 6.082038 M
## 2 TOM 7.812380 M
## 8 MATT 8.248374 M
## 4 LARRY 8.424665 M
## 7 TIM 8.754023 M
## 1 MARK 8.956372 M

How can I sort a dataframe by a predetermined order of factor levels in R?

We can specify the levels of the 'group' as category_order and that use that to `arrange

library(dplyr)
df1 <- df %>%
arrange(factor(group, levels = category_order))
df1
# group value
#1 tree 50
#2 house 2
#3 lake 1
#4 human 5

Or using fct_relevel

library(forcats)
df %>%
arrange(fct_relevel(group, category_order))

Creating a column with factor variables conditional on multiple other columns?

Here's an alternative using case_when -

library(dplyr)

long_fused %>%
mutate(max = do.call(pmax, select(., -Threshold)),
#If you don't have Threshold column in your data just use .
#mutate(max = do.call(pmax, .),
Threshold = case_when(between(max, 5, 10) ~ 'Low',
between(max, 11, 15) ~ 'Medium',
TRUE ~ 'High'))

# CNV.Gain Amplification Homozygous.Deletion.Frequency
#1 3 5 10
#2 0 0 11
#3 7 16 25

# Heterozygous.Deletion.Frequency max Threshold
#1 0 10 Low
#2 8 11 Medium
#3 0 25 High

ordering data frame based on factor levels indices in r

You could use dplyr as follows below. This is a variant of another answer, without using stringr.

library(dplyr)
df %>%
arrange(as.numeric(gsub("\\D+", "", string)))

## Name string value
## 1 BB a8 0.35120965
## 2 DD a8 0.54526648
## 3 BB a11 -0.90101120
## 4 AA a11 1.65637910
## 5 DD a45 0.42240082
## 6 CC a45 -0.30438594
## 7 AA a120 -0.05781699
## 8 AA a120 -1.83615123
## 9 DD a140 -1.82698618

You can also further sort by Name in addition to string.

so.df %>%
arrange(
as.numeric(gsub("\\D+", "", string)),
Name
)
## Name string value
## 1 BB a8 0.35120965
## 2 DD a8 0.54526648
## 3 AA a11 1.65637910
## 4 BB a11 -0.90101120
## 5 CC a45 -0.30438594
## 6 DD a45 0.42240082
## 7 AA a120 -0.05781699
## 8 AA a120 -1.83615123
## 9 DD a140 -1.82698618

Reorder factor levels within group

To reorder the factor levels you can use forcats (part of the tidyverse), and do something like this...

library(forcats)
df2 <- df %>% mutate(a_factor = fct_reorder(a_factor,
value*(-1 + 2 * (group=="group1"))))

levels(df2$a_factor)
[1] "f" "e" "d" "a" "b" "c"

This does not rearrange the dataframe itself...

df2
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6

Forcats solution for reordering based on another column

We can change the .fun from the default median to I i.e. to get the value as is

library(dplyr)
library(forcats)
df %>%
mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC,
as.integer(ACADEMIC_PERIOD), .fun = I))


Related Topics



Leave a reply



Submit