﻿ Sort a Factor Based on Value in One or More Other Columns - ITCodar

# Sort a Factor Based on Value in One or More Other Columns

## Sort a factor based on value in one or more other columns

Here's a reproducible sample, with solution:

``set.seed(0)a = sample(1:20,replace=F)b = sample(1:20,replace=F)f = as.factor(letters[1:20])> a [1] 18  6  7 10 15  4 13 14  8 20  1  2  9  5  3 16 12 19 11 17> b [1] 16 18  4 12  3  5  6  1 15 10 19 17  9 11  2  8 20  7 13 14> f [1] a b c d e f g h i j k l m n o p q r s tLevels: a b c d e f g h i j k l m n o p q r s t``

Now for the new factor:

``fn = factor(f, levels=unique(f[order(a,b,f)]), ordered=TRUE)> fn [1] a b c d e f g h i j k l m n o p q r s t20 Levels: k < l < o < f < n < b < c < i < m < d < s < q < g < h < e < ... < j``

Sorted on 'a', next 'b' and finally 'f' itself (although in this example, 'a' has no repeated values).

## Changing order of factor levels based on lookup in other column

Try this:

``factor(df\$val, levels = unique(df\$val[order(df\$id)]))``

## Set levels of a factor based on numeric value of another column

The answer here, actually provided by @RichScriven is that I wasn't setting the order of the column I wanted (`df\$ftor`) and rather the whole data.frame (`df`). In addition, the indexing was wonky. So ultimately I replaced this:

``df\$ftor <- factor(df\$ftor, levels=df[order(df\$order_ID),], ordered=TRUE)``

with this:

``df\$ftor <- factor(df\$ftor, levels=df\$ftor[order(df\$order_ID)], ordered=TRUE)``

## Reorder a factor based on the ratio of the group sums of two columns - grouping by the factor to be reordered

Base R solution (using `dat` as your data.frame)

``stu.tea <- names(sort(by(              dat[c("Nstudents","Nteachers")],dat["District"],             function(x) do.call("/",as.list(colSums(x)))           )))#[1] "B" "A"dat\$District <- factor(dat\$District,levels=stu.tea)dat\$District#[1] A A A B B B#Levels: B A``

## Sort data frame column by factor

`order` takes multiple arguments, and it does just what you want:

``with(score, score[order(sex, y, x),])##         x        y sex## 3   SUSAN 6.636370   F## 5    EMMA 6.873445   F## 9  VIOLET 8.539329   F## 6 LEONARD 6.082038   M## 2     TOM 7.812380   M## 8    MATT 8.248374   M## 4   LARRY 8.424665   M## 7     TIM 8.754023   M## 1    MARK 8.956372   M``

## How can I sort a dataframe by a predetermined order of factor levels in R?

We can specify the `levels` of the 'group' as `category_order` and that use that to `arrange

``library(dplyr)df1 <- df %>%           arrange(factor(group, levels = category_order))df1#  group value#1  tree    50#2 house     2#3  lake     1#4 human     5``

Or using `fct_relevel`

``library(forcats)df %>%    arrange(fct_relevel(group, category_order))``

## Creating a column with factor variables conditional on multiple other columns?

Here's an alternative using `case_when` -

``library(dplyr)long_fused %>%  mutate(max = do.call(pmax, select(., -Threshold)),  #If you don't have Threshold column in your data just use .  #mutate(max = do.call(pmax, .),           Threshold = case_when(between(max, 5, 10) ~ 'Low',                                between(max, 11, 15) ~ 'Medium',                                TRUE ~ 'High'))#  CNV.Gain Amplification Homozygous.Deletion.Frequency#1        3             5                            10#2        0             0                            11#3        7            16                            25#  Heterozygous.Deletion.Frequency max Threshold#1                               0  10       Low#2                               8  11    Medium#3                               0  25      High``

## ordering data frame based on factor levels indices in r

You could use `dplyr` as follows below. This is a variant of another answer, without using `stringr`.

``library(dplyr)df %>%   arrange(as.numeric(gsub("\\D+", "", string)))##   Name string       value## 1   BB     a8  0.35120965## 2   DD     a8  0.54526648## 3   BB    a11 -0.90101120## 4   AA    a11  1.65637910## 5   DD    a45  0.42240082## 6   CC    a45 -0.30438594## 7   AA   a120 -0.05781699## 8   AA   a120 -1.83615123## 9   DD   a140 -1.82698618``

You can also further sort by `Name` in addition to `string`.

``so.df %>%  arrange(      as.numeric(gsub("\\D+", "", string)),      Name  )##   Name string       value## 1   BB     a8  0.35120965## 2   DD     a8  0.54526648## 3   AA    a11  1.65637910## 4   BB    a11 -0.90101120## 5   CC    a45 -0.30438594## 6   DD    a45  0.42240082## 7   AA   a120 -0.05781699## 8   AA   a120 -1.83615123## 9   DD   a140 -1.82698618``

## Reorder factor levels within group

To reorder the factor levels you can use `forcats` (part of the `tidyverse`), and do something like this...

``library(forcats)df2 <- df %>% mutate(a_factor = fct_reorder(a_factor,                                            value*(-1 + 2 * (group=="group1"))))levels(df2\$a_factor)[1] "f" "e" "d" "a" "b" "c"``

This does not rearrange the dataframe itself...

``df2  a_factor  group value1        a group1     12        b group1     23        c group1     34        d group2     45        e group2     56        f group2     6``

## Forcats solution for reordering based on another column

We can change the `.fun` from the default `median` to `I` i.e. to get the value as is

``library(dplyr)library(forcats)df %>%    mutate(ACADEMIC_PERIOD_DESC = fct_reorder(ACADEMIC_PERIOD_DESC,            as.integer(ACADEMIC_PERIOD), .fun = I))``