Order a mixed vector (numbers with letters)
> library(gtools)
> mixedsort(alph)
[1] "7" "8" "9" "10a" "10b" "10c" "11a" "11b" "11c" "12"
To sort a data.frame you use mixedorder
instead
> mydf <- data.frame(alph, USArrests[seq_along(alph),])
> mydf[mixedorder(mydf$alph),]
alph Murder Assault UrbanPop Rape
Alabama 7 13.2 236 58 21.2
California 8 9.0 276 91 40.6
Colorado 9 7.9 204 78 38.7
Alaska 10a 10.0 263 48 44.5
Arizona 10b 8.1 294 80 31.0
Arkansas 10c 8.8 190 50 19.5
Florida 11a 15.4 335 80 31.9
Delaware 11b 5.9 238 72 15.8
Connecticut 11c 3.3 110 77 11.1
Georgia 12 17.4 211 60 25.8
mixedorder
on multiple vectors (columns)
Apparently mixedorder
cannot handle multiple vectors. I have made a function that circumvents this by converting all character vectors to factors with mixedsorted sorted levels, and pass all vectors on to the standard order
function.
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
do.call(order, c(
lapply(list(...), function(l){
if(is.character(l)){
factor(l, levels=mixedsort(unique(l)))
} else {
l
}
}),
list(na.last = na.last, decreasing = decreasing)
))
}
However, in your particular case multi.mixedorder
gets you the same result as the standard order
, since V2
is numeric.
df <- data.frame(
V1 = c("A","A","B","B","C","C","D","D","E","E"),
V2 = 19:10,
V3 = alph,
stringsAsFactors = FALSE)
df[multi.mixedorder(df$V2, df$V3),]
V1 V2 V3
10 E 10 12
9 E 11 11a
8 D 12 11b
7 D 13 11c
6 C 14 9
5 C 15 8
4 B 16 10c
3 B 17 10b
2 A 18 10a
1 A 19 7
Notice that
19:10
is equivalent toc(19:10)
.c
means concat, that is to make one long vector out of many short, but in you case you only have one vector (19:10
) so there's no need to concat anything. However, in the case ofV1
you have 10 vectors of length 1, so there you need to concat, as you already do.- You need
stringsAsFactors=FALSE
to not convertV1
andV3
to (incorrectly sorted) factors (which is default).
How to control the order of a variable mixed with string and numbers in R
Here's one possible way within dplyr
-
df %>%
arrange(nchar(x), x)
x y
1 S1 a
2 S2 b
3 S3 c
4 S4 d
5 S5 e
6 S6 f
7 S7 g
8 S8 h
9 S9 i
10 S10 j
11 S11 k
12 S12 l
13 S13 m
14 S14 n
15 S15 o
Order vector in R: Letter with number sorts funny
We can use mixed_sort
from gtools
. According to ?mixed_sort
These functions sort or order character strings containing embedded numbers so that the numbers are numerically sorted rather than sorted by character value.
library(gtools)
mixedsort(v1)
#[1] "r_1" "r_2" "r_10"
The reason for the sort is that it is not a numeric vector
. So, sorting happen
data
v1 <- c("r_1", "r_2", "r_10")
How to do a sort of mixed values in R
It's slightly ugly, but you could just split the data frame in two using filter statements, arrange each section individually, and then bind them back together:
df <- bind_rows(df %>%
filter(!is.na(as.numeric(level))) %>%
arrange(variable, as.numeric(level)),
df %>%
filter(is.na(as.numeric(level))) %>%
arrange(variable, level))
Gives you:
# A tibble: 6 x 2
variable level
<chr> <chr>
1 comp_ded 500
2 comp_ded 750
3 comp_ded 1000
4 channel DIR
5 channel EA
6 channel IA
Sort a dataframe based on a character column containing letters followed by numbers in R
You can try using something like this that does numeric day sorting:
Day <- c("Day1","Day20","Day5","Day10")
A <- c (5,7,2,0)
B <- c(15,12,16,30)
df <- data.frame(Day,A,B, stringsAsFactors = FALSE)
df$DayNum <- as.numeric(gsub('Day', '', df$Day))
df <- df[order(df$DayNum), ]
Output as follows:
df
Day A B DayNum
1 Day1 5 15 1
3 Day5 2 16 5
4 Day10 0 30 10
2 Day20 7 12 20
You can avoid creating a new column by doing the following (was trying to show full detail of what was going on):
df <- df[order(as.numeric(substr(df$Day, 4, nchar(df$Day)))), ]
Output will be same as above.
Related Topics
How to Create a Consecutive Group Number
How to Delete Rows Where All the Columns Are Zero
Remove Quotes from a Character Vector in R
Deleting Rows in R Based on Values Over Multiple Columns
Regex to Replace Comma to Dot Separator
Splitting a Dataframe into Several Dataframes
How to Get to the Next Line in the R Command Prompt Without Executing
Concatenate String Columns and Order in Alphabetical Order
How to Show Code But Hide Output in Rmarkdown
How to Change the Default Colors in Plotly Chart
Finding Local Maxima and Minima
Add Column Which Contains Binned Values of a Numeric Column
Controlling Number of Decimal Digits in Print Output in R
Ggplot'S Qplot Does Not Execute on Sourcing
Dictionary Style Replace Multiple Items
Show Percent % Instead of Counts in Charts of Categorical Variables