Convert from K to thousand (1000) in R
Another way similar to what @RichardScriven has suggested:
x <- c("1.4k", "14k")
as.numeric(sub("k", "e3", x, fixed = TRUE))
## [1] 1400 14000
How do I replace k and m with thousands and millions?
Using stringr
and dplyr
from tidyverse
library(tidyverse)
df %>%
mutate(students = case_when(
str_detect(students, "m") ~ as.numeric(str_extract(students, "[\\d\\.]+")) * 1000000,
str_detect(students, "k") ~ as.numeric(str_extract(students, "[\\d\\.]+")) * 1000,
))
# A tibble: 3 x 2
uni students
<chr> <dbl>
1 Yale 16000000
2 Toronto 240000
3 NYU 7500
Convert a factor column with numbers in k format into numeric without losing any data
First detect which records with a "k".
df$is_k <- grepl("k", df$Likes)
Strip the "k", and then convert to numeric. If the record had a "k" then multiple my 1000, else multiple by 1.
df$Likes_num <- as.numeric(gsub("k", "", df$Likes)) * ifelse(df$is_k, 1000, 1)
Edit
For multiple units, I adapted something I had elsewhere for a more complex problem. This shows the steps and is simple enough, though I am not sure how robust it is.
Function
convert_units <- function(x) {
if (class(x) == "numeric") return(x)
# named vector of scalings (you can add to this)
unit_scale <- c("k" = 1e3, "m" = 1e6)
# clean up some potential nuisances with the input
x_str <- gsub(",", "", trimws(tolower(as.character(x))))
# extract out the letters
unit_char <- gsub("[^a-z]", "", x_str)
# extract out the numbers and convert to numeric
x_num <- as.numeric(gsub("[a-z]", "", x_str), "", x_str)
# develop a vector of multipliers
multiplier <- unit_scale[match(unit_char, names(unit_scale))]
multiplier[is.na(multiplier)] <- 1
# multiply
x_num * multiplier
}
Application
df$Likes2 <- convert_units(df$Likes)
Sample Result
ID Likes Likes2
1 1 99k 99000
2 2 997 997
3 3 15.5k 15500
4 4 9.25k 9250
5 5 575 575
6 6 800 800
7 7 8.5k 8500
8 8 2,400 2400
How to replace K for thousands and M for Millions in the same column in R
Supposing you have data that looks like this:
fifa2 <- data.frame(Value = c("€565K", "€5.65M", "€777777"))
you can do this:
library(dplyr)
fifa2 %>%
mutate(Value1 = as.numeric(gsub("[€MK]", "", Value)),
Value1 = ifelse(grepl("K$", Value), Value1 * 1000,
ifelse(grepl("M$", Value), Value1 * 1000000,
Value1)))
Value Value1
1 €565K 565000
2 €5.65M 5650000
3 €777777 777777
Format thousand to Ks in R
You can also have a look at function scales::label_number_si
which rounds the number.
a <- c(465456.6789, 3567.5, 1465458.12)
scales::label_number_si(accuracy = 0.1)(a)
#[1] "465.5K" "3.6K" "1.5M"
How to replace K with 1000 in a string using regular expression
This is a quick and dirty way which matches one or more digits followed by K and appends 000 to it:
data %>%
mutate(comment3 = str_replace(comment3 , "(\\d+)K", "\\1000"))
Where your data are placed in `data'.
Using \1
(or \\1
when escaped) to include the contents of the matched group (here, \\d+
) seems to be the piece that you were missing in your attempt.
Results from your sample data:
# A tibble: 7 x 1
comment3
<chr>
1 3.22%-1ST $100000/1.15% BAL
2 3.25% ON 1ST $100000/1.16% ON BAL
3 3.22% 1ST 100000/1.16 ON BAL
4 3.22% 1ST 100000/1.15% ON BAL
5 3.26% 1ST 100000/1.16% ON BAL
6 3.20% 1ST 100000/1.15% ON BAL
7 3.22% ON 1ST 100000 & 1.15% ON BALANCE
Format a number 1000 as 1k, 1000000 as 1m etc. in R
Using dplyr::case_when
:
so_formatter <- function(x) {
dplyr::case_when(
x < 1e3 ~ as.character(x),
x < 1e6 ~ paste0(as.character(x/1e3), "K"),
x < 1e9 ~ paste0(as.character(x/1e6), "M"),
TRUE ~ "To be implemented..."
)
}
test <- c(1, 999, 1000, 999000, 1000000, 1500000, 1000000000, 100000000000)
so_formatter(test)
# [1] "1"
# [2] "999"
# [3] "1K"
# [4] "999K"
# [5] "1M"
# [6] "1.5M"
# [7] "To be implemented..."
# [8] "To be implemented..."
converting k and M to thousands and millions using mutate across and an ifelse statement
Try this: if you account for the string size you can decide how many 0's to append on to your data. in the example below all I am doing is adding an additional condition to the ifelse statement where if it is a length of 4 and contains 'M' append 5 0's else append 4 0's. You can add more ifelse statements to decided how many 0's to add on.
library(tidyverse)
dat <- data.frame(
col1 = c('1.3M', '1.47k', '900k'),
col2 = c('1.31M', '20k', '999k'),
col3 = c('2.20M', '2.2M', '37M')
)
dat %>%
mutate(across(contains("col"), ~ifelse(grepl('k$',.), gsub('k','000',.),
ifelse(grepl('M$',.) & nchar(.) ==4,gsub('M','00000',.),gsub('M','0000',.))))) %>%
mutate(across(contains("col"), ~str_remove_all(., '\\.')))
Edit:
This might be a cleaner way of doing the same thing (and more dynamic). I would just remove the K and M and then multiply by a 1000 or a 1000000 to get the full number.
dat %>%
mutate(across(contains("col"), ~ifelse(grepl('k$',.), as.numeric(gsub('k','',.))*1000,
ifelse(grepl('M$',.) ,as.numeric(gsub('M','',.))*1000000,.))))
How do I format a number in thousands in R
How's sprintf("%.0f", 123542.52/1000)
?
Related Topics
Count Unique Combinations of Values
Implementation of Skyline Query or Efficient Frontier
Removing Particular Character in a Column in R
Extract Date from Given String in R
Extracting Indices for Data Frame Rows That Have Max Value for Named Field
Parallel Processing in R Limited
Remove Columns of Dataframe Based on Conditions in R
Accessing Parent Namespace Inside a Shiny Module
Automated Formula Construction
Creating New Shape Palettes in Ggplot2 and Other R Graphics
Using Lm in List Column to Predict New Values Using Purrr
Cannot Read File with "#" and Space Using Read.Table or Read.CSV in R
Split or Separate Uneven/Unequal Strings with No Delimiter
Topic Models: Cross Validation with Loglikelihood or Perplexity
Indexing Integer Vector with Na
How to Pass Column Name as Argument to Function for Dplyr Verbs