Convert written number to number in R
Here's a start that should get you to hundreds of thousands.
word2num <- function(word){
wsplit <- strsplit(tolower(word)," ")[[1]]
one_digits <- list(zero=0, one=1, two=2, three=3, four=4, five=5,
six=6, seven=7, eight=8, nine=9)
teens <- list(eleven=11, twelve=12, thirteen=13, fourteen=14, fifteen=15,
sixteen=16, seventeen=17, eighteen=18, nineteen=19)
ten_digits <- list(ten=10, twenty=20, thirty=30, forty=40, fifty=50,
sixty=60, seventy=70, eighty=80, ninety=90)
doubles <- c(teens,ten_digits)
out <- 0
i <- 1
while(i <= length(wsplit)){
j <- 1
if(i==1 && wsplit[i]=="hundred")
temp <- 100
else if(i==1 && wsplit[i]=="thousand")
temp <- 1000
else if(wsplit[i] %in% names(one_digits))
temp <- as.numeric(one_digits[wsplit[i]])
else if(wsplit[i] %in% names(teens))
temp <- as.numeric(teens[wsplit[i]])
else if(wsplit[i] %in% names(ten_digits))
temp <- (as.numeric(ten_digits[wsplit[i]]))
if(i < length(wsplit) && wsplit[i+1]=="hundred"){
if(i>1 && wsplit[i-1] %in% c("hundred","thousand"))
out <- out + 100*temp
else
out <- 100*(out + temp)
j <- 2
}
else if(i < length(wsplit) && wsplit[i+1]=="thousand"){
if(i>1 && wsplit[i-1] %in% c("hundred","thousand"))
out <- out + 1000*temp
else
out <- 1000*(out + temp)
j <- 2
}
else if(i < length(wsplit) && wsplit[i+1] %in% names(doubles)){
temp <- temp*100
out <- out + temp
}
else{
out <- out + temp
}
i <- i + j
}
return(list(word,out))
}
Results:
> word2num("fifty seven")
[[1]]
[1] "fifty seven"
[[2]]
[1] 57
> word2num("four fifty seven")
[[1]]
[1] "four fifty seven"
[[2]]
[1] 457
> word2num("six thousand four fifty seven")
[[1]]
[1] "six thousand four fifty seven"
[[2]]
[1] 6457
> word2num("forty six thousand four fifty seven")
[[1]]
[1] "forty six thousand four fifty seven"
[[2]]
[1] 46457
> word2num("forty six thousand four hundred fifty seven")
[[1]]
[1] "forty six thousand four hundred fifty seven"
[[2]]
[1] 46457
> word2num("three forty six thousand four hundred fifty seven")
[[1]]
[1] "three forty six thousand four hundred fifty seven"
[[2]]
[1] 346457
I can tell you already that this won't work for word2num("four hundred thousand fifty")
, because it doesn't know how to handle consecutive "hundred" and "thousand" terms, but the algorithm can be modified probably. Anyone should feel free to edit this if they have improvements or build on them in their own answer. I just thought this was a fun problem to play with (for a little while).
Edit: Apparently Bill Venables has a package called english that may achieve this even better than the above code.
A function to convert words to numbers
The function works like this (note you also need the stringr
package).
First, it takes the word you input (i.e.
"five"
if you usedwords_to_numbers("five")
)Then,
str_to_lower()
takes that and normalizes it to all lower case (i.e., avoiding issues if you typed"Five"
or"FIVE"
instead of"five"
).It then iterates over a loop (for some reason ending at 11), so
i
will take the value of1
, then2
, then3
, all the way to11
.Within the loop,
str_replace_all()
takes your string (i.e.,"five"
) and looks for a matching pattern. Here, the pattern iswords(i)
(i.e.words(5)
wheni == 5
yields the pattern"five"
- in theenglish
package, thewords()
function provides a vector of words that represent the position in the vector. For instance, if you typeenglish::words(1000)
it will return"one thousand"
. Once it finds the pattern, it then replaces it withas.character(i)
. Theas.character()
function converts the numberi
value to a character sincestr_replace_all()
requires a character replacement. If you needed the return value to be numeric, you could useas.numeric(words_to_numbers("five"))
For some reason, the function stops at 11, meaning if you type words_to_numbers("twelve")
it won't work (returns "twelve"). So you will need to adjust that number if you want to use the function for values > 11.
Hope this helps and good luck learning R!
Convert integer to words
Option 1:
Use the as.english
function from the 'english' package:
library(english)
as.english(2017)
Option 2:
Use the replace_number
function from the 'qdap' package.
library(qdap)
replace_number(2017)
Option 3:
Use the numbers_to_words
function from the 'xfun' package.
library(xfun)
numbers_to_words(2017)
Convert letters to numbers
thanks for all the ideas, but I am a dumdum.
Here's what I did. Made a mapping from each letter to a specific number, then called each letter
df=data.frame(L=letters[1:26],N=rnorm(26))
df[df$L=='e',2]
Numeric to Alphabetic Lettering Function in R
Here are some alternatives:
1) encode Let b be the base. Here b = 26. Then there are b^k appendices having k letters
so for a particular appendix having number x it has n letters if n is the
smallest integer for which b + b^2 + ... + b^n >= x. The LHS of this inequality is a geometric series and therefore has a closed form solution. Replacing the LHS with that expression and solving the resulting equation for n gives the formula for n in the code below. Then we subtract all b^k terms from number for which k < n and use the APL-like encode
function found here (and elsewhere on the web). encode
does the base conversion giving digits
, a vector of digits in base base
. Finally add 1 to each digit and use that as a lookup into LETTERS
.
app2 <- function(number, base = 26) {
n <- ceiling(log((1/(1 - base) - 1 - number) * (1 - base), base = base)) - 1
digits <- encode(number - sum(base^seq(0, n-1)), rep(base, n))
paste(LETTERS[digits + 1], collapse = "")
}
sapply(1:29, app2) # test
giving:
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O"
[16] "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z" "AA" "AB" "AC"
Another test to try is:
sapply(1:60, app2, base = 3)
2) recursive solution Here is an alternative that works recursively. It computes the last letter of the Appendix number and then removes it and recursively computes the portion to its left.
app2r <- function(number, base = 26, suffix = "") {
number1 <- number - 1
last_digit <- number1 %% base
rest <- number1 %/% base
suffix <- paste0(LETTERS[last_digit + 1], suffix)
if (rest > 0) Recall(rest, base, suffix) else suffix
}
# tests
identical(sapply(1:29, app2r), sapply(1:29, app2))
## [1] TRUE
identical(sapply(1:60, app2r, base = 3), sapply(1:60, app2, base = 3))
## [1] TRUE
Related Topics
Display a Time Clock in the R Command Line
Adaptive Moving Average - Top Performance in R
Changing Line Colors with Ggplot()
Comma Separator for Numbers in R
Why Does Unlist() Kill Dates in R
Adding Space Between Bars in Ggplot2
Subset a Column in Data Frame Based on Another Data Frame/List
Remove All Punctuation Except Apostrophes in R
Dplyr Mutate Rowwise Max of Range of Columns
Twitter, Roauth and Windows: Register Ok, But Certificate Verify Failed
Extract Matrix Column Values by Matrix Column Name
How to Find All Functions in an R Package
Add "Filename" Column to Table as Multiple Files Are Read and Bound
Remove Groups with Less Than Three Unique Observations
Avoid Ggplot Sorting the X-Axis While Plotting Geom_Bar()
Generate Random Numbers with Fixed Mean and Sd