Binary operations in a dataframe
If you have Gross
value always in millions, you can get the numbers from it and multiply by 1e6
to get amount in millions and then divide by Weeks
.
library(rvest)
library(dplyr)
url = "https://www.imdb.com/chart/boxoffice"
read_table = read_html("https://www.imdb.com/chart/boxoffice")
movie_table = html_table(html_nodes(read_table, "table")[[1]])
movie_table <- movie_table[-c(1, ncol(movie_table))]
movie_table %>% mutate(per_week_calc = readr::parse_number(Gross) * 1e6/Weeks)
# Title Weekend Gross Weeks per_week_calc
#1 Onward $10.5M $60.3M 2 30150000
#2 I Still Believe $9.5M $9.5M 1 9500000
#3 Bloodshot $9.3M $10.5M 1 10500000
#4 The Invisible Man $6.0M $64.4M 3 21466667
#5 The Hunt $5.3M $5.8M 1 5800000
#6 Sonic the Hedgehog $2.6M $145.8M 5 29160000
#7 The Way Back $2.4M $13.4M 2 6700000
#8 The Call of the Wild $2.2M $62.1M 4 15525000
#9 Emma. $1.4M $10.0M 4 2500000
#10 Bad Boys for Life $1.1M $204.3M 9 22700000
If you have data in billions or thousands you can refer
Changing Million/Billion abbreviations into actual numbers? ie. 5.12M -> 5,120,000 and Convert from K to thousand (1000) in R
regex pattern a few numbers followed by the letter k
How about
data = gsub("([0-9]+)k", "\\1000", data)
How to replace string from numeric value in R?
Edit: For values that have non-integer values.
x <- c("19M","20K","1K", "1.25M", "1.5K"); x
x <- sub("M", "e6", x); x
x <- sub("K", "e3", x); x
as.numeric(x)
[1] 19000000 20000 1000 1250000 1500
For integer values, the following is sufficient.
x <- c("19M","20K","1K")
x <- sub("M","000000", x)
x <- sub("K","000", x)
as.numeric(x)
1.9e+07 2.0e+04 1.0e+03
Related Topics
Can Dplyr Join on Multiple Columns or Composite Key
How to Extract Just the Number from a Named Number (Without the Name)
Knitr: How to Prevent Text Wrapping in Output
How to Make Variable Bar Widths in Ggplot2 Not Overlap or Gap
Ggplot2, Axis Not Showing After Using Theme(Axis.Line=Element_Line())
Read/Write Data in Libsvm Format
How to Convert Time (Mm:Ss) to Decimal Form in R
Changing Factor Levels with Dplyr Mutate
Replace a Value Na with the Value from Another Column in R
Stepwise Regression Using P-Values to Drop Variables with Nonsignificant P-Values
Dplyr Issues When Using Group_By(Multiple Variables)
Replace <Na> in a Factor Column
Protect/Encrypt R Package Code for Distribution
Reason Behind Speed of Fread in Data.Table Package in R
Different Legend-Keys Inside Same Legend in Ggplot2