Scrolling page in RSelenium
Assuming you got
library(RSelenium)
startServer()
remDr <- remoteDriver()
remDr$open()
remDr$setWindowSize(width = 800, height = 300)
remDr$navigate("https://www.r-project.org/about.html")
You could scroll to the buttom like this:
webElem <- remDr$findElement("css", "body")
webElem$sendKeysToElement(list(key = "end"))
And you could scroll to the top like this:
webElem$sendKeysToElement(list(key = "home"))
And in case you want to scroll down just a bit, use
webElem$sendKeysToElement(list(key = "down_arrow"))
The names of the keys are in selKeys
.
RSelenium: Scroll down to load web content
If unfortunately your code does not work for scrolling down, try using executeScript()
as below :-
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Scroll down a page and load all items before using read_html()
To load whole page we need to scroll bit by bit instead of directly scrolling to the end of page.
#after navigating and accepting cookie, we shall scroll bit by bit
for(i in 1:30){
print(i)
remDr$executeScript("window.scrollBy(0,500);")
Sys.sleep(1)
}
#get nodes of all houses
html_full_page = remDr$getPageSource()[[1]] %>%
read_html()
x <- html_full_page %>%
html_nodes('.re-CardPackPremium-carousel')
{xml_nodeset (30)}
Scrolling through entire page with Rselenium, then extracting a tabular data into a data frame
I solved this issue. There were two things that were going on. The first is that the page was automatically loading with the cursor inside of a search bar. I got rid of this by doing remDr$findElement(using = "css", "body")$clickElement()
to click into the body of the text. Next, as one great question/answer pointed out, if the scrolling/arrow keys are not working with sendKeysToElement(list(key = "up_arrow"))
, you should try remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
.
Hence, the a small sample of my script is the following:
library(RSelenium)
library(rvest)
library(tidyverse)
## opens the driver
rD <- rsDriver(browser="firefox", port=4545L, verbose=F)
remDr <- rD[["client"]]
link_texts <- c("Base Set", "Promo", "Fossil")
## navigates to the correct page
remDr$navigate("https://www.pricecharting.com/category/pokemon-cards")
for (name in link_texts) {
## finds the link and clicks on it
remDr$findElement(using = "link text", name)$clickElement()
## gets the table path
remDr$findElement(using = "css", "body")$clickElement()
## finds the table - this line may be extraneous
table <- remDr$findElement(using = "css", "body")
## scrolls to the bottom of the table
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(1)
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(1)
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(1)
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(1)
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(1)
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(1)
## get the entire page source that's been loaded
html <- remDr$getPageSource()[[1]]
## read in the page source
page <- read_html(html)
data_name <- str_to_lower(str_replace(name, " ","_"))
## extract the tabular table
df <- page %>%
html_elements("#games_table") %>%
html_table() %>%
pluck(1) %>%
select(1:4)
assign(data_name, df)
Sys.sleep(3)
remDr$navigate("https://www.pricecharting.com/category/pokemon-cards")
}
## close driver
remDr$close()
rD$server$stop()
Check if it's possible to scroll down with RSelenium
Stumbled across a way to do this in Python here and modified it to work in R. Below is a now-working update of the original code I posted above.
# Open webpage
library(RSelenium)
rD = rsDriver(browser = "firefox")
remDr = rD[["client"]]
url = "https://stocktwits.com/symbol/NZDCHF"
remDr$navigate(url)
# Keep scrolling down page, loading new content each time.
last_height = 0 #
repeat {
remDr$executeScript("window.scrollTo(0,document.body.scrollHeight);")
Sys.sleep(3) #delay by 3sec to give chance to load.
# Updated if statement which breaks if we can't scroll further
new_height = remDr$executeScript("return document.body.scrollHeight")
if(unlist(last_height) == unlist(new_height)) {
break
} else {
last_height = new_height
}
}
Related Topics
How to Get Coefficients and Their Confidence Intervals in Mixed Effects Models
Finding 2 & 3 Word Phrases Using R Tm Package
Using Dynamic Column Names in 'Data.Table'
Legend Placement, Ggplot, Relative to Plotting Region
Can't Download Data from Yahoo Finance Using Quantmod in R
How to Define Fixed Aspect-Ratio for (Base R) Scatter-Plot
Filter Function in Dplyr Errors: Object 'Name' Not Found
Common Legend for Multiple Plots in R
Ggmap Error: Geomrasterann Was Built with an Incompatible Version of Ggproto
Emulate Split() with Dplyr Group_By: Return a List of Data Frames
Displaying a PDF from a Local Drive in Shiny
How to Make Gradient Color Filled Timeseries Plot in R
How to Add a Table to My Ggplot2 Output