How to Set Up Rselenium for R

RSelenium is not working

From comments:

Click start

Select Control Panel > System

Select Advance system settings

Click Environment Variables...

Under System Variables

Scroll to Path and double click

At the end of Variable value: add ;C:\path\to\directory that holds the chromedriver.exe file. Note the ; that separates the paths

Restart your R session and you should now be able to run:

> require(RSelenium)
RSelenium::startServer()
remDr <- remoteDriver(browserName = "chrome")
remDr$open()

EDIT

For RSelenium to operate with chrome you first need to download chromedriver.exe you can download this from https://sites.google.com/a/chromium.org/chromedriver/downloads. Once downloaded unzip the folder and place chromedriver.exe where you would like to store it.

The directory that you store chromedriver.exe and add to your system PATH can be anywhere you choose. As stated in comments, for example, mine currently resides in C:\Python27\Scripts.

RSelenium won't open a session in Chrome

@user2554330 said that there might be issues with RSelenium reacting to the current version of Chrome, so based on this and some other comments I read, I decided to use RSelenium in Firefox as opposed to Chrome and it worked. The code I used is:

rs_driver_object <- rsDriver(browser = 'firefox',
port = free_port())

It might be worth it to try this again in the future using Chrome, but for now, this basic code in firefox seems to work

Is RSelenium::remDr$open() actually suppose to open a browser window?

Finally got it working and thought I would share what worked as there are a lot of this same question with no replies.

So to answer my question, yes, remDr$open() should open a browser, but in a vnc container that you already have up and running. Here's my process to get there.

First, make sure everything is up-to-date. So many problems could have been solved by making sure my programs, packages and library's where the newest. With that said, I read somewhere that a program (can't remember which) was having issues with Big Sur...

Second, this resource was great about explaining how to setup Docker+Selenium on windows/linux machines and how to scrape using RSelenium. In it he warns that there are some security issues with setting up the ports in the method below. I don't know how to solve that so be warned that you run the code at your own risk!
https://www.youtube.com/watch?v=OxbvFiYxEzI&t=4838s

Ok now to get started... you need the app Docker on your system and up and running. On my mac it's called Docker Desktop.

On the Docker website you use Docker Hub to find the selenium files you need. It will give you a link to copy and paste into your terminal to download the files. I used standalone-firefox and standalone-firefox-debug. I also think you need selenium-server-standalone-xxx.xx.jar but I can't remember what for or the process. It's sitting in my Rproject folder... don't know if that matters.

If you want to see a window with the pages on it you would run in terminal:

docker run -d -p 4445:4444 -p 5901:5900 selenium/standalone-firefox-debug

This will set up a new project in Docker.

You now need to use a vnc to create a container to view the browser in. Mac has one built in via ScreenShare that you can easily access one of two ways... cmd+k and enter vnc://127.0.0.1:5901 or via terminal:

open vnc://127.0.0.1:5901

Password is "secret" !!!!

Now your vnc is up and running you can run your code in RStudio and yes, $open will open firefox in the vnc/screen share window!

library(RSelenium)

remDr <- remoteDriver(port=4445L) ##4445 for mac. windows can use 4444
remDr$open()

Finally note. Once you have tested your code and everything is working properly it is recommended that you run a headless browser instead of the debug one. So your code will work, you just won't need to open the vnc and watch, which slows everything down. Here is the code to run instead:

docker run -d -p 4445:4444 selenium/standalone-firefox

Installing RSelenium from GitHub

Your problem is that some of the dependecies are not available on your default repo. Specifically, this doesn't work:

install.packages("binman", repos = "http://www.omegahat.net/R")

RSelenium is also currently available on CRAN. So all you should have to do is to select a CRAN mirror which has these packages. For example:

install.packages(c("XML", "wdman", "binman", "subprocess", "semver", "RSelenium"), 
repos = "https://cloud.r-project.org")


Related Topics



Leave a reply



Submit