zipping lists in R
I think you're looking for mapply
:
‘mapply’ is a multivariate version of ‘sapply’. ‘mapply’ applies
‘FUN’ to the first elements of each ... argument, the second
elements, the third elements, and so on. Arguments are recycled
if necessary.
For your example, use mapply(f, A, B)
zip/unzip functions in R
The purrr package attempts to provide a lot of FP primitives. purrr
's version of zip is called transpose()
.
L1 <- list(as.list(1:3),as.list(9:7))
library(purrr)
(L2 <- transpose(L1))
## List of 3
## $ :List of 2
## ..$ : int 1
## ..$ : int 9
## $ :List of 2
## ..$ : int 2
## ..$ : int 8
## $ :List of 2
## ..$ : int 3
## ..$ : int 7
identical(transpose(L2),L1) ## TRUE
transpose()
also works on your second (unzip) example.
Which is more pythonic in a for loop: zip or enumerate?
No doubt, zip
is more pythonic. It doesn't require that you use a variable to store an index (which you don't otherwise need), and using it allows handling the lists uniformly, while with enumerate
, you iterate over one list, and index the other list, i.e. non-uniform handling.
However, you should be aware of the caveat that zip
runs only up to the shorter of the two lists. To avoid duplicating someone else's answer I'd just include a reference here: someone else's answer.
@user3100115 aptly points out that in python2, you should prefer using itertools.izip
over zip
, due its lazy nature (faster and more memory efficient). In python3 zip
already behaves like py2's izip
.
R equivalent of [x[y] for x,y in zip(i,j)] in python?
In R you use Map
function:
i = list(c(1,2,3),c(2,3,4),c(2,4,2))
j = c(2,3,1)
Map(`[`, i, j)
[[1]]
[1] 2
[[2]]
[1] 4
[[3]]
[1] 2
You can also use mapply
which returns a vector instead of a list:
mapply(`[`, i, j)
[1] 2 4 2
How do I add a loop when using R to scrape data?
The comment of @r2evans already provides an answer. Since the @ShanCham asked how to actually implement this I wanted to guide with the following code, which is just more verbose than the comment and could therefore not be posted as additional comment.
library(rvest)
#only two exemplary zipcodes, could be more, of course
zipcodes <- c("02110", "02125")
crime <- lapply(zipcodes, function(z) {
site <- read_html(paste0("https://www.trulia.com/real_estate/",z,"-Boston/crime/"))
#for illustrative purposes:
#introduced as.numeric to numeric columns
#exluded some of your other columns and shortenend the current text in type
data.frame(zip = z,
theft = site %>% html_nodes(".crime-text-0") %>% html_text() %>% as.numeric(),
assault = site %>% html_nodes(".crime-text-1") %>% html_text() %>% as.numeric() ,
type = site %>% html_nodes(".clearfix") %>% html_text() %>% paste(collapse = " ") %>% substr(1, 50) ,
stringsAsFactors=FALSE)
})
class(crime)
#list
#Output are lists that can be bound together to one data.frame
crime <- do.call(rbind, crime)
#crime is a data.frame, hence, classes/types are kept
class(crime$type)
# [1] "character"
class(crime$assault)
# [1] "numeric"
Is there more to enumerate() than just zip(range(len()))?
Because not every iterable has a length.
>>> def countdown(x):
... while x >= 0:
... yield x
... x -= 1
...
>>> down = countdown(3)
>>> len(down)
Traceback (most recent call last):
[...]
TypeError: object of type 'generator' has no len()
>>> enum = enumerate(down)
>>> next(enum)
(0, 3)
>>> next(enum)
(1, 2)
This is a trivial example, of course. But I could think of lots of real world objects where you can't reasonably pre-compute a length. Either because the length is infinite (see itertools.count
) or because the object you are iterating over does itself not know when the party is over.
Your iterator could be fetching chunks of data from a remote database of unknown size or to which the connection might get lost without warning. Or it could process user input.
def get_user_input():
while True:
i = input('input value or Q to quit: ')
if i == 'Q':
break
yield i
You cannot get the length of get_user_input()
, but you can enumerate
all inputs as you fetch them via next
(or iteration).
Python zip pattern in R
You can emulate the python zip with a data.frame like this::
medias_name <- c("print", "ooh", "tv", "digital")
medias_img <- c("Print.png", "Ooh.png", "Tv.png", "Digital.png")
myenv <- new.env()
zip_df <- data.frame(name=medias_name, img=medias_img, stringsAsFactors=F)
for (i in 1:nrow(zip_df)){
myenv[[zip_df[i, 'name']]] <- zip_df[i, 'img']
}
So we get::
> zip_df
name img
1 print Print.png
2 ooh Ooh.png
3 tv Tv.png
4 digital Digital.png
> ls(myenv)
[1] "digital" "ooh" "print" "tv"
> myenv$tv
[1] "Tv.png"
>
Another solution is to emulate zip with a list to pack items::
medias_name <- c("print", "ooh", "tv", "digital")
medias_img <- c("Print.png", "Ooh.png", "Tv.png", "Digital.png")
myenv <- new.env()
ziplist <- list(medias_name, medias_img)
for (i in 1:length(ziplist[[1]])){
name <- ziplist[[1]][i]
img <- ziplist[[2]][i]
myenv[[name]] <- img
}
Related Topics
R Ggplot Ordering Bars in "Barplot-Like " Plot
Why Does Mapply Not Return Date-Objects
R Group By, Counting Non-Na Values
In R, How to Plot into a Memory Buffer Instead of a File
Fastest Way to Sort Each Row of a Large Matrix in R
Are Data Tables with More Than 2^31 Rows Supported in R with the Data Table Package Yet
3D Equivalent of the Curve Function in R
Change the Order of Stacked Fill Columns in Ggplot2
Indexing Integer Vector with Na
"Object Not Found" Error Within a User Defined Function, Eval() Function
R - Unable to Install R Packages - Cannot Open the Connection
R: Formatting Plotly Hover Text
Logical Comparison of Two Vectors with Binary (0/1) Result
Aggregate by Multiple Columns and Reshape from Long to Wide
How to Split a Data Frame Among Columns, Say at Every Nth Column