Zip or Enumerate in R

zipping lists in R

I think you're looking for mapply:

   ‘mapply’ is a multivariate version of ‘sapply’.  ‘mapply’ applies
‘FUN’ to the first elements of each ... argument, the second
elements, the third elements, and so on. Arguments are recycled
if necessary.

For your example, use mapply(f, A, B)

zip/unzip functions in R

The purrr package attempts to provide a lot of FP primitives. purrr's version of zip is called transpose().

 L1 <- list(as.list(1:3),as.list(9:7))
library(purrr)
(L2 <- transpose(L1))
## List of 3
## $ :List of 2
## ..$ : int 1
## ..$ : int 9
## $ :List of 2
## ..$ : int 2
## ..$ : int 8
## $ :List of 2
## ..$ : int 3
## ..$ : int 7
identical(transpose(L2),L1) ## TRUE

transpose() also works on your second (unzip) example.

Which is more pythonic in a for loop: zip or enumerate?

No doubt, zip is more pythonic. It doesn't require that you use a variable to store an index (which you don't otherwise need), and using it allows handling the lists uniformly, while with enumerate, you iterate over one list, and index the other list, i.e. non-uniform handling.

However, you should be aware of the caveat that zip runs only up to the shorter of the two lists. To avoid duplicating someone else's answer I'd just include a reference here: someone else's answer.

@user3100115 aptly points out that in python2, you should prefer using itertools.izip over zip, due its lazy nature (faster and more memory efficient). In python3 zip already behaves like py2's izip.

R equivalent of [x[y] for x,y in zip(i,j)] in python?

In R you use Map function:

i = list(c(1,2,3),c(2,3,4),c(2,4,2))
j = c(2,3,1)

Map(`[`, i, j)
[[1]]
[1] 2

[[2]]
[1] 4

[[3]]
[1] 2

You can also use mapply which returns a vector instead of a list:

mapply(`[`, i, j)
[1] 2 4 2

How do I add a loop when using R to scrape data?

The comment of @r2evans already provides an answer. Since the @ShanCham asked how to actually implement this I wanted to guide with the following code, which is just more verbose than the comment and could therefore not be posted as additional comment.

library(rvest)

#only two exemplary zipcodes, could be more, of course
zipcodes <- c("02110", "02125")

crime <- lapply(zipcodes, function(z) {

site <- read_html(paste0("https://www.trulia.com/real_estate/",z,"-Boston/crime/"))

#for illustrative purposes:
#introduced as.numeric to numeric columns
#exluded some of your other columns and shortenend the current text in type
data.frame(zip = z,
theft = site %>% html_nodes(".crime-text-0") %>% html_text() %>% as.numeric(),
assault = site %>% html_nodes(".crime-text-1") %>% html_text() %>% as.numeric() ,
type = site %>% html_nodes(".clearfix") %>% html_text() %>% paste(collapse = " ") %>% substr(1, 50) ,
stringsAsFactors=FALSE)
})

class(crime)
#list

#Output are lists that can be bound together to one data.frame
crime <- do.call(rbind, crime)

#crime is a data.frame, hence, classes/types are kept
class(crime$type)
# [1] "character"
class(crime$assault)
# [1] "numeric"

Is there more to enumerate() than just zip(range(len()))?

Because not every iterable has a length.

>>> def countdown(x):
... while x >= 0:
... yield x
... x -= 1
...
>>> down = countdown(3)
>>> len(down)
Traceback (most recent call last):
[...]
TypeError: object of type 'generator' has no len()
>>> enum = enumerate(down)
>>> next(enum)
(0, 3)
>>> next(enum)
(1, 2)

This is a trivial example, of course. But I could think of lots of real world objects where you can't reasonably pre-compute a length. Either because the length is infinite (see itertools.count) or because the object you are iterating over does itself not know when the party is over.

Your iterator could be fetching chunks of data from a remote database of unknown size or to which the connection might get lost without warning. Or it could process user input.

def get_user_input():
while True:
i = input('input value or Q to quit: ')
if i == 'Q':
break
yield i

You cannot get the length of get_user_input(), but you can enumerate all inputs as you fetch them via next (or iteration).

Python zip pattern in R

You can emulate the python zip with a data.frame like this::

medias_name <- c("print", "ooh", "tv", "digital")
medias_img <- c("Print.png", "Ooh.png", "Tv.png", "Digital.png")

myenv <- new.env()
zip_df <- data.frame(name=medias_name, img=medias_img, stringsAsFactors=F)
for (i in 1:nrow(zip_df)){
myenv[[zip_df[i, 'name']]] <- zip_df[i, 'img']
}

So we get::

> zip_df
name img
1 print Print.png
2 ooh Ooh.png
3 tv Tv.png
4 digital Digital.png

> ls(myenv)
[1] "digital" "ooh" "print" "tv"

> myenv$tv
[1] "Tv.png"
>

Another solution is to emulate zip with a list to pack items::

medias_name <- c("print", "ooh", "tv", "digital")
medias_img <- c("Print.png", "Ooh.png", "Tv.png", "Digital.png")

myenv <- new.env()
ziplist <- list(medias_name, medias_img)
for (i in 1:length(ziplist[[1]])){
name <- ziplist[[1]][i]
img <- ziplist[[2]][i]
myenv[[name]] <- img
}


Related Topics



Leave a reply



Submit