How to Create a Presence-Absence Matrix

Create a presence-absence matrix with presence on specific dates

We can try the code below

library(data.table)

setDT(df1)
setDT(df2)

na.omit(
dcast(
df1[df2, .(Date, ID), on = .(Start < Date, End > Date)][df1, on = .(ID)],
Date ~ ID,
fun.aggregate = length
)
)

which gives

         Date Afr Ahe Art
1: 2015-07-01 1 0 0
2: 2015-07-02 1 0 1
3: 2015-07-03 1 0 1

Data

> dput(df1)
structure(list(ID = c("Afr", "Ahe", "Art"), Start = structure(c(16615,
17153, 16617), class = "Date"), End = structure(c(16847, 17586,
18382), class = "Date")), class = "data.frame", row.names = c(NA,
-3L))

> dput(df2)
structure(list(Date = structure(c(16617, 16618, 16619), class = "Date")), class = "data.frame", row.names = c(NA,
-3L))

How to transform a dataset into a presence/absence matrix?

Here's a tidy solution:

library(stringr)
library(dplyr)
library(tidyr)
dat <- data.frame(
species = c("species_1", "species_1, species_2", "species_2, species_3"),
year = c(2000, 2003, 2005)
)
library(stringr)
dat %>%
rowwise() %>%
mutate(species = list(str_split(species, ",")[[1]])) %>%
unnest(species) %>%
mutate(species = trimws(species),
value=1) %>%
pivot_wider(names_from="species", values_fill = 0)
#> # A tibble: 3 × 4
#> year species_1 species_2 species_3
#> <dbl> <dbl> <dbl> <dbl>
#> 1 2000 1 0 0
#> 2 2003 1 1 0
#> 3 2005 0 1 1

Created on 2022-06-30 by the reprex package (v2.0.1)

How to generate species presence/absence matrix from lat/long data using R

Easy enough to do with tidyverse. First some example data:

library(tidyverse)

df <- tibble(
Sp = c('SP1', 'SP1', 'SP2', 'SP2', 'SP2'),
Long = c(118, 119, 118, 119, 119),
Lat = c(10, 11, 10, 11, 12)
)

Sp Long Lat
<chr> <dbl> <dbl>
1 SP1 118 10
2 SP1 119 11
3 SP2 118 10
4 SP2 119 11
5 SP2 119 12

And then a pivot operation. spread has recently been superseded by pivot_wider in tidyr (though spread will still be supported for now).

df2 <- df %>% 
mutate(present = 1) %>% # create a dummy column
pivot_wider(names_from = Sp, values_from = present) %>% # turn 'Sp' column into 'SP1' and 'SP2'
mutate_at(vars(SP1, SP2), ~ifelse(is.na(.), 0, 1)) # fill in missing columns with 0

Long Lat SP1 SP2
<dbl> <dbl> <dbl> <dbl>
1 118 10 1 1
2 119 11 1 1
3 119 12 0 1

Creating a 'presence-absence' matrix from a pandas dataframe

>>> pd.crosstab(df['Site'], df['Species'])
Species Neofelis Panthera
Site
A 0 1
B 1 1
C 0 1
D 1 0

How to create a presence-absence matrix?

We can use dcast

library(reshape2)
dcast(df1, Site~Species, length)

Presence-absence matrix

According to the error message, the issue is probably with records in lets.presab.points. It expects a list of species, but you are trying to give it a dataframe. So, in your example code, species is a character vector, so it needs to be the same format for your code too. So, you might need to do something like this (although I'm uncertain what the format of your records data is):

library(letsR)

PAM2 <- lets.presab.points(xy, records$species, xmn = -11.5, xmx = 3,
ymn = 49, ymx = 61)

species needs to be the same length as xy.

Create a presence/absence column based on presence records

Create an occupancy column with value as 1 and use complete to create the combinations and fill to fill the missing values.

library(dplyr)
library(tidyr)

df %>%
mutate(occupancy = 1) %>%
complete(sample, family, fill = list(occupancy = 0)) %>%
group_by(sample) %>%
fill(site, n_days, .direction = 'updown') %>%
ungroup

# sample family site n_days occupancy
# <chr> <chr> <chr> <int> <dbl>
# 1 A_1_17/06/12 U A1 3 0
# 2 A_1_17/06/12 V A1 3 0
# 3 A_1_17/06/12 W A1 3 0
# 4 A_1_17/06/12 X A1 3 1
# 5 A_1_17/06/12 Y A1 3 1
# 6 A_1_17/06/12 Z A1 3 1
# 7 A_1_22/02/2011 U A1 3 0
# 8 A_1_22/02/2011 V A1 3 1
# 9 A_1_22/02/2011 W A1 3 0
#10 A_1_22/02/2011 X A1 3 1
# … with 14 more rows

Create a presence/absence matrix from two variables of a dataframe but adding the information of one third variable from the df instead of value 1

Try dcast from reshape2

library(reshape2)
dcast(mydf, day~paste0('ind_', individual),
value.var='weight', sum, fill=NA_real_)
# day ind_1 ind_2 ind_3 ind_4 ind_5 ind_6 ind_7
#1 1 20 18 36 36 41 NA NA
#2 2 25 NA 40 NA 46 30 12

and for 'length'

dcast(mydf, day~paste0('ind_', individual),
value.var='length', sum, fill=NA_integer_)
# day ind_1 ind_2 ind_3 ind_4 ind_5 ind_6 ind_7
#1 1 12 23 26 15 56 NA NA
#2 2 16 NA 30 NA 60 30 35

Or using base R

xtabs(weight~day+individual, mydf)


Related Topics



Leave a reply



Submit