Create a presence-absence matrix with presence on specific dates
We can try the code below
library(data.table)
setDT(df1)
setDT(df2)
na.omit(
dcast(
df1[df2, .(Date, ID), on = .(Start < Date, End > Date)][df1, on = .(ID)],
Date ~ ID,
fun.aggregate = length
)
)
which gives
Date Afr Ahe Art
1: 2015-07-01 1 0 0
2: 2015-07-02 1 0 1
3: 2015-07-03 1 0 1
Data
> dput(df1)
structure(list(ID = c("Afr", "Ahe", "Art"), Start = structure(c(16615,
17153, 16617), class = "Date"), End = structure(c(16847, 17586,
18382), class = "Date")), class = "data.frame", row.names = c(NA,
-3L))
> dput(df2)
structure(list(Date = structure(c(16617, 16618, 16619), class = "Date")), class = "data.frame", row.names = c(NA,
-3L))
How to transform a dataset into a presence/absence matrix?
Here's a tidy solution:
library(stringr)
library(dplyr)
library(tidyr)
dat <- data.frame(
species = c("species_1", "species_1, species_2", "species_2, species_3"),
year = c(2000, 2003, 2005)
)
library(stringr)
dat %>%
rowwise() %>%
mutate(species = list(str_split(species, ",")[[1]])) %>%
unnest(species) %>%
mutate(species = trimws(species),
value=1) %>%
pivot_wider(names_from="species", values_fill = 0)
#> # A tibble: 3 × 4
#> year species_1 species_2 species_3
#> <dbl> <dbl> <dbl> <dbl>
#> 1 2000 1 0 0
#> 2 2003 1 1 0
#> 3 2005 0 1 1
Created on 2022-06-30 by the reprex package (v2.0.1)
How to generate species presence/absence matrix from lat/long data using R
Easy enough to do with tidyverse. First some example data:
library(tidyverse)
df <- tibble(
Sp = c('SP1', 'SP1', 'SP2', 'SP2', 'SP2'),
Long = c(118, 119, 118, 119, 119),
Lat = c(10, 11, 10, 11, 12)
)
Sp Long Lat
<chr> <dbl> <dbl>
1 SP1 118 10
2 SP1 119 11
3 SP2 118 10
4 SP2 119 11
5 SP2 119 12
And then a pivot operation. spread
has recently been superseded by pivot_wider
in tidyr (though spread
will still be supported for now).
df2 <- df %>%
mutate(present = 1) %>% # create a dummy column
pivot_wider(names_from = Sp, values_from = present) %>% # turn 'Sp' column into 'SP1' and 'SP2'
mutate_at(vars(SP1, SP2), ~ifelse(is.na(.), 0, 1)) # fill in missing columns with 0
Long Lat SP1 SP2
<dbl> <dbl> <dbl> <dbl>
1 118 10 1 1
2 119 11 1 1
3 119 12 0 1
Creating a 'presence-absence' matrix from a pandas dataframe
>>> pd.crosstab(df['Site'], df['Species'])
Species Neofelis Panthera
Site
A 0 1
B 1 1
C 0 1
D 1 0
How to create a presence-absence matrix?
We can use dcast
library(reshape2)
dcast(df1, Site~Species, length)
Presence-absence matrix
According to the error message, the issue is probably with records
in lets.presab.points
. It expects a list of species, but you are trying to give it a dataframe. So, in your example code, species
is a character vector, so it needs to be the same format for your code too. So, you might need to do something like this (although I'm uncertain what the format of your records
data is):
library(letsR)
PAM2 <- lets.presab.points(xy, records$species, xmn = -11.5, xmx = 3,
ymn = 49, ymx = 61)
species
needs to be the same length as xy
.
Create a presence/absence column based on presence records
Create an occupancy
column with value as 1 and use complete
to create the combinations and fill
to fill the missing values.
library(dplyr)
library(tidyr)
df %>%
mutate(occupancy = 1) %>%
complete(sample, family, fill = list(occupancy = 0)) %>%
group_by(sample) %>%
fill(site, n_days, .direction = 'updown') %>%
ungroup
# sample family site n_days occupancy
# <chr> <chr> <chr> <int> <dbl>
# 1 A_1_17/06/12 U A1 3 0
# 2 A_1_17/06/12 V A1 3 0
# 3 A_1_17/06/12 W A1 3 0
# 4 A_1_17/06/12 X A1 3 1
# 5 A_1_17/06/12 Y A1 3 1
# 6 A_1_17/06/12 Z A1 3 1
# 7 A_1_22/02/2011 U A1 3 0
# 8 A_1_22/02/2011 V A1 3 1
# 9 A_1_22/02/2011 W A1 3 0
#10 A_1_22/02/2011 X A1 3 1
# … with 14 more rows
Create a presence/absence matrix from two variables of a dataframe but adding the information of one third variable from the df instead of value 1
Try dcast
from reshape2
library(reshape2)
dcast(mydf, day~paste0('ind_', individual),
value.var='weight', sum, fill=NA_real_)
# day ind_1 ind_2 ind_3 ind_4 ind_5 ind_6 ind_7
#1 1 20 18 36 36 41 NA NA
#2 2 25 NA 40 NA 46 30 12
and for 'length'
dcast(mydf, day~paste0('ind_', individual),
value.var='length', sum, fill=NA_integer_)
# day ind_1 ind_2 ind_3 ind_4 ind_5 ind_6 ind_7
#1 1 12 23 26 15 56 NA NA
#2 2 16 NA 30 NA 60 30 35
Or using base R
xtabs(weight~day+individual, mydf)
Related Topics
Why Does As.Matrix Add Extra Spaces When Converting Numeric to Character
How to Capture the Output of System()
Data.Table := Assignments When Variable Has Same Name as a Column
Testing a Function That Uses Enquo() for a Null Parameter
How to Modify Unexported Object in a Package
R Specify Function Environment
Row-Wise Sum of Values Grouped by Columns with Same Name
Subset Dataframe Based on Posixct Date and Time Greater Than Datetime Using Dplyr
R: Calculate Means for Subset of a Group
Group Rows in Data Frame Based on Time Difference Between Consecutive Rows
How to Pass Individual 'Curvature' Arguments in 'Ggplot2' 'Geom_Curve' Function
How to Pad a Vector with Na from the Front
Using Anti_Join() from the Dplyr on Two Tables from Two Different Databases
How to Adapt a Latex Beamer Theme to Apply It in an Rmarkdown::Beamer_Presentation
Car::Scatter3D in R - Labeling Axis Better
Store Arrangegrob to Object, Does Not Create Printable Object
Rhtml: Warning: Conversion Failure on '<Var>' in 'Mbcstosbcs': Dot Substituted for <Var>