Calculating All Distances Between One Point and a Group of Points Efficiently in R

Calculating all distances between one point and a group of points efficiently in R

Rather than iterating across data points, you can just condense that to a matrix operation, meaning you only have to iterate across K.

# Generate some fake data.
n <- 3823
K <- 10
d <- 64
x <- matrix(rnorm(n * d), ncol = n)
centers <- matrix(rnorm(K * d), ncol = K)

system.time(
  dists <- apply(centers, 2, function(center) {
    colSums((x - center)^2)
})
)

Runs in:

utilisateur     système      écoulé 
      0.100       0.008       0.108

on my laptop.

Calculate distance from one point to the others by R

See ?distm: you can use two sets of points:

distm(coordinaties[6,2:1],coordinaties[-6,2:1])

         [,1]     [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]    [,10]    [,11]    [,12]
[1,] 11075.61 11075.61    0    0    0    0    0    0    0 10183.02 10183.02 10183.02
        [,13]    [,14] [,15] [,16] [,17] [,18]
[1,] 10183.02 10183.02     0     0     0     0

Calculate minimum distance between groups of points in data frame

One way in base R using combn :

do.call(rbind, combn(unique(df$Group), 2, function(x) {
  df1 <- subset(df, Group == x[1])
  df2 <- subset(df, Group == x[2])
  df3 <- merge(df1, df2, by = 'Time')
  value <- abs(df3$Value.x - df3$Value.y)
  data.frame(combn = paste(x, collapse = ''), 
             time = df3$Time[which.max(value)],
             max_difference = max(value))
}, simplify = FALSE))

#  combn time max_difference
#1    AB    1              4
#2    AC    0              8
#3    BC    0              5

We create all combination of unique Group values, subset the data for them and merge them on Time. Subtract the corresponding value columns and return the max difference between them.

data

df <- structure(list(Time = c(0L, 1L, 2L, 0L, 1L, 2L, 0L, 0L, 0L), 
    Value = c(1, 2, 3, 4, 6, 6, 7, 7, 9), Group = c("A", "A", 
    "A", "B", "B", "B", "C", "C", "C")), 
    class = "data.frame", row.names = c(NA, -9L))

R: Calculate distance between consecutive points per group and group them

I worked out a little use case that can get you started. It is a base approach using a for loop and aggregation based on vector of columns to which you could apply a paired vector of functions by which to aggregate.

df <- read.table(text = "
Group    X   Y  Z  Distance
1   110 3762 431 10        NA
2   112 4950 880 10        NA
3   113 5062 873 20        NA
4   113 5225 874 30 163.00307
5   113 5262 875 10  37.01351
6   113 5300 874 20  38.01316
7   114 5300 874 30  NA
8   114 5300 874 20  38.01316", header = T, stringsAsFactors = F)

aggregateIt <- function(df = data, #data.frame
                        returnRaw = F, #to get the raw unaggregted df (only first case from column `grouped` by `subgroup` usable in this application)
                        colsToAgg = c("Z1", "Z2", "Z3"), #cols to aggregate
                        how = c("sum", "sum", "max")) #how to aggregate the columns, `Z1` by sum, `Z2` by sum and `Z3` by max
  {
  count <- 1L
  result <- vector("integer", nrow(df))
  grouped <- vector("character", nrow(df))
  for(i in seq_len(length(result)-1L)){
    if(df$Group[i] != df$Group[i+1L]) {
      result[i] <- count
      grouped[i] <- "no"
      count <- count + 1L
      if((i+1L) == length(result)) {
        result[i+1L] <- count
        grouped[i+1L] <- "no"
      }
    } else {
        if(df$Distance[i+1L] > 100L) {
          result[i] <- count
          grouped[i] <- "no"
          count <- count + 1L
          if((i+1L) == length(result)) {
            result[i+1L] <- count
            grouped[i+1L] <- "no"
          }
        } else {
          result[i] <- count
          grouped[i] <- "yes"
          if((i+1L) == length(result)) {
            result[i+1L] <- count
            grouped[i+1L] <- "yes"
          }
        }
    }
  }
  df <- within(df, {subgroup <- result; grouped <- grouped})
  if(returnRaw) return(df)
  A <- Reduce(function(a, b) merge(a, b, by = "subgroup"), 
         lapply(seq_along(how), function(x) aggregate(.~subgroup, df[, c(colsToAgg[x], "subgroup")], how[x])))
  B <- df[!duplicated(df$subgroup, fromLast = F), c("Group", "subgroup", "grouped")]
  out <- merge(A, B, by = "subgroup")
  return(out[, c("Group", colsToAgg, "grouped")])
}

aggregateIt(df = df, colsToAgg = "Z", how = "sum")
#  Group  Z grouped
#1   110 10      no
#2   112 10      no
#3   113 20      no
#4   113 60     yes
#5   114 50     yes

Not claiming this is most efficient solution but it points out the solution. Hope this helps!

Calculate distance of one point in DF with all other points in R

If your points exist in 2D space (e.g. Euclidean), then you can use the Cluster package:

library(cluster)
data(agriculture)

##  Dissimilarities using Euclidean metric
d.agr <- daisy(agriculture, metric = "euclidean")
as.matrix(d.agr)

The final matrix will give you the "distance" between each point, according to the metric you set (Euclidean in the above example).

Calculate the distances between pairs of points in r

rbind(x,y) has 2 rows, 10 columns and is interpreted as 2 points in 10-dimensional space. dist(rbind(x,y)) is calculating the Euclidean distance between these 2 points.

Calculating the distance between two long/lat points in the same data.frame

This is a easily solved with the distGeo function (similar to your functions above) from geosphere package:

library(geosphere)
#calculate distances in meters
df$distance<-distGeo(df[,c("lon1", "lat1")], df[,c("lon2", "lat2")])

#remove columns
df[, -c(3:6)]

customer_id      id distance
1   353808874    8474 498.2442
2    69516747    8107 668.4088
3   357032052 1617436 366.9541
4   307735090    7698 531.0785
5   307767260 1617491 343.3051

Distance between a matrix of points, simple if & for's

You could use the dist function:

df  <- data.frame(easting=easting,northing = northing)
dist(df) # or round(dist(df,upper=T,diag=T),3)

example for the first three rows:

round(dist(df[1:3,], upper=T,diag=T),3)

        1       2       3
1   0.000 310.409 581.588
2 310.409   0.000 271.221
3 581.588 271.221   0.000

Comparison:

round(dist(df[1:3,]),3)

        1       2
2 310.409        
3 581.588 271.221

Calculating distance between coordinates and reference point

points_in_circle() returns the points within a given radius from a reference point. The following returns all points within 1000km from the reference point:


library(spatialrisk)
points_in_circle(df, lat_center = 52.92343, lon_center = 5.04127, 
                 lon = Longitude, lat = Latitude, radius = 1e6)
#>         Day Month Year            Location.Receiver    Transmitter
#> 1095729  26    07 2021         Den Oever Ijsselmeer A69-1602-59776
#> 1072657  17    08 2021         Den Oever Ijsselmeer A69-1602-59776
#> 1092667  18    08 2021         Den Oever Ijsselmeer A69-1602-59776
#> 716601   19    08 2021         Den Oever Ijsselmeer A69-1602-59769
#> 1077415  19    08 2021         Den Oever Ijsselmeer A69-1602-59776
#> 1180267  05    08 2021 Medemblik Ijsselmeer, gemaal A69-1602-59777
#>         Batch.location BatchNr Latitude Longitude       Date distance_m
#> 1095729      Den Oever       8 52.92343   5.04127 2021-07-26       0.00
#> 1072657      Den Oever       8 52.92343   5.04127 2021-08-17       0.00
#> 1092667      Den Oever       8 52.92343   5.04127 2021-08-18       0.00
#> 716601       Den Oever       1 52.92343   5.04127 2021-08-19       0.00
#> 1077415      Den Oever       8 52.92343   5.04127 2021-08-19       0.00
#> 1180267      Den Oever       9 52.76098   5.12172 2021-08-05   18875.55

^{Created on 2021-12-02 by the reprex package (v2.0.1)}

Calculating All Distances Between One Point and a Group of Points Efficiently in R