How to Calculate Euclidean Distance Between Two Matrices in R

how to calculate Euclidean distance between two matrices in R

You can use the package pdist:

library(pdist)
dists <- pdist(t(mat1), t(mat2))
as.matrix(dists)
[,1] [,2] [,3]
[1,] 9220.40 9260.735 8866.033
[2,] 12806.35 12820.086 12121.927
[3,] 11630.86 11665.869 11155.823

this will give you all Euclidean distances of the pairs: (mat1$x,mat2$x), (mat1$x,mat2$y),..., (mat1$z,mat2$z)

Calculate Euclidean distance between multiple pairs of points in dataframe in R

Try this out if you would just like to add another column to your dataframe

testdf$distance <- sqrt((P^2 + S^2))

How to calculate the euclidean distance in R between two matrices each with unequal dimensions

You just need to change your for loop, so it calculates for each row all three columns of the result matrix:

for(i in 1:nrow(matA)) 
{
resultMatrix[i,1] <- sqrt(rowSums((t(MatrixA[i,])-MatrixB[i,1:2])^2))
resultMatrix[i,2] <- sqrt(rowSums((t(MatrixA[i,])-MatrixB[i,3:4])^2))
resultMatrix[i,3] <- sqrt(rowSums((t(MatrixA[i,])-MatrixB[i,5:6])^2))

}

Generalized for an arbitrary number of columns:

for(i in 1:nrow(MatrixA)) 
{
for(j in 1:((dim(MatrixB)[2])/2))
{
k = (j * 2) - 1
resultMatrix[i,j] <- sqrt(rowSums((t(MatrixA[i,])-MatrixB[i,k:(k+1)])^2))
}
}

How to use apply function to calculate the distance between two matrices

Use two apply instances with the second nested in the first:

d1 <- apply(xtest, 1, function(x) apply(xtrain, 1, function(y) sqrt(crossprod(x-y))))

Check against pdist:

library(pdist)
d2 <- as.matrix(pdist(xtrain, xtest))

all.equal(d1, d2, tolerance = 1e-7)
## [1] TRUE

Distance matrix from two separate data frames

Perhaps you could use the fields package: the function rdist might do what you want:

rdist : Euclidean distance matrix

Description: Given two sets of locations computes the Euclidean distance matrix among all pairings.

> rdist(df1, df2)
[,1] [,2] [,3] [,4] [,5]
[1,] 4.582576 6.782330 2.000000 1.732051 2.828427
[2,] 4.242641 5.744563 1.732051 0.000000 1.732051
[3,] 4.123106 5.099020 3.464102 3.316625 4.000000
[4,] 5.477226 5.000000 4.358899 3.464102 3.316625
[5,] 7.000000 5.477226 5.656854 4.358899 3.464102

Similar is the case with the pdist package

pdist : Distances between Observations for a Partitioned Matrix

Description: Computes the euclidean distance between rows of a matrix X and rows of another matrix Y.

> pdist(df1, df2)
An object of class "pdist"
Slot "dist":
[1] 4.582576 6.782330 2.000000 1.732051 2.828427 4.242640 5.744563 1.732051
[9] 0.000000 1.732051 4.123106 5.099020 3.464102 3.316625 4.000000 5.477226
[17] 5.000000 4.358899 3.464102 3.316625 7.000000 5.477226 5.656854 4.358899
[25] 3.464102
attr(,"Csingle")
[1] TRUE

Slot "n":
[1] 5

Slot "p":
[1] 5

Slot ".S3Class":
[1] "pdist"
#

NOTE: If you're looking for the Euclidean norm between rows, you might want to try:

a <- c(1,2,3,4,5)
b <- c(5,4,3,2,1)
c <- c(5,4,1,2,3)
df1 <- rbind(a, b, c)

a2 <- c(2,7,1,2,3)
b2 <- c(7,6,5,4,3)
c2 <- c(1,2,3,4,5)
df2 <- rbind(a2,b2,c2)

rdist(df1, df2)

This gives:

> rdist(df1, df2)
[,1] [,2] [,3]
[1,] 6.164414 7.745967 0.000000
[2,] 5.099020 4.472136 6.324555
[3,] 4.242641 5.291503 5.656854

Compute the Euclidean distances between corresponding rows of 2 matrices

There is indeed a nice way to do that. Let

A <- matrix(rnorm(4 * 8), nrow = 4, ncol = 8)
B <- matrix(rnorm(4 * 8), nrow = 4, ncol = 8)

Then

sqrt(rowSums((A - B)^2))
# [1] 3.295312 3.222073 6.857711 2.991980

where A - B does element-wise subtraction, we may square the resulting matrix element-wise, compute the row sums of this matrix using rowSums and take the square root element-wise.

(Speed Challenge) Any faster method to calculate distance matrix between rows of two matrices, in terms of Euclidean distance?

method_XXX <- function() {
sqrt(outer(rowSums(x^2), rowSums(y^2), '+') - tcrossprod(x, 2 * y))
}

Unit: relative
expr min lq mean median uq max
method_ThomasIsCoding_v1() 12.151624 10.486417 9.213107 10.162740 10.235274 5.278517
method_ThomasIsCoding_v2() 6.923647 6.055417 5.549395 6.161603 6.140484 3.438976
method_ThomasIsCoding_v3() 7.133525 6.218283 5.709549 6.438797 6.382204 3.383227
method_AllanCameron() 7.093680 6.071482 5.776172 6.447973 6.497385 3.608604
method_XXX() 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000

How to calculate Euclidian distance between two points defined by matrix containing x, y?

If you just want a vector, something like this will work for you.

Try something like this:

euc.dist <- function(x1, x2) sqrt(sum((x1 - x2) ^ 2))

library(foreach)
foreach(i = 1:nrow(x1), .combine = c ) %do% euc.dist(x1[i,],x2[i,])

This will work for any dimensions.

If you don't want to use foreach, you can use a simple loop:

dist <- NULL
for(i in 1:nrow(x1)) dist[i] <- euc.dist(x1[i,],x2[i,])
dist

Although, I would recommend foreach (because it's very easy to for various tasks like this). Read more about it in the documentation of the package.

Calculate euclidean distance with R

How about something like this:

First, I'll make some fake data

set.seed(4304)
df <- data.frame(
x = runif(1000, -1, 1),
y = runif(1000, -1, 1),
z = runif(1000, -1,1)
)

Make a sequence of values from 1 to the number of rows of your dataset by 2s.

s <- seq(1, nrow(df), by=2)

Use sapply() to make the distance between each pair of points.

out <- sapply(s, function(i){
sqrt(sum((df[i,] - df[(i+1), ])^2))
})

Organize the distances into a data frame

res <- data.frame(
pair = paste(rownames(df)[s], rownames(df)[(s+1)], sep="-"),
dist=out)
head(res)
# pair dist
# 1 1-2 1.379992
# 2 3-4 1.303511
# 3 5-6 1.242302
# 4 7-8 1.257228
# 5 9-10 1.107484
# 6 11-12 1.392247

How to calculate the max euclidean distance between two elements in a matrix - R?

Here is one approach that avoids looking at each element of the matrix by a for loop.

# set up
set.seed(123)
n <- 100
m <- matrix(sample(c(1,0), size = n^2, replace = TRUE), n, n)

# find the ones in the matrix and calculates the distances
ind <- which(m==1, arr.ind=TRUE)
dists <- dist(ind) # default euclidean

# look for the largest entry, and convert it to index position
ind1d <- which.max(dists)
ind2d <- arrayInd(ind1d, .dim=rep(nrow(ind),2))

# get answer
ans <- ind[as.vector(ind2d),]
ans

# row col
#[1,] 98 100
#[2,] 1 1


Related Topics



Leave a reply



Submit