## Geographic / geospatial distance between 2 lists of lat/lon points (coordinates)

To calculate the geographic distance between two points with latitude/longitude coordinates, you can use several formula's. The package `geosphere`

has the `distCosine`

, `distHaversine`

, `distVincentySphere`

and `distVincentyEllipsoid`

for calculating the distance. Of these, the `distVincentyEllipsoid`

is considered the most accurate one, but is computationally more intensive than the other ones.

With one of these functions, you can make a distance matrix. Based on that matrix you can then assign `locality`

names based on shortest distance with `which.min`

and the corresponding distance with `min`

(see for this the last part of the answer) like this:

`library(geosphere)`

# create distance matrix

mat <- distm(list1[,c('longitude','latitude')], list2[,c('longitude','latitude')], fun=distVincentyEllipsoid)

# assign the name to the point in list1 based on shortest distance in the matrix

list1$locality <- list2$locality[max.col(-mat)]

this gives:

`> list1`

longitude latitude locality

1 80.15998 12.90524 D

2 72.89125 19.08120 A

3 77.65032 12.97238 C

4 77.60599 12.90927 D

5 72.88120 19.08225 A

6 76.65460 12.81447 E

7 72.88232 19.08241 A

8 77.49186 13.00984 D

9 72.82228 18.99347 A

10 72.88871 19.07990 A

Another possibility is to assign the `locality`

based on the average longitude and latitude values of the `locality`

s in `list2`

:

`library(dplyr)`

list2a <- list2 %>% group_by(locality) %>% summarise_each(funs(mean)) %>% ungroup()

mat2 <- distm(list1[,c('longitude','latitude')], list2a[,c('longitude','latitude')], fun=distVincentyEllipsoid)

list1 <- list1 %>% mutate(locality2 = list2a$locality[max.col(-mat2)])

or with `data.table`

:

`library(data.table)`

list2a <- setDT(list2)[,lapply(.SD, mean), by=locality]

mat2 <- distm(setDT(list1)[,.(longitude,latitude)], list2a[,.(longitude,latitude)], fun=distVincentyEllipsoid)

list1[, locality2 := list2a$locality[max.col(-mat2)] ]

this gives:

`> list1`

longitude latitude locality locality2

1 80.15998 12.90524 D D

2 72.89125 19.08120 A B

3 77.65032 12.97238 C C

4 77.60599 12.90927 D C

5 72.88120 19.08225 A B

6 76.65460 12.81447 E E

7 72.88232 19.08241 A B

8 77.49186 13.00984 D C

9 72.82228 18.99347 A B

10 72.88871 19.07990 A B

As you can see, this leads in most (7 out of 10) occasions to another assigned `locality`

.

You can add the distance with:

`list1$near_dist <- apply(mat2, 1, min)`

or another approach with `max.col`

(which is highly probable faster):

`list1$near_dist <- mat2[matrix(c(1:10, max.col(-mat2)), ncol = 2)]`

# or using dplyr

list1 <- list1 %>% mutate(near_dist = mat2[matrix(c(1:10, max.col(-mat2)), ncol = 2)])

# or using data.table (if not already a data.table, convert it with 'setDT(list1)' )

list1[, near_dist := mat2[matrix(c(1:10, max.col(-mat2)), ncol = 2)] ]

the result:

`> list1`

longitude latitude locality locality2 near_dist

1: 80.15998 12.90524 D D 269966.8970

2: 72.89125 19.08120 A B 65820.2047

3: 77.65032 12.97238 C C 739.1885

4: 77.60599 12.90927 D C 9209.8165

5: 72.88120 19.08225 A B 66832.7223

6: 76.65460 12.81447 E E 0.0000

7: 72.88232 19.08241 A B 66732.3127

8: 77.49186 13.00984 D C 17855.3083

9: 72.82228 18.99347 A B 69456.3382

10: 72.88871 19.07990 A B 66004.9900

## How to convert the geospatial distance of two points (lat, lon) to a similarity measuremnt

A simple answer is to define the extremes, and then interpolate linearly between them. So you would have

- if
`dv == 0`

,`similarity = 1`

(same point) - if
`dv == MAX`

,`similarity = 0`

(antipodal points; you can take 20 000 000 m as an approximation for MAX, or use the exact value, whichever you prefer) - otherwise,
`similarity = (MAX - dv) / MAX`

You can use other interpolations besides linear - but the basic idea holds: this is easier than you make it seem. Any monotonous and continous function `f`

such that `f(0) == 0 && f(1) == 1`

will make `f((MAX - dv) / MAX)`

a similarity that matches your constraints.

## Function to calculate geospatial distance between two points (lat,long) using R

Loading the geosphere package you can use a number of different functions

`library(geosphere)`

distm(c(lon1, lat1), c(lon2, lat2), fun = distHaversine)

Also:

`distHaversine()`

distMeeus()

distRhumb()

distVincentyEllipsoid()

distVincentySphere()

...

## Calculating geographic distance between a list of coordinates (lat, lng)

Here's the solution I ended up using. It's called the Haversine (distance) function if you want to look up what it does for yourself.

I changed my approach a little as well. My input (`positions`

) is a list of tuple coordinates:

`def calculate_distance(positions):`

results = []

for i in range(1, len(positions)):

loc1 = positions[i - 1]

loc2 = positions[i]

lat1 = loc1[0]

lng1 = loc1[1]

lat2 = loc2[0]

lng2 = loc2[1]

degreesToRadians = (math.pi / 180)

latrad1 = lat1 * degreesToRadians

latrad2 = lat2 * degreesToRadians

dlat = (lat2 - lat1) * degreesToRadians

dlng = (lng2 - lng1) * degreesToRadians

a = math.sin(dlat / 2) * math.sin(dlat / 2) + math.cos(latrad1) * \

math.cos(latrad2) * math.sin(dlng / 2) * math.sin(dlng / 2)

c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))

r = 6371000

results.append(r * c)

return (sum(results) / 1000) # Converting from m to km

## Getting distance between two points based on latitude/longitude

**Edit:** Just as a note, if you just need a quick and easy way of finding the distance between two points, I strongly recommend using the approach described in Kurt's answer below instead of re-implementing Haversine -- see his post for rationale.

This answer focuses just on answering the specific bug OP ran into.

It's because in Python, all the trig functions use radians, not degrees.

You can either convert the numbers manually to radians, or use the `radians`

function from the math module:

`from math import sin, cos, sqrt, atan2, radians`

# approximate radius of earth in km

R = 6373.0

lat1 = radians(52.2296756)

lon1 = radians(21.0122287)

lat2 = radians(52.406374)

lon2 = radians(16.9251681)

dlon = lon2 - lon1

dlat = lat2 - lat1

a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2

c = 2 * atan2(sqrt(a), sqrt(1 - a))

distance = R * c

print("Result:", distance)

print("Should be:", 278.546, "km")

The distance is now returning the correct value of `278.545589351`

km.

## Evaluating the closest distance from one point between multiple options?

If you only want the nearest building to each person, and they're relatively close:

`library(sf)`

## load data here from @dcarlson's dput

person_location <- person_location %>%

st_as_sf(coords = c('longitude', 'latitude')) %>%

st_set_crs(4326)

building_location <- building_location %>%

st_as_sf(coords = c('longitude', 'latitude')) %>%

st_set_crs(4326)

st_nearest_feature(person_location, building_location)

#although coordinates are longitude/latitude, st_nearest_feature assumes that they #are planar

#[1] 6 2 6 6 8

So person 1,3 & 4 are closest to building #6. Person 2 -> building #2 ...

All distances can be calculated with `st_distance(person_location, building_location)`

.

You can use the `nngeo`

library to easily find the shortest distance for each person.

`library(nngeo)`

st_connect(person_location, building_location) %>% st_length()

Calculating nearest IDs

|===============================================================================================================| 100%

Calculating lines

|===============================================================================================================| 100%

Done.

Units: [m]

[1] 5054.381 5856.388 1923.254 1796.608 1976.786

Things are easier to understand with a graph:

`st_connect(person_location, building_location) %>% `

ggplot() +

geom_sf() +

geom_sf(data = person_location, color = 'green') +

geom_sf(data = building_location, color = 'red')

And even easier on a map:

`st_connect(person_location, building_location) %>% `

mapview::mapview() +

mapview::mapview(person_location, color = 'green', col.regions = 'green') +

mapview::mapview(building_location, color = 'black', col.regions = 'black')

geosphere is probably more accurate, but if you're dealing with relatively small areas these tools are probably good enough. I find it easier to work with, and don't often need extreme precision.

## Distm function for calculate distance between coordinates in R

This should work:

`library(geosphere)`

distm(df[,c('Longitude','Latitude')],

df1[,c('Longitude','Latitude')],

fun=distVincentyEllipsoid)

[,1] [,2] [,3] [,4]

[1,] 45461.49 23203.37 44300.99 10190.84

[2,] 60243.58 15053.19 53852.61 40763.35

[3,] 63272.26 22151.07 59016.34 32505.87

[4,] 56308.59 46393.08 59016.34 15048.01

The first row indicates the distance between property 1 and industries 1, 2, 3 and 4.

See also here:

Function to calculate geospatial distance between two points (lat,long) using R

Geographic / geospatial distance between 2 lists of lat/lon points (coordinates)

## Geographic / geospatial distance between 2 lists of lat/lon points (coordinates)

To calculate the geographic distance between two points with latitude/longitude coordinates, you can use several formula's. The package `geosphere`

has the `distCosine`

, `distHaversine`

, `distVincentySphere`

and `distVincentyEllipsoid`

for calculating the distance. Of these, the `distVincentyEllipsoid`

is considered the most accurate one, but is computationally more intensive than the other ones.

With one of these functions, you can make a distance matrix. Based on that matrix you can then assign `locality`

names based on shortest distance with `which.min`

and the corresponding distance with `min`

(see for this the last part of the answer) like this:

`library(geosphere)`

# create distance matrix

mat <- distm(list1[,c('longitude','latitude')], list2[,c('longitude','latitude')], fun=distVincentyEllipsoid)

# assign the name to the point in list1 based on shortest distance in the matrix

list1$locality <- list2$locality[max.col(-mat)]

this gives:

`> list1`

longitude latitude locality

1 80.15998 12.90524 D

2 72.89125 19.08120 A

3 77.65032 12.97238 C

4 77.60599 12.90927 D

5 72.88120 19.08225 A

6 76.65460 12.81447 E

7 72.88232 19.08241 A

8 77.49186 13.00984 D

9 72.82228 18.99347 A

10 72.88871 19.07990 A

Another possibility is to assign the `locality`

based on the average longitude and latitude values of the `locality`

s in `list2`

:

`library(dplyr)`

list2a <- list2 %>% group_by(locality) %>% summarise_each(funs(mean)) %>% ungroup()

mat2 <- distm(list1[,c('longitude','latitude')], list2a[,c('longitude','latitude')], fun=distVincentyEllipsoid)

list1 <- list1 %>% mutate(locality2 = list2a$locality[max.col(-mat2)])

or with `data.table`

:

`library(data.table)`

list2a <- setDT(list2)[,lapply(.SD, mean), by=locality]

mat2 <- distm(setDT(list1)[,.(longitude,latitude)], list2a[,.(longitude,latitude)], fun=distVincentyEllipsoid)

list1[, locality2 := list2a$locality[max.col(-mat2)] ]

this gives:

`> list1`

longitude latitude locality locality2

1 80.15998 12.90524 D D

2 72.89125 19.08120 A B

3 77.65032 12.97238 C C

4 77.60599 12.90927 D C

5 72.88120 19.08225 A B

6 76.65460 12.81447 E E

7 72.88232 19.08241 A B

8 77.49186 13.00984 D C

9 72.82228 18.99347 A B

10 72.88871 19.07990 A B

As you can see, this leads in most (7 out of 10) occasions to another assigned `locality`

.

You can add the distance with:

`list1$near_dist <- apply(mat2, 1, min)`

or another approach with `max.col`

(which is highly probable faster):

`list1$near_dist <- mat2[matrix(c(1:10, max.col(-mat2)), ncol = 2)]`

# or using dplyr

list1 <- list1 %>% mutate(near_dist = mat2[matrix(c(1:10, max.col(-mat2)), ncol = 2)])

# or using data.table (if not already a data.table, convert it with 'setDT(list1)' )

list1[, near_dist := mat2[matrix(c(1:10, max.col(-mat2)), ncol = 2)] ]

the result:

`> list1`

longitude latitude locality locality2 near_dist

1: 80.15998 12.90524 D D 269966.8970

2: 72.89125 19.08120 A B 65820.2047

3: 77.65032 12.97238 C C 739.1885

4: 77.60599 12.90927 D C 9209.8165

5: 72.88120 19.08225 A B 66832.7223

6: 76.65460 12.81447 E E 0.0000

7: 72.88232 19.08241 A B 66732.3127

8: 77.49186 13.00984 D C 17855.3083

9: 72.82228 18.99347 A B 69456.3382

10: 72.88871 19.07990 A B 66004.9900

### Related Topics

Convert Dataframe Column to 1 or 0 for "True"/"False" Values and Assign to Dataframe

Loop Through Data Frame and Variable Names

Using Ifelse Statement on the Whole Dataset Instead of a Single Column

Remove Unwanted Symbols from Expression Function - R

R - Getting Characters After Symbol

Coerce Multiple Columns to Factors At Once

Change Rows into Columns in R With Values Yes/No (1/0)

How to Generate the First N Terms in the Series:

Column Name Changes in R for Loop for Defined Data Frame

Installing Rgl on Ubuntu and Mac: X11 Not Found

R: Error in Usemethod("Group_By_"):Applied to an Object of Class

Calculate the Area Under a Curve

Replacing Na Values from Another Dataframe by Id

R Count Distinct Elements Based on Two Columns by Group

How to Output the Columns With the Maximum Value

How to Italicize Part (One or Two Words) of an Axis Title