Pandas Latitude-Longitude to Distance Between Successive Rows

Pandas Latitude-Longitude to distance between successive rows

you can use this great solution (c) @derricw (don't forget to upvote it ;-):

# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of http://stackoverflow.com/a/29546836/2901002

Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)

All (lat, lon) coordinates must have numeric dtypes and be of equal length.

"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

return earth_radius * 2 * np.arcsin(np.sqrt(a))

df['dist'] = \
haversine(df.LAT.shift(), df.LONG.shift(),
df.loc[1:, 'LAT'], df.loc[1:, 'LONG'])

Result:

In [566]: df
Out[566]:
Ser_Numb LAT LONG dist
0 1 74.166061 30.512811 NaN
1 2 72.249672 33.427724 232.549785
2 3 67.499828 37.937264 554.905446
3 4 84.253715 69.328767 1981.896491
4 5 72.104828 33.823462 1513.397997
5 6 63.989462 51.918173 1164.481327
6 7 80.209112 33.530778 1887.256899
7 8 68.954132 35.981256 1252.531365
8 9 83.378214 40.619652 1606.340727
9 10 68.778571 6.607066 1793.921854

UPDATE: this will help to understand the logic:

In [573]: pd.concat([df['LAT'].shift(), df.loc[1:, 'LAT']], axis=1, ignore_index=True)
Out[573]:
0 1
0 NaN NaN
1 74.166061 72.249672
2 72.249672 67.499828
3 67.499828 84.253715
4 84.253715 72.104828
5 72.104828 63.989462
6 63.989462 80.209112
7 80.209112 68.954132
8 68.954132 83.378214
9 83.378214 68.778571

calculate distance between latitude longitude columns for pandas data frame

Assuming you have more than a single row for which you would like to compute the distance you can use apply as follows:

df['Dist'] = df.apply(lambda row: h3.point_dist((row['lat1'], row['long1']), (row['lat2'], row['long2'])), axis=1)

Which will add a column to your dataframe simi9lar to the following:

      lat1        long1      lat2       long2        Dist
0 52.229676 21.012229 52.406374 16.925168 2.796556
1 57.229676 30.001176 48.421365 17.256314 6.565542

Please note, my distance calculations may not agree with yours, since I used a dummy function for h3.point_dist computation

Calculate distance of successive row AND group by column

Your function has been slightly changed to return a DataFrame, then a groupby and an apply can do the job :

>>> def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
... if to_radians:
... lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
... a = np.sin((lat2-lat1)/2.0)**2+ np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
... return pd.DataFrame(earth_radius *2 * np.arcsin(np.sqrt(a)))

>>> df['dist'] = (df.groupby(["Group"])
... .apply(lambda x: haversine(x['LAT'],
... x['LONG'],
... x['LAT'].shift(),
... x['LONG'].shift())).values)
>>> df
Group ID LAT LONG dist
0 1 1 74.166061 30.512811 NaN
1 1 2 72.249672 33.427724 232.695882
2 1 3 67.499828 37.937264 555.254059
3 1 4 84.253715 69.328767 1983.141596
4 2 5 72.104828 33.823462 NaN
5 2 6 63.989462 51.918173 1165.212900
6 2 7 80.209112 33.530778 1888.442548
7 2 8 68.954132 35.981256 1253.318254
8 2 9 83.378214 40.619652 1607.349894
9 2 0 68.778571 6.607066 1795.048866

Longitude and Latitude Distance Calculation between 2 dataframes

The following code worked for me:

a=list(range(19))

for i in a:
Lat1=df1[i,2] #works down 3rd column
Lon1=df1[i,3] #works down 4th column
Lat2=df2['Latitude']
Lon2= df2['Longitude']

#the i in the below piece works down the 1st column to grab names
#the code then places them into column names

df2[df1iloc[i,0]] = 3958.756*np.arccos(np.cos(math.radians(90-Lat1)) *np.cos(np.radians(90-Lat2)) +np.sin(math.radians(90-Lat1)) *np.sin(np.radians(90-Lat2)) *np.cos(np.radians(Lon1-Lon2)))

Note that this calculates the miles between each location as direct shots there. Doesn't factor in twists and turns.

Repeated calculation between consecutive rows of pandas dataframe

Vectorized Haversine function:

def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of http://stackoverflow.com/a/29546836/2901002

Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)

All (lat, lon) coordinates must have numeric dtypes and be of equal length.

"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

return earth_radius * 2 * np.arcsin(np.sqrt(a))

Solution:

df['dist'] = haversine(df['lat'], df['lng'],
df['lat'].shift(), df['lng'].shift(),
to_radians=False)

Result:

In [65]: df
Out[65]:
label lat lng dist
0 foo 1.0 1.0 NaN
1 bar 2.5 1.0 9556.500000
2 zip 3.0 2.1 7074.983158
3 foo 1.2 1.0 10206.286067

Haversine Distance between consecutive rows for each Customer

I'll reuse the vectorized haversine_np function from derricw's answer:

def haversine_np(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)

All args must be of equal length.

"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

dlon = lon2 - lon1
dlat = lat2 - lat1

a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2

c = 2 * np.arcsin(np.sqrt(a))
km = 6367 * c
return km

def distance(x):
y = x.shift()
return haversine_np(x['Lat'], x['Lon'], y['Lat'], y['Lon']).fillna(0)

df['Distance'] = df.groupby('Customer').apply(distance).reset_index(level=0, drop=True)

Result:

  Customer  Lat  Lon    Distance
0 A 1 2 0.000000
1 A 1 2 0.000000
2 B 3 2 0.000000
3 B 4 2 111.057417

How to calculate distance using latitude and longitude in a pandas dataframe?

Please note: The following script does not account for the curvature of the earth. There are numerous documents Convert lat/long to XY explaining this problem.

However, the distance between coordinates can be roughly determined. The export is a Series, which can be easily concatenated with your original df to provide a separate column displaying distance relative to your coordinates.

d = ({
'Lat' : [43.937845,44.310739,44.914698],
'Long' : [-97.905537,-97.588820,-99.003517],
})

df = pd.DataFrame(d)

df = df[['Lat','Long']]

point1 = df.iloc[0]

def to_xy(point):

r = 6371000 #radians of the earth (m)
lam,phi = point
cos_phi_0 = np.cos(np.radians(phi))

return (r * np.radians(lam) * cos_phi_0,
r * np.radians(phi))

point1_xy = to_xy(point1)

df['to_xy'] = df.apply(lambda x:
tuple(x.values),
axis=1).map(to_xy)

df['Y'], df['X'] = df.to_xy.str[0], df.to_xy.str[1]

df = df[['X','Y']]
df = df.diff()

dist = np.sqrt(df['X']**2 + df['Y']**2)

#Convert to km
dist = dist/1000

print(dist)

0 NaN
1 41.149537
2 204.640462


Related Topics



Leave a reply



Submit