Pandas Latitude-Longitude to distance between successive rows
you can use this great solution (c) @derricw (don't forget to upvote it ;-):
# vectorized haversine function
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of http://stackoverflow.com/a/29546836/2901002
Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)
All (lat, lon) coordinates must have numeric dtypes and be of equal length.
"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
return earth_radius * 2 * np.arcsin(np.sqrt(a))
df['dist'] = \
haversine(df.LAT.shift(), df.LONG.shift(),
df.loc[1:, 'LAT'], df.loc[1:, 'LONG'])
Result:In [566]: df
Out[566]:
Ser_Numb LAT LONG dist
0 1 74.166061 30.512811 NaN
1 2 72.249672 33.427724 232.549785
2 3 67.499828 37.937264 554.905446
3 4 84.253715 69.328767 1981.896491
4 5 72.104828 33.823462 1513.397997
5 6 63.989462 51.918173 1164.481327
6 7 80.209112 33.530778 1887.256899
7 8 68.954132 35.981256 1252.531365
8 9 83.378214 40.619652 1606.340727
9 10 68.778571 6.607066 1793.921854
UPDATE: this will help to understand the logic:In [573]: pd.concat([df['LAT'].shift(), df.loc[1:, 'LAT']], axis=1, ignore_index=True)
Out[573]:
0 1
0 NaN NaN
1 74.166061 72.249672
2 72.249672 67.499828
3 67.499828 84.253715
4 84.253715 72.104828
5 72.104828 63.989462
6 63.989462 80.209112
7 80.209112 68.954132
8 68.954132 83.378214
9 83.378214 68.778571
calculate distance between latitude longitude columns for pandas data frame
Assuming you have more than a single row for which you would like to compute the distance you can use apply as follows:
df['Dist'] = df.apply(lambda row: h3.point_dist((row['lat1'], row['long1']), (row['lat2'], row['long2'])), axis=1)
Which will add a column to your dataframe simi9lar to the following: lat1 long1 lat2 long2 Dist
0 52.229676 21.012229 52.406374 16.925168 2.796556
1 57.229676 30.001176 48.421365 17.256314 6.565542
Please note, my distance calculations may not agree with yours, since I used a dummy function for h3.point_dist computation Calculate distance of successive row AND group by column
Your function has been slightly changed to return a DataFrame
, then a groupby
and an apply
can do the job :
>>> def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
... if to_radians:
... lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
... a = np.sin((lat2-lat1)/2.0)**2+ np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
... return pd.DataFrame(earth_radius *2 * np.arcsin(np.sqrt(a)))
>>> df['dist'] = (df.groupby(["Group"])
... .apply(lambda x: haversine(x['LAT'],
... x['LONG'],
... x['LAT'].shift(),
... x['LONG'].shift())).values)
>>> df
Group ID LAT LONG dist
0 1 1 74.166061 30.512811 NaN
1 1 2 72.249672 33.427724 232.695882
2 1 3 67.499828 37.937264 555.254059
3 1 4 84.253715 69.328767 1983.141596
4 2 5 72.104828 33.823462 NaN
5 2 6 63.989462 51.918173 1165.212900
6 2 7 80.209112 33.530778 1888.442548
7 2 8 68.954132 35.981256 1253.318254
8 2 9 83.378214 40.619652 1607.349894
9 2 0 68.778571 6.607066 1795.048866
Longitude and Latitude Distance Calculation between 2 dataframes
The following code worked for me:
a=list(range(19))
for i in a:
Lat1=df1[i,2] #works down 3rd column
Lon1=df1[i,3] #works down 4th column
Lat2=df2['Latitude']
Lon2= df2['Longitude']
#the i in the below piece works down the 1st column to grab names
#the code then places them into column names
df2[df1iloc[i,0]] = 3958.756*np.arccos(np.cos(math.radians(90-Lat1)) *np.cos(np.radians(90-Lat2)) +np.sin(math.radians(90-Lat1)) *np.sin(np.radians(90-Lat2)) *np.cos(np.radians(Lon1-Lon2)))
Note that this calculates the miles between each location as direct shots there. Doesn't factor in twists and turns. Repeated calculation between consecutive rows of pandas dataframe
Vectorized Haversine
function:
def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
"""
slightly modified version: of http://stackoverflow.com/a/29546836/2901002
Calculate the great circle distance between two points
on the earth (specified in decimal degrees or in radians)
All (lat, lon) coordinates must have numeric dtypes and be of equal length.
"""
if to_radians:
lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
a = np.sin((lat2-lat1)/2.0)**2 + \
np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
return earth_radius * 2 * np.arcsin(np.sqrt(a))
Solution:df['dist'] = haversine(df['lat'], df['lng'],
df['lat'].shift(), df['lng'].shift(),
to_radians=False)
Result:In [65]: df
Out[65]:
label lat lng dist
0 foo 1.0 1.0 NaN
1 bar 2.5 1.0 9556.500000
2 zip 3.0 2.1 7074.983158
3 foo 1.2 1.0 10206.286067
Haversine Distance between consecutive rows for each Customer
I'll reuse the vectorized haversine_np
function from derricw's answer:
def haversine_np(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
All args must be of equal length.
"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
c = 2 * np.arcsin(np.sqrt(a))
km = 6367 * c
return km
def distance(x):
y = x.shift()
return haversine_np(x['Lat'], x['Lon'], y['Lat'], y['Lon']).fillna(0)
df['Distance'] = df.groupby('Customer').apply(distance).reset_index(level=0, drop=True)
Result: Customer Lat Lon Distance
0 A 1 2 0.000000
1 A 1 2 0.000000
2 B 3 2 0.000000
3 B 4 2 111.057417
How to calculate distance using latitude and longitude in a pandas dataframe?
Please note: The following script does not account for the curvature of the earth. There are numerous documents Convert lat/long to XY explaining this problem.
However, the distance between coordinates can be roughly determined. The export is a Series, which can be easily concatenated
with your original df
to provide a separate column
displaying distance relative to your coordinates.
d = ({
'Lat' : [43.937845,44.310739,44.914698],
'Long' : [-97.905537,-97.588820,-99.003517],
})
df = pd.DataFrame(d)
df = df[['Lat','Long']]
point1 = df.iloc[0]
def to_xy(point):
r = 6371000 #radians of the earth (m)
lam,phi = point
cos_phi_0 = np.cos(np.radians(phi))
return (r * np.radians(lam) * cos_phi_0,
r * np.radians(phi))
point1_xy = to_xy(point1)
df['to_xy'] = df.apply(lambda x:
tuple(x.values),
axis=1).map(to_xy)
df['Y'], df['X'] = df.to_xy.str[0], df.to_xy.str[1]
df = df[['X','Y']]
df = df.diff()
dist = np.sqrt(df['X']**2 + df['Y']**2)
#Convert to km
dist = dist/1000
print(dist)
0 NaN
1 41.149537
2 204.640462
Related Topics
Differencebetween Np.Array() and Np.Asarray()
Writing List of Strings to Excel CSV File in Python
Python - Datetime with Timezone to Epoch
How to Check If Stdin Has Some Data
Python MySQL Connector - Unread Result Found When Using Fetchone
How to Restrict Foreign Keys Choices to Related Objects Only in Django
How to Get Tweets Older Than a Week (Using Tweepy or Other Python Libraries)
Rotating a Two-Dimensional Array in Python
Can't Use '\1' Backreference to Capture-Group in a Function Call in Re.Sub() Repr Expression
What's the Difference Between Subprocess Popen and Call (How to Use Them)
How to Frame Two for Loops in List Comprehension Python
Possible Values from Sys.Platform
How Does Sklearn.Svm.Svc's Function Predict_Proba() Work Internally
Getting Gradient of Model Output W.R.T Weights Using Keras
What Does 'Wb' Mean in This Code, Using Python
Appending to the Same List from Different Processes Using Multiprocessing