Distance Calculations in MySQL Queries

Find distance between two points using latitude and longitude in mysql

I think your question says you have the city values for the two cities between which you wish to compute the distance.

This query will do the job for you, yielding the distance in km. It uses the spherical cosine law formula.

Notice that you join the table to itself so you can retrieve two coordinate pairs for the computation.

SELECT a.city AS from_city, b.city AS to_city, 
111.111 *
DEGREES(ACOS(LEAST(1.0, COS(RADIANS(a.Latitude))
* COS(RADIANS(b.Latitude))
* COS(RADIANS(a.Longitude - b.Longitude))
+ SIN(RADIANS(a.Latitude))
* SIN(RADIANS(b.Latitude))))) AS distance_in_km
FROM city AS a
JOIN city AS b ON a.id <> b.id
WHERE a.city = 3 AND b.city = 7

Notice that the constant 111.1111 is the number of kilometres per degree of latitude, based on the old Napoleonic definition of the metre as one ten-thousandth of the distance from the equator to the pole. That definition is close enough for location-finder work.

If you want statute miles instead of kilometres, use 69.0 instead.

http://sqlfiddle.com/#!9/21e06/412/0

If you're looking for nearby points you may be tempted to use a clause something like this:

   HAVING distance_in_km < 10.0    /* slow ! */
ORDER BY distance_in_km DESC

That is (as we say near Boston MA USA) wicked slow.

In that case you need to use a bounding box computation. See this writeup about how to do that. http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

The formula contains a LEAST() function. Why? Because the ACOS() function throws an error if its argument is even slightly greater than 1. When the two points in question are very close together, the expression with the COS() and SIN() computations can sometimes yield a value slightly greater than 1 due to floating-point epsilon (inaccuracy). The LEAST(1.0, dirty-great-expression) call copes with that problem.

There's a better way, a formula by Thaddeus Vincenty. It uses ATAN2() rather than ACOS() so it's less susceptible to epsilon problems.


Edit 2022 (by Alexio Vay):
As of today the modern solution should be the following short code:

   select ST_Distance_Sphere(
point(-87.6770458, 41.9631174),
point(-73.9898293, 40.7628267))

Please check out the answer of Naresh Kumar.

Fastest Way to Find Distance Between Two Lat/Long Points

  • Create your points using Point values of Geometry data types in MyISAM table. As of Mysql 5.7.5, InnoDB tables now also support SPATIAL indices.

  • Create a SPATIAL index on these points

  • Use MBRContains() to find the values:

      SELECT  *
    FROM table
    WHERE MBRContains(LineFromText(CONCAT(
    '('
    , @lon + 10 / ( 111.1 / cos(RADIANS(@lat)))
    , ' '
    , @lat + 10 / 111.1
    , ','
    , @lon - 10 / ( 111.1 / cos(RADIANS(@lat)))
    , ' '
    , @lat - 10 / 111.1
    , ')' )
    ,mypoint)

, or, in MySQL 5.1 and above:

    SELECT  *
FROM table
WHERE MBRContains
(
LineString
(
Point (
@lon + 10 / ( 111.1 / COS(RADIANS(@lat))),
@lat + 10 / 111.1
),
Point (
@lon - 10 / ( 111.1 / COS(RADIANS(@lat))),
@lat - 10 / 111.1
)
),
mypoint
)

This will select all points approximately within the box (@lat +/- 10 km, @lon +/- 10km).

This actually is not a box, but a spherical rectangle: latitude and longitude bound segment of the sphere. This may differ from a plain rectangle on the Franz Joseph Land, but quite close to it on most inhabited places.

  • Apply additional filtering to select everything inside the circle (not the square)

  • Possibly apply additional fine filtering to account for the big circle distance (for large distances)

distance calculations in mysql queries

Option 1:
Do the calculation on the database by switching to a database that supports GeoIP.

Option 2:
Do the calculation on the databaseusing a stored procedure like this:

CREATE FUNCTION calcDistance (latA double, lonA double, latB double, LonB double)
RETURNS double DETERMINISTIC
BEGIN
SET @RlatA = radians(latA);
SET @RlonA = radians(lonA);
SET @RlatB = radians(latB);
SET @RlonB = radians(LonB);
SET @deltaLat = @RlatA - @RlatB;
SET @deltaLon = @RlonA - @RlonB;
SET @d = SIN(@deltaLat/2) * SIN(@deltaLat/2) +
COS(@RlatA) * COS(@RlatB) * SIN(@deltaLon/2)*SIN(@deltaLon/2);
RETURN 2 * ASIN(SQRT(@d)) * 6371.01;
END//

If you have an index on latitude and longitude in your database, you can reduce the number of calculations that need to be calculated by working out an initial bounding box in PHP ($minLat, $maxLat, $minLong and $maxLong), and limiting the rows to a subset of your entries based on that (WHERE latitude BETWEEN $minLat AND $maxLat AND longitude BETWEEN $minLong AND $maxLong). Then MySQL only needs to execute the distance calculation for that subset of rows.

If you're simply using a stored procedure to calculate the distance) then SQL still has to look through every record in your database, and to calculate the distance for every record in your database before it can decide whether to return that row or discard it.

Because the calculation is relatively slow to execute, it would be better if you could reduce the set of rows that need to be calculated, eliminating rows that will clearly fall outside of the required distance, so that we're only executing the expensive calculation for a smaller number of rows.

If you consider that what you're doing is basically drawing a circle on a map, centred on your initial point, and with a radius of distance; then the formula simply identifies which rows fall within that circle... but it still has to checking every single row.

Using a bounding box is like drawing a square on the map first with the left, right, top and bottom edges at the appropriate distance from our centre point. Our circle will then be drawn within that box, with the Northmost, Eastmost, Southmost and Westmost points on the circle touching the borders of the box. Some rows will fall outside that box, so SQL doesn't even bother trying to calculate the distance for those rows. It only calculates the distance for those rows that fall within the bounding box to see if they fall within the circle as well.

Within your PHP (guess you're running PHP from the $ variable name), we can use a very simple calculation that works out the minimum and maximum latitude and longitude based on our distance, then set those values in the WHERE clause of your SQL statement. This is effectively our box, and anything that falls outside of that is automatically discarded without any need to actually calculate its distance.

There's a good explanation of this (with PHP code) on the Movable Type website that should be essential reading for anybody planning to do any GeoPositioning work in PHP.

EDIT
The value 6371.01 in the calcDistance stored procedure is the multiplier to give you a returned result in kilometers. Use appropriate alternative multipliers if you want to result in miles, nautical miles, meters, whatever

How can I calculate the distance between two positions in mysql query?

It requires MySQL’s built-in Math functions, including cos(), sin(), acos() and radians().

SELECT id, ( 3959 * acos( cos( radians(latitude) ) *
cos( radians( latitude) )
* cos( radians( longitude ) - radians(longitude ) )
+ sin( radians(latitude) ) * sin(radians(latitude)) )
) AS distance
FROM myTable

Where value 3959 is the Earth radius in miles

How to optimize SQL query with calculating distance by longitude and latitude?

Here's a few ideas, some of which may not apply depending on your exact situation.

  1. You could do the conversion of latitude and longitude to radians and store that in the row as well. This would save the cost of those calculations (actually the cost would be accrued once when storing the data).
  2. If your table is very large, you could use a simple linear distance calculation rather than the Haversince formula to limit the results to which you apply the Haversince formula.
  3. If you have other data in the table that would serve as a good first filter (country/region/etc.), you could apply it first.
  4. You could reorder your joins so that they are applied after the distance filter so that you don't incur the cost of the join on data that doesn't qualify.

MySQL Calculate the Distance between Zipcodes Faster and More Accurately?

On Accuracy

The only way to calculate distance accurately is with 3D trig, as you're doing. You can read more on that topic here: https://en.wikipedia.org/wiki/Geographical_distance

Although giving a pretty accurate distance between the lat/lng center-points of zipcodes, those center-points are arbitrarily picked, and the distance is calculated "as the crow flies", so you won't get an accurate representation of actual travel distance between two points within each.

For example, you may have two homes next-door to each other in adjacent zipcodes, or two homes on opposite ends of each zipcode, which will calculate as equidistant given this calculation.

The only way to correct that issue is to calculate address distance, which requires USPS data to map an address to a more specific point, or the use of an API like Google Maps, which will also calculate actual travel distance given available roads.

On Performance

There are a couple ways to speed up your query.

1. Reduce the Real-time Math

The fastest way to do your calculations in real-time is to precalculate and store the expensive trig values in columns in your table, e.g.:

ALTER TABLE Location
ADD COLUMN cos_rad_lat DOUBLE,
ADD COLUMN cos_rad_lng DOUBLE,
ADD COLUMN sin_rad_lat DOUBLE;

Then

UPDATE Location
SET cos_rad_lat = cos(radians(latitude)),
cos_rad_lng = cos(radians(longitude)),
sin_rad_lat = sin(radians(latitude));

Do your cos(radians(78.3232)) type calculations outside the query, so that math isn't done for each row of data.

Thus reducing all calculations to constant values (before getting to SQL) and calculated columns will make your query look like this:

SELECT
zipcode,
3959 * acos(
0.20239077538110228
* cos_rad_lat
* cos_rad_lng - 1.140108408597264
)
+ 0.979304842243025 * sin_rad_lat AS distance
FROM Location
HAVING distance < 25
ORDER BY distance

2. Bounding-box Reduction

Note: You can combine this with method 1.

You could probably increase performance slightly by adding a bounding-box reduction of zips in a subquery before doing the trig, but that may be more complicated than you would like.

For example, instead of:

FROM Location

You could do

FROM (
SELECT *
FROM Location
WHERE latitude BETWEEN A and B
AND longitude BETWEEN C and D
) AS Location

Where A, B, C, and D are numbers corresponding to your center-point +- about 0.3 (As each 10th of a degree of lat/lng corresponds to about 5-7 miles in the US).

This method gets tricky at -180 / 180 Longitude, but that doesn't affect the US.

3. Store All Calculated Distances
Another thing you could do is precalculate all distances of all zips, and store then in a separate table

CREATE TABLE LocationDistance (
zipcode1 varchar(5) NOT NULL REFERENCES Location(zipcode),
zipcode2 varchar(5) NOT NULL REFERENCES Location(zipcode)
distance double NOT NULL,
PRIMARY KEY (zipcode1, zipcode2),
INDEX (zipcode1, distance)
);

Populate this table with every combination of zip and their calculated distance.

Your query would then look like:

SELECT zipcode2
FROM LocationDistance
WHERE zipcode1 = 12345
AND distance < 25;

This would by far be the fastest solution, though it involves storing on the order of 1 Billion records.



Related Topics



Leave a reply



Submit