How to Group Nearby Latitude and Longitude Locations Stored in SQL

How to group nearby latitude and longitude locations stored in SQL

Well, your problem description reads exactly like the DBSCAN clustering algorithm (Wikipedia). It avoids chain effects in the sense that it requires them to be at least minPts objects.

As for the differences in densities across, that is what OPTICS (Wikipedia) is supposed do solve. You may need to use a different way of extracting clusters though.

Well, ok, maybe not 100% - you maybe want to have single hotspots, not areas that are "density connected". When thinking of an OPTICS plot, I figure you are only interested in small but deep valleys, not in large valleys. You could probably use the OPTICS plot an scan for local minima of "at least 10 accidents".

Update: Thanks for the pointer to the data set. It's really interesting. So I did not filter it down to cyclists, but right now I'm using all 1.2 million records with coordinates. I've fed them into ELKI for analysis, because it's really fast, and it actually can use the geodetic distance (i.e. on latitude and longitude) instead of Euclidean distance, to avoid bias. I've enabled the R*-tree index with STR bulk loading, because that is supposed to help to get the runtime down a lot. I'm running OPTICS with Xi=.1, epsilon=1 (km) and minPts=100 (looking for large clusters only). Runtime was around 11 Minutes, not too bad. The OPTICS plot of course would be 1.2 million pixels wide, so it's not really good for full visualization anymore. Given the huge threshold, it identified 18 clusters with 100-200 instances each. I'll try to visualize these clusters next. But definitely try a lower minPts for your experiments.

So here are the major clusters found:

  • 51.690713 -0.045545 a crossing on A10 north of London just past M25
  • 51.477804 -0.404462 "Waggoners Roundabout"
  • 51.690713 -0.045545 "Halton Cross Roundabout" or the crossing south of it
  • 51.436707 -0.499702 Fork of A30 and A308 Staines By-Pass
  • 53.556186 -2.489059 M61 exit to A58, North-West of Manchester
  • 55.170139 -1.532917 A189, North Seaton Roundabout
  • 55.067229 -1.577334 A189 and A19, just south of this, a four lane roundabout.
  • 51.570594 -0.096159 Manour House, Picadilly Line
  • 53.477601 -1.152863 M18 and A1(M)
  • 53.091369 -0.789684 A1, A17 and A46, a complex construct with roundabouts on both sides of A1.
  • 52.949281 -0.97896 A52 and A46
  • 50.659544 -1.15251 Isle of Wight, Sandown.
  • ...

Note, these are just random points taken from the clusters. It may be sensible to compute e.g. cluster center and radius instead, but I didn't do that. I just wanted to get a glimpse of that data set, and it looks interesting.

Here are some screenshots, with minPts=50, epsilon=0.1, xi=0.02:

Some clusters in central London

Notice that with OPTICS, clusters can be hierarchical. Here is a detail:

Nested clusters - OPTICS produces a hierarchy

How to group nearby latitude and longitude and assign a Name/Number to that group in SQL Server?

First, let's create a computed geography column that will store the location coordinates. We will use this column to let SQL Server calculate distances for us:

ALTER TABLE #temp
ADD Point_Geolocation AS geography::STPointFromText('POINT(' + CAST(LONGITUDE AS VARCHAR(100))+ ' ' + CAST(LATITUDE AS VARCHAR(100)) +')', 4326) PERSISTED

Second, let's create a table of all nearby locations:

IF OBJECT_ID('tempdb..#Nearby_Points') IS NOT NULL DROP TABLE #Nearby_Points
CREATE TABLE #Nearby_Points (
ID_1 VARCHAR(10) NOT NULL,
ID_2 VARCHAR(10) NOT NULL,
PRIMARY KEY (ID_1, ID_2)
)

INSERT INTO #Nearby_Points
(
ID_1,
ID_2
)
SELECT t1.ID AS p1_ID
,t2.ID AS p2_ID
FROM #temp t1
INNER JOIN #temp t2
ON t1.ID < t2.ID
WHERE t1.Point_Geolocation.STDistance(t2.Point_Geolocation) < 40 -- Specify distance criteria here

-- SELECT * FROM #Nearby_Points

Note: with 100k+ coordinates, we're looking at approx 5 billion calculations: (100,000 ^ 2) / 2. The above query might take a while to execute.

Third, let's create a table to store our cluster list:

IF OBJECT_ID('tempdb..#Clusters') IS NOT NULL DROP TABLE #Clusters
CREATE TABLE #Clusters(
Cluster_ID INT NOT NULL,
Point_ID VARCHAR(10) NOT NULL,
PRIMARY KEY(Cluster_ID, Point_ID)
);

-- This index may improve performance a little
CREATE NONCLUSTERED INDEX IX_Point_ID ON #Clusters(Point_ID);

Finally, the following code will:

  1. create a new cluster for the first point that isn't already in a
    cluster.
  2. repeatedly re-scan the cluster table and add additional points to existing clusters, until each cluster contains all points that should belong to it.
  3. Go to step 1. above and repeat, until no new clusters are created.
DECLARE @Rowcount INT

INSERT INTO #Clusters
(
Cluster_ID,
Point_ID
)
SELECT COALESCE((SELECT MAX(Cluster_ID) FROM #Clusters),0) + 1
,MIN(np.ID_1)
FROM #Nearby_Points np
WHERE np.ID_1 NOT IN (SELECT Point_ID FROM #Clusters)
HAVING MIN(np.ID_1) IS NOT NULL

SET @Rowcount = @@ROWCOUNT

WHILE @Rowcount > 0
BEGIN

WHILE @Rowcount > 0
BEGIN

INSERT INTO #Clusters
(
Cluster_ID,
Point_ID
)
SELECT Cluster_ID
,Point_ID
FROM (
SELECT np.ID_2 AS Point_ID
,c.Cluster_ID
FROM #Nearby_Points np
INNER JOIN #Clusters c
ON np.ID_1 = c.Point_ID

UNION

SELECT np.ID_1
,c.Cluster_ID
FROM #Nearby_Points np
INNER JOIN #Clusters c
ON np.ID_2 = c.Point_ID
) vals
WHERE NOT EXISTS (
SELECT 1
FROM #Clusters
WHERE Cluster_ID = vals.Cluster_ID
AND Point_ID = vals.Point_ID
)

SET @Rowcount = @@ROWCOUNT
END

INSERT INTO #Clusters
(
Cluster_ID,
Point_ID
)
SELECT COALESCE((SELECT MAX(Cluster_ID) FROM #Clusters),0) + 1
,MIN(np.ID_1)
FROM #Nearby_Points np
WHERE np.ID_1 NOT IN (SELECT Point_ID FROM #Clusters)
HAVING MIN(np.ID_1) IS NOT NULL

SET @Rowcount = @@ROWCOUNT
END

And voilà:

SELECT *
FROM #Clusters c

|Cluster_ID | Point_ID|
|-----------|---------|
| 1 | ID1 |
| 1 | ID2 |
| 1 | ID3 |
| 1 | ID4 |
| 1 | ID5 |
| 1 | ID6 |
| 2 | ID7 |
| 2 | ID8 |
| 2 | ID9 |

How to group latitude/longitude points that are 'close' to each other?

There are a number of ways of determining the distance between two points, but for plotting points on a 2-D graph you probably want the Euclidean distance. If (x1, y1) represents your first point and (x2, y2) represents your second, the distance is

d = sqrt( (x2-x1)^2 + (y2-y1)^2 )

Regarding grouping, you may want to use some sort of 2-D mean to determine how "close" things are to each other. For example, if you have three points, (x1, y1), (x2, y2), (x3, y3), you can find the center of these three points by simple averaging:

x(mean) = (x1+x2+x3)/3
y(mean) = (y1+y2+y3)/3

You can then see how close each is to the center to determine whether it should be part of the "cluster".


There are a number of ways one can define clusters, all of which use some variant of a clustering algorithm. I'm in a rush now and don't have time to summarize, but check out the link and the algorithms, and hopefully other people will be able to provide more detail. Good luck!

How to find nearest location using latitude and longitude from SQL database?

SELECT id, ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM markers HAVING distance < 25 ORDER BY distance LIMIT 0 , 20;

How to find near stores from store location (latitude, longitude) from store table?

Spatial queries are better handled using the extension PostGIS. It has loads of really handy functions that make spatial queries very easy to write and to maintain. My suggestion:

Install Postgis (see this other answer)

Add a geometry column to your table, e.g.

SELECT AddGeometryColumn ('public','store','geom',4326,'POINT',2);

Create point geometries based on your latitude and longitude values:

UPDATE store SET geom = ST_MakePoint(longitude,latitude);

Index it (to make queries faster)

CREATE INDEX idx_store_geom ON store USING gist (geom);

After that, this is how a query to list the nearest neighbours of a given point would look like:

SELECT * FROM store
ORDER BY geom <-> ST_SetSRID(ST_MakePoint(92.215,111.12),4326)

Or if you want the nearest store to each store ..

SELECT * FROM store mds,
LATERAL (SELECT store_name,ST_Distance(geom,mds.geom) FROM store
WHERE id <> mds.id
ORDER BY geom <-> mds.geom
LIMIT 1) c (closest_store,distance);
  • The operator <-> stands for distance, so using it in the ORDER BY clause with LIMIT 1 selects only the record that is closest to a reference geometry.
  • 4326 stands for the spatial reference system WGS84. It might vary depending on your coordinates.

Demo: db<>fiddle

show nearby places based on saved longitude and latitude in database. angular6 + sql server

I recommend using Geography data type instead of latitude and longtitude. See following demo

--this is your table
CREATE TABLE Landmark (
Id int,
Name VARCHAR(100),
Latitude FLOAT,
Longitude FLOAT
)

INSERT Landmark VALUES
(1, 'Greek Restaurant', -72.374984, 41.274672),
(2, 'Italian Restaurant', -73.483947, 40.739283)

--this is better table to query
WITH GeographyLandmark AS
(
SELECT Id, Name, geography::STPointFromText('POINT(' + CAST(Latitude AS VARCHAR(20)) + ' ' + CAST(Longitude AS VARCHAR(20)) + ')', 4326) Location
FROM LandMark
)
--this query calculates distance between point and localizations in meters
SELECT Id, Name,
geography::STPointFromText('POINT(' + CAST(-74.009056 AS VARCHAR(20)) + ' ' + CAST(40.713744 AS VARCHAR(20)) + ')', 4326).STDistance(Location) Distance
FROM GeographyLandmark

Results:

Id    Name                 Distance
----- -------------------- ----------------
1 Greek Restaurant 150944,610588657
2 Italian Restaurant 44456,82536079

References: STDistance, STPointFromText

find the nearest location by latitude and longitude in postgresql

select * from (
SELECT *,( 3959 * acos( cos( radians(6.414478) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(12.466646) ) + sin( radians(6.414478) ) * sin( radians( lat ) ) ) ) AS distance
FROM station_location
) al
where distance < 5
ORDER BY distance
LIMIT 20;

How to store longitude & latitude as a geography in sql server 2014?

How can i use the longitute and latitute to store location within a geography column?(because it's supposed to be only one geographic point not two right? not one for longitute and one for latitute?)

You can use geography::STPointFromText / geography::Point to store longitude and latitude in a geography datatype.

SELECT geography::STPointFromText('POINT(' + CAST([Longitude] AS VARCHAR(20)) + ' ' + CAST([Latitude] AS VARCHAR(20)) + ')', 4326)

or

SELECT geography::Point(Latitude, Longitude , 4326)

Reference Link:

Update Geography column in table

Now that I've got the geography points, how can i select all the rows within a specific distance(in my case 2km)?

You can use STDistance like this.

DECLARE @g geography;
DECLARE @h geography;
SET @g = geography::STGeomFromText('POINT(-122.35900 47.65129)', 4326);
SET @h = geography::STGeomFromText('POINT(-122.34720 47.65100)', 4326);
SELECT @g.STDistance(@h);

Reference Link:

Distance between two points using Geography datatype in sqlserver 2008?

Insert Query

DECLARE @GeoTable TABLE 
(
id int identity(1,1),
location geography
)
--Using geography::STGeomFromText
INSERT INTO @GeoTable
SELECT geography::STGeomFromText('POINT(-122.35900 47.65129)', 4326)

--Using geography::Point
INSERT INTO @GeoTable
SELECT geography::Point(47.65100,-122.34720, 4326);

Get Distance Query

DECLARE @DistanceFromPoint geography
SET @DistanceFromPoint = geography::STGeomFromText('POINT(-122.34150 47.65234)', 4326);

SELECT id,location.Lat Lat,location.Long Long,location.STDistance(@DistanceFromPoint) Distance
FROM @GeoTable;


Related Topics



Leave a reply



Submit