What's the Fastest Way of Checking If a Point Is Inside a Polygon in Python

What's the fastest way of checking if a point is inside a polygon in python

You can consider shapely:

from shapely.geometry import Point
from shapely.geometry.polygon import Polygon

point = Point(0.5, 0.5)
polygon = Polygon([(0, 0), (0, 1), (1, 1), (1, 0)])
print(polygon.contains(point))

From the methods you've mentioned I've only used the second, path.contains_points, and it works fine. In any case depending on the precision you need for your test I would suggest creating a numpy bool grid with all nodes inside the polygon to be True (False if not). If you are going to make a test for a lot of points this might be faster (although notice this relies you are making a test within a "pixel" tolerance):

from matplotlib import path
import matplotlib.pyplot as plt
import numpy as np

first = -3
size  = (3-first)/100
xv,yv = np.meshgrid(np.linspace(-3,3,100),np.linspace(-3,3,100))
p = path.Path([(0,0), (0, 1), (1, 1), (1, 0)])  # square with legs length 1 and bottom left corner at the origin
flags = p.contains_points(np.hstack((xv.flatten()[:,np.newaxis],yv.flatten()[:,np.newaxis])))
grid = np.zeros((101,101),dtype='bool')
grid[((xv.flatten()-first)/size).astype('int'),((yv.flatten()-first)/size).astype('int')] = flags

xi,yi = np.random.randint(-300,300,100)/100,np.random.randint(-300,300,100)/100
vflag = grid[((xi-first)/size).astype('int'),((yi-first)/size).astype('int')]
plt.imshow(grid.T,origin='lower',interpolation='nearest',cmap='binary')
plt.scatter(((xi-first)/size).astype('int'),((yi-first)/size).astype('int'),c=vflag,cmap='Greens',s=90)
plt.show()

, the results is this:

point inside polygon within pixel tolerance

Check if geo-point is inside or outside of polygon

Here is a possible solution to my problem.

Geographical coordinates must be stored properly. Example np.array([[Lon_A, Lat_A], [Lon_B, Lat_B], [Lon_C, Lat_C]])
Create the polygon
Create the point to be tested
Use polygon.contains(point) to test if point is inside (True) or outside (False) the polygon.

Here is the missing part of the code:

from shapely.geometry import Point
from shapely.geometry.polygon import Polygon

lons_lats_vect = np.column_stack((lons_vect, lats_vect)) # Reshape coordinates
polygon = Polygon(lons_lats_vect) # create polygon
point = Point(y,x) # create point
print(polygon.contains(point)) # check if polygon contains point
print(point.within(polygon)) # check if a point is in the polygon

Note: the polygon does not take into account great circles, therefore it is necessary to split the edges into many segments thus increasing the number of vertices.

Special case: If point lies on borders of Polygon

E.g. print(Polygon([(0, 0), (1, 0), (1, 1)]).contains(Point(0, 0))) will fail

So one can use

print(polygon.touches(point)) # check if point lies on border of polygon

Checking if a point is contained in a polygon/multipolygon - for many points

If you'd like to perform less comparison operations, you can try to use the shapely str-tree feature. Consider the following code:

from shapely.strtree import STRtree
from shapely.geometry import Polygon, Point

# this is the grid. It includes a point for each rectangle center. 
points = [Point(i, j) for i in range(10) for j in range(10)]
tree = STRtree(points). # create a 'database' of points

# this is one of the polygons. 
p = Polygon([[0.5, 0], [2, 0.2], [3, 4], [0.8, 0.5], [0.5, 0]])

res = tree.query(p). # run the query (a single shot). 

# do something with the results
print([o.wkt for o in res])

The result of this code is:

['POINT (3 0)', 'POINT (2 0)', 'POINT (1 0)', 'POINT (1 1)', 'POINT (3 1)',
 'POINT (2 1)', 'POINT (2 2)', 'POINT (3 2)', 'POINT (1 2)', 'POINT (2 3)',
 'POINT (3 3)', 'POINT (1 3)', 'POINT (2 4)', 'POINT (1 4)', 'POINT (3 4)']

How can I determine whether a 2D Point is within a Polygon?

For graphics, I'd rather not prefer integers. Many systems use integers for UI painting (pixels are ints after all), but macOS, for example, uses float for everything. macOS only knows points and a point can translate to one pixel, but depending on monitor resolution, it might translate to something else. On retina screens half a point (0.5/0.5) is pixel. Still, I never noticed that macOS UIs are significantly slower than other UIs. After all, 3D APIs (OpenGL or Direct3D) also work with floats and modern graphics libraries very often take advantage of GPU acceleration.

Now you said speed is your main concern, okay, let's go for speed. Before you run any sophisticated algorithm, first do a simple test. Create an axis aligned bounding box around your polygon. This is very easy, fast and can already save you a lot of calculations. How does that work? Iterate over all points of the polygon and find the min/max values of X and Y.

E.g. you have the points (9/1), (4/3), (2/7), (8/2), (3/6). This means Xmin is 2, Xmax is 9, Ymin is 1 and Ymax is 7. A point outside of the rectangle with the two edges (2/1) and (9/7) cannot be within the polygon.

// p is your point, p.x is the x coord, p.y is the y coord
if (p.x < Xmin || p.x > Xmax || p.y < Ymin || p.y > Ymax) {
    // Definitely not within the polygon!
}

This is the first test to run for any point. As you can see, this test is ultra fast but it's also very coarse. To handle points that are within the bounding rectangle, we need a more sophisticated algorithm. There are a couple of ways how this can be calculated. Which method works also depends on whether the polygon can have holes or will always be solid. Here are examples of solid ones (one convex, one concave):

Polygon without hole

And here's one with a hole:

Polygon with hole

The green one has a hole in the middle!

The easiest algorithm, that can handle all three cases above and is still pretty fast is named ray casting. The idea of the algorithm is pretty simple: Draw a virtual ray from anywhere outside the polygon to your point and count how often it hits a side of the polygon. If the number of hits is even, it's outside of the polygon, if it's odd, it's inside.

Demonstrating how the ray cuts through a polygon

The winding number algorithm would be an alternative, it is more accurate for points being very close to a polygon line but it's also much slower. Ray casting may fail for points too close to a polygon side because of limited floating point precision and rounding issues, but in reality that is hardly a problem, as if a point lies that close to a side, it's often visually not even possible for a viewer to recognize if it is already inside or still outside.

You still have the bounding box of above, remember? Just pick a point outside the bounding box and use it as starting point for your ray. E.g. the point (Xmin - e/p.y) is outside the polygon for sure.

But what is e? Well, e (actually epsilon) gives the bounding box some padding. As I said, ray tracing fails if we start too close to a polygon line. Since the bounding box might equal the polygon (if the polygon is an axis aligned rectangle, the bounding box is equal to the polygon itself!), we need some padding to make this safe, that's all. How big should you choose e? Not too big. It depends on the coordinate system scale you use for drawing. If your pixel step width is 1.0, then just choose 1.0 (yet 0.1 would have worked as well)

Now that we have the ray with its start and end coordinates, the problem shifts from "is the point within the polygon" to "how often does the ray intersects a polygon side". Therefore we can't just work with the polygon points as before, now we need the actual sides. A side is always defined by two points.

side 1: (X1/Y1)-(X2/Y2)
side 2: (X2/Y2)-(X3/Y3)
side 3: (X3/Y3)-(X4/Y4)
:

You need to test the ray against all sides. Consider the ray to be a vector and every side to be a vector. The ray has to hit each side exactly once or never at all. It can't hit the same side twice. Two lines in 2D space will always intersect exactly once, unless they are parallel, in which case they never intersect. However since vectors have a limited length, two vectors might not be parallel and still never intersect because they are too short to ever meet each other.

// Test the ray against all sides
int intersections = 0;
for (side = 0; side < numberOfSides; side++) {
    // Test if current side intersects with ray.
    // If yes, intersections++;
}
if ((intersections & 1) == 1) {
    // Inside of polygon
} else {
    // Outside of polygon
}

So far so well, but how do you test if two vectors intersect? Here's some C code (not tested), that should do the trick:

#define NO 0
#define YES 1
#define COLLINEAR 2

int areIntersecting(
    float v1x1, float v1y1, float v1x2, float v1y2,
    float v2x1, float v2y1, float v2x2, float v2y2
) {
    float d1, d2;
    float a1, a2, b1, b2, c1, c2;

    // Convert vector 1 to a line (line 1) of infinite length.
    // We want the line in linear equation standard form: A*x + B*y + C = 0
    // See: http://en.wikipedia.org/wiki/Linear_equation
    a1 = v1y2 - v1y1;
    b1 = v1x1 - v1x2;
    c1 = (v1x2 * v1y1) - (v1x1 * v1y2);

    // Every point (x,y), that solves the equation above, is on the line,
    // every point that does not solve it, is not. The equation will have a
    // positive result if it is on one side of the line and a negative one 
    // if is on the other side of it. We insert (x1,y1) and (x2,y2) of vector
    // 2 into the equation above.
    d1 = (a1 * v2x1) + (b1 * v2y1) + c1;
    d2 = (a1 * v2x2) + (b1 * v2y2) + c1;

    // If d1 and d2 both have the same sign, they are both on the same side
    // of our line 1 and in that case no intersection is possible. Careful, 
    // 0 is a special case, that's why we don't test ">=" and "<=", 
    // but "<" and ">".
    if (d1 > 0 && d2 > 0) return NO;
    if (d1 < 0 && d2 < 0) return NO;

    // The fact that vector 2 intersected the infinite line 1 above doesn't 
    // mean it also intersects the vector 1. Vector 1 is only a subset of that
    // infinite line 1, so it may have intersected that line before the vector
    // started or after it ended. To know for sure, we have to repeat the
    // the same test the other way round. We start by calculating the 
    // infinite line 2 in linear equation standard form.
    a2 = v2y2 - v2y1;
    b2 = v2x1 - v2x2;
    c2 = (v2x2 * v2y1) - (v2x1 * v2y2);

    // Calculate d1 and d2 again, this time using points of vector 1.
    d1 = (a2 * v1x1) + (b2 * v1y1) + c2;
    d2 = (a2 * v1x2) + (b2 * v1y2) + c2;

    // Again, if both have the same sign (and neither one is 0),
    // no intersection is possible.
    if (d1 > 0 && d2 > 0) return NO;
    if (d1 < 0 && d2 < 0) return NO;

    // If we get here, only two possibilities are left. Either the two
    // vectors intersect in exactly one point or they are collinear, which
    // means they intersect in any number of points from zero to infinite.
    if ((a1 * b2) - (a2 * b1) == 0.0f) return COLLINEAR;

    // If they are not collinear, they must intersect in exactly one point.
    return YES;
}

The input values are the two endpoints of vector 1 (v1x1/v1y1 and v1x2/v1y2) and vector 2 (v2x1/v2y1 and v2x2/v2y2). So you have 2 vectors, 4 points, 8 coordinates. YES and NO are clear. YES increases intersections, NO does nothing.

What about COLLINEAR? It means both vectors lie on the same infinite line, depending on position and length, they don't intersect at all or they intersect in an endless number of points. I'm not absolutely sure how to handle this case, I would not count it as intersection either way. Well, this case is rather rare in practice anyway because of floating point rounding errors; better code would probably not test for == 0.0f but instead for something like < epsilon, where epsilon is a rather small number.

If you need to test a larger number of points, you can certainly speed up the whole thing a bit by keeping the linear equation standard forms of the polygon sides in memory, so you don't have to recalculate these every time. This will save you two floating point multiplications and three floating point subtractions on every test in exchange for storing three floating point values per polygon side in memory. It's a typical memory vs computation time trade off.

Last but not least: If you may use 3D hardware to solve the problem, there is an interesting alternative. Just let the GPU do all the work for you. Create a painting surface that is off screen. Fill it completely with the color black. Now let OpenGL or Direct3D paint your polygon (or even all of your polygons if you just want to test if the point is within any of them, but you don't care for which one) and fill the polygon(s) with a different color, e.g. white. To check if a point is within the polygon, get the color of this point from the drawing surface. This is just a O(1) memory fetch.

Of course this method is only usable if your drawing surface doesn't have to be huge. If it cannot fit into the GPU memory, this method is slower than doing it on the CPU. If it would have to be huge and your GPU supports modern shaders, you can still use the GPU by implementing the ray casting shown above as a GPU shader, which absolutely is possible. For a larger number of polygons or a large number of points to test, this will pay off, consider some GPUs will be able to test 64 to 256 points in parallel. Note however that transferring data from CPU to GPU and back is always expensive, so for just testing a couple of points against a couple of simple polygons, where either the points or the polygons are dynamic and will change frequently, a GPU approach will rarely pay off.

Fastest way to produce a grid of points that fall within a polygon or shape?

I saw that you answered your question (and seems to be happy with using intersection) but also note that shapely (and the underlying geos library) have prepared geometries for more efficient batch operations on some predicates (contains, contains_properly, covers, and intersects).
See Prepared geometry operations.

Adapted from the code in your question, it could be used like so:

from shapely.prepared import prep

# determine maximum edges
polygon = shape(geojson['features'][i]['geometry'])
latmin, lonmin, latmax, lonmax = polygon.bounds

# create prepared polygon
prep_polygon = prep(polygon)

# construct a rectangular mesh
points = []
for lat in np.arange(latmin, latmax, resolution):
    for lon in np.arange(lonmin, lonmax, resolution):
        points.append(Point((round(lat,4), round(lon,4))))

# validate if each point falls inside shape using
# the prepared polygon
valid_points.extend(filter(prep_polygon.contains, points))

Is there a clever way to determine the points that are inside of an arbitrary shape?

It turns out, this algorithm has already been standardized in the matplotlib.path module. You can produce the same results using the Path class. Consider the following changes to the above code:

import numpy as np
from matplotlib import pyplot as plt
from matplotlib import path

r1 = 10
r2 = 4
a = 12  # x shift for circle 2
b = -4  # y shift for circle 2

theta = np.arange(0, 2*np.pi, 0.0006)

r1_complex = r1*np.exp(1j*theta)
r1_x, r1_y = np.real(r1_complex), np.imag(r1_complex)
stacked1 = np.stack((r1_x, r1_y), axis=1)  # A list of coordinates

r2_complex = r2*np.exp(1j*theta)
r2_x, r2_y = np.real(r2_complex) + a, np.imag(r2_complex) + b
stacked2 = np.stack((r2_x, r2_y), axis=1)  # A list of coordinates

p = path.Path(stacked2)
r1_inside = p.contains_points(stacked1)

fig, ax = plt.subplots()

ax.plot(r1_x, r1_y)
ax.plot(r2_x, r2_y)
ax.plot(r1_x[r1_inside], r1_y[r1_inside])

ax.set_aspect('equal')
ax.grid()
plt.show()

This produces the same image without knowledge of the mathematical properties of the shapes.

What's the Fastest Way of Checking If a Point Is Inside a Polygon in Python