Spatial Join Returns Empty Results in GeoPandas: How to Fix It

Problem statement

A common GIS task is joining one layer to another based on location, such as matching customer points to sales territories, addresses to census tracts, or parcels to administrative boundaries. In GeoPandas, this is usually done with geopandas.sjoin().

The problem is that sjoin() can run without an error and still return an empty GeoDataFrame, or return far fewer matches than expected. In practice, this usually happens for one of these reasons:

  • the two layers use different CRS
  • the datasets do not actually overlap
  • the spatial predicate is wrong for the task
  • some geometries are null, empty, or invalid
  • the join direction or join type hides the missing matches

This guide shows how to diagnose each cause and apply the right fix.

Quick answer

If geopandas.sjoin() returns no matches:

  1. check that both GeoDataFrames have the same CRS with .crs
  2. reproject with .to_crs(), not .set_crs()
  3. compare .total_bounds to confirm the layers overlap
  4. test a different predicate such as intersects
  5. inspect null, empty, and invalid geometries before joining

In many cases, the fix is simply reprojecting one layer correctly and rerunning the join.

Step-by-step solution

Check whether both GeoDataFrames use the same CRS

A CRS mismatch is one of the most common reasons a spatial join returns no rows.

Compare gdf.crs values before running sjoin

import geopandas as gpd

points = gpd.read_file("data/customer_locations.shp")
polygons = gpd.read_file("data/sales_territories.shp")

print("Points CRS:", points.crs)
print("Polygons CRS:", polygons.crs)

If one layer is in EPSG:4326 and the other is in EPSG:3857, the coordinates are not in the same system, so the geometries will not line up.

Reproject one layer with to_crs() instead of assigning CRS manually

Use to_crs() when the layer already has the correct CRS definition but needs to be converted.

if points.crs != polygons.crs:
    points = points.to_crs(polygons.crs)

joined = gpd.sjoin(points, polygons, how="inner", predicate="intersects")
print(joined.head())

Common mistake: confusing set_crs() with to_crs()

set_crs() only labels the data. It does not transform coordinates.

Wrong approach:

points = points.set_crs(polygons.crs)

Correct approach:

points = points.to_crs(polygons.crs)

Use set_crs() only when the data has no CRS assigned and you know what the original CRS should be.

Confirm the layers actually overlap in space

Even if both layers have the same CRS, they may still be in different places.

Inspect total_bounds for both layers

print("Points bounds:", points.total_bounds)
print("Polygons bounds:", polygons.total_bounds)

total_bounds returns:

[minx, miny, maxx, maxy]

If one layer has bounds near [-74, 40, -73, 41] and the other has bounds near [500000, 4500000, 600000, 4600000], they are probably still in different coordinate systems or were imported incorrectly.

Plot both layers together to verify overlap

A visual check is often the fastest way to confirm whether a join should produce matches.

ax = polygons.plot(facecolor="none", edgecolor="black", figsize=(8, 8))
points.plot(ax=ax, color="red", markersize=10)

If the points are not visible or appear far away from the polygons, the join will return no matches.

Watch for wrong CRS metadata and bad source files

This often happens when one file is longitude/latitude and the other is projected in meters. It can also happen when a shapefile has a wrong .prj file or missing CRS metadata.

Use the correct spatial predicate

The predicate controls what counts as a match. If you choose the wrong one, the join can return zero rows even when the layers are close or overlapping.

Start with intersects when debugging

intersects is usually the safest first test because it matches geometries that overlap, cross, touch, or fall within one another.

joined = gpd.sjoin(points, polygons, how="inner", predicate="intersects")

Be careful with within, contains, and direction

For a point-in-polygon join:

  • points are usually within polygons
  • polygons usually contain points

Direction matters. This can return no matches:

gpd.sjoin(points, polygons, how="inner", predicate="contains")

That asks whether each point contains a polygon, which is false.

Test a few predicates quickly

for pred in ["intersects", "within", "contains", "touches"]:
    try:
        result = gpd.sjoin(points, polygons, how="inner", predicate=pred)
        print(pred, len(result))
    except Exception as e:
        print(pred, "not available:", e)

Check available predicates in your environment

Available predicates can vary by GeoPandas, Shapely, and the spatial index backend.

print(points.sindex.valid_query_predicates)

If a predicate is unsupported in your environment, GeoPandas will usually raise an error. Check the available predicates before running the join.

Check for empty, null, or invalid geometries

Bad geometry can prevent expected matches.

Find null geometries with isna()

print("Null point geometries:", points.geometry.isna().sum())
print("Null polygon geometries:", polygons.geometry.isna().sum())

Find empty geometries with is_empty

print("Empty point geometries:", points.geometry.is_empty.sum())
print("Empty polygon geometries:", polygons.geometry.is_empty.sum())

Detect invalid geometries with is_valid

print("Invalid polygons:", (~polygons.geometry.is_valid).sum())

Try repairing invalid polygons before joining

For some polygon issues, buffer(0) can repair self-intersections, but it is not guaranteed to fix every invalid geometry.

polygons = polygons.copy()
polygons["geometry"] = polygons.geometry.buffer(0)

print("Invalid polygons after repair:", (~polygons.geometry.is_valid).sum())

Then rerun the join:

joined = gpd.sjoin(points, polygons, how="inner", predicate="intersects")

If buffer(0) changes geometry in a way you do not want, go back to the source data and fix it there.

Make sure the geometry types match the task

The geometry types in each layer affect which predicates make sense.

Point-in-polygon joins

If your goal is assigning points to polygons, the usual pattern is:

joined = gpd.sjoin(points, polygons, how="left", predicate="within")

Polygon-to-polygon joins

For polygon overlays, intersects is often more useful than within or contains, especially when boundaries only partially overlap.

Line and polygon cases

If one layer contains lines and the other polygons, within may be too strict. If geometries only touch at boundaries, contains may return no matches while intersects or touches returns some.

Verify the join direction and join type

The join may not be empty. It may only be hiding unmatched rows.

Difference between left, right, and inner

  • inner: returns only matched rows
  • left: keeps all rows from the left GeoDataFrame
  • right: keeps all rows from the right GeoDataFrame

If you use inner, unmatched features disappear completely, which makes debugging harder.

Keep the layer you want to inspect on the left

Use the dataset you want to preserve as the left table.

joined = gpd.sjoin(points, polygons, how="left", predicate="within")

Inspect unmatched rows after the join

unmatched = joined[joined["index_right"].isna()]
print(unmatched[["geometry"]].head())
print("Unmatched rows:", len(unmatched))

This is useful when every left-side feature failed to match and you need to inspect where they are.

Run a minimal test to isolate the issue

A small test can help confirm whether the problem is with the data or with your join logic.

Test with a small subset of rows

points_sample = points.iloc[:5].copy()
polygons_sample = polygons.iloc[:5].copy()

test_join = gpd.sjoin(points_sample, polygons_sample, how="inner", predicate="intersects")
print(test_join)

This only helps if the sample features are expected to overlap. A zero-row result from arbitrary first rows does not prove sjoin() is the problem.

Compare one geometry against another directly

pt = points.geometry.iloc[0]
poly = polygons.geometry.iloc[0]

print("Intersects:", pt.intersects(poly))
print("Within:", pt.within(poly))
print("Touches:", pt.touches(poly))

If direct geometry tests also return False, the problem is usually CRS, bad geometry, the wrong predicate, or no real overlap.

Code examples

Example: debug a point-to-polygon join

This is a practical sequence for checking a typical customer-points-to-territories workflow.

import geopandas as gpd

points = gpd.read_file("data/customer_locations.shp")
polygons = gpd.read_file("data/sales_territories.shp")

# 1. Check CRS
print(points.crs)
print(polygons.crs)

if points.crs != polygons.crs:
    points = points.to_crs(polygons.crs)

# 2. Check extents
print(points.total_bounds)
print(polygons.total_bounds)

# 3. Check geometry quality
print("Null points:", points.geometry.isna().sum())
print("Empty points:", points.geometry.is_empty.sum())
print("Invalid polygons:", (~polygons.geometry.is_valid).sum())

# 4. Run a safe first join
joined = gpd.sjoin(points, polygons, how="left", predicate="intersects")

# 5. Inspect unmatched rows
unmatched = joined[joined["index_right"].isna()]
print("Unmatched rows:", len(unmatched))
print(joined.head())

Explanation

A spatial join only returns matches when GeoPandas finds the expected spatial relationship between features in two layers. That depends on four things:

  • both layers must use compatible coordinates
  • the features must actually overlap or touch in space
  • the predicate must match the relationship you want
  • the geometries must be valid enough for spatial operations

If any one of these is wrong, sjoin() can run successfully and still return zero rows.

GeoPandas uses a spatial index to narrow down candidate matches and then applies a predicate such as intersects or within. If the extents do not overlap, the index finds no candidates. If the predicate is too strict, the candidates are rejected. If geometries are null, empty, or invalid, expected relationships may fail.

Edge cases and notes

  • Layers that only touch at borders: within and contains may return no matches, while intersects or touches may work.
  • Multipart geometries: MultiPolygons can behave differently than expected when testing containment near holes or gaps.
  • Precision issues: Very small geometries or nearly coincident boundaries can produce unexpected results.
  • Version differences: Predicate support can vary across GeoPandas and Shapely versions. Check valid_query_predicates.
  • Missing or wrong CRS metadata: to_crs() cannot fix incorrectly labeled source data until the original CRS is identified.
  • Sampling for tests: head() or iloc[:5] is only useful if those features are expected to overlap.

If you need the underlying concept, read How Spatial Join Works in GeoPandas.

For the standard workflow, see Perform a Spatial Join in GeoPandas and Reproject a GeoDataFrame in GeoPandas.

If the problem looks projection-related, use GeoPandas CRS Mismatch: How to Fix Projection Problems.

You may also need How to Fix CRS Mismatch in GeoPandas or GeoPandas Not Reading Shapefile: Common Causes and Fixes if the issue starts earlier in the workflow.

FAQ

Why does sjoin() return an empty GeoDataFrame without an error?

Because the function can run correctly even when no features satisfy the spatial relationship. The most common causes are CRS mismatch, no spatial overlap, the wrong predicate, or bad geometries.

Should I use set_crs() or to_crs() before a spatial join?

Use to_crs() to reproject data into the same coordinate system. Use set_crs() only when the layer is missing CRS metadata and you know the correct original CRS.

Which predicate should I use for point-in-polygon joins?

Start with within when points are the left layer and polygons are the right layer. If you are debugging, try intersects first because it is less restrictive.

How do I check whether two GeoDataFrames actually overlap?

Compare .total_bounds for both layers and plot them together on the same axis. If the extents do not overlap visually or numerically, the join will return no matches.