How to Convert a Raster to a Vector in Python

Problem statement

A common GIS task is converting raster data into vector polygons. This usually comes up when you have a classified raster, binary mask, or land cover grid and need polygon features for analysis, editing, or export.

Typical examples include:

extracting land cover classes from a classified TIFF
converting a flood mask into polygons
turning valid raster regions into a Shapefile or GeoJSON
creating vector boundaries from a suitability or segmentation raster

In Python, the standard workflow uses:

rasterio to read the raster
rasterio.features.shapes to polygonize pixel regions
GeoPandas to store and export the result
Shapely to build geometry objects

This works best for categorical rasters where neighboring pixels share the same class value.

Quick answer

To convert a raster to vector polygons in Python, the usual workflow is:

open the raster with Rasterio
read the raster band and define a valid-data mask
extract polygons with rasterio.features.shapes
load the results into a GeoDataFrame
save to Shapefile or GeoJSON

Basic example:

import os
import rasterio
from rasterio.features import shapes
import geopandas as gpd
from shapely.geometry import shape

raster_path = "data/landcover.tif"
output_path = "output/landcover_polygons.shp"

os.makedirs("output", exist_ok=True)

with rasterio.open(raster_path) as src:
    band = src.read(1)
    crs = src.crs
    mask = band != src.nodata if src.nodata is not None else band > 0

    records = []
    for geom, value in shapes(band, mask=mask, transform=src.transform):
        records.append({"geometry": shape(geom), "class_value": int(value)})

gdf = gpd.GeoDataFrame(records, crs=crs)
gdf.to_file(output_path)

This approach is best for classified rasters. For continuous rasters like elevation or imagery, reclassify first or you may create too many polygons.

Step-by-step solution

Step 1: Load the raster and inspect its values

Before polygonizing, check the raster metadata and values. You need to know:

CRS
affine transform
nodata value
raster class values

import numpy as np
import rasterio

raster_path = "data/landcover.tif"

with rasterio.open(raster_path) as src:
    band = src.read(1)
    print("CRS:", src.crs)
    print("Transform:", src.transform)
    print("NoData:", src.nodata)

    unique_values = np.unique(band)
    print("Unique values:", unique_values[:20])
    print("Total unique values:", len(unique_values))

If the raster has only a small set of repeated values such as 1, 2, 3, 4, it is suitable for polygonizing. If it has many unique values, it is probably continuous data and should usually be reclassified first.

For very large rasters, np.unique() can be slow or memory-intensive. In that case, inspect a smaller clipped area first or review the raster class definitions from its source.

Step 2: Filter out nodata or background pixels

You usually do not want nodata or background cells turned into polygons. Build a mask so only valid cells are included.

with rasterio.open(raster_path) as src:
    band = src.read(1)

    if src.nodata is not None:
        mask = band != src.nodata
    else:
        mask = band > 0  # example if 0 is background

The mask controls which cells are polygonized. This is important for clean output.

Step 3: Convert raster regions to vector geometries

Use rasterio.features.shapes to extract polygons. It returns geometry/value pairs. Adjacent pixels with the same value are grouped into polygon features.

from rasterio.features import shapes

with rasterio.open(raster_path) as src:
    band = src.read(1)
    mask = band != src.nodata if src.nodata is not None else band > 0

    results = shapes(band, mask=mask, transform=src.transform)

    for geom, value in results:
        print(value, geom["type"])
        break

The transform is required so raster row and column positions become real map coordinates.

Step 4: Build a GeoDataFrame from the extracted shapes

Convert the GeoJSON-like geometry dictionaries into Shapely geometries, then create a GeoDataFrame.

import geopandas as gpd
from shapely.geometry import shape

with rasterio.open(raster_path) as src:
    band = src.read(1)
    crs = src.crs
    mask = band != src.nodata if src.nodata is not None else band > 0

    records = []
    for geom, value in shapes(band, mask=mask, transform=src.transform):
        records.append({
            "geometry": shape(geom),
            "class_value": int(value)
        })

gdf = gpd.GeoDataFrame(records, crs=crs)
print(gdf.head())

Now each polygon has a class_value attribute from the raster.

Step 5: Save the output as Shapefile or GeoJSON

Export the GeoDataFrame in the format you need.

import os

os.makedirs("output", exist_ok=True)

gdf.to_file("output/landcover_polygons.shp")
gdf.to_file("output/landcover_polygons.geojson", driver="GeoJSON")

Shapefile is widely supported. GeoJSON is often easier for web mapping and data exchange.

Code examples

Example 1: Convert a classified raster to polygons

This is the standard workflow for a land cover raster.

import os
import rasterio
from rasterio.features import shapes
import geopandas as gpd
from shapely.geometry import shape

input_raster = "data/landcover.tif"
output_vector = "output/landcover_polygons.shp"

os.makedirs("output", exist_ok=True)

with rasterio.open(input_raster) as src:
    band = src.read(1)
    crs = src.crs
    mask = band != src.nodata if src.nodata is not None else band > 0

    features = [
        {"geometry": shape(geom), "class_value": int(value)}
        for geom, value in shapes(band, mask=mask, transform=src.transform)
    ]

gdf = gpd.GeoDataFrame(features, crs=crs)
gdf.to_file(output_vector)

Example 2: Convert only one raster class to vector

If you only want one value, such as flooded cells coded as 1, create a boolean mask for that class.

import os
import rasterio
from rasterio.features import shapes
import geopandas as gpd
from shapely.geometry import shape
import numpy as np

input_raster = "data/flood_mask.tif"
output_vector = "output/flooded_areas.shp"
target_value = 1

os.makedirs("output", exist_ok=True)

with rasterio.open(input_raster) as src:
    band = src.read(1)
    class_mask = band == target_value

    features = []
    for geom, value in shapes(
        band.astype(np.int16),
        mask=class_mask,
        transform=src.transform
    ):
        if int(value) == target_value:
            features.append({
                "geometry": shape(geom),
                "class_value": int(value)
            })

    gdf = gpd.GeoDataFrame(features, crs=src.crs)

gdf.to_file(output_vector)

This is useful when only one class matters, such as flooded versus not flooded.

Example 3: Export raster polygons to GeoJSON

To create GeoJSON output, use the same extraction process and write GeoJSON.

import os
import rasterio
from rasterio.features import shapes
import geopandas as gpd
from shapely.geometry import shape

input_raster = "data/landcover.tif"
output_geojson = "output/landcover_polygons.geojson"

os.makedirs("output", exist_ok=True)

with rasterio.open(input_raster) as src:
    band = src.read(1)
    crs = src.crs
    mask = band != src.nodata if src.nodata is not None else band > 0

    records = []
    for geom, value in shapes(band, mask=mask, transform=src.transform):
        records.append({
            "geometry": shape(geom),
            "class_value": int(value)
        })

gdf = gpd.GeoDataFrame(records, crs=crs)
gdf.to_file(output_geojson, driver="GeoJSON")

Explanation

When you polygonize a raster in Python, the raster is scanned for groups of adjacent pixels with the same value. Each region becomes a polygon.

The key parts are:

raster band values: define which class each cell belongs to
mask: limits extraction to valid cells
transform: converts pixel coordinates into map coordinates
CRS: keeps the output aligned with other GIS layers

rasterio.features.shapes does not create one polygon per cell unless every cell is isolated. It groups neighboring cells with the same value into larger polygons.

This is different from other raster-to-vector tasks:

polygonizing creates area features
point conversion creates a point for each cell
contour extraction creates lines from continuous surfaces

For most workflows, polygonizing is the right method when the raster represents zones or classes.

Edge cases and notes

Continuous rasters can produce too many polygons

Elevation rasters, temperature grids, and imagery often contain many unique values. If you polygonize them directly, the output may contain a huge number of tiny polygons. Reclassify first.

Small pixel regions may create noisy output

Classified rasters often contain isolated cells and slivers. These become small polygons. A common follow-up step is to:

dissolve polygons by class
filter small areas
smooth boundaries if appropriate

Nodata handling affects the result

If nodata is missing or incorrect, you may get unwanted polygons around empty cells. Always check src.nodata and confirm the mask logic.

CRS must be preserved

The output GeoDataFrame should use the raster CRS:

gdf = gpd.GeoDataFrame(records, crs=crs)

If the output does not align with other layers, the problem is usually CRS assignment or later reprojection.

Invalid geometries can appear

Polygonized output can sometimes include invalid geometries, especially from complex raster edges. Check validity if later overlay operations fail:

invalid = ~gdf.is_valid
print(gdf[invalid])

Large rasters can be slow to polygonize

Very large rasters can be slow and memory-intensive to process. The output can also contain complex or multipart geometries. In practice, it often helps to clip the raster to the area of interest first.

Internal links

If you need background on when this workflow makes sense, see Raster vs vector data in GIS.

Related next steps:

If your output layer does not line up with other data, see How to fix CRS mismatch errors in GeoPandas.

FAQ

How do I convert a raster to polygons in Python?

Use rasterio to read the raster and rasterio.features.shapes to extract polygon geometries from grouped pixel regions. Then store the result in a GeoDataFrame and export it.

What Python library is used to polygonize a raster?

The standard tool is rasterio.features.shapes. GeoPandas is commonly used after that to manage and save the vector output.

Can I convert only one raster value to vector polygons?

Yes. Build a mask such as band == 1 and polygonize only that class. This is common for binary masks like flooded versus not flooded.

Why does raster to vector conversion create too many polygons?

This usually happens when the raster is continuous or noisy. Many unique values or isolated pixels create many separate polygon regions. Reclassification or filtering is often needed first.

How do I save polygonized raster output as a Shapefile or GeoJSON?

Use GeoPandas:

gdf.to_file("output.shp")
gdf.to_file("output.geojson", driver="GeoJSON")

Shapefile is widely supported, while GeoJSON is often easier for web and exchange workflows.

How to Convert a Raster to a Vector in Python #

Problem statement #

Quick answer #

Step-by-step solution #

Step 1: Load the raster and inspect its values #

Step 2: Filter out nodata or background pixels #

Step 3: Convert raster regions to vector geometries #

Step 4: Build a GeoDataFrame from the extracted shapes #

Step 5: Save the output as Shapefile or GeoJSON #

Code examples #

Example 1: Convert a classified raster to polygons #

Example 2: Convert only one raster class to vector #

Example 3: Export raster polygons to GeoJSON #

Explanation #

Edge cases and notes #

Continuous rasters can produce too many polygons #

Small pixel regions may create noisy output #

Nodata handling affects the result #

CRS must be preserved #

Invalid geometries can appear #

Large rasters can be slow to polygonize #

Internal links #

FAQ #

How do I convert a raster to polygons in Python? #

What Python library is used to polygonize a raster? #

Can I convert only one raster value to vector polygons? #

Why does raster to vector conversion create too many polygons? #

How do I save polygonized raster output as a Shapefile or GeoJSON? #