Fast, memory-efficient, zero-copy spatial indexes for Python.
This documentation is for the Python bindings. Refer here for the Rust crate documentation.
- An R-tree and k-d tree written in Rust, compiled for Python.
- Fast. The Rust core and immutability lends the spatial indexes to be very fast. Additionally, building the indexes accepts vectorized Numpy or Arrow input.
- Memory efficient. The index is fully packed, meaning that all nodes are at full capacity (except for the last node at each tree level). This means the RTree and k-d tree use less memory.
- Bounded memory. For any given number of items and node size, you can infer the total memory used by the RTree or KDTree.
- Multiple R-tree sorting methods. Currently, hilbert and sort-tile-recursive (STR) sorting methods are implemented.
- ABI-stable: the index is contained in a single buffer, compatible with the
flatbushandkdbushJavaScript libraries. Being ABI-stable means that the spatial index can be persisted for later use or shared zero-copy between Rust and Python. - Supports float64 or float32 coordinates: for 2x memory savings, use float32 coordinates in the spatial index.
pip install geoindex-rs
or with Conda:
conda install geoindex-rs
Building an RTree and searching for nearest neighbors.
import numpy as np
from geoindex_rs import rtree as rt
# Three bounding boxes
min_x = np.arange(0, 3)
min_y = np.arange(0, 3)
max_x = np.arange(2, 5)
max_y = np.arange(2, 5)
# When creating the builder, the total number of items must be passed
builder = rt.RTreeBuilder(num_items=3)
# Add the bounding boxes to the builder
builder.add(min_x, min_y, max_x, max_y)
# Consume the builder (sorting the index) and create the RTree.
tree = builder.finish()
# Find the nearest neighbors in the RTree to the point (5, 5)
results = rt.neighbors(tree, 5, 5)
# For performance, results are returned as an Arrow array.
assert results.to_pylist() == [2, 1, 0]Building a KDTree and searching within a bounding box.
import numpy as np
from geoindex_rs import kdtree as kd
# Three points: (0, 2), (1, 3), (2, 4)
x = np.arange(0, 3)
y = np.arange(2, 5)
# When creating the builder, the total number of items must be passed
builder = kd.KDTreeBuilder(3)
# Add the points to the builder
builder.add(x, y)
# Consume the builder (sorting the index) and create the KDTree.
tree = builder.finish()
# Search within this bounding box:
results = kd.range(tree, 2, 4, 7, 9)
# For performance, results are returned as an Arrow array.
assert results.to_pylist() == [2]The RTree and KDTree classes implement the Python buffer protocol, so you
can pass an instance of the index directly to bytes to copy the underlying
spatial index into a buffer. Then you can save that buffer somewhere, load it
again, and use it directly for queries!
import numpy as np
from geoindex_rs import rtree as rt
min_x = np.arange(0, 3)
min_y = np.arange(0, 3)
max_x = np.arange(2, 5)
max_y = np.arange(2, 5)
builder = rt.RTreeBuilder(num_items=3)
builder.add(min_x, min_y, max_x, max_y)
tree = builder.finish()
# Copy to a Python bytes object
copied_tree = bytes(tree)
# The entire RTree is contained within this 144 byte buffer
assert len(copied_tree) == 144
# We can use the bytes object (or anything else implementing the Python buffer
# protocol) directly in searches
results = rt.neighbors(copied_tree, 5, 5)
assert results.to_pylist() == [2, 1, 0]- Trees are immutable. After creating the index, items can no longer be added or removed.
- Only two-dimensional indexes is supported. This can still be used with higher-dimensional input data as long as it's fine to only index two of the dimensions.
- Queries return insertion indexes into the input set, so you must manage your own collections.