Design¶
Pyspatialml provides access to raster datasets using two main structures, comprising the
Raster
and RasterLayer
classes.
Raster Objects¶
The main class that facilitates working with multiple raster datasets is the
Raster
class. The Raster
object takes a list
GDAL-supported raster datasets and ‘stacks’ them into a single Raster
object. The underlying file-based raster datasets are not physically-stacked,
but rather the Raster object internally represents each band within the datasets
as a RasterLayer
. This means that metadata regarding what each raster
dataset represents (e.g. the dataset’s name) can be retained, and additional
raster datasets can easily be added or removed from the stack without physical
on-disk changes.
Note these raster datasets need to be spatially aligned in terms of their extent, resolution and coordinate reference system.
Creating a new Raster object¶
There are three methods of creating a new Raster object:
1. Raster(file_path=[raster1.tif, raster2.tif, raster3.tif])
creates a
Raster object from existing file-based GDAL-supported datasets.
2. Raster(arr=new_numpy_array, crs=crs, transform=transform)
creates a
Raster object from a 3D numpy array (band, row, column). The crs
and
transform
arguments are optional but are required to provide coordinate
reference system information to the Raster object. The crs argument has to be
represented by `rasterio.crs.CRS`
object, and the transform parameter requires
a `affine.Affine`
object.
3. Raster(layers=[RasterLayer1, RasterLayer2, RasterLayer3])
creates a
Raster object from a single or list of RasterLayer objects. RasterLayers are a
thin wrapper around rasterio.Band objects with additional methods. This is
mostly used internally. A RasterLayer itself is initiated directly from a
rasterio.Band object.
Generally, Pyspatialml intends users to work with the Raster object. However,
access to individual RasterLayer objects, or the underlying rasterio.band
datasets can be useful if pyspatialml is being used in conjunction with other
functions and methods in the rasterio
package.
Overview of attributes and methods¶
Attributes¶
Raster.loc
: Access pyspatialml.RasterLayer objects within a Raster using a key, or a list of keys.Raster.iloc
: Access pyspatialml.RasterLayer objects using an index position.Raster.files
: A list of the on-disk files that the Raster object references.Raster.dtypes
: A list of numpy dtypes for each RasterLayer.Raster.nodatavals
: A list of the nodata values for each RasterLayer.Raster.count
: The number of RasterLayers in the Raster.Raster.res
: The resolution in (x, y) dimensions of the Raster.Raster.meta
: A dict containing the raster metadata.Raster.names
: A list of the RasterLayer names.Raster.block_shape
: The default block_shape in (rows, cols) for reading windows of data in the Raster for out-of-memory processing.
Methods¶
read
: eads data from the Raster object into a numpy array.write
: Write the Raster object to a file.predict_proba
: Apply class probability prediction of a scikit learn model to a Raster.predict
: Apply prediction of a scikit learn model to a Raster.append
: Method to add new RasterLayers to a Raster object.drop
: Drop individual RasterLayers from a Raster object.rename
: Rename a RasterLayer within the Raster object.plot
: Plot a Raster object as a raster matrix.mask
: Mask a Raster object based on the outline of shapes in ageopandas.GeoDataFrame
.intersect
: Perform a intersect operation on the Raster object.crop
: Crops a Raster object by the supplied bounds.to_crs
: Reprojects a Raster object to a different crs.aggregate
: Aggregates a raster to (usually) a coarser grid cell size.apply
: Apply user-supplied function to a Raster object.block_shapes
: Generator for windows for optimal reading and writing based on the raster.astype
: Not currently implemented.
RasterLayer¶
A RasterLayer
is an object that represents a single raster band. It is based
on a rasterio.band
object with some additional attributes and methods. A RasterLayer
is used because the rasterio.Band.ds.read
method reads all bands from a multi-band dataset,
whereas the RasterLayer read method always refers to a single band.
A RasterLayer object is generally not intended to be initiated directly, but rather is used
internally as part of a Raster
to represent individual bands. However, a RasterLayer
can be initiated from a rasterio.band object.
Overview of attributes and methods¶
Methods encapsulated in RasterLayer objects represent those that typically would only be applied to a single-band of a raster, i.e. sieve-clump, distance to non-NaN pixels, or arithmetic operations on individual layers.
Attributes¶
bidx
: The band index of the RasterLayer within the file dataset.dtype
: The data type of the RasterLayer.nodata
: The number that is used to represent nodata pixels in the RasterLayer.file
: The file path to the dataset.ds
: The underlying rasterio.band object.driver
: The name of the GDAL format driver.meta
: A python dict storing the RasterLayer metadata.cmap
: The name of matplotlib map, or a custom matplotlib.cm.LinearSegmentedColormap or ListedColormap object.norm
: A matplotlib.colors.Normalize to apply to the RasterLayer.count
: Number of layers; always equal to 1.
Methods¶
close()
: Closes a RasterLayer.read()
: Reads data from the RasterLayer into a numpy array.write()
: Write the RasterLayer to a GDAL-supported raster dataset.fill()
: Fill nodata gaps in a RasterLayer. Wrapper around therasterio.fill.fillnodata
method.sieve()
: Replace pixels with their largest neighbor. Wrapper around therasterio.features.sieve
method.distance()
: Calculate euclidean grid distances to non-NaN pixels.plot()
: Plot a RasterLayer usingmatplotlib.pyplot.imshow
.