With the new releases of odc-geo
and odc-stac
there are a lot of new features that make working with geospatial data in a notebook much more pleasant and productive. This notebook walks through some of the features
pip install odc-geo==0.2.0 odc-stac==0.3.0
# OR
mambba install odc-geo==0.2.0 odc-stac==0.3.0
This notebook was tested on Planetary Computer Hub. It loads 16-days worth of Landsat8 scenes in a UTM zone 33 (covers parts of Europe and Africa). Imagery is loaded at 1km per pixel resolution in Web Mercator projection. We convert pixels to RGBA<uint8>
by clipping 6_000, 30_000 -> 0, 255
and display result on the map.
You can download this notebook to try it out on your own.
Query 16 days at the start of June 2021. Landsat 8 revisit is 16 days so we should get one observation from each scene in the queried region.
import pystac_client
catalog = pystac_client.Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")
items_all = catalog.search(
collections=["landsat-8-c2-l2"],
datetime="2021-06-01T00:00:00Z/2021-06-16T23:59:59Z",
bbox=[10, -71, 21, 71],
).get_all_items()
# only keep items in utm=33, to select one tall column of data
items = [item for item in items_all if item.properties["proj:epsg"] in [32633, 32733]]
print(f"Using {len(items)} of {len(items_all)} items (utm 33)")
Using 272 of 605 items (utm 33)
Request to load data with odc.stac.load
.
red
,green
,blue
bandsEPSG:3857
)uint16
pixels 0
marking nodata pixels (this can also be extracted from STAC extension data if configured)average
resampling for converting to desired projectiongroupby=
)groupby="time"
or groupby="solar_day"
chunks=
patch_url=planetary_computer.sign
No pixel loading happens yet, this just builds a recipe that we will execute later on.
import planetary_computer
from dask.distributed import Client
from odc.stac import configure_rio
from odc.stac import load as stac_load
configure_rio(cloud_defaults=True) # No need to look for side-car files
# fmt: off
xx = stac_load(
items,
["red", "green", "blue"], # Can be omited, then you'll get all the bancs
#################################################################
# All options below have reasonable defaults derived
# from supplied metadata.
#################################################################
crs="epsg:3857", # Defaults to most common CRS across Items
resolution=1000, # Defaults to native resolution
dtype="uint16", # Defaults to native pixel type
nodata=0, # Defaults to native
resampling="average", # Default is `nearest`
# Default is to group by time
# groupby="solar_day", # is often what one wants
groupby=lambda item, parsed, idx: 1, # all items in the same "group"
# Enable and configure Dask
# since each pixel is 1x1km we opt for smaller chunk sizes
# Image will be Tall and narrow, so only chunk along Y
chunks={"x": -1, "y": 1000},
# Needed for planetary computer data sources
patch_url=planetary_computer.sign,
)
# fmt: on
thumb = xx.odc.to_rgba(vmin=6_000, vmax=30_000)
display(xx)
display(thumb)
<xarray.Dataset> Dimensions: (y: 15386, x: 1314, time: 1) Coordinates: * y (y) float64 1.132e+07 1.132e+07 ... -4.064e+06 -4.064e+06 * x (x) float64 9.885e+05 9.895e+05 ... 2.300e+06 2.302e+06 spatial_ref int32 3857 * time (time) datetime64[ns] 2021-06-01T09:38:46.486871 Data variables: red (time, y, x) uint16 dask.array<chunksize=(1, 1000, 1314), meta=np.ndarray> green (time, y, x) uint16 dask.array<chunksize=(1, 1000, 1314), meta=np.ndarray> blue (time, y, x) uint16 dask.array<chunksize=(1, 1000, 1314), meta=np.ndarray>
<xarray.DataArray 'ro_rgba-90fc57049f07f68941b70cad943de8b0-264118d43e4e5f217ad076644d1ab06d' ( time: 1, y: 15386, x: 1314, band: 4)> dask.array<ro_rgba-90fc57049f07f68941b70cad943de8b0, shape=(1, 15386, 1314, 4), dtype=uint8, chunksize=(1, 1000, 1314, 4), chunktype=numpy.ndarray> Coordinates: * y (y) float64 1.132e+07 1.132e+07 ... -4.064e+06 -4.064e+06 * x (x) float64 9.885e+05 9.895e+05 ... 2.300e+06 2.302e+06 spatial_ref int32 3857 * time (time) datetime64[ns] 2021-06-01T09:38:46.486871 * band (band) <U1 'r' 'g' 'b' 'a'
# Start local Dask client to run across CPUs
# one can also connect to remote cluster
dask_client = Client()
Should take about a minute, longer if running from "home".
%%time
thumb = thumb.compute()
CPU times: user 4.47 s, sys: 1 s, total: 5.47 s Wall time: 48.1 s
We use folium
here, but same works with ipyleaflet
too.
note: we are using webp
compression here to reduce size of rendered notebook. If you don't see an image on a map below it could be that your browser doesn't support webp.
import folium
_map = folium.Map()
# fmt: off
thumb.odc.add_to(
_map,
fmt="webp", # Default is png, jpeg is also supported
quality=60, # webp/jpeg quality configuration
max_size=100_000, # disable further shrinking, keep 1km resolution
opacity=0.90, # folium layer options are passed-through
)
# fmt: on
_map.fit_bounds(thumb.odc.map_bounds())
display(_map)
We can also use geopandas
to plot stac item metadata on top of imagery. Handy for narrowing down test scenes when looking for experiment sites.
import geopandas
import pystac
items = pystac.ItemCollection(sorted(items, key=lambda x: x.datetime), clone_items=False)
df = geopandas.GeoDataFrame.from_features(items.to_dict(), crs="epsg:4326")
# https://github.com/geopandas/geopandas/issues/1208
df["id"] = [x.id for x in items]
df[["geometry", "id", "datetime", "landsat:wrs_path", "landsat:wrs_row", "proj:epsg"]].explore(
"landsat:wrs_path",
style_kwds=dict(fillOpacity=0, width=0.3, opacity=0.3),
popup=True,
categorical=True,
m=_map,
)
display(_map)
Let's have a closer look at one scene in Africa.
items_179_073 = [
item
for item in items
if (item.properties["landsat:wrs_path"], item.properties["landsat:wrs_row"]) == ("179", "073")
]
items_179_073
[<Item id=LC08_L2SP_179073_20210611_02_T1>]
If chunks=
parameter is not supplied data loading happens locally without using Dask. Concurrency is still suported though, we request to use 4 CPU cores with pool=4
. Progress can be observed by passing progress=tqdm
option.
from IPython.display import Image
from tqdm.auto import tqdm
yy = stac_load(
items_179_073,
["red", "green", "blue", "nir08"],
dtype="uint16",
nodata=0,
patch_url=planetary_computer.sign,
pool=4,
progress=tqdm,
)
# display raster GeoBox
# - Loaded in Native projection/resolution
yy.odc.geobox
cubic
resamplingjpeg
and displayjpeg
does not support transparency so we ask for missing pixels to be whitepng
that does support transparency and is lossless, but is more heavy in terms of byteswebp
when size matters but transparency is also needed (some browsers might not support it though)Image(
data=yy.odc.to_rgba(vmin=6_000, vmax=30_000)
.isel(time=0)
.odc.reproject(yy.odc.geobox.zoom_to(1600), resampling="cubic")
.odc.compress("jpeg", 90, transparent=(255, 255, 255))
)
Individual bands can be visualized in false colour with .odc.colorize
method (supports matplotlib colormaps).
nir = yy.isel(time=0).nir08
Image(
data=nir.odc.reproject(yy.odc.geobox.zoom_to(1600), resampling="cubic")
.odc.colorize("magma")
.odc.compress("webp", 80)
)
Cloud Optimized GeoTIFF (COG) is a popular and well supported format. We can easily save xarrays to COG format.
fname_rgb = f"{items_179_073[0].id}-rgba-cog.tif"
fname_nir = f"{items_179_073[0].id}-nir-cog.tif"
print(f"Saving {fname_rgb}")
yy.isel(time=0).odc.to_rgba(vmin=6_000, vmax=30_000).odc.write_cog(
fname_rgb,
overwrite=True,
overview_resampling="average",
# there are other options, but we'll leave them as defaults
)
print(f"Saving {fname_nir}")
nir.odc.write_cog(
fname_nir,
overwrite=True,
overview_resampling="average",
# change compression to ZSTD (default is DEFLATE)
compress="zstd",
zstd_level=9,
)
print("Done")
Saving LC08_L2SP_179073_20210611_02_T1-rgba-cog.tif Saving LC08_L2SP_179073_20210611_02_T1-nir-cog.tif Done
!ls -lh {fname_rgb} {fname_nir}
!gdalinfo {fname_rgb}
!gdalinfo {fname_nir}
-rw-r--r-- 1 jovyan users 83M Jun 7 00:02 LC08_L2SP_179073_20210611_02_T1-nir-cog.tif -rw-r--r-- 1 jovyan users 82M Jun 7 00:01 LC08_L2SP_179073_20210611_02_T1-rgba-cog.tif Driver: GTiff/GeoTIFF Files: LC08_L2SP_179073_20210611_02_T1-rgba-cog.tif Size is 7768, 7872 Coordinate System is: PROJCRS["WGS 84 / UTM zone 33S", BASEGEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]], CONVERSION["UTM zone 33S", METHOD["Transverse Mercator", ID["EPSG",9807]], PARAMETER["Latitude of natural origin",0, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8801]], PARAMETER["Longitude of natural origin",15, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8802]], PARAMETER["Scale factor at natural origin",0.9996, SCALEUNIT["unity",1], ID["EPSG",8805]], PARAMETER["False easting",500000, LENGTHUNIT["metre",1], ID["EPSG",8806]], PARAMETER["False northing",10000000, LENGTHUNIT["metre",1], ID["EPSG",8807]]], CS[Cartesian,2], AXIS["(E)",east, ORDER[1], LENGTHUNIT["metre",1]], AXIS["(N)",north, ORDER[2], LENGTHUNIT["metre",1]], USAGE[ SCOPE["Engineering survey, topographic mapping."], AREA["Between 12°E and 18°E, southern hemisphere between 80°S and equator, onshore and offshore. Angola. Congo. Democratic Republic of the Congo (Zaire). Gabon. Namibia. South Africa."], BBOX[-80,12,0,18]], ID["EPSG",32733]] Data axis to CRS axis mapping: 1,2 Origin = (519720.000000000000000,8040090.000000000000000) Pixel Size = (30.000000000000000,-30.000000000000000) Metadata: AREA_OR_POINT=Area Image Structure Metadata: COMPRESSION=DEFLATE INTERLEAVE=PIXEL LAYOUT=COG PREDICTOR=2 Corner Coordinates: Upper Left ( 519720.000, 8040090.000) ( 15d11' 9.58"E, 17d43'34.53"S) Lower Left ( 519720.000, 7803930.000) ( 15d11'18.07"E, 19d51'38.28"S) Upper Right ( 752760.000, 8040090.000) ( 17d22'59.58"E, 17d42'42.79"S) Lower Right ( 752760.000, 7803930.000) ( 17d24'48.26"E, 19d50'39.82"S) Center ( 636240.000, 7922010.000) ( 16d17'33.85"E, 18d47'20.85"S) Band 1 Block=512x512 Type=Byte, ColorInterp=Red Overviews: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Mask Flags: PER_DATASET ALPHA Overviews of mask band: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Band 2 Block=512x512 Type=Byte, ColorInterp=Green Overviews: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Mask Flags: PER_DATASET ALPHA Overviews of mask band: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Band 3 Block=512x512 Type=Byte, ColorInterp=Blue Overviews: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Mask Flags: PER_DATASET ALPHA Overviews of mask band: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Band 4 Block=512x512 Type=Byte, ColorInterp=Alpha Overviews: 3884x3936, 1942x1968, 971x984, 486x492, 243x246 Driver: GTiff/GeoTIFF Files: LC08_L2SP_179073_20210611_02_T1-nir-cog.tif Size is 7768, 7872 Coordinate System is: PROJCRS["WGS 84 / UTM zone 33S", BASEGEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]], CONVERSION["UTM zone 33S", METHOD["Transverse Mercator", ID["EPSG",9807]], PARAMETER["Latitude of natural origin",0, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8801]], PARAMETER["Longitude of natural origin",15, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8802]], PARAMETER["Scale factor at natural origin",0.9996, SCALEUNIT["unity",1], ID["EPSG",8805]], PARAMETER["False easting",500000, LENGTHUNIT["metre",1], ID["EPSG",8806]], PARAMETER["False northing",10000000, LENGTHUNIT["metre",1], ID["EPSG",8807]]], CS[Cartesian,2], AXIS["(E)",east, ORDER[1], LENGTHUNIT["metre",1]], AXIS["(N)",north, ORDER[2], LENGTHUNIT["metre",1]], USAGE[ SCOPE["Engineering survey, topographic mapping."], AREA["Between 12°E and 18°E, southern hemisphere between 80°S and equator, onshore and offshore. Angola. Congo. Democratic Republic of the Congo (Zaire). Gabon. Namibia. South Africa."], BBOX[-80,12,0,18]], ID["EPSG",32733]] Data axis to CRS axis mapping: 1,2 Origin = (519720.000000000000000,8040090.000000000000000) Pixel Size = (30.000000000000000,-30.000000000000000) Metadata: AREA_OR_POINT=Area Image Structure Metadata: COMPRESSION=ZSTD INTERLEAVE=BAND LAYOUT=COG PREDICTOR=2 Corner Coordinates: Upper Left ( 519720.000, 8040090.000) ( 15d11' 9.58"E, 17d43'34.53"S) Lower Left ( 519720.000, 7803930.000) ( 15d11'18.07"E, 19d51'38.28"S) Upper Right ( 752760.000, 8040090.000) ( 17d22'59.58"E, 17d42'42.79"S) Lower Right ( 752760.000, 7803930.000) ( 17d24'48.26"E, 19d50'39.82"S) Center ( 636240.000, 7922010.000) ( 16d17'33.85"E, 18d47'20.85"S) Band 1 Block=512x512 Type=UInt16, ColorInterp=Gray NoData Value=0 Overviews: 3884x3936, 1942x1968, 971x984, 486x492, 243x246
import odc.geo
import odc.stac
print(f"odc.stac=={odc.stac.__version__}")
print(f"odc.geo=={odc.geo.__version__}")
odc.stac==0.3.0 odc.geo==0.2.0