6.2 Raster data processing in Python¶
Sentinel-2 is a satellite mission under the European Space Agency’s (ESA) Copernicus program, designed for Earth observation. It consists of two satellites, Sentinel-2A and Sentinel-2B, which provide high-resolution optical imagery for land monitoring. Sentinel-2 data includes:
- Multispectral Imaging: 13 spectral bands ranging from visible to shortwave infrared.
- High Resolution: Spatial resolution of 10m, 20m, and 60m, depending on the band.
- Frequent Revisit Time: A 5-day global coverage cycle with both satellites.
Applications: Used for agriculture, forestry, land cover mapping, disaster monitoring, and environmental studies.
The data is freely available and widely used for remote sensing and geographic analysis.
Download Sentinel-2 data¶
Either use Copernicus Browser (https://browser.dataspace.copernicus.eu/) or eodag Python package (https://eodag.readthedocs.io/).
Define Area Of Interest¶
Let's use NUTS regions from lesson 2: NUTS_RG_20M_2021_3035.shp
import fiona
nuts_name = "Berlin"
with fiona.open('/mnt/repository/155ISDP/01/NUTS_RG_20M_2021_3035.shp') as nuts:
nuts_crs = nuts.crs
for feature in nuts:
if feature["properties"]["NUTS_NAME"] == nuts_name:
aoi = feature["geometry"]
break
nuts_crs, aoi.coordinates
(CRS.from_epsg(3035), [[(4565854.8829, 3276635.620100001), (4570053.7798999995, 3269037.6789999995), (4574087.5297, 3265054.381100001), (4572823.348999999, 3258448.2174999993), (4553875.3859, 3257338.2413), (4546340.8794, 3259545.0768999998), (4536436.5251, 3258557.022), (4532432.7773, 3262181.204), (4534088.271299999, 3270461.741699999), (4535330.4365, 3281303.397399999), (4550928.4684, 3287499.4212999996), (4565854.8829, 3276635.620100001)]])
Vizualize AOI¶
To vizualize AOI polygon GeoPandas package may be used.
aoi_map = aoi_g.set_crs(epsg=3035)
aoi_map.explore()
Query Copernicus Data Space based on AOI¶
Connection to Copernicus Data Space will be perfomed by eodag package.
In our case, we focus on the L2A product (images that have been corrected for atmospheric effects).
from eodag import EODataAccessGateway
dag = EODataAccessGateway()
product = "S2_MSI_L2A"
dag.available_providers(product)
onda skipped: could not be loaded from user configuration astraea_eod skipped: could not be loaded from user configuration wekeo skipped: could not be loaded from user configuration
['cop_dataspace', 'creodias', 'planetary_computer', 'sara']
In our case, we will query the Copernicus Data Space service.
dag.set_preferred_provider("cop_dataspace")
Polygon defining AOI must be provided in WGS-84 (EPSG:4326) and in Well Known Text (WKT) representation.
Transformation from ETRS-89 (EPSG:3035) to WGS-84 will be done by to_crs method.
extent = tuple(aoi_map.to_crs(4326).total_bounds)
extent
(np.float64(13.10941283735977), np.float64(52.3762469971005), np.float64(13.723263691263968), np.float64(52.64819449710592))
Let's search for Sentinel-2 data Level 2A with low cloud coverage (less than 5%) and available for March 2025.
results = dag.search(
productType="S2_MSI_L2A",
start="2025-03-01",
end="2025-03-31",
geom=tuple(extent),
count=True
)
print(results.number_matched)
45
results_cc = results.filter_property(operator="le", cloudCover=5)
results_cc
SearchResult (8) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0 EOProduct(id=S2C_MSIL2A_20250303T100951_N0511_R022_T33UUU_20250303T155811, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 EOProduct(id=S2C_MSIL2A_20250303T100951_N0511_R022_T32UQD_20250303T155811, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 EOProduct(id=S2C_MSIL2A_20250306T101931_N0511_R065_T33UVU_20250306T152957, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 EOProduct(id=S2C_MSIL2A_20250306T101931_N0511_R065_T32UQD_20250306T152957, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4 EOProduct(id=S2C_MSIL2A_20250306T101931_N0511_R065_T33UUU_20250306T152957, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5 EOProduct(id=S2B_MSIL2A_20250308T100749_N0511_R022_T33UVU_20250308T122841, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
6 EOProduct(id=S2B_MSIL2A_20250308T100749_N0511_R022_T32UQD_20250308T122841, provider=cop_dataspace)
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
7 EOProduct(id=S2B_MSIL2A_20250308T100749_N0511_R022_T33UUU_20250308T122841, provider=cop_dataspace)
|
For the next steps, we will need to set the access data.
- Create an account: here
- Create a configuration file
import os
username = "?"
password = "?"
dag.update_providers_config(f"""
cop_dataspace:
download:
extract: False
outputs_prefix:
output_dir: {os.environ['HOME']}/Downloads/sentinel2
delete_archive: False
auth:
credentials:
username: {username}
password: {password}
""")
View the scenes before downloading (source):
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
fig = plt.figure(figsize=(10, 8))
for i, product in enumerate(results_cc, start=1):
# This line takes care of downloading the quicklook
quicklook_path = product.get_quicklook()
# Plot the quicklook
img = mpimg.imread(quicklook_path)
ax = fig.add_subplot(3, 4, i)
ax.set_title(i - 1)
plt.imshow(img)
plt.tight_layout()
Download selected Sentinel product¶
Let's select and download the product with full coverage.
data_path = dag.download(results_cc[3])
0.00B [00:00, ?B/s]
from pathlib import Path
print(data_path, Path(data_path).exists())
/home/martin/Downloads/sentinel2/S2C_MSIL2A_20250306T101931_N0511_R065_T32UQD_20250306T152957.zip True
NDVI computation¶
For normalized difference vegetation index computation red and near-infrared bands are needed. It means that we need 4th (red) and 8th (nir) Sentinel-2 bands (see spectral bands description).
Prepare input data¶
Raster data may be open by rasterio package.
import rasterio as rio
with rio.open(data_path) as ds:
sd = ds.subdatasets
meta = ds.meta
print(meta)
print(len(sd), sd[0])
{'driver': 'SENTINEL2', 'dtype': 'float_', 'nodata': None, 'width': 512, 'height': 512, 'count': 0, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0, 0.0, 1.0, 0.0)} 4 SENTINEL2_L2A:/vsizip//home/martin/Downloads/sentinel2/S2C_MSIL2A_20250306T101931_N0511_R065_T32UQD_20250306T152957.zip/S2C_MSIL2A_20250306T101931_N0511_R065_T32UQD_20250306T152957.SAFE/MTD_MSIL2A.xml:10m:EPSG_32632
/home/martin/git/k155cvut/isdp/venv/lib/python3.13/site-packages/rasterio/__init__.py:356: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
with rio.open(sd[0]) as b10m:
print("CRS:", b10m.crs)
print("Bounds:", b10m.bounds)
print("Bands:", b10m.count)
CRS: EPSG:32632 Bounds: BoundingBox(left=699960.0, bottom=5790240.0, right=809760.0, top=5900040.0) Bands: 6
with rio.open(sd[0]) as b10m:
meta = b10m.meta
for b in b10m.indexes:
print(b, b10m.tags(b)["BANDNAME"])
1 B4 2 B3 3 B2 4 B8 5 AOT 6 WVP
Raster data can be vizualized by raster.plot.
Task: Test the greyscale (plt.cm.Greys_r
) and design a procedure for histogram matching.
Clip input data by AOI¶
For this purpose AOI must be transformed into UTM zone 32N.
with rio.open(sd[0]) as b10m:
print(b10m.crs)
EPSG:32632
from shapely.geometry import box
aoi_utm = aoi_map.to_crs(epsg=b10m.crs.to_epsg())[0]
print(aoi_utm)
POLYGON ((812595.0019881215 5831566.489820243, 816904.1235225211 5824024.387674039, 820997.3783632959 5820096.361232187, 819824.4351269903 5813468.332224231, 800875.9326156294 5812088.701389414, 793305.1209924341 5814189.652446337, 783408.0870345002 5813060.849563709, 779351.3652835945 5816629.581559535, 780892.0871073608 5824936.499978881, 781983.1294661271 5835799.75621576, 797504.3142354856 5842221.699483118, 812595.0019881215 5831566.489820243))
Raster data can be easily clipped by rasterio.mask module.
import rasterio.mask
with rio.open(sd[0]) as b10m:
out_image, out_transform = rasterio.mask.mask(b10m, [aoi_utm], crop=True)
out_image.shape
(6, 3015, 3041)
Let's define function for raster data clipping. Additionally:
- raster values are casted to float,
- zero value is replaced by NaN (Not a Number)
import numpy as np
def clip_raster(path, aoi):
with rio.open(data_path) as ds:
with rasterio.open(ds.subdatasets[0]) as b10m:
image, transform = rasterio.mask.mask(b10m, [aoi], crop=True)
data = image.astype(float)
data[data==0] = np.nan
return data, transform
data_aoi, clip_transform = clip_raster(data_path, aoi_utm)
data_aoi.shape, clip_transform
/home/martin/git/k155cvut/isdp/venv/lib/python3.13/site-packages/rasterio/__init__.py:356: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
((6, 3015, 3041), Affine(10.0, 0.0, 779350.0, 0.0, -10.0, 5842230.0))
Compute NDVI¶
Computation can be performed by numpy package.
import numpy as np
b8_aoi = data_aoi[3]
b4_aoi = data_aoi[0]
ndvi = (b8_aoi - b4_aoi) / (b8_aoi + b4_aoi)
np.nanmin(ndvi), np.nanmax(ndvi)
(np.float64(-0.737578683615474), np.float64(0.8879857079053148))
Vizualize NDVI¶
Reclassify NDVI¶
ndvi[ndvi < 0.1] = 1
ndvi[ndvi < 0.4] = 2
ndvi[ndvi < 1.0] = 3
np.unique(ndvi, return_counts=True)
(array([ 1., 2., 3., nan]), array([1622731, 5425382, 254463, 1866039]))
Store reclassified NDVI as GeoTIFF file¶
with rio.open(sd[0]) as b10m:
meta = b10m.meta
meta.update({
"driver": 'GTiff',
"dtype": "int8",
"height": ndvi.shape[0],
"width": ndvi.shape[1],
"transform": clip_transform,
"nodata": 0
})
np.nan_to_num(ndvi, copy=False, nan=0)
with rasterio.open('/tmp/ndvi_cat.tif', 'w', **meta) as dst:
dst.write_band(1, ndvi.astype(rasterio.int8))