6.1 Vector data processing in Python¶
Converting vector data to other data formats is often done through the GDAL (https://gdal.org/) library (or its part referred to as OGR). GDAL is the huge library written in C++ which in simple use cases of working with vector data might be a bit "overkilling". For this reason, the Fiona (https://fiona.readthedocs.io/) library combined with Shapely (https://shapely.readthedocs.io/) has become popular in Python.
Fiona is the Python application interface to the OGR library. The library maps the loaded vector data into a GeoJSON data structure and writes the same structures back to the output files. Fiona is designed to be highly productive and to make it easy to write code which is easy to read.
Shapely can be use together with Fiona, however, it has a different purpose than data conversion. Shapely includes functions for modifying geometries, for example generalization, a buffer zone or comparing two geometries.
Data preparations¶
Originally, these layers come from OpenStreetMaps. They were downloaded using QuickOSM plugin in QGIS. Another way how to download OSM data is for example Overpass Turbo API (http://overpass-turbo.eu/).
QuickOSM is the QGIS extension which can be installed via Plugins -> Manage and Install Plugins
.
Two layers are prepared for you. First, we will work with protected areas around Lovosice (a city in northern Bohemia). In QuickOSM we select boundary key and its protected_area value. We would like to select all protected areas in the distance of 30 km from that city. Similarly, we select all highways in 30 km from Lovosice city. The selected tags (see OSM Map Features) can be seen in figures below.
We downloaded several geometry types - points, lines, in the case of protected areas also polygons. Further, we will work only with lines (highways) and polygons (protected areas).
Both layers were transformed to EPSG:5514 (Czech national Krovak coordinate system) before exporting from QGIS. They are available at https://geo.fsv.cvut.cz/courses/155isdp/data/06/ or on GIS.lab in /mnt/repository/155ISDP/06/ directory.
Simple vector data operations using Fiona library¶
1. Opening files by Fiona¶
import fiona
protected_areas = fiona.open('/mnt/repository/155ISDP/06/protected_areas.geojson', 'r')
2. Layer properties¶
# driver
protected_areas.driver
'GeoJSON'
# coordinate system
protected_areas.crs
CRS.from_epsg(5514)
# path to file
protected_areas.path
'/mnt/repository/155ISDP/06/protected_areas.geojson'
# layer name
protected_areas.name
'protected_areas'
# bound coordinates
protected_areas.bounds
(-787545.5171200077, -1008755.295851619, -724271.1223364118, -963765.0720619478)
# metadata
protected_areas.meta
{'driver': 'GeoJSON', 'schema': {'properties': {'full_id': 'str', 'osm_id': 'str', 'osm_type': 'str', 'boundary': 'str', 'wikimedia_commons': 'str', 'sorting_name': 'str', 'short_name': 'str', 'website': 'str', 'start_date': 'str', 'protection_title:en': 'str', 'leisure': 'str', 'eea:cdda:sitecode': 'str', 'eea:cdda:parentiso': 'str', 'eea:cdda:objectid': 'str', 'area:ha': 'str', 'wikipedia': 'str', 'wikidata': 'str', 'type': 'str', 'protection_title': 'str', 'protect_class': 'str', 'name:de': 'str', 'name': 'str', 'iucn_level': 'str'}, 'geometry': 'MultiPolygon'}, 'crs': CRS.from_epsg(5514), 'crs_wkt': 'PROJCS["S-JTSK / Krovak East North",GEOGCS["S-JTSK",DATUM["System_of_the_Unified_Trigonometrical_Cadastral_Network",SPHEROID["Bessel 1841",6377397.155,299.1528128,AUTHORITY["EPSG","7004"]],AUTHORITY["EPSG","6156"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4156"]],PROJECTION["Krovak"],PARAMETER["latitude_of_center",49.5],PARAMETER["longitude_of_center",24.8333333333333],PARAMETER["azimuth",30.2881397527778],PARAMETER["pseudo_standard_parallel_1",78.5],PARAMETER["scale_factor",0.9999],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","5514"]]'}
# number of geometric entities
len(protected_areas)
43
# print EPSG CRS definition
from fiona.crs import to_string
print(to_string(protected_areas.crs))
EPSG:5514
# print geometry type for first 10 entities
i = 1
for feature in protected_areas:
print("entity {} has geometry type of {}".format(i, feature['geometry']['type']))
i += 1
if i > 10:
break
entity 1 has geometry type of MultiPolygon entity 2 has geometry type of MultiPolygon entity 3 has geometry type of MultiPolygon entity 4 has geometry type of MultiPolygon entity 5 has geometry type of MultiPolygon entity 6 has geometry type of MultiPolygon entity 7 has geometry type of MultiPolygon entity 8 has geometry type of MultiPolygon entity 9 has geometry type of MultiPolygon entity 10 has geometry type of MultiPolygon
# print properties of particular item
print(protected_areas[2]['properties']['name'])
NPR Milešovka
3. Closing the file¶
# we finished work so we need to close the file
protected_areas.close()
# or better way for opening a file, with automatic close using WITH formula
with fiona.open("/mnt/repository/155ISDP/06/protected_areas.geojson", "r") as protected_areas:
print(protected_areas)
<open Collection '/mnt/repository/155ISDP/06/protected_areas.geojson:protected_areas', mode 'r' at 0x7faa6a987230>
Simple vector data analysis using Shapely library¶
Import shapely
and store the most protected area with protect class number "5" (see OSM key for protect class).
Protected Landscape/Seascape (IUCN Category V):
from shapely.geometry import shape
protected_areas = fiona.open("/mnt/repository/155ISDP/06/protected_areas.geojson", "r")
Iterating through features and appending to the list:
protected_class_5 = []
for feature in protected_areas:
if feature['properties']['protect_class'] == "5":
protected_class_5.append(feature)
len(protected_class_5)
1
protected_class_5[0]['properties']['name']
'CHKO České středohoří'
Tip: same selection using list comprehension:
protected_class_5 = [feature for feature in protected_areas if feature['properties']['protect_class'] == "5"]
protected_class_5[0]['properties']['name']
'CHKO České středohoří'
To work with geometries, we need to convert the JSON data into Shapely structures:
pa_geom.bounds
(-787545.5171200077, -1005646.1435298981, -724271.1223364118, -963765.0720619478)
Geometry metadata:
pa_geom.geom_type
'MultiPolygon'
pa_geom.area # 1054 km²
1054023678.4064878
Creating buffer zone of 1000 meters around protected area:
buffer.area
1261416863.6013732
Converting Buffer shapely geometry object back to JSON object:
from shapely.geometry import mapping
new_feature = {
"type": "Feature",
"properties": {
"name": "feature with buffer zone of 1000 meters",
},
"geometry": mapping(buffer)
}
Again, we can have a look at the layer name, path, CRS ...
new_feature["properties"]["name"]
'feature with buffer zone of 1000 meters'
More complex analysis using Fiona and Shapely¶
TASK: Find the areas directly affected by the construction of the D8 motorway from Prague to Dresden. Let's assume that the affected area is a strip 200m from the highway on both sides. How big is the area?
Prepare the JSON file containing only the highway D8
import fiona
motorways = fiona.open("/mnt/repository/155ISDP/06/motorways.geojson", "r")
motorways.crs
CRS.from_epsg(5514)
d8 = list(filter(lambda hw: hw["properties"]["ref"] == "D8", motorways))
Create buffer zone around the D8 highway:
from shapely.geometry import shape
Again, we use list comprehension:
buffered_motorways = [shape(feature["geometry"]).buffer(200) for feature in d8]
buffered_motorways[:10]
[<POLYGON ((-750546.028 -1010122.768, -750547.585 -1010119.279, -750576.388 -...>, <POLYGON ((-755571.192 -1001377.4, -755570.995 -1001377.469, -755333.433 -10...>, <POLYGON ((-755533.22 -1001802.192, -755533.547 -1001802.077, -755700.104 -1...>, <POLYGON ((-763268.444 -993251.719, -763269.749 -993250.972, -763341.409 -99...>, <POLYGON ((-763035.031 -993384.11, -763051.503 -993373.437, -763066.848 -993...>, <POLYGON ((-762804.691 -993070.331, -762788.188 -993080.956, -762772.806 -99...>, <POLYGON ((-762758.21 -993096.871, -762741.664 -993107.429, -762726.232 -993...>, <POLYGON ((-762947.962 -993434.03, -762993.378 -993408.274, -763009.956 -993...>, <POLYGON ((-762560.288 -993654.574, -762576.849 -993644.04, -762592.298 -993...>, <POLYGON ((-762216.991 -993404.484, -762214.918 -993405.682, -762122.294 -99...>]
Create intersection of buffered highway and given protected area:
intersections = []
for hw in buffered_motorways:
if hw.intersects(pa_geom):
out_geom = hw.intersection(pa_geom)
intersections.append(out_geom)
intersections[:10]
[<POLYGON ((-765855.438 -980088.611, -765887.827 -980109.19, -765916.563 -980...>, <POLYGON ((-765896.505 -980100.851, -765928.46 -980124.205, -765977.823 -980...>, <POLYGON ((-766143.227 -980324.902, -766172.769 -980356.439, -766183.471 -98...>, <POLYGON ((-764316.861 -977656.125, -764302.265 -977669.247, -764289.025 -97...>, <POLYGON ((-764459.491 -978856.529, -764476.566 -978924.46, -764478.844 -978...>, <POLYGON ((-764485.751 -978531.602, -764471.372 -978544.962, -764458.373 -97...>, <POLYGON ((-763906.029 -984938.096, -763934.285 -984905.482, -763971.457 -98...>, <POLYGON ((-766555.376 -983240.592, -766584.603 -983185.443, -766589.021 -98...>, <POLYGON ((-763767.214 -985212.857, -763781.132 -985171.227, -763800.152 -98...>, <POLYGON ((-763736.982 -985377.607, -763739.292 -985353.746, -763742.298 -98...>]
Merge intersection geometries into one multifeature:
Count the area which was affected.
affected_area = mergedPolys.area # m2
affected_area
6981184.070089217
Count how many percents of the whole protected area (class 5) was affected by the D8 highway construction?
protected_area = pa_geom.area # m2
protected_area
1054023678.4064878
(affected_area/protected_area) * 100
0.6623365502228215