The process in a nutshell

This scraping pipeline was developed to collect ALL the data on the official Italian postal codes (referred to as CAP, Codici di Avviamento Postale) from Poste Italiane web page.

Scraped CAP codes of Lazio and Abruzzo. CAP zones for Rome and Pescara were reconstructed using Voronoi Polygons.

The CAP boundaries of the so-called “multi-CAP” cities (i.e., cities divided into several postal areas) were reconstructed by first cross-referencing the information extracted from Poste Italiane (i.e., the list of addresses and house numbers belonging to a specific CAP code) with the ANNCSU database. Next, the house addresses were grouped into Voronoi Polygons based on the assigned CAP. The resulting polygons were aggregated using the dissolve() operator, cleaned up by removing holes, sliver polygons, and overlaps. Finally, the resulting CAP zones were clipped to the municipality-level boundaries from ISTAT.

Reconstructed CAP zones of Bologna
Reconstructed CAP zones of Bologna

It is built on the following libraries: