What data is needed to run a routing analysis?

Running a routing analysis sounds straightforward until you realize how much depends on the data beneath it. Whether you are planning maintenance routes for a utility network, optimizing field crew dispatch, or modeling how a failure might propagate through infrastructure, the quality and completeness of your input data determine how useful your results will be. Spatial analysis sits at the heart of GIS technology, and routing is one of its most operationally powerful applications.

This article walks through every data requirement you need to consider before running a routing analysis, from the network geometry itself to attribute values, supporting spatial layers, and the formats that hold everything together.

What is routing analysis in a geospatial context? #

Routing analysis in a geospatial context is the process of calculating optimal paths through a network based on spatial data and defined rules. It uses connected geographic features, such as roads, pipelines, or cable networks, to determine the best route between two or more points according to criteria such as distance, travel time, capacity, or cost.

In infrastructure and utility environments, routing analysis goes well beyond navigation. It helps organizations answer questions such as: Which field crew should respond to a fault? What is the most efficient inspection corridor for a pipeline network? How does a service interruption cascade through a connected grid? The analysis depends on a topologically correct network model, meaning every connection point, segment, and junction must be accurately represented in the spatial data.

Routing analysis also incorporates rules and constraints. A vehicle route might avoid certain road types, while a utility routing model might respect pressure zones or voltage boundaries. These constraints come from attribute data attached to the network features, which is why geometry alone is never enough.

What network data is required for routing analysis? #

Routing analysis requires a connected network dataset made up of edges and nodes. Edges represent the linear features you route along, such as road segments, pipes, or cables. Nodes represent the connection points between edges, including intersections, junctions, valves, or substations. Every edge must connect to at least one node, and the topology must be clean and consistent.

Network topology #

Topology defines how features connect to each other. For routing to work correctly, edges must share endpoints with adjacent edges rather than simply crossing visually. A gap of even a few centimeters in the underlying geometry can break a route calculation entirely. Building a valid network topology is often the most time-consuming part of preparing data for routing analysis, particularly when working with legacy infrastructure datasets that were digitized at varying levels of precision.

Node and edge geometry #

Each node needs accurate coordinate data, and each edge needs a correctly ordered sequence of vertices. For infrastructure networks, this typically means working with high-resolution survey data or data captured through structured field collection processes. The coordinate reference system must be consistent across all layers involved in the analysis.

What attribute data does a routing analysis depend on? #

Routing analysis depends on attribute data to define the cost or impedance of traveling along each network edge. Without attribute values, the algorithm can only minimize raw distance. With attributes, it can optimize for travel time, operational cost, risk level, capacity, or any other measurable factor relevant to your use case.

The most common attributes used in routing analysis include:

Impedance values: travel time, distance, or a custom cost assigned to each edge
Direction restrictions: one-way designations for roads or flow direction for pipelines
Capacity attributes: maximum load, flow rate, or bandwidth limits
Turn restrictions: rules governing whether a vehicle or flow can transition between specific edge pairs at a node
Barrier attributes: temporary or permanent restrictions such as road closures, valve positions, or maintenance zones

For utility networks specifically, attributes such as operating pressure, material type, installation year, and condition rating can all feed into routing logic. When you integrate multiple data sources and build relationships between them, you can create derived attribute fields that combine several factors into a single routing cost, giving you a much more realistic model of how your network actually behaves.

What additional spatial layers improve routing accuracy? #

Additional spatial layers improve routing accuracy by providing context that the network dataset alone cannot capture. Elevation models, land use classifications, administrative boundaries, and real-time event data all help the routing engine make decisions that reflect actual conditions rather than an idealized network model.

Useful supporting layers include:

Digital elevation models (DEM): relevant for gravity-fed water networks, slope-dependent vehicle routing, or flood risk assessment
Land use and zoning layers: help apply speed limits, access restrictions, or environmental constraints
Administrative and operational boundaries: service zones, maintenance districts, or pressure zones that define which parts of the network belong to which operational unit
Asset condition layers: spatial data on asset health scores or inspection results that can increase the cost of routing through degraded infrastructure
Incident and event data: real-time or historical records of faults, outages, or disruptions that should be treated as dynamic barriers

The more of these layers you can integrate natively, without extracting them from their source systems, the more responsive your routing analysis becomes. Connecting directly to live data sources means your routing model reflects current network conditions rather than a snapshot taken days or weeks ago.

How does data quality affect routing analysis results? #

Data quality directly determines whether routing analysis produces reliable results or misleading ones. Poor geometry, missing attributes, outdated records, and inconsistent coordinate systems all introduce errors that compound through the analysis. A routing algorithm will always find a path, but that path is only as trustworthy as the data it runs on.

The most damaging quality issues are topological errors, specifically gaps and overlaps in the network geometry that prevent the algorithm from recognizing valid connections. These often go undetected until a route calculation fails or produces an obviously wrong result. Attribute completeness matters equally: an edge with a missing impedance value may be ignored entirely or assigned a default cost that distorts the output.

Data currency is another factor. Infrastructure networks change constantly through extensions, replacements, and reconfigurations. A routing model built on asset registration data that has not been updated recently will not reflect the actual state of the network. Tracking data changes automatically and storing them incrementally is a practical way to keep your routing inputs current without requiring full data refreshes every time something changes.

Before running any routing analysis, it is worth conducting a data quality assessment that checks for topological integrity, attribute completeness, coordinate system consistency, and record currency. Identifying and fixing these issues upstream saves significant time compared to diagnosing incorrect routing results after the fact.

What data formats and standards are used in routing analysis? #

Routing analysis uses a range of data formats and standards depending on the source systems, the GIS platform, and the type of network being modeled. The most widely used formats include GeoJSON, Shapefile, GML, and geodatabase formats, while network-specific standards such as INSPIRE, NEN, and IMKL are common in regulated infrastructure sectors in the Netherlands and across Europe.

For utility networks, data is often stored in asset management systems or spatial databases rather than flat-file formats. Connecting to these sources natively, without extracting data into intermediate formats, reduces the risk of version mismatches and keeps the routing model synchronized with operational records. Web service standards such as WFS and WMS also play an important role, allowing routing applications to consume spatial data from distributed sources without requiring local copies.

Standardization matters particularly when multiple organizations or systems contribute data to the same routing model. A consistent coordinate reference system, shared attribute naming conventions, and agreed-upon topology rules make it possible to integrate data from different sources into a single coherent network dataset. Where standards are not enforced at the source, data-shaping tools that allow filtering, renaming, and creating derived fields become important for preparing inputs before analysis begins.

At Spatial Eye, our spatial analysis capabilities are built to handle exactly this kind of complexity. We connect to your data natively, apply routing and topology functions directly, and help you synthesize information from multiple sources into a coherent, actionable network model. If you want to understand how your infrastructure data can support more reliable routing analysis, we are happy to show you what is possible.

Spatial Eye