Cygnus – conflation at your fingertips!

This is a follow-up blog post after the State Of the Map US 2017 conference held in Denver.

The process of conflation in GIS is defined as the act of merging two data layers to create one layer containing the features and attributes of both original layers.

Cygnus is a tool that compares external data with OSM, giving you a result file in JOSM XML format with all the changes. The comparison is made in a non-destructive way, so no OSM ways are ever deleted or degraded.

Workflow

NOTE – The license compatibility between the local data file and OSM has to be taken into account before adding anything in OSM. Also, please follow the OSM import procedures if you are planning to add external data to OSM.

First of all, you need to have a shapefile with local data in WGS84 spatial reference. This shapefile has to be filtered in different ways, depending on the tags you want to compare. For example, if you want to compare oneways, make sure to have a flow-direction/oneway/etc. attribute in the shapefile.

Translation

The first thing that has to be taken care of is to assure a proper attribute translation. I created a simple example for this exercise. I don’t want to get neck-deep in too many technical details so the main focus remains the process as a whole. I kept the attribute information for this example straightforward:

In order to create an OSM file from this data, I wrote a simple translation file that will be used together with ogr2osm.

Next, run the below command to obtain the OSM file.

python ogr2osm.py simple_streets.shp -t simple_translation.py -o simple_output.osm

Finally, I converted the OSM file to PBF using osmosis, because Cygnus requires a PBF file as input.

Cygnus goes to work!

Now that you have gone through the pre-processing of the local data file, we can offer it to Cygnus for processing. Note that your upload needs to be small-ish – the spatial extent needs to be smaller than 50×50 km and the file needs to be 20MB or smaller in size.

The interface of the Cygnus service is very simple – there are just two pages:

  • the home page where you add new jobs
  • the job queue page where you can see your progress and download the result

If your input file was uploaded successfully, Cygnus will go to work. Your job will be added to the back of the queue. When it’s your turn, Cygnus will read your PBF file, and download the OSM data for the same extent, using Overpass API. It will then compare your upload with the existing OSM data and produce the output file that you can download from the job queue.

NOTE – Everyone’s jobs are listed here, so be careful not to touch other users’ stuff.

Process the output in JOSM

Once Cygnus gives us the output, we can open it in JOSM and inspect it. This is by far the most important, and time consuming, step. Even though Cygnus does a best effort to connect ways where needed, it acts conservatively so it will not snap ways together that do not belong together.

Here are a few ways that got properly connected to the existing highway=secondary:

But there are situations where the distance was too far so Cygnus did not snap:

In this case, you need to manually connect the ways if that is appropriate.

When you are finally satisfied with your manually post-processed conflation result, you can go ahead and merge it with the OSM data and upload it!

Facebooktwitter

New ImproveOSM tiles are ready to be used!

New ImproveOSM missing road tiles are available! The new data is very helpful as they can help you to target the missing roads, add them to OSM, and thus greatly improving the map.

Worldwide, there are 113048 new road tiles.  The countries with the highest number of tiles are Russia – 38669 tiles, United Kingdom – 8890 tiles, Kazakhstan – 10993 tiles, India –  9418 tiles, and the United States- 7560 tiles (see graph below). There are few new tiles in Detroit too so that you are welcome to give us a hand with them! You can find more information about our work in Detroit on our blog (http://blog.improveosm.org/en/2017/08/lane-number-and-turn-lane-editing-in-detroit/).

Facebooktwitter

Lane number and turn lane editing in Detroit

Since we started editing in Detroit, we focused on making OSM navigation ready. We started with the basics: road geometry, road name, turn restrictions, and then we were able to further build on this foundation by adding details like lanes and turn lanes. In the last four months, we focused on adding and updating the lane info (lane number and turn lane) on motorway, motorway_link, trunk, trunk_link, primary, primary_link, secondary, secondary_link roads in Detroit, Michigan.

For editing lanes and turn lanes we used JOSM, the TurnLanes-tagging Editor plugin and the Lane and road attributes map paint style.

We had two kinds of lane editing: unidirectional road editing, bidirectional road editing. The only difference between those two is the direction tag used in the second case, as you can see in the below table:

For every edited case, we used a simple workflow:

  • we split the way where the number of lanes changes
  • we checked and double-checked the aerial imagery to make sure we enter the correct number of lanes and add the appropriate lanes tag
  • we opened the turn lanes-tagging plugin and activated the Lane and road attributes map style
  • using the plugin, we selected the type of the road: Unidirectional road or Bidirectional road
  • we marked the number of lanes for each way needed
  • we marked  the direction on each lane
  • before uploading the data, we checked again that the turn lanes that we had added were similar to the markings on the road!

The approach of the main cases we’ve met during our edits is exemplified in the next GIFs.

Editing the number of lanes

Adding both ways lane

In some particular cases, when there were doubts, we consulted the OSM community on Github and Talk-US.

While editing, we paid special attention to other already existing features (like route relations, turn restrictions, speed limits, etc). Because all Telenav Mapping team was involved in this project, we established from the beginning some rules, in order to have consistency in our edits:

  • Add a new lane only when you have a line marked on the road (use the satellite imagery, OSC photos to validate the marks).
  • Links without any marks on-road or without one-way tag should be edited as a bidirectional road, adding one lane on both driving directions.
  • Never add the turn lane before or after the continuous line mark on the road. The turn lane will be added starting from the beginning of the continuous line mark on the road.
  • We split and edit lane numbers even when we have small segments of ways.
  • The location of the junction nodes should be at the beginning of the continuous line marks.
  • We always add the yellow both-way lane.
  • We DO NOT add the yellow striped lanes and double marked line lanes.

The main sources used during the project were aerial imagery (Bing, Mapbox, NAIP, Digital Globe) and street-level imagery: OSC, Mapillary.

We worked on this issue for 2 months and succeeded to review a large part of the motorway, trunk, primary and secondary roads from the Detroit area, in order to add or update lane info. During this project, we managed to review 3100 miles and edit 1730 miles of roads.

Here’s how the number of miles of roads with lane information has increased during the project:

The edits we made cover a large area of the Wayne, Macomb and Oakland counties. In the GIF below you can see an evolution (difference between March and July) of our lane info edits in OpenStreetMap.

Heatmaps with our edits during the last four months:

When we finished editing lanes and turn lanes in Detroit, we started assessing the general quality of the lane info by using different approaches. Internally, we call this process quality assurance and we think it is vital to do it after the end of each project.

During the QA process, we edited lane info on about 400 miles of roads, and the main issues that we corrected were:

  • incorrect number of lanes and turn lanes
  • duplicated/overlapping ways
  • missing both way lane
  • oneways with lanes:forward/lanes:backward info
  • check roundabouts to have the proper number of lanes

Below you can see some examples of our improvements:

Facebooktwitter

Improving OSM in Canada one day at a time

Ever since we started our mapping project in Canada, nearly 8 months ago, we’ve been continuously working on bringing the OSM data to the level where all elements needed for routing get as detailed as possible.

Whether we are talking about the basics of road networks such as geometry, naming, or traffic flow direction, to in-depth details like the number of lanes, turn lanes, turn restrictions, signposts, and even complex relations referring to highways, we edit everything.

Our main focus is oriented towards the Top 5 metro areas: Toronto, Montreal, Ottawa, Vancouver, Calgary. These are the places where we spent most of our time researching for open data, adding new features, editing existing ones. In order to make sure that the overall state of OSM throughout the entire region of Canada is in a navigable ready state, we’ve also included the first 50 cities based on population.

So, let’s see some numbers and graphs because everybody likes those. If we start looking at the numbers for the entire region we can see a significant rise in road geometry that was added, around 3% (25,330 miles) out of the total numbers of miles. The same goes for roads that previously did not have name tags with a rise of a little over 3.5% (16,799 miles).

A more significant change can be noticed for features that weren’t extensively mapped before in the area, such as turn restrictions rising from 5254 to 54891, or signposts that hadn’t been mapped under the same standardized method. With the help of OpenStreetCam and Mapillary pictures, we’ve managed to add relevant signpost information increasing the number of nodes well over 68%.

If we break down the numbers for the Top 5 areas, the most noticeable changes can be observed for both Toronto and Montreal where one-way tags and signpost information have been improved.

One of our main goals is to focus not only on quantity but especially on quality. This is why we have multiple tools for integrity checking that are run periodically on the entire region of Canada. These tools cover a wide variety of cases that are being corrected weekly, such as road name flip-flops, unconnected ways, smoothness problems, misnamed roads, road names having their suffixes or prefixes abbreviated, and many more.

We make use of different QA tools (KeepRight/Osmose) to search and track issues in OSM that have either been added by mistake or have remained unedited after large imports. We’re also on the lookout to improve way accuracy and fix alignment issues.

An overview of our edits.

Below you can see some examples of our improvements.

Road geometry updates.
Road geometry alignment.
Missing geometry and minor refinements.
Turning loops updates.
Facebooktwitter

Mapping traffic signals and stop signs using MapRoulette

In our journey of improving OpenStreetMap, we are constantly searching for open-source data. This search is very important and is done before we start improving the map in a new area.

Currently, part of our team is focused on improving the Detroit area. So, before we started mapping we searched for useful geospatial data and we came across open data about traffic signals and stop signs for Wayne County, Detroit. The data can be found here and here.

Traffic signals mapped in OSM
Stop signs mapped in OSM

We filtered out the traffic signals and stop signs that were already in OSM but there is still a significant amount of data that can be added in OSM. (912 – traffic signals and 8755 – stop signs). Due to this, we thought about creating a MapRoulette challenge.

About MapRoulette

MapRoulette is a micro-tasking tool used to fix bugs in OpenStreetMap and to improve it. A user can create tasks by uploading files that contain the location, ways, points with the error that has to be fixed, or files with features that are missing from the map and can be added by other users.

When creating a new task, the user gives specific instructions on what steps have to be followed to edit through this tool. Once a user has logged in, he can see on the map the created challenge and the pins which consist of tasks he can solve.

So, given the available data that we found, we created two challenges – one for traffic signals and the other for stop signs. Some general rules for mapping traffic signals and stop signs can be found on the OSM wiki – here and here.

Tags that we use for mapping
  • Stop signs – highway=stop
  • Traffic signals – highway=traffic_signal
Notes
  • If the traffic signal/stop sign is referring to all the highways entering the intersection, we add the traffic signal/stop sign at the intersection point.

  • If the traffic signal/stop sign is not referring to all the highways entering the intersection we add the traffic signal/stop sign before the intersection, where the sign/signal is positioned.
  • We need to add an additional tag if the road is bidirectional:
    • for traffic signals, we use the traffic_signals:direction key with the forward or backward values to indicate the affected direction.

    • for stop signs add direction=forward or direction=backward to indicate the affected direction.

The data has been published under Public Domain license.

Everyone who is keen on mapping is welcomed to help us.

Let’s improve OSM together!

Facebooktwitter

How to deal with orphan nodes in OSM?

This is a guest post by one of our Map Analyst team interns, Manuela.


While editing the map I stumbled upon clusters of orphan nodes. Basically, orphan nodes have no tags and are not part of a way. One example here:
nodes

Some online tools report these as bugs/issues (e.g. osmose). You should be careful, though. These may have been created for a reason. Before proceeding to deletion, ask yourself:

  • Were these nodes orphan from the very beginning?
  • Is there just one orphan node in a changeset or are there many?
  • Are they arranged in a special way/shape?

If so, they may be GPS traces that could be used for mapping. And still, even nodes from GPS traces should be deleted, there is no reason to keep orphan nodes in the map AFTER you extracted all the needed information from them.

My advice: research! There are many tools to find out the history of an OSM element. To name a few: object history, WHODIDIT, OSM History Viewer, attic data, etc.

You’ll probably find yourself in one of the following situations:

  • You’ve found a newbie that creates orphan nodes by mistake
  • The nodes are correct but the user forgot to add the tags
  • A Redaction bot deleted ways without deleting the corresponding nodes (read more here)

Your options:

  • Contact the creator or the user who made the last change via a changeset comment or private message (in a friendly way, of course)
  • Add relevant tags if that’s the case
  • Delete the nodes or leave a fixme tag for other users that may know the area better than you

How to find orphan nodes

Osmose will report orphan nodes clusters as issues:

orphan

In JOSM, orphan nodes (even isolated ones) can be easily found using the Search tool, with the following queries:

type:node tags:0 -child

or

type:node untagged -child

The -child tells JOSM to select only those nodes that are not part of a way.

As if that wasn’t enough already, I’ve created a map paint style that highlights orphan nodes and dims other elements. This will help you analyze the distribution of the orphan nodes without needing to select them and will help you make the decision to delete or no.

This is the default JOSM map paint style, where you can’t really see the orphan nodes that well.
orphan2
This is how the map looks like after applying the new paint style.

If you want to use this map paint style, you can find the script and a step by step guide on the GitHub page:

https://github.com/manuelabutuc/JOSM-Orphan-nodes-map-paint-style/tree/master

Have fun spotting orphan nodes! But remember to delete them only if you are sure they have no reason to be on the map. If you have any improvement suggestions for this map paint style, feel free to comment below or fork the project.

Facebooktwitter

Using PostGIS to answer geodata questions

One of the biggest challenges when working with large sets of data is to find the least costly workflow that you have to follow in order to get the most accurate answers.

Let’s say you have a huge dataset composed of all sorts of geometry features (points, lines, areas, etc.) and you want to do a bit of cleaning – because messy and redundant information is no fun!

So you might be thinking “Hmmm… which are the areas that have an unnecessary high density of points?”

The same issue can arise when working with OpenStreetMap data. This can be easily solved using PostGIS and a command-line tool that we’ve created and used.

Note: The following steps require a Linux environment, Postgresql 9.x, PostGIS 2.x, Osmosis 0.43+, QGIS 2.12.2+

Getting the data

Download a *.osm.pbf file using the command line:

 wget https://s3.amazonaws.com/metro-extracts.mapzen.com/san-francisco_california.osm.pbf

This is the metro extract for San Francisco, provided by Mapzen. Geofabrik is also a very good resource for OSM data extracts.

In the same folder, download SCOPE – databaSe Creator Osmosis Postgis loadEr.

wget https://github.com/baditaflorin/osm-postgis-scripts/blob/master/scope.sh

Make sure to set the file to be executable by using

chmod +x scope.sh

Load the data

Using SCOPE and following the instructions on the screen, load the *.osm.pbf into a database.

SCOPE automatically creates the database with hstore and PostGIS extensions and the pgsnapshot schema.

Play with the data

Now that you have the data set up, you can easily query it using the DB Manager from QGIS and some PostGIS scripts.

Interesting examples

For example, using the find_duplicate_nodes query, we can see that this building (@20.805088941495338, -104.92877615339032), appears on the same spot 23 times!

duplicate_building

The one next to it (@20.8054225, -104.9278152) appears 22 times!

duplicate_building_2

The node density for these areas (@20.4411867, -97.3172739) is too high – 168 nodes!

nodes1

Also, 171 nodes for a small fence segment (@46.7487683, 23.559687)!

fence

the-node-density

Feel free to fork the GitHub repository and modify the code to suit your needs! Also, if you feel inspired, you can suggest a better and shorter name or acronym for SCOPE!

Facebooktwitter

Turn restrictions – a vital part of any routing system

The best part of using everyday OSM technologies and relying on OSM to make sure that you get “there” on time is that you can directly influence the quality of the experience.

Regardless of which OSM technology you’ll be using, to provide you the best experience possible, the routing software has to know as much information as possible about the roads between you and your destination: one-way streets, turn restrictions, speed limits, road closures, and much more.

For example, the turn restrictions contribute significantly to the total travel time, and to the correctness of the route altogether, thus, by ignoring them in the traffic network model, essential characteristics of the network might be missed, leading to substandard and unreasonable paths.

Dealing with turn restrictions in OSM

To help us navigate the complexities of properly translating real map scenarios to the ways and points schema of OSM we will rely on JOSM with the turn restrictions plugin installed.

Turn restrictions in OSM are handled by creating a relation

A relation is one of the core data elements that consists of one or more tags and also an ordered list of one or more nodes, ways, and/or relations as members which is used to define logical or geographic relationships between other elements. (source)

There is a mandatory requirement when creating a turn restriction relation: it has to consist of minimum of three members and must have assigned two tags. (see below example)

structure
The ‘type=restriction’ flags the relation as a turn restriction and ‘restriction=no_u_turn’ indicates the restriction type.

A ‘no_’ type relation can also be represented in map data as an ‘only_’ type relation. The prohibited turn restriction relation is preferred by some routing engines instead of an allowed turn restriction relation.

More details here - https://wiki.openstreetmap.org/wiki/Relation:restriction
More details here – https://wiki.openstreetmap.org/wiki/Relation:restriction; US regulatory signs – http://mutcd.fhwa.dot.gov/services/publications/fhwaop02084/

Members of a turn restriction relation are ways and nodes

One simple case can be a turn restriction relation that consists of three members – two ways and one node. The two ways would represent the beginning (‘from’ role) and end (‘to’ role) of the turn restriction. The node would represent the continuity of travel between two ways and has a ‘via’ role.

Way (A) - node (B) - way (C) sequence
Way (A) – node (B) – way (C) sequence in a ‘no_left_turn’ restriction relation.

Another case is where a turn restriction relation can consist of three or more ways. Two ways from this type of relation would represent the beginning and end of the turn restriction and at least one way would represent the continuity of travel between the aforementioned ways (‘via’ role).

Way (A) - way (B) - way (C) sequence in a no_u_turn restriction relation
Way (A) – way (B) – way (C) sequence in a ‘no_u_turn’ restriction relation.

Workflow for adding turn restrictions

The traditional way

Using the embedded relation editor available in JOSM. A slight disadvantage of this method is that you spend a bit more time manually constructing the relation. Click on the image below for the how-to video.

traditional_way_vid

The user-friendly way

Using the turn restrictions plugin, that automatically recognizes the type of relation and roles for each member. Click on the image below for the how-to video.

user_friendly_vid

Using the aforementioned tools, we have reviewed 2,000 miles of field trip footage and added nearly 2,500 turn restrictions in the LA/Orange County area, where 85% of the turn restrictions that were added to the map are no_u_turns, followed by 11% of no_left_turns, the rest being covered by the other categories.

Hopefully, we’ve managed to illustrate how easy is to map turn restrictions in OSM. Now, it’s your turn!

Facebooktwitter

How we imported Administrative Boundaries for Mexico from INEGI

The INEGI boundaries import project is focused on importing the data of the national, state, municipal and sub-municipal level divisions present in the MGN published by the INEGI in a community monitored process.

One of the current problems in OSM regarding Mexico’s data is the incompleteness of the administrative boundaries for municipalities. Municipalities are the second-level administrative division in Mexico, the first being the state. There are 2456 municipalities, including the ones in Mexico City which are also a second-level division just with a different name – delegations.

The main goal of this process is to enhance the current OSM administrative division coverage of Mexico with open data made available by the government at the end of 2014.

Import Process

The following steps describe the entire workflow we followed to import the boundary data.

  • Step 0 – Reprojection of INEGI dataset

Before any other step, the data released by INEGI has to be reprojected to WGS84 (EPSG:4326), from ITRF92, using QGIS and saved as a .shp file. An important thing to mention is that no simplification of the boundary geometries is considered whatsoever for this or any of the subsequent steps since the geometries are official government data.

  • Step 1 – Conversion to OSM data

Download the state boundary of interest relation from OSM and save it as a .osm file. In QGIS, using Vector > Research Tools > Select by Location, select the INEGI municipalities boundaries that are within the area of interest, in this case, Quintana Roo state, and export the selection as a .shp file.

Municipalities in Quintana Roo state.
Municipalities in Quintana Roo, as polygons.

The exported features will be polygons. In order to process them, they must be converted to lines in QGIS using the Polygons to Lines option, available in Vector > Geometry tools. Visually, the output will look the same as when the municipalities were polygons.

Municipalities in Quintana Roo, as lines.
Municipalities in Quintana Roo, as lines.

Using ogr2osm the .shp file containing the boundaries as lines are converted into a .osm file.

Before moving forward, the resulting .osm file has to be modified a bit. Using Notepad++, open the file and search and replace <nd ref=’ with <nd ref=’- and <node id=’  with <node id=’-, so the file will be with negative id.

The negative id is important because JOSM will know that this is new data, not yet added to the map.

Next, the .osm file can be converted to a .osm.pbf file using osmosis.

  • Step 2 – Processing

We load the .osm.pbf file from the previous step into an internal tool, called Mexico Split. The tool is designed to eliminate duplicate/overlapping ways by detaching them from their parent polygons and replacing them with a single common way of the two involved polygons.

mxsplit3
Detects overlapping ways and replaces them with a single common way.

Besides this main purpose, the tool also splits any resulting ways longer than 2000 segments in shorter ways, groups the ways in relationships according to the borders they define, and adds some predefined tags to these ways and relations.

Tags added to relations:

type=boundary

INEGI:MUNID=<value_from_the_original_polygon>

name=<value_from_the_original_polygon>

Tags added to both ways and relations:

boundary=administrative

admin_level=6

source=INEGI, MGN 2014 v6.2

For example, data for Bacalar municipality contains the following information:

Bacalar municipality in Quintana Roo state.
Bacalar municipality in Quintana Roo state. (click for larger image)
  • Step 3 – Backup and metrics of existing OSM data

We took a backup of the current OSM data previous to the import of the regions that are going to be impacted, using Overpass API. Also, tag-related metrics have been recorded – source, population, admin_center, admin_label, Wikipedia, etc. in order to have an overview of the newly added information.

  • Step 4 – Delete existing data from OSM and upload fresh data

In some cases, the states already have some information regarding municipality boundaries (admin_level=6). These will be deleted, but before deletion we take a look at all the features and relations, to have a very good image of what we should put back in map data after the import.

Next, we upload the municipalities on a state-by-state basis.

  • Step 5 – Clean/verify the newly added data

This is a very important step because we verify the data that we’ve uploaded to make sure that there are no errors and manually re-link the admin_level=6 relations to the admin_level=4 boundaries, where required. Any other manual corrections are done at this step.

output_mSaeg1
An example of the newly added municipalities boundaries in Tabasco.

To ease the process of importing the municipality boundaries, we use the Mexico Import Map paint style for JOSM. It highlights the last node of every way, making it simple to see the length of every way.

Map styles - JOSM default vs. Mexico Import
Map styles – JOSM default vs. Mexico Import. (click for larger image)

The square node also has a certain degree of transparency, so we can see if there is a node under the node. To be able to work in a systematic way, it allows to quickly see duplicated nodes and see the difference between the admin_level=4 and admin_level=6.

Facebooktwitter