Help Train OpenStreetCam’s Open Sign Detection Platform

Telenav open-sourced the machine learning based sign detection platform that powers the automatic detection of nearly 100 sign types in the  OpenStreetCam images you contributed. You can already see these detections in the latest version of the OpenStreetCam JOSM plugin to help you map, and iD integration will come soon as well. 

Machine learning gets better with training. The more known instances of a particular sign that are fed into the system, the more reliable the automatic detections for that sign type will become.

Our Map Team has spent thousands of hours manually tagging and validating traffic signs in images, and the resulting training data is open source as well. But did you know you can help improve the detection system yourself as well? Let us show you how.

If you go to the trip details on the OpenStreetCam web site, you will see three ‘tabs’ on the left. The first one takes you to the main trip info. The second one takes you to an OSM edit mode, that lets you quickly go over detections and see if they need to be added to OSM. (Separate post! The third tab is the sign validation mode. If the tab icon has a number with it, there are unverified signs to work on.

The detection validation mode on the OpenStreetCam web site

The bottom part of the screen shows all detected signs. The ones that have been validated already will have a green checkmark with them. The ones that have been invalidated will have a red ‘X’. 

You can validate or invalidate the automatic detection if the sign on the image exactly matches / doesn’t match the automatic detection, by clicking the corresponding button on the left. 

Power Validator Workflow

You can validate entire trips with many detected signs very quickly by using some of the power functions available:

  • Next to the trip slider, underneath the image, you will find a small magnifying glass button. Clicking this will automatically zoom and pan the image to the detection
  • Use Cmd (Mac) / Alt (Windows / Linux) and the left and right arrows to quickly jump to the next detection
  • Use Cmd / Alt up and down to validate or invalidate the currently highlighted detection.
Skipping through detections quickly using shortcut keys Cmd / Alt up and down

Facebooktwittergoogle_plus

Summer Dispatch From The Telenav Map Team

It has been an exciting summer! Besides our regular work, there was the annual State of the Map conference that we were all really looking forward to. We launched a new ImproveOSM web site. OpenStreetCam dash-cams are distributed to OSM US members. And more. Read all about it in our Summer Dispatch below!

State of the Map

Quite a few of us got to go to State of the Map in Milan, Italy! Our team hosted four presentations at the conference, and we are really happy with the interest and feedback we received. We made a lot of new map friends as well!

All SOTM presentations were recorded and posted on YouTube, so if you missed any of us, you can watch the presentations at your leisure:

Alina and Bogdan presenting our Machine Learning stack at SOTM 2018

We also had a booth at the conference where we talked about ImproveOSM and OpenStreetCam, and where 6 lucky winners received a Waylens OpenStreetCam dashboard camera!

Excited crowd right before one of the Waylens cameras is being given away!

Mapping

We continue to map in Canada, the United States, and Mexico. As always you can track our work on GitHub. We have been focusing a lot on adding missing road names for the larger metropolitan areas in the US. Our typical workflow is to identify local government road centerline data sources, verify the license, process them with Cygnus to find changed / new names, and manually add the names if we can verify them.

Local road centerline data the team identified in Colorado

We are excited that the US community is looking to build an overview of available road centerline databases from (local) governments. We hope the ones we identified can help bootstrap this initiative.

We also published some MapRoulette challenges around this topic. 

ImproveOSM

Right on time for State of the Map, we launched a complete redesign of improveosm.org, our portal for everything Telenav❤️OSM. The new site gives you quick access to our OSM initiatives, data and tools. Check it out!We also released more than 20 thousand new missing roads locations. These are added to the existing database of currently more than 2.4 million missing road locations. An easy way to start editing based on these locations is to download the ImproveOSM plugin for JOSM.

Locations of the new Missing Roads locations

OpenStreetCam

The steady growth of OpenStreetCam continues. Almost 4.5 million kilometers of trips are in the OSC database. This amounts to about 165 million images!

We started a collaboration with OpenStreetMap US to run a Camera Lending program. Through the program, OSM US members can apply to borrow a custom Waylens Horizon camera for up to three months. The camera captures high resolution images for OSC and uploads them automatically. Almost 20 mappers have a camera already, and they have driven about 30 thousand kilometers in the past couple of months!

The passenger’s seat of our Camera Man ToeBee, as he gets ready to dispatch a bunch of Waylens cameras

That’s a wrap for our summer dispatch folks! Thanks for reading and keep an eye on the blog for more from the Telenav Map Team. Be sure to follow us on Twitter as well @improveOSM and @openstreetcam. 👋🏼

 

Facebooktwittergoogle_plus

New version of OpenStreetCam JOSM plugin with sign detections

This post also appears on my OSM diary.

The Telenav OSM team just released a new version of the OpenStreetCam JOSM plugin. The major new feature is the ability to show and manipulate street sign detections. Images in only a few areas are currently processed for sign detection, so it’s not very likely that you will see anything yet, but that will change over time as we catch up processing over 140 million images.

screen

To enable detections, right-click on the OpenStreetCam layer in the Layers panel, and check ‘Detections’ under ‘Data to display’. You can filter the detections by the following criteria:

  • Not older than — show only detections (or images) from that date or newer.
  • Only mine — show only detections / images from my own OSM / OSC account.
  • OSM Comparison — show detections based on comparison with OSM data:
    • Same data — Only show signs that have corresponding tags / data already mapped in OSM
    • New data — Only show signs that do not have corresponding data in OSM and need to be mapped
    • Changed data — Only show signs that have existing tags in OSM but the value is different (for example a 50 km/h sign and the OSM way is mapped as 60 km/h)
    • Unknown — No match could be made between the detected sign and OSM data
  • Edit status — show detections based on manually set status of the detection:
    • Open — new detection, status not changed yet
    • Mapped — manually marked as mapped
    • Bad sign — manually marked as a bad detection
    • Other — other status
  • Detection type — show only signs of the selected types.
  • Mode — Show only automatic detections, manually tagged detections, or both.

For the filters OSM Comparison, Edit status and Detection type, you can select multiple values by using shift-click and command/ctrl-click.

In the main editor window, you can select a sign to load the corresponding photo, which will show an outline of the detected sign. If there are multiple signs in an image, you can select the next one by clicking on the location again. (This is something we hope to improve.)

panel

In the new ‘OpenStreetMap detections’ panel, you can see metadata for the detection, and set the status to Mapped, Bad Detection, or Other. By marking signs that are not detected correctly as Bad Detection, you hide them from other mappers, and we will use that information to improve the detection system.

The plugin is available from the JOSM plugin list, and the source is on Github.

Facebooktwittergoogle_plus

Working with ImproveOSM Data Dumps

Our ImproveOSM pipeline produces a pretty impressive number of suggested roads missing from OSM, missing oneway tags, and missing turn restrictions, based on analysis of billions of GPS data points. We make the results available as frequent data dumps in CSV format. In this post, I want to look at a way to integrate this data into your OSM mapping workflow.

If you just want to see ImproveOSM data in JOSM wherever you are currently mapping, you can just use the ImproveOSM JOSM plugin. For advanced users who want more flexibility, or who want to use this data in different ways, this post offers some guidance.

The data dumps are available from here. For this example, I will work with the most recent Direction of Flow data file. This highlights ways with potential missing oneway tag. After downloading and unzipping it, you will have a CSV file of about 16.5 megabytes that looks like this:

wayId;fromNodeId;toNodeId;percentage;status;roadType;theGeom;numberOfTrips
148617028;1867720648;89191396;99.5378927911275;SOLVED;THROUGHWAY;LINESTRING(2.217821 48.922613,2.217719 48.922618,2.217408 48.922633);1082
33555379;322840377;322840383;98.6301369863014;INVALID;LOCAL_ROAD;LINESTRING(4.999815 47.34294,4.999957 47.343062,4.999965 47.34315);146
17271190;178942503;2341050872;100;OPEN;LOCAL_ROAD;LINESTRING(11.070503 50.139245,11.070525 50.139213,11.070616 50.139099,11.070693 50.139032);74
.....

Since the theGeom field is in WKT, you can import it as a layer in QGIS pretty easily. Let’s fire up QGIS (I use 2.18) and add a Delimited Text layer.

In the dialog, select the downloaded CSV file as the file source. Set the delimiter to semicolon. QGIS detected for me that the geometry was in the theGeom field, and of type WKT, but you can set that manually if needed:

Upon clicking OK, QGIS wants us to define which CRS the coordinates are defined in. Select WGS84.

Now, we have a layer of line geometries that correspond to OSM ways that may be missing a oneway tag.

To make the file more manageable, let’s limit our selection to one country. I get country boundaries from Natural Earth (a fantastic resource!). After adding the country borders to QGIS, I can perform a spatial query. Before you do this, select the country you are interested in. I pick Mexico as an example.

Bring up the Spatial Query window. If you don’t see this menu item, you will need to enable the Spatial Query plugin.

Select the ImproveOSM layer as the source, and the Natural Earth layer as the query layer. Make sure to check the ‘1 Selected geometries’ checkbox, so we limit our query to Mexico.

The matching features will now be selected in the ImproveOSM layer. Make sure that layer is selected in the Layers Panel before you select Layer -> Save As.. from the QGIS menu. In this dialog, choose GeoJSON as the output type. Select a destination filename. Make sure that the CRS is set to WGS84. Make sure the ‘Save only selected features’ is checked, and Save.

Now you have a GeoJSON file with all OSM way geometries that may need a oneway tag. You can load this file into JOSM, using its GeoJSON plugin. To organize your work going through these, I would recommend using the Todo plugin and add the GeoJSON features to the todo list.

Facebooktwittergoogle_plus

New Features and Enhancements in Cygnus+

Cygnus is the Telenav Mapping conflation tool. We use it a lot internally to compare approved external data sources with existing OSM data, but there is also a public version. We outlined how it works in an earlier blog post. In this post, I want to highlight some of the newer features in Cygnus. These new features are based on the feedback from our team of Map Analysts, who use the tool in their day to day work.

Discarding Very Short Segments

Cygnus outputs the differences in geometry between existing OSM data and the spatial data that we want to use to improve OSM. Sometimes, when the differences are very tiny, Cygnus used to export very short ways. These are not really meaningful enhancements, and clutter up the result data. Therefore, we implemented a length filter. Ways shorter than a defined length threshold will not be included in the output. Based on experience, we set the default to 5 meters. In the internal (command line) version our team uses, this can be tweaked using a parameter. In the public web version, this is not yet possible. We can consider adding it if there is sufficient demand.

An example of Cygnus in action. It finds an opportunity for improvement (possibly incorrect street name) as well as a false positive (degraded road geometry)

Road Names

When comparing road geometry, Cygnus not only compares geometry, but also road names. An annoying side effect we noticed is that road names are often not exactly the same in OSM as they are in the external data we compare with. This does not mean that the external data is necessarily better. For example, OSM could say that the name of a road is “River Road”, and the external data source could say it is “River Rd”. This is not a meaningful difference, and we would want to exclude those in most cases. So we added a string distance based  threshold in Cygnus to filter out similar strings. It is set to a sensible default which, again, can be tweaked in the command line version we use internally, but not yet in the web version.

Another Cygnus improvement related to road names is to ignore name differences on certain types of ways: roundabouts and service roads. Roundabout ways in OSM do not have names by convention, unless the roundabout itself has a name, so they should generally not be added. Service roads technically can have names in OSM, but it is not common. In external data, they do sometimes have names, but if they do, it usually does not make sense to add them to OSM. Based on our experience, they often have descriptive names like ‘driveway’ or ‘access road’ in the source data.

Using Cygnus

You can use Cygnus yourself by going to http://cygnus.improve-osm.org/ and uploading your source data file. You need to do a fair amount of work to prepare the source data: translating the source attributes into valid OSM tags, and converting to OSM PBF. And always remember to consider carefully what you do with the result. Cygnus is not designed to be an automated import tool. Every suggested change should be manually reviewed.

Let us know how you have used, or would like to use Cygnus!

Facebooktwittergoogle_plus

Fire up the editors: ImproveOSM updated with many new things to fix in OSM

Our OSM team continually processes billions of anonymized GPS traces we receive through the Scout app and partners, in order to discover things potentially wrong or missing in OSM. We call this effort ImproveOSM, and it  is a big part of Telenav’s overall mission to keep making OSM even better.

Missing Roads in Northern Brazil. The denser the GPS point cloud, the more trips and the more likely you are helping people get around more accurately!

Our most recent update to ImproveOSM was a particularly big one. In the last month, we added:

  • 133 thousand missing roads tiles
    • Another 75 thousand tiles that are likely parking areas or tracks
    • Another 670 thousand (!) water tiles (see below)
  • 300 thousand suspected turn restrictions with over 50% high confidence

Using ImproveOSM data

Perhaps you have not looked at ImproveOSM data before. It is available through the ImproveOSM web site, which is based on the iD editor. The screenshots on this page are from that web site. If you know how to edit with iD, you will find it easy to work with ImproveOSM data and use it to edit OSM. We wrote a post that goes into more detail a little while ago.

If you prefer JOSM, we have created an ImproveOSM JOSM plugin as well. it works similar to the web site: you choose what ImproveOSM data you want to see (suspected missing roads, suspected wrong one-way roads, or suspected missing turn restrictions, or all of the above!) and the plugin will show you the ImproveOSM data as a separate layer. We also have a blog post about using the JOSM plugin.

Finally, a few interesting / funny examples of ImproveOSM data around the world.

ImproveOSM data points out that a new road alignment is now in use. Aerial imagery and OSM have not been updated yet. This is in northern Sweden.

Here, we stumble upon an undermapped town north of Surat, India. Of course, there are un- and undermapped areas everywhere in the world, but the ImproveOSM data shows that there are people driving around on these streets using a GPS enabled app or vehicle — people who would benefit from better OSM data in their everyday lives. It is not hard to find places like this around the world.

Finally, an animation showing clusters of ‘water’ tiles. This is a side effect of the partner data we process. Since it’s anonymized there is no way to say anything about why these traces exist. Useful for OSM? Perhaps.. Interesting? I think so!

Are you finding interesting, useful, funny or wrong data in ImproveOSM? Let us know! Happy Mapping!

Facebooktwittergoogle_plus

Is OpenStreetMap Big Data ready?

This article was written by Adrian Bona as a draft for a talk at State of the Map US in Boulder, Colorado this past month. The talk did not make it into the program, but the technology lives on as a central part of our OpenStreetMap technology stack here at Telenav. We will continue to deliver weekly Parquet files of OSM data. Adrian has recently moved on from Telenav, but our OSM team is looking forward to hearing from you about this topic! — Martijn

Getting started with OpenStreetMap at large scale (the entire planet) can be painful. A few years ago we were a bit intrigued to see people waiting hours or even days to get a piece of OSM imported in PostgreSQL on huge machines. But we said OK … this is not Big Data.Meanwhile, we started to work on various geo-spatial analyses involving technologies from a Big Data stack, where OSM was used and we were again intrigued as the regular way to handle the OSM data was to run osmosis over the huge PBF planet file and dump some CSV files for various scenarios. Even if this works, it’s sub-optimal, and so we wrote an OSM converter to a big data friendly columnar format called Parquet.The converter is available at github.com/adrianulbona/osm-parquetizer.Hopefully, this will make the valuable work of so many OSM contributors easily available for the Big Data world.

How fast?

Less than a minute for romania-latest.osm.pbf and ~3 hours (on a decent laptop with SSD) for the planet-latest.osm.pbf.

Getting started with Apache Spark and OpenStreetMap

The converter mentioned above takes one file and not only converts the data but also splits it in three files, one for each OSM entity type – each file basically represents a collection of structured data (a table). The schemas of the tables are the following:

node
 |-- id: long
 |-- version: integer
 |-- timestamp: long
 |-- changeset: long
 |-- uid: integer
 |-- user_sid: string
 |-- tags: array
 |    |-- element: struct
 |    |    |-- key: string
 |    |    |-- value: string
 |-- latitude: double
 |-- longitude: double

way
 |-- id: long
 |-- version: integer
 |-- timestamp: long
 |-- changeset: long
 |-- uid: integer
 |-- user_sid: string
 |-- tags: array
 |    |-- element: struct
 |    |    |-- key: string
 |    |    |-- value: string
 |-- nodes: array
 |    |-- element: struct
 |    |    |-- index: integer
 |    |    |-- nodeId: long

relation
 |-- id: long
 |-- version: integer
 |-- timestamp: long
 |-- changeset: long
 |-- uid: integer
 |-- user_sid: string
 |-- tags: array
 |    |-- element: struct
 |    |    |-- key: string
 |    |    |-- value: string
 |-- members: array
 |    |-- element: struct
 |    |    |-- id: long
 |    |    |-- role: string
 |    |    |-- type: string

Now, loading the data in Apache Spark becomes extremely convenient:

val nodeDF = sqlContext.read.parquet("romania-latest.osm.pbf.node.parquet")
nodeDF.createOrReplaceTempView("nodes")

val wayDF = sqlContext.read.parquet("romania-latest.osm.pbf.way.parquet")
wayDF.createOrReplaceTempView("ways")

val relationDF = sqlContext.read.parquet("romania-latest.osm.pbf.relation.parquet")
relationDF.createOrReplaceTempView("relations")


From this point on, the Spark world opens and we could either play around with DataFrames or use the beloved SQL that we all know. Lets consider the following task:

For the most active OSM contributors, highlight the distribution of their work over time.

The DataFrames API solution looks like:

val nodeDF = nodeDF
    .withColumn("created_at", ($"timestamp" / 1000).cast(TimestampType))
    .createOrReplaceTempView("nodes")

val top10Users = nodeDF.groupBy("user_sid")
    .agg(count($"id").as("node_count"))
    .orderBy($"node_count".desc)
    .limit(10)
    .collect
    .map({ case Row(user_sid: String, _) => user_sid })
    
nodeDF.filter($"user_sid".in(top10Users: _*))
    .groupBy($"user_sid", year($"created_at").as("year"))
    .agg(count("id").as("node_count"))
    .orderBy($"year")
    .registerTempTable("top10UsersOverTime")


The Spark SQL solution looks like:

select 
    user_sid, 
    year(created_at)) as year,
    count(*) as node_count
from 
    nodes
where 
    user_sid in (
        select user_sid from (
            select 
                user_sid, 
                count(*) as c 
            from 
                nodes 
            group by 
                user_sid 
            order by 
                c desc 
            limit 10
        )
    )
group by 
    user_sid, 
    year(created_at)
order by 
    year


Both solutions are equivalent, and give the following results:

alt tag

Even if we touched only a tiny piece of OSM, there is nothing to stop us from analyzing and getting valuable insights from it, in scalable way.

If you are curious about more advanced interaction between OpenStreetMap and Apache Spark, take a look at this databricks notebook.

OpenStreetMap Parquet files for the entire planet?

Telenav is happy to announce weekly releases of OpenStreetMap Parquet files for the entire planet at osm-data.skobbler.net.

Facebooktwittergoogle_plus

Find your MapRoulette Challenge

MapRoulette is a fun way to spend a few minutes (or hours…) improving OpenStreetMap. MapRoulette will present you with a random, easy to solve issue in OSM. MapRoulette is organized in ‘Challenges’, groups of tasks that are of the same nature. For example, there is a challenge to add missing crosswalks in various areas in Switzerland, based on analysis of aerial images.

How do you find a challenge you would like to work on? The MapRoulette home page provides a map of all the challenges, but this has some shortcomings. The challenge ‘centers’ are no

t always representative of where the tasks actually are located. It is also hard to search by topic. MapRoulette also has a search bar that you can use to find a challenge by keyword.

I want to work on making it much easier to find

 interesting MapRoulette challenges, and I would like to hear from you how you think that should work. Please add a comment below with your ideas!

In the mean time, I made a page that lists the most popular and newest challenges. It is a bit of a hack so let me know if it stops working 😉

Happy Mapping!

Facebooktwittergoogle_plus

Help fix up TIGER v1 ways

Old, untouched TIGER ways are still abundant in OSM 🙁 and fixing them up seems to be an endless task.

ugh!

I don’t know why I didn’t do this before, but I finally got around to making a MapRoulette challenge so we can fix them together:

>> Go to the challenge <<

Because the number of old TIGER ways is huge, this challenge covers only a tiny part of the U.S. as you can see here:

Once this part is done, we can reload the challenge with more old TIGER ways.

If you look at the screenshot above, you can also see what the query is that goes into Overpass to create the challenge in the first place. You can easily adapt it to make your own local challenge if you want to start fixing up old TIGER ways with your local mapping friends! (Why not organize a TIGER fixing party? OSM US will pay for pizza!)

If you’re interested in the Overpass details and some ideas for improving it, keep reading. Otherwise, just start fixing! 

Query Overpass for old TIGER

Here is my extremely simplified way to query Overpass for old TIGER ways:

way[highway]["tiger:tlid"](40, -113, 41, -111);
out body geom qt;

It takes the bounding box (40, -113, 41, -111) and searches for ways that have the highway tag as well as the tiger:tlid tag. This query should be a pretty good approximation of a real old TIGER way query, because the tiger:tlid tag is removed automatically when you edit such a way in iD or JOSM. So any way that still has this tag must not have been edited since the import.

This query falls short of a real old TIGER ways query, because the nodes that make up the way may very well have been edited. I am also not 100% sure under which circumstances the editors remove the tiger:tlid and other unnecessary TIGER import tags. It may be safer to look for last edited date or version number. If you have suggestions for improvement, please let me know in the comments.

Happy mapping!

Facebooktwittergoogle_plus

More and Updated Data for ImproveOSM

ImproveOSM has been updated with many new roads. We processed recent  GPS data from a number of data partners with some great results. A total of 30,000 new missing road tiles were added, over 17000 in Indonesia alone.

Aside from the missing roads, we added 67000 potential missing one-way roads that we detected with high confidence. Internal testing revealed only 6% false positives.

We are happy to continue providing OSM mappers with high quality data about missing things in OSM based on billions of GPS traces. Because ImproveOSM is based on actual drives from people using navigation or mapping software in their vehicles, and we apply a pretty high threshold for number of trips and quality of the GPS data, you can be pretty confident that every ImproveOSM feature will lead you to something you can add to OSM. Even if the aerial imagery is poor.

You should see the new data in your ImproveOSM plugin or on the ImproveOSM web site very shortly. Happy mapping and let us know what you mapped using ImproveOSM!

Facebooktwittergoogle_plus