Mapping a city’s flow using Uber data

The past few months we have seen some great data visualisation tackling a range of topics from the Hurricane Sandy map by Google to the Big Data in the Big Apple map by Foursquare I wrote about. I write about the topic of visualisation for two reasons. One, I want to stress the importance of visualisation when handeling (big) data to improve accessibility. Two, because visualisation is a perfect way to tell stories about and with data and to provide new insights about places such as cities. Todays post focuses on the second reason.

Uber, a company based in San Francisco, has introduced a smartphone app that allows available taxi drivers and cab-seeking riders to find one another. They first launched in their hometown, but are now active in lots of cities in the US and also in Paris, London, Toronto and are secretly finding their way into Sydney and Amsterdam.

Bradley Voytek (no stranger to our blog, we have interviewed him a while back) is turning data from Uber into a map of a city’s flow using the ridership data. Here’s how it works:

“The neighborhoods are outlined in grey and at the centroid of each neighborhood is a circle, the size of which represents the proportion of rides that flow out of that neighborhood. The circles are colored according to which statistically-identified subnetwork they belong. Every neighborhood that sends a ride in has a line of the same color as the source neighborhood connecting it to its destination. The weight of each line represents the proportion of rides that go from the source neighborhood to its target. Technically speaking this is a weighted digraph.”

Here are NYC and Washington:

So, what does this information tell us? It identifies networks of “related” neighborhoods and finds the neighborhoods that are the “hub” of the city, into and out of which the most people flow. It sort of what the Foursquare map also showed. It seems that most people tend to stay within a neighborhood taking relatively short rides. So what makes these places so popular?

These type of questions can be answered by combining data sets. So as an example: if we have the popular neighbourhoods of a city based on the Uber-data and we combine that with what kind of Foursqaure check-ins these area’s generate and maybe some Yelp-reviews with specific location tags, we can get a pretty good insight of what makes a particular part of the city popular. It’s about combining data sets and their individual types of data to create a data layer that enables a new way to look at things, or in this case a city. Now that can be a useful add-on to existing data that cities already use.

Leave a Reply