transcript: Open Knowledge Australia: VicRoads open data talk

Open Knowledge group presenting a talk by VicRoads, hosted at ThoughtWorks, 4/16/2015: 

this transcript is made straight from the notes I took at the event – it is definitely missing some comments, and quite likely there are mistakes in what I did capture. If you see any, comment here or email me and I will fix them. 

Three guys (Adrian Porteous, Evan Quick, Phil Reid) from VicRoads talking about their open data – current, planned, and what people would want in the future.

Steve Bennett, from Open Knowledge, who has worked with VicRoads and has many open data mapping projects – http://stevebennett.me/, http://cycletour.org/, http://www.opentrees.org/openbins.org, opencouncildata.org and melbdataguru.tumblr.com

 

In 2013-ish the Victorian government created a mandate to make government data open where possible. VicRoads specifically had already been doing this, with crash data etc. Public Transit Victoria is under heavy pressure to make their data available in the General Transit Feed Specification (GTFS) format for eg: Google Maps to use, and VicRoads to do real-time traffic data. Unfortunately they don’t currently have the infrastructure to serve everything, inventories of what they have available, or even a good understanding of what data would be useful to the public.

 

Their focus right now is on:

  • real-time traffic data
  • Integrating the released data with spatial data
  • Releasing an inventory of available (and externally relevant) data

 

Steve was seconded from his position at the University of Melbourne for six weeks in December to start looking at what they had and help them figure out what to do. While there, he found one pretty cool dataset: every two years, VicRoads runs cars with cameras along every road they own (freeways and main roads in Victoria) to get a Google Street View-like set of images, which they use in the office to decide on maintenance priorities, etc. They were looking at this dataset wondering if they could somehow automate road sign recognition in it to help with inventory management (road signs are all owned by VicRoads, including every speed zone sign in the state). Steve said that they could actually publish this complete dataset, about 2.5 terabytes worth, on VicNode. (VicNode is a service for hosting research data by scientists across Victoria: Steve also works with them). Once hosted there, they could submit it to Mapillary, a crowdsourced Google Street View analog. This started extensive discussions about who owned the dataset, what restrictions there were on sharing it, etc – but now it’s on track to be on Mapillary this week (update: approval came through and the uploading is in process!).  This is a great example of how they hope getting open-data people involved will help them recognise data that can be published and find ways to do it easily. Part of the reason they are here giving this talk is to solicit ideas from people here about what they want from VicRoads – what kind of data do they want, what quality do they need, what format would be useful, etc.

 

Question time!

Q. How does real-time traffic data collection work?
A. Every traffic signal has sensors, mainly to make adaptive light timing work. There are 6-12 detectors per approach at an intersection.

Q. So collection on rural roads [that have no traffic lights] is not as good?
A. Correct. They also do a lot of manual collection, where you lay a cord across the road and count them that way, and all freeways have detectors every 500m. This was all set up for traffic management purposes, but once the data was being collected they started using it for other things. They’ve also set up bluetooth sensors – the BT sensor sees your phone broadcasting a bluetooth ID at the start of a collection path, and then another sensor x miles later sees you again and they aggregate that data into trip time information. Also, the entire taxi fleet has GPS data that VicRoads can track.

Q. What about traffic measurement on bike routes?
A. Yes, we have traffic sensors on 30-40 of the major bike routes. Also, there are truck weight sensors in the roads that can count the number of axles on a truck as it goes over.

Q. Do these sensors catch every car going over?
A. The sensors themselves do, but we only look at aggregated data.

Q. What’s the timing of the release of CrashStats data? [https://www.data.vic.gov.au/data/dataset/crash-stats-data-extract: asked by one of the guys who runs http://www.crowdspot.com.au/ – he asks because they tried to work with this data and it seems to be delayed release]
A. There is fairly complex procession around the CrashStats data. VicRoads get it from the Victoria Police for crashes that caused injuries or fatalities, basically just as an incident report. (The Victoria Police incident report data is also published by TAC. VicRoads do also have RACV towtruck calls data for crash responses, which is less complete and not used in CrashStats?). Once the incident report is received, they investigate to identify the cause of the crash, the eventual outcome (delayed fatalities or recoveries) and once that’s finalised, they publish it to the internal dataset. This includes details like the angles and speeds of cars involved, crash diagrams, etc – any identifying details are removed before public release. Data is published to a semi-private system (available to councils) six-monthly. The public data host was built as a student project several years ago and looks like it. They are currently planning a new version and trying to decide what it needs.

Q. Are hospital records included?
A. No.
They don’t really capture details of accidents with no injuries reported – perhaps they should, perhaps it would show that those crashes are leading indicators of where fatal crashes will be. They are resource constrained on all this (nobody has a day job of releasing VicRoads data to the public) but are working hard to make the data more available. So they put a dump on 6 years of raw data on data.vic.gov.au last year for GovHack, but haven’t kept updating it. They are embarrassed.

Q. So, weighing vehicles. I live in a residential area and we get a lot of trucks coming down back streets. Do you know where they go?
A. Through the Intelligent Access program, they actually get the GPS records from most trucks. Theoretically they then get checked and told not to go off allowed routes – but in practice they mostly only get followed up on for going over bridges that aren’t rated to support their weight. The Open Access data isn’t open (it is a federal program and probably the data is owned by the federal government?) but it would be awesome if it were.

Q. So those truck weight sites don’t tell you?
A. Actually they know that trucks deliberately avoid the weight sites, which are all public knowledge, because the amount of traffic they see is lower than the amount of traffic we know exists – but they don’t know where they go instead except by looking at the Fed data. If this were opened up, it’d probably be possible to get some neat matches between the two datasets using linkage mapping.
All speed signs in Victoria must be approved by VicRoads (not the case in all states), and the dataset of all speed zones is available on data.vic.gov. They are trying to get that dataset into a realtime update so that e.g. the variable speed on the Westgate Bridge is reflected online immediately.

Q. Is there any data on whether former blackspots (high accident intersections) are safer after speed zone changes?
A. There probably is, but these guys don’t know it. The Monash Accident Research Center is involved? Anecdotally, they think the data showed it didn’t have a huge impact and that’s why the program wasn’t continued.
VicRoads is considered a leading example of government departments providing open data, but these guys feel skeptical because they don’t feel like they are doing such a great job – for instance, they have 650 internal datasets, and only about 40 have been published as open data. The unreleased datasets include stuff like

  • bridge height measurements, could be used for intelligent routing of trucks. Currently there are FIVE bridge strikes EVERY DAY in Victoria, which requires traffic management, possibly bridge closures and repairs, etc etc.
  • maximum bridge crossing weight, ditto

so if you have a specific idea of data that would be useful to you, they might just have it already! There is a ‘suggest a dataset’ option on data.vic.gov, or you can email these guys directly to kickstart the process.

Back when they started doing open data, it wasn’t really run well: so they had quotas on how many unique datasets needed to be released, but no quality control, so people released silly stuff like an entire dataset consisting of 4 values. These days, they still have issues like liability concerns on releasing data, e.g. if the bridge height data is released and contains an incorrect value and someone’s intelligent routing relies on this and directs a truck under a bridge that it hits – who is liable, for what? And this data does change, each time a road is resurfaced it rises by like an inch so the bridge clearance decreases, if the clearance hasn’t been measured in a decade it could be several inches off.

Q. Do you have any datasets that could be crowdsourced for fixes to make them more complete/accurate?
A. They don’t have any plans (or means with which) to do so atm, although it’s a pretty interesting idea to explore in the future. IMHO it’s something that could be driven by the open data community.

Q. Is it conceivable to provide the data from traffic sensors as a realtime raw feed, or other sensors?
A. It depends on the data. Road closure information is up to the minute already, but parking spot sensors are delayed two weeks. [Questioner wants timelapse traffic models of particular streets at different times of day to evaluate development decisions]. So some of it might not be realtime, but we have built an app that shows average traffic for any road at a chosen time of day, which is already publicly available. (??Didn’t catch where this is, couldn’t find it online).

Q. On those five bridge strikes a day that were mentioned earlier: is that data available anywhere?
A. No, actually. Someone must collect it, for instance there is a specific bridge strike response team – but they just write up freeform reports and there’s no system that parses those for the bridge name/location etc.

 

Currently on data.vic.gov: mostly the high-interest data sets like heavy vehicle information (allowed routes vary by weight of truck, size, day, time of day, etc – this is all used by trucking companies), speed sign locations, and crash stats (available both as a java app and a raw dump) which covers all accidents with injuries since 2005 (internally they have this data back to 1986). They recognize that they aren’t doing well at keeping released datasets up to date, and this is one reason they are interested in ArcGIS Online (vicroadsopendata.vicroadsmaps.opendata.arcgis.com) – instead of having to format data for data.vic.gov, they could set up an automated export of data in the format they already have (they are an ESRI shop) and just upload it there, and somehow get data.vic.gov to see the updates without anyone doing anything else. ArcGIS Online also has some basic charting capabilities built in to run on the datasets, and automatically serves it in multiple formats.

 

There is a Road Use Priority dataset showing which population a road is primarily intended to serve – eg, truck routes, bike routes, pedestrian areas. This is an example of how VicRoads has shifted it’s purpose from road management to transport systems management.

Q. Is that traffic data for bikes available?
A. Yes, on data.vic.gov, and they are about to release a bike traffic modeller like the car traffic modeller mentioned above.

Q. Do traffic light sensors count bikes?
A. Hm, they’re not sure. Anecdotally, bikes in car lanes trigger traffic lights – they definitely have sensors in bike lanes and also in some of the main trails around Melbourne.

Q. Could residents on a backroad ratrun volunteer to pay for e.g a bluetooth sensor that will track how many cars are going through, to get it looked at?
A. Actually, councils already have temporary counters available – when a complaint is made about inappropriate traffic, the council installs it (costs a couple hundred dollars to run for a couple weeks), then looks at the collected data and decides whether there is a real problem that needs traffic management. So all these councils have small collections of data about back streets, and VicRoads has started wondering if they can do something with it.

They don’t currently have a way to aggregate data from multiple councils (but opencouncildata.org is trying to work on this), or from any crowdsourced data collection. They probably won’t get around to this, and an ideal solution is that VicRoads releases what they have, the councils release what they have, and the internet aggregates it all.

 

edited with some corrections from Steve 4/23

transcript: Reigning in the Surveillance State (an ACLU Town Hall event)

Delayed publishing of a transcript I made from the March 11 ACLU event at the Seattle Town Hall. This is all from my own notes at the time so is probably missing pieces and adding inaccuracies – there is a video of the event available to check, but I haven’t done so myself. Putting it up anyway because transcripts are way more usable than videos.

Activist technology researchers = hackers

Phones are designed from the ground up to be logged to the government.

FBI tried to control encryption tech, eventually they allowed pgp encryption to be exported. Twenty years later nobody uses it because it is unusable. FBI predicts that unbreakable encryption will lead to invulnerable criminals.

Apple says there is no way for them to ever decrypt an iMessage you send, and turns this encryption on by default. Whatsapp also uses great encryption, says nobody can eavesdrop on it. And people use this all the time even without looking for encryption. (Chris recommends an app called Signal – free and encrypted. But requires you to get your friends on it.)

And Microsoft? Cooperated with prism for skype, outlook.com, says they were required to comply. Says that their tools offer the same level of security as a regular call aka not much.

How will the government respond to people using these encrypted tools?

Chris knew that other governments were buying hacking tools. He looked up whether the FBI was doing it and confirmed in 2012 that they were. He found they could hack into your computer, use webcams without turning on that light, etc. The first court order allowing it was made in 2002 and this court order was released to the public in 2012.

Besides the FBI, local police now have DHS grants to afford this stuff. Drones, stingray, etc – all DHS funds. Suppliers like Raytheon etc built them for the military but that’s a finite market, so expand into the domestic police market. He says I think reasonable people can debate on whether these tools are appropriate for use in Afghanistan etc, but that we can agree that tools developed for a hostile warzone are not appropriate in a domestic environment (paraphrase not quote). And because the money is federal, they don’t have to go to the city council or local authorities and debate the value of it and ask for money. The argument the police use is that if they debated it publicly then it would tip off the bad guys. And that’s the conflict in the are of surveillance. And besides no debate that means no oversight, even the courts granting warrants to use this don’t know what they’re going to do.

Re: Tacoma police using stingray saying they always get court orders, but the judges say they have never heard of this. The judges were mad and have made the police be more specific – after some frontpage stories. And similar around the country, but only so far around cell phone tracking, not computer hacking. And we won’t get that until someone proves it is happening.

Question time:

Q: Come to our march on April 14 to protest police violence which is genoicde against black and poor people

A: Really damn good redirect by Chris back to surveillance disproportionately affecting the poor – whatsapp, encrypted messaging, available by default on expensive Iphones but not at all on cheap Walmart phones.

 

Q: So you’ve told us that this is bad, but is using WhatsApp actually reigning it in? And how is it so much more important today than 100 years ago than when telephone operators had party lines? Is it because people buy in to the line that it is for our own safety that theres no uproar?

A: is about economics. If the government really care about you they will line your house with cameras. But probably nobody in this room is worth $1million in surveillance tools to anyone. So what we are doing by using signal is raising the cost of surveillance. And about uproar it is probably because it is abstract to people, until you have red light cameras and webcams enabled on children’s school iPads.

 

Q: big data is not a government thing but it finds stuff.

A: effectively they have had all our secrets for decades, but now they can find it and connect it. With facial recognition for instance you can suddenly connect so much more connection between all our personalities. Chris is much more worried about facial recognition tech than big data per se.

Q: what do you expect from social media companies regarding surveillance?

A: it’s really difficult to get companies to do something against their interest. Getting them to retain less data is against their business model and I haven’t been very successful there. And realistically i don’t expect them to change that until they find another way to make money. Google is trying desperately to find another way to make money but until they are out of advertising they will need your data.

 

Q: who do we make FoIArequests to to find out who is giving our police money?

A: DHS (within which most of the grants are handed out by FEMA) and the Department of Defense. Expect them to take a long long time, so also file the requests against your local agency receiving the money. And in Washington our state public records act is much stronger than FoIA.

 

Q: New research at uw shows that watching power consumption at a house can tell the difference between two different TVs of the same model being turned on, and Chris thinks that power data is under protected. What is the ACLU doing about this?

A: This is a state level fight, the ACLU has so far done best in California in concert with the EFF – results there includes power companies releasing reports on police requests, which shows us that the most requests are made in San Diego.

 

Q: metadata is not protected by encryption, what to do there?

A: we don’t have great tools for that yet except e.g Tor, or tunneling, but those are kind of slow and not good enough for say video chat, and also none of them can protect you from the cell tower knowing who you connected to.

 

Q: ?? Missed it

A: so phones have encryption keys that are supposed to protect your communication with the cell tower. Recently GCHQ hacked these from Gemalto which provides SIM cards to ATT and RFID passport chips. So we can’t trust the phone network for privacy even if they kind of wanted to provide it. And remember that this eavesdropping ability of having the keys will eventually be available to your local police

 

Q: the EU is way ahead of us, can we get data protection like them?

A: haha realistically lol not from Congress (note: not an exact quote). Technology has the ability to protect us where the laws never will. But we rely on these mega corporations to provide it and so we have to get them to play along. And if you get a law passed for a specific state or industry then often those protections will be built in for everyone because its easier for the company to do that.

 

Q: There is a bill proposed by a guy in congress, not passed (yet..)?  (Russ Feingold?)

A: there are a number of bills about this. They are hard to get passed. The NSA stuff is outside congress anyway, is ruled by executive order. I think we need a massive overhaul of that system and my job of lobbying companies is way easier.

Q cont: I think people don’t understand that we have laws today since 2012 that allow indefinite detention of Americans and the ACLU isn’t doing enough.

 

Q: back to the smart meters, Seattle city light is planning to put them in place over the next couple years, and we have privacy concerns and also safety concerns with fires and with frequencies and we have fliers outside.

 

Q: what about Amazon and their contracts with the CIA? Maybe people should be protesting them? And can you address the argument the FBI used that they use this against bad guys?

A: I regularly communicate with lawyers at all tech companies but it is very difficult with Amazon. Other companies like Apple are now publishing transparency reports but nothing from them. We should maybe be focusing on them more, especially as they provide a storage and copying back end now.

For the second, I think it’s important to distinguish between domestic and foreign use. For domestic, like stingray, you can’t target that to an individual and it’s unacceptable. Internationally, is very hard because the government says there are terrorist. But we know since Snowden that they use it not on terrorists but in interesting people, like a phone company in Belgium. And I think that spying on engineers everywhere shouldn’t be an acceptable tactic.

 

Q: something about Linux? (missed it)

A: I use Linux and I have less and less trust in closed source software and peorple are working on reproducible builds where you can verify that the code you are downloading came from the source code published online.