This past Thursday (15th October 2015) I was invited to give my comments at event, alliteratively titled “An open dialogue on open data.” The dialogue was organized by InterNews and Transparency International and held at the Sri Lanka Press Institute in Colombo.
I was part of a panel that included Nalaka Gunawardene and Sanjana Hattotuwa. I was asked to speak on the challenges and issues of Big and Open Data which itself is a bit of a misnomer in Sri Lanka, since there are currently no datasets in Sri Lanka that can be considered (or even amenable to be considered) as both “big” and “open”.
As a preamble to my comments I used some brief slides to highlight LIRNEasia’s ongoing big data research that LIRNEasia is is conducting, leveraging mobile network big data to produce insights for developmental policy.
I summarize here some of my my comments at the event as well as my retrospective reflections:
- Irrespective of whether it is big or open data in Sri Lanka, there are some issues that are cross-cutting and applicable when discussing open data.
- Standardization is still a big issue amongst the datasets generated by different government agencies. For example the number of divisional secretariat divsions (DSDs) in Sri Lanka differ amongst agencies. The department of census and statistics says there are 331 as of the last census, but the latest shape files by the survey department only have 229. Leave aside the fact that the identifiers themselves don’t match and each use varying spellings for DSD names and in some case are not even the same name.
- Accountability and incentives: Its all good for us to sit on the outside and say that there must be policies that open up government data, but we need to also understand some of the genuine concerns of government officials. For example having worked on agriculture issues, I know how much agriculture department officials (not just in Sri Lanka but also in other countries in our region) worry about who will be liable/ accountable if some incorrect advice is given by third parties who utilize the government data to offer crop advisory services. Understanding these concerns is what prompted LIRNEasia to conduct its current line of agriculture research (partnering with the Department of Agriculture) to digitize and open up some crop advisory information for specific crops in the export value chain in Sri Lanka. LIRNEasia will utilize the lessons from this engagement to offer concrete suggestions on how government data can be opened up and used beneficially by others.
- With respect to privacy, we need to understand cultural contexts before we import western models directly to our countries. Privacy is very personal issue and different people articulate in different ways through use cases (e.g. I want to be told of discounts on the purchase of cigarettes, but my purchasing history shouldn’t be shared with my health insurance companies). We also need to understand that our privacy needs evolve (e.g. in the early 1900s one could be arrested for using a camera in NY’s Central park, something that is unthinkable today). We have a variety of solutions for protecting Personally Identifiable Information (PII) such as pseudonymization, but also have to worry about how such pseudonymized datasets can be mixed with other public datasets to infer PII.
- Finally if we want more data easily available to the public, we have to build ecosystems and upgrade capacities of people to analyse that data or for that matter even consume the insights from such data. Data drive journalism requires at the forefront numerate journalists.