Big Data 4 Development


Big data for social good

Posted by Rohan Samarajiva on March 4, 2018  /  0 Comments

The first post on big data on this website was in September 2011. By 2012, we were working on the topic with mobile network big data in hand. Six years ago, we were alone in the field. The meetings we had in multiple countries with multiple operators did not yield the additional data we desired. But we can be happy that our efforts such as an early dissemination effort at ITU Telecom World may have contributed to a more enlightened attitude that made possible the effort described below: The GSMA has announced that more operators have joined its “Big Data for Social Good” initiative and that the first wave of trials have been conducted by Bharti Airtel, Telefonica and Telenor.
On the 13th of February, a team from Lirneasia – comprised of Professor Rohan Samarajiva, Dr. Sujata Gamage, and myself – presented some of our research at the Trivedi Center at Ashoka University in Delhi, India. Ashoka, for those of us who are not familiar with it, is a private university that focuses on liberal arts: their capital stems from philanthropic contributions.   The Trivedi audience were a mix of high-level academics and students  – most with a base degree in computer science. Trivedi is dedicated to putting together datasets on Indian politics.

What is not bias?

Posted by Rohan Samarajiva on February 21, 2018  /  0 Comments

Bias is an important topic in general. It is of special significance to a research organization. Issues of bias being built into models that are beginning to play significant roles in society and economy are coming to the forefront of public discourse. So we decided to talk about this topic at a Journal Club. Colleagues from University of Moratuwa’s DataSearch also attended.
Data sovereignty, or the desire of states to exert unfettered control over data associated with natural and legal persons under their jurisdiction, was seen as an issue with the highest salience at a foresight event I participated in Bengaluru a few days ago. Several of us thought that the states will push for greater control in the immediate future and will be met with significant resistance from citizens and companies (varying across different kinds of data and different countries; health data may be easier than traffic data; states in big countries are more likely to prevail than those in small ones). We expected some kind of equilibrium to be achieved in around 5-10 years. The effort by law enforcement authorities in the US to compel Microsoft to handover the contents of email stored in Ireland and the case of the Great Firewall of China came up in discussion. Here is a discussion on recent developments in China: This is worrying not just for people who want to surf the web without annoying obstructions.
I have been cautious about buying the Western media story on China’s social credit system, but I had little other than gut feeling for my caution. Now finally, here is an analysis that will balance the Western narrative, which the author says is more about what those in the West worry about and not about what is actually happening in China. Debate over the appropriate balance between security and liberty is nothing new. While technological and data management innovations have introduced new tools, it can’t surprise anyone to learn where China’s Communist Party draws the line between individual privacy and social stability. If anything, I would have thought that the plan’s faith in market forces and private business, or the efforts at government legitimacy through increasing transparency, would have been more surprising to those outside of China.
Dharmawardana, K. G. S., Lokuge, J. N.
It is natural to think of state entities as the key actors in south-south cooperation (SSC) for improving public-service delivery. But as the highlighted example of Bangladesh’s Union Digital Centers (UDCs) shows, non-state actors can play important roles in public-service innovation. If true innovation is the objective, it would behoove the UN Office for South-South Cooperation and other interested parties to cast the net wider to include innovative organizational mechanisms as well as government innovations.
by Keshan de Silva and Yudhanjaya Wijeratne One of the most useful datasets we have is a collection of pseudoanaonymized call data records for all of Sri Lanka, largely from the year 2013. Given that Sri Lanka has extremely high cell coverage and subscription rates (we’re actually oversubscribed – there’s more subscribers than people in the country; an artifact of people owning multiple SIMS), this dataset is ripe for conducting analysis at a big data scale. We recently used it to examine the event attendance of the annual Nallur festival that happens in Jaffna, Sri Lanka. Using CDR records, we were able to analyze the increase in population of the given region during the time of the festival. A lengthy writeup describes it on Medium, explaining the importance of the festival and the logic for picking it.
Blumenstock, JE, Maldeniya, D, & Lokanathan, S
A confluence is the junction of two rivers, especially rivers of approximately equal width. My session at SAARC Law 2017 is entitled Confluence of Law and Technology. The way I see it, there is no alternative but to relax the requirement that the metaphorical rivers be of equal width. Unless, of course, we define law in the Lessig manner, East Coast Code being old style ink on paper interpreted by judges law and West Coast Code being self-enforcing rules built into hardware and software. So, anyway, I worked up a set of slides being from the tech side of the world.
The inaugural board meeting of the Global Partnership for Sustainable Development Data (GPSDD, more popularly known for their twitter @data4SDGs) was held on the 22nd of September.  I  participated as a GPSDD board member. Significant achievements have been made by GPSDD since its inception, culminating in high level support for the need for good data to measure SDGs, with many nation states making statements at the UN General Assembly which concluded just two days before the board meeting. But countries saying the right things (i.e.
Perera-Gomez, T. & Lokanathan, S.
A team of GIS experts at LIRNEasia is building an open re-demarcation tool to encourage trust in the process of electoral reforms.
Governments should not be flying blind. Now the tools of big data are available to reduce their ignorance. But we will not be able to use big data effectively if the narrative is dominated by utopian hype and dystopian scare mongering. For that we need effective, fit-for-purpose public public policy and regulation for big data (including algorithms), not remnants of 1970s thinking such as informed consent and strict purpose specification. For example, the above shibboleths do not provide any remedy for the real harms of lack of security of data storage.
Big data is a team sport. We have people with different skill sets in our team. I can’t code, but I sit in on meeting where arcane details of software are discussed. Our coders spend most of their time on analytics, but think about broader issues such as fairness. So here is a snippet that had the eye of Lasantha Fernando: If you’ve ever applied for a loan or checked your credit score, algorithms have played a role in your life.
Linnet Taylor correctly points out that US case law does not have applicability outside the US. However, the third-party doctrine set out in the Smith v Maryland case differentiated between transaction-generated data on a telecom network and the content of what was communicated. Now there’s likely to be a different governing precedent, for those under US law: The Supreme Court agreed on Monday to decide whether the government needs a warrant to obtain information from cellphone companies showing their customers’ locations. The Supreme Court has limited the government’s ability to use GPS devices to track suspects’ movements, and it has required a warrant to search cellphones. The new case, Carpenter v.