Big Data 4 Development

Here is what I wrote about Smith v Maryland and the third party doctrine two years ago. The US government’s justification for the collection and use of telephone metadata pertaining to US citizens by the National Security Agency (NSA) exposed by Snowden was based on the third-party doctrine, derived from the above judgments (Savage, 2013). A 2013 decision from the District Court of the District of Columbia (perhaps the most important, because Washington DC is within the District) attracted significant attention because it explicitly contradicted the Smith rationale, stating that the surveillance of meta-data in 2013 was qualitatively different from that which was decided in 1979. However, a subsequent decision by a District Judge from the Foreign Intelligence Surveillance Act (FISA) Court responsible for oversight of the National Security Agency’s surveillance activities reaffirmed the third-party doctrine. Until the various appeals work their way up to the Supreme Court, Smith v Maryland will continue as the ruling precedent in the US.

Helani Galpaya at GIZ, Berlin

Posted on June 17, 2018  /  0 Comments

Helani Galpaya was one of the keynote speakers at a GIZ-organized event in Berlin, Germany on the 14th of June 2018.
Slides presented by Helani Galpaya at GIZ, Berlin in June 2018.
We’ve been talking about competition as one of the major policy issues in the data analytics space. But this is an angle we had not thought about: China. Still, with the European Union enacting tough new privacy laws, and some in the United States eager to follow, Google and Facebook could soon be forced to find ways to make money beyond selling users’ personal information to advertisers, said Raj Rajgopal, president of digital business strategy at Virtusa Corporation, a consulting firm. “As profitability reduces, they’ll say, ‘Now I need to monetize my customer base,’” Mr. Rajgopal said.
It is one thing to talk about the second circle of associates in terms of lawful interception, as we did in our bulk surveillance paper that will be out of review shortly. It is quite something else to see how many records are actually yielded by surveillance of a relatively small number of targets. In 2017, 543 million records were collected from surveillance of just 40 people. In 2016, the first full year for which that replacement system was in operation, the government obtained orders to target 42 people and collected just over 151 million call detail records. In 2017, the government obtained orders for 40 targets.
Much of the discussion on privacy is premised on the implicit imposition of a private-property model on data or information that is subject to control/consent. This could have worked when all we were dealing with were relatively simple data like a social security number or an address. But the really interesting data are transaction-generated data (TGD). These necessarily involve more than one person. How can I give or not give consent to the use of my TGD, when multiple entities have been involved in its production?
Our paper on bulk surveillance is under review and will be public soon. We did not go deep into predictive policing because most of the extant material was US-centric. It is interesting that the Economist, which keeps putting out city-level murder data, has chosen to publish a piece on the use of data in controlling violence in Latin America’s cities: Rodrigo Guerrero, the city’s mayor and a surgeon by training, launched a plan inspired by the epidemiological approach some North American cities were taking at the time. He set up “violence observatories” where police, public-health officials, academics and concerned citizens could study crime data. This revealed that most of the city’s murders took place in drunken brawls, not in conflict between gangs, and that they were late at night a day or so after payday.
LIRNEasia was proud to partner on Sri Lanka’s first national symposium on Data for Sustainable Development. Held over two days (March 20-21), the event offered a unique platform to share LIRNEasia’s views and experience on the use of data for the SDGs, particularly in relation to our big data work. Sriganesh Lokanathan, Team Leader – Big Data research moderated the first session of the symposium, and was part of a panel discussion on day 2. I had the opportunity to showcase LIRNEasia’s experience on using Big data for SDGs at a session that was co-presented with UN Global Pulse. At the first session, Sriganesh moderated a panel that emphasized the importance of data to achieve the SDGs, and set the tone for the symposium.

Big data for social good

Posted by Rohan Samarajiva on March 4, 2018  /  0 Comments

The first post on big data on this website was in September 2011. By 2012, we were working on the topic with mobile network big data in hand. Six years ago, we were alone in the field. The meetings we had in multiple countries with multiple operators did not yield the additional data we desired. But we can be happy that our efforts such as an early dissemination effort at ITU Telecom World may have contributed to a more enlightened attitude that made possible the effort described below: The GSMA has announced that more operators have joined its “Big Data for Social Good” initiative and that the first wave of trials have been conducted by Bharti Airtel, Telefonica and Telenor.
On the 13th of February, a team from Lirneasia – comprised of Professor Rohan Samarajiva, Dr. Sujata Gamage, and myself – presented some of our research at the Trivedi Center at Ashoka University in Delhi, India. Ashoka, for those of us who are not familiar with it, is a private university that focuses on liberal arts: their capital stems from philanthropic contributions.   The Trivedi audience were a mix of high-level academics and students  – most with a base degree in computer science. Trivedi is dedicated to putting together datasets on Indian politics.

What is not bias?

Posted by Rohan Samarajiva on February 21, 2018  /  0 Comments

Bias is an important topic in general. It is of special significance to a research organization. Issues of bias being built into models that are beginning to play significant roles in society and economy are coming to the forefront of public discourse. So we decided to talk about this topic at a Journal Club. Colleagues from University of Moratuwa’s DataSearch also attended.
Data sovereignty, or the desire of states to exert unfettered control over data associated with natural and legal persons under their jurisdiction, was seen as an issue with the highest salience at a foresight event I participated in Bengaluru a few days ago. Several of us thought that the states will push for greater control in the immediate future and will be met with significant resistance from citizens and companies (varying across different kinds of data and different countries; health data may be easier than traffic data; states in big countries are more likely to prevail than those in small ones). We expected some kind of equilibrium to be achieved in around 5-10 years. The effort by law enforcement authorities in the US to compel Microsoft to handover the contents of email stored in Ireland and the case of the Great Firewall of China came up in discussion. Here is a discussion on recent developments in China: This is worrying not just for people who want to surf the web without annoying obstructions.
I have been cautious about buying the Western media story on China’s social credit system, but I had little other than gut feeling for my caution. Now finally, here is an analysis that will balance the Western narrative, which the author says is more about what those in the West worry about and not about what is actually happening in China. Debate over the appropriate balance between security and liberty is nothing new. While technological and data management innovations have introduced new tools, it can’t surprise anyone to learn where China’s Communist Party draws the line between individual privacy and social stability. If anything, I would have thought that the plan’s faith in market forces and private business, or the efforts at government legitimacy through increasing transparency, would have been more surprising to those outside of China.
Dharmawardana, K. G. S., Lokuge, J. N.
It is natural to think of state entities as the key actors in south-south cooperation (SSC) for improving public-service delivery. But as the highlighted example of Bangladesh’s Union Digital Centers (UDCs) shows, non-state actors can play important roles in public-service innovation. If true innovation is the objective, it would behoove the UN Office for South-South Cooperation and other interested parties to cast the net wider to include innovative organizational mechanisms as well as government innovations.
by Keshan de Silva and Yudhanjaya Wijeratne One of the most useful datasets we have is a collection of pseudoanaonymized call data records for all of Sri Lanka, largely from the year 2013. Given that Sri Lanka has extremely high cell coverage and subscription rates (we’re actually oversubscribed – there’s more subscribers than people in the country; an artifact of people owning multiple SIMS), this dataset is ripe for conducting analysis at a big data scale. We recently used it to examine the event attendance of the annual Nallur festival that happens in Jaffna, Sri Lanka. Using CDR records, we were able to analyze the increase in population of the given region during the time of the festival. A lengthy writeup describes it on Medium, explaining the importance of the festival and the logic for picking it.