The recent Ebola crises in West Africa, has brought attention to the potential of leveraging mobile network big data in combating the spread of infectious diseases. Whilst it is Ebola that has everyone’s attention right now, the application of such methods is equally relevant to malaria and dengue, both diseases that have and continue to affect many in developing economies. The BBC, Economist, as well as others, have already chimed in, making the case for wider access to mobile network data for such efforts as well as for facilitating general post-disaster recovery efforts. The Economist article goes further and suggests that due to the attendant privacy challenges, regulatory and legal instruments must be brought to bear, to compel sharing of such data with selected researchers (where the state decides who these experts are) in cases of emergencies. Whether you prescribe to such knee-jerk reactions or not, it’s first important to understand what mobile network data can do to help stem the spread of infectious diseases.
What insights can mobile network data provide?
There are commonly two types positioning data that is generated by all mobile networks. The first is cell-handoff data such as Visitor Location Registry (VLR) data that is created to know what tower/ cell a phone is connected to at all times so that the operator can service a call to/from that phone. This gets updated almost every minute, but given the volumes in questions, most operators never store historical VLR data. The second type of positioning data is available from the passive logs of the operators created when phone events occur, e.g. outbound and inbound calls/SMS-es/ MMS-es, mobile internet sessions (Can be collectively called Call Detail Records or CDRs), airtime reloads, etc. Given that these are event based, they are not as ideal as VLRs. In either case, the location of a user is often resolvable only upto the geographic footprint of the cell/ tower, which can range from a few hundred meter square in urban areas to upto a few kilometer square in less dense rural areas. More accurate positioning data such as GPS or network /handset enhanced triangulation, currently are not generated for the majority of mobile subscribers in developing economies.
Regular mobility patterns can be established relatively easily from CDRs, especially when one considers data covering a month of activity. These can give insights such as:
- What are the regions that people from region X regularly travel to.
- What is the volume of people that travel to each of those regions over the same time period
Using the above, it is possible to then establish mobility hubs, population sources and sinks, and the resultant population flows over varying temporal periods. For example see Weslowski et al. (2014) who in Sept used existing datasets for some West Africa countries (released over the last few years to various researchers), to model the population mobility patterns in the region. Without any ground epidemiological data, CDR based mobility profiles can help to understand how infectious diseases such as Eboloa could spread from one region to another (along with the associated probabilities). If we start adding incidence data into the mix then that can allow for more accurate analyses. This is what Weslowski et al (2012) did in 2012 to understand how malaria would spread in Kenya identifying the main sources and sinks of the infection. These clearly highlights the value of CDRs for generating high resolution mobility models which can be used to predict disease propagation
Scenarios
In the case of infectious disease (especially Ebola) we can consider 3 different scenarios:
- Being prepared
Before Ebola strikes a particular country, we can using insights from CDR based analyses to understand how it may spread within the country. As such it is useful for planning purposes. Furthermore we can do this with anonymized historical data, so privacy concerns are minimized. Airline flight records (or similar records for other intra-country travel modes) have a role here helping to calculate the probability of a country being affected.
- Once a case is detected
When this happens, what is then required is a more traditional investigative procedures. Mobile network data is more useful in this case to verify who the person may have come into contact with. In this case we need non-anonymized data. Timeliness of detection, quarantining of the affected person, and the investigations are critical. Initially atleast this is not a big data problem, unless if the disease spreads quickly to others.
- If it becomes an epidemic/ pandemic
Mobile network data then has a big role to play in this instance. VLR data is the most useful, but not technical feasible to get for the whole country. What may be more feasible is to get some VLR data for a specific time period for specific regions. CDR data is only second best, but it is still better to develop some mobility patterns and mobility changes off CDRs than not doing it at all. Even with regards to CDR data, historical data is of some value but what we need is near real time data, so anonymization is not very practical at this juncture. Clearly there will be technical challenges and operators will have to expend resources to capture some VLR data and furthermore to make the data available to researchers in a very short time frame.
ICT based solutions can help in detecting whether an epidemic might be happening. This may not be such a problem with highly dangerous diseases like Ebola, which brings in much more attention and resources, but for others e.g. dengue, malaria, etc, the problem is that that often the epidemiological infrastructure is not sufficient in developing economies. Disease reporting is often a slow process and as LIRNEasia research found could take upto a month to propagate up the system in Sri Lanka. Past research by LIRNEasia that enabled data capture through mobile phones and coupled that with sophisticated software from Carnegie Mellon University showed how such a system could facilitate early detection of potential epidemics.
Conclusion
Clearly the use of mobile network data in such instances can help with resource allocation problems that a country or region may face, so that they may effectively target their actions at the most high risk regions. Equally true is that there are technical challenges as well as privacy issues that need to be considered in obtaining mobile network data. Last week, at ITU’s Plenipotentiary Conference in Busan the ITU, GSMA, UN Global Pulse and the Internet Society (ISOC) announced that they are joining forces to fight against Ebola. What results come out of that collaboration and when are good questions to ask. Others have already started work. Flowminder.org is developing national mobility estimates for Western Africa in coordination with others. Some results are available at Worldpop/Ebola.
As horrible as this Ebola crisis is, for developing countries such as Sri Lanka, the resultant public discussion creates a more conducive environment for greater utilization of CDR based analyses for other infectious diseases that are a problem in these countries. Sri Lanka (and for that matter even some other South Asian countries) have frequent problems with dengue and fighting that can be greatly aided by the use of mobile network data.
1 Comment