Dap — Documents


Pinpointing where poverty is most severe and tracking its changes over time is crucial for helping communities effectively. However, traditional benchmarks like household surveys and national censuses often fall short—they’re expensive, slow, and infrequent. In countries like Sri Lanka, this means we’re often relying on outdated information, hindering our ability to respond to sudden economic shocks or disasters. On top of that, poverty cannot be determined by income data alone, rather its multidimensional, where factors such as infrastructure, access to services, and economic activity also play a role in determining a community’s well-being. To capture these complexities, our DAP team (Data, Algorithms, and Policy) explored something different: how to rethink the way we measure poverty in Sri Lanka using AI with non-traditional data sources?
LIRNEasia has drafted a regional (Asia) report for the Global Index on Responsible AI (GIRAI) that focuses on responsible Artificial Intelligence in the Asia region, which is open for public review until April 13, 2025. This report, the final output a Global Center on AI Governance (GCG)-funded project, exists in three main parts: The first section examines where Asia stands in the Global Index, identifying key trends and regional patterns. The second section contextualizes these findings through in-depth national case studies, highlighting both best practices and governance gaps. The final section takes a forward-looking approach, identifying the key developments that will shape AI governance in the region. This report was authored by Merl Chandana and Sukitha Bandaranayake, with the India case study written by Anushka Jain and Aarushi Gupta.
The following document is a summary of an upcoming regional report for the Global Index on Responsible AI (GIRAI) that focuses on responsible Artificial Intelligence in the Asia region.  The broader report, to be released in April 2025 as the final output of a Global Center on AI Governance (GCG)-funded project, was authored by Merl Chandana and Sukitha Bandaranayake from LIRNEasia, with the India case study written by Anushka Jain and Aarushi Gupta (of Digital Futures Lab, India). Part II was co-authored by Merl Chandana, Sukitha Bandaranayake, and Ana Florido. The report containing global findings of the Index can be found here.
Last year we conducted research to explore the possibility of leveraging online job portal data for economic analysis in 13 Asia Pacific countries, as a part of a project for the Asian Development Bank. We examined the types of information available on major portals across the region, to discern the nature and format of available data. We also tested and refined methodologies to analyse a dataset comprising online job vacancies sourced from a Sri Lankan job portal, to demonstrate use cases for exploring  the impacts of shocks on the labour market. The first step in this exploration was to review where in practice online job portal data has been used, to identify the  methods and techniques available along with their strengths and limitations.  The full review is published below.
On 2nd October 2023, Research Manager and Team Lead (Data, Algorithms, and Policy) Merl Chandana, alongside Junior Researcher Chanuka Algama, held a session titled ‘Applied data science research for social good’ at the University of Kelaniya’s Department of Statistics and Computer Science. The session delved into LIRNEasia’s journey of forming a data science team and using large datasets to yield critical insights for public policy. They contrasted LIRNEasia’s applied data science approach with traditional academic research and private sector practices. Additionally, they highlighted the emerging ‘AI for Social Good’ movement and its potential as a career avenue. The slides used can be accessed below.
By employing unsupervised and supervised machine learning techniques, we explore the feasibility of utilizing mobile call detail records (CDRs) as well as geographic information system (GIS) and remote sensing (RS) data to map poverty spatially
Many countries around the world have adopted artificial intelligence (AI) polices. However, Sri Lanka is yet to adopt one. This discussion paper considers factors that may be taken into account if an AI policy were to be drafted in Sri Lanka.
This policy brief looks at the current status of Sri Lanka's Open Data Portal, and what may be done to improve it. 
Keynote presentation for South Eastern University, 10th Annual Science Research Sessions 2021, 30 November 2021 - by Rohan Samarajiva, LIRNEasia
Over the past decade, both internet penetration and digital media user base have increased substantially.
We present a dataset consisting of 3468 documents in Bengali, drawn from Bangladeshi news websites and factchecking operations, annotated as CREDIBLE, FALSE, PARTIAL or UN-CERTAIN. The dataset has markers for the content of the document, the classification, the web domain from which each document was retrieved, and the date on which the document was published. We also present the results of misinformation classification models built for the Bengali language, as well as comparisons to prior work in English and Sinhala.
This research report analyses the implementation of AI ethics principles in the policy, legal and regulatory, and technical arenas in Singapore and India.
The Institute of Chartered Professional Managers of Sri Lanka’s (CPM Sri Lanka) 26th Webinar was held on the 27th of August 2021 with a focus on the ‘Safety of information in a technically driven world’ a timely subject of cyber security. LIRNEasia Chair Prof. Rohan Samarajiva, shared his expertise on the key presentation (below) focusing on the current issues of information security, including potential risks to organizations and its management, data storage and back-ups as well as prevention and recovery.
We present a dataset consisting of 3576 documents in Sinhala, drawn from Sri Lankan news websites and factchecking operations, annotated as CREDIBLE, FALSE, PARTIAL or UN- CERTAIN. The dataset has markers for the content of the document, the classification, the web domain from which each document was retrieved, and the date on which the document was published. We also present the results of misinformation classification models built for the Sinhala language, as well as comparisons to English benchmarks, and suggest that for smaller media ecosystems it may make more practical sense to model uncertainty instead of truth vs falsehood binaries.
As hate speech on social media becomes an ever-increasing problem, policymakers may look to more authoritarian measures for policing content. Several countries have already, at some stage, banned networks such as Facebook and Twitter (Liebelson, 2017).
This paper presents two colloquial Sinhala language corpora from the language efforts of the Data, Analysis and Policy team of LIRNEasia, as well as a list of algorithmically derived stopwords. The larger of the two corpora spans 2010 to 2020 and contains 28,825,820 to 29,549,672 words of multilingual text posted by 533 Sri Lankan Facebook pages, including politics, media, celebrities, and other categories; the smaller corpus amounts to 5,402,76 words of only Sinhala text extracted from the larger.