Bengali language Archives — LIRNEasia


We present a dataset consisting of 3468 documents in Bengali, drawn from Bangladeshi news websites and factchecking operations, annotated as CREDIBLE, FALSE, PARTIAL or UN-CERTAIN. The dataset has markers for the content of the document, the classification, the web domain from which each document was retrieved, and the date on which the document was published. We also present the results of misinformation classification models built for the Bengali language, as well as comparisons to prior work in English and Sinhala.
For practical reasons, we mostly limit our dissemination to English. This is a workable strategy in South Asia as policy makers read English than local languages. Still local languages are vital in all countries we work. In Bangladesh we gave equal priority to Bangla and English. Research findings of two LIRNEasia’s mobile 2.