PUBLICATIONS
Our amazing team of researchers have been leading and involved in a variety of important publications
- A Study of Acquisition Functions for Medical Imaging Deep Active Learning. Bonaventure F. P. Dossou (arxiv 2024)
- Adapting Pretrained ASR Models to Low-resource Clinical Speech using Epistemic Uncertainty-based Data Selection. Bonaventure F. P. Dossou (under review at NeuRIPS 2024)
- FonMTL: Towards Multitask Learning for the Fon Language. Bonaventure F. P. Dossou, Iffanice B. Houndayi, Pamely Zantou, Gilles Hacheme (EMNLP 2023)
- Pretrained Vision Models for Predicting High-Risk Breast Cancer Stage. Bonaventure F. P. Dossou, Yeno K. S. Gbenou, Miglanche Ghomsi Nono (ICLR 2023)
- AfriNames: Most ASR models “butcher” African Names. Tobi Olatunji, Tejumade Afonja, Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Chris C. Emezue, Amina Mardiyyah Rufai, Sahib Singh (Interspeech 2023)
- GFlowOut: Dropout with Generative Flow Networks. Dianbo Liu, Moksh Jain, Bonaventure F. P. Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Chinenye Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio (ICML 2023)
- AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages. Odunayo Ogundepo, T. Gwadabe, Clara Rivera, J. Clark, Sebastian Ruder, David Ifeoluwa Adelani, Bonaventure F. P. Dossou, et.al. (EMNLP 2023)
- AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR. Tobi Olatunji, Tejumade Afonja, Aditya Yadavalli, C. Emezue, Sahib Singh, Bonaventure F. P. Dossou, Joanne Osuchukwu, Salomey Osei, Atnafu Lambebo Tonja, Naome A. Etori, Clinton Mbataku (EMNLP 2023)
- Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages. Colin Leong, Herumb Shandilya, Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Joel Mathew, Abdul-Hakeem Omotayo, Oreen Yousuf, Zainab Akinjobi, Chris C. Emezue, Shamsudeen Muhammad, Steven Kolawole, Younwoo Choi, Tosin P. Adewumi (ICLR 2023)
- MasakhaNEWS: News Topic Classification for African languages. David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, Jesujoba Oluwadara Alabi, Atnafu Lambebo Tonja, Christine Mwase, Odunayo Ogundepo, Bonaventure F. P. Dossou, et.al. (AACL 2023 – Best Paper Award)
- MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages. Cheikh M. Bamba Dione, David Ifeoluwa Adelani, Peter Nabende,…, Bonaventure F. P. Dossou, et.al. (ACL 2023)
- The Less the Merrier? Investigating Language Representation in Multilingual Models. Nigatu, Hellina Hailu, Atnafu Lambebo Tonja, and Jugal Kalita (EMNLP 2023)
- PuoBERTa: Training and evaluation of a curated language model for Setswana. Vukosi Marivate, Moseli Mots’Oehli, Valencia Wagnerinst, Richard Lastrucci, and Isheanesu Dzingirai (SACAIR 2023)
- MphayaNER: Named Entity Recognition for Tshivenda. Rendani Mbuvha, David I. Adelani, Tendani Mutavhatsindi, Tshimangadzo Rakhuhu, Aluwani Mauda, Tshifhiwa Joshua Maumela, Andisani Masindi, Seani Rananga, Vukosi Marivate, Tshilidzi Marwala
- FINE-TUNING MULTILINGUAL PRETRAINED AFRICAN LANGUAGE MODELS. Rozina Myoya , Fiskani Banda , Vukosi Marivate, Abiodun Modupe (ICLR 2023)
- Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu. Derwin Ngomane , Vukosi Marivate, Jade Abbott, Rooweither Mabuya (RAIL 2023)
- Consultative engagement of stakeholders toward a roadmap for African language technologies. Kathleen Siminyu, Jade Abbott, Kọ́la´ Tu´bọsù ´n, Angela Thandizwe Mthembu, Arshath Ramkilowan, Babatunde Oladimeji (Cell Press Publication)
- How good are Large Language Models on African Languages? Jessica Ojo, Kelechi Ogueji, Pontus Stenetorp, David Ifeoluwa Adelani
- EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation, Atnafu Lambebo Tonja, et al. (LREC-COLING 2024)
- Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets, Israel Abebe Azime, Atnafu Lambebo Tonja, et al. (In Review)
- AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages, Jiayi Wang, David Ifeoluwa Adelani, Sweta Agrawal, Ricardo Rei, Eleftheria Briakou, Marine Carpuat, Marek Masiak, Xuanli He, Sofia Bourhim, Andiswa Bukula, Muhidin Mohamed, Temitayo Olatoye, Hamam Mokayede, Christine Mwase, Wangui Kimotho, Foutse Yuehgoh, Anuoluwapo Aremu, Jessica Ojo, et al. (NAACL 2024)
-
Preparing the Vuk’uzenzele and ZA-gov-multilingual South African multilingual corpora. Richard Lastrucci, Isheanesu Dzingirai, Jenalea Rajab, Andani Madodonga, Matimba Shingange, Daniel Njini, Vukosi Marivate.