ICIR’s journey with NativeAI
This investigative unit shares their progress and lessons learned in developing an AI-powered tool designed to transcribe and translate news content into Igbo, Hausa and Yoruba
Image generated from ChatGPT AI with prompt input from Abdulaziz Gobir.
By: Eunice Enoch
At the International Center for Investigative Reporting (ICIR), we embarked on an exciting journey with the JournalismAI Innovation Challenge, supported by the Google News Initiative, to bring to life a project we call NativeAI for Newsrooms. Our vision is to develop an AI-powered tool capable of transcribing audio-visual content into text and then translating this text into the three major Nigerian languages – Igbo, Hausa, and Yoruba.
Why this project, you might ask? Nigeria is a blend of ethnic groups, cultures and languages making it one of the most linguistically diverse countries in the world. For newsrooms, researchers, educators, and even individuals with hearing impairments, accessing and understanding information across this linguistic landscape can be a significant challenge. NativeAI was conceived to break down these barriers, encouraging inclusivity and streamlining information access for a wider audience.
Our experience: A deep dive into innovation
The past few months, spanning from December 2024 to April 2025, have been a whirlwind of innovation, learning, and problem-solving. Building NativeAI from the ground up has been an incredibly enriching experience. We've delved into the intricacies of natural language processing, explored the nuances of Nigerian languages, and grappled with the challenges of building a robust and accurate AI model.
One of the most rewarding aspects has been witnessing the potential of NativeAI firsthand. Imagine a newsroom seamlessly converting hours of interview footage into text, or a researcher effortlessly accessing information originally presented in a language they don't understand. These possibilities fuel our dedication and highlight the transformative power of this technology.
What's working: Celebrating our milestones
We've made significant strides in several key areas:
Transcription functionality: Our AI model has demonstrated promising accuracy in transcribing audio-visual content into English text. This foundational capability is crucial for the subsequent translation process.
Translation capabilities: We've successfully developed the ability to translate the transcribed text into Igbo, Hausa, and Yoruba. While still under refinement, the initial results showcase the potential for accurate translations.
Team collaboration: The project has enhanced a strong collaborative spirit within our team. The dedication and expertise of each member have been relevant in overcoming technical hurdles and driving progress.
Infrastructure setup: We've successfully established the necessary infrastructure for developing and testing NativeAI.
Navigating the challenges: Lessons learnt
Innovation rarely comes without its set of challenges. We've encountered a few along the way, which have provided invaluable learning opportunities:
Data scarcity and diversity: One of the most significant hurdles has been the limited availability of high-quality, diverse datasets specifically for Nigerian accented audio and the three target languages. This scarcity impacts the model's ability to accurately process and translate conversations.
Model robustness: Ensuring the model performs consistently well across various audio and video qualities, as well as different speaking styles, requires continuous refinement and more extensive training data.
Resource intensive development: Building and training sophisticated AI models demands significant computational resources and time. We are constantly exploring strategies to optimisze our processes and enhance efficiency.
These challenges, while present, have not deterred us. Instead, they have spurred us to explore innovative solutions. We've recogniszed the critical need for more diverse and representative datasets and are actively investigating techniques like data augmentation and model distillation to enhance the robustness and performance of NativeAI. Combining publicly available datasets and employing data augmentation strategies are key areas of focus moving forward.
Looking ahead: A sustainable future for NativeAI
Our vision for NativeAI extends beyond the grant period. We are actively exploring avenues for the project's sustainability and future expansion. We believe that NativeAI holds immense potential for newsrooms and other organiszations seeking to optimisze their workflows and reach broader audiences. Our future plans include scaling the project to partner with more news organiszations, enabling them to leverage the power of AI to streamline their journalistic processes and foster greater inclusivity.
The journey of building NativeAI has been both challenging and incredibly rewarding. We are excited about the progress we've made and the potential impact this technology can have on bridging language gaps and creating greater understanding across Nigeria. Stay tuned for more updates as we continue to refine and expand the capabilities of NativeAI!
—
This article is part of a series providing updates from 35 grantees on the JournalismAI Innovation Challenge, supported by the Google News Initiative. Click here to read other articles from our grantees.