Kiswahili Language Model · Research Preview

KW5-Lite

A small, Swahili-first model, published openly with what it does well and where it falls short.

KW5-Lite is an early research release. It is a compact Swahili-language conversational model, fine-tuned from an open-weight base and built to run on modest hardware. It is not a finished product, and we treat it accordingly. This page describes what it is, what it is good for today, and the limits we have measured.

View on Hugging Face ↗Research Partnership →

Status

KW5-Lite is distributed as a LoRA adapter, which is the verified, working artifact. An earlier standalone merged build showed quality regressions during verification and should not be used until it is rebuilt and re-checked. Use the LoRA adapter until this notice is removed.

Why a Swahili-first model matters.

Over 200 million people speak Kiswahili across East Africa. Many general models reach the language through a translation layer that can flatten nuance and cultural context. We think the language deserves a model built for it directly.

KW5-Lite is built the other way around. It is Swahili-first, fine-tuned from an open-weight base on Kiswahili conversational data, and aligned to be helpful and to defer appropriately on medical, legal, and safety questions rather than to overreach.

Because it is small, it runs at low latency on modest hardware and can be deployed privately, on a laptop or a school server, without sending data away. That is the trade we chose: the right size for short, everyday Swahili conversations, deployed where the work happens.

What it is today.

Small and fast

A 1.5B-parameter base with a lightweight LoRA adapter. Runs on a single consumer GPU, no data centre required.

Swahili-first

Trained on Kiswahili conversational data, strongest in the standard and formal register.

Private by default

Runs locally on your own hardware. No API calls and no telemetry by design.

Culturally grounded

Handles standard-register grammar and common cultural patterns without a translation intermediary.

Careful on sensitive topics

Redirects acute medical concerns toward proper care and declines clearly harmful requests, briefly and without lecturing.

Open

Open weights under Apache 2.0, with a published evaluation record. Training data details are being finalized.

What it does not do.

Long conversations drift

Quality reliably degrades after roughly seven to ten turns. It is best suited to short, self-contained exchanges. Reset the history for longer sessions.

Informal and Sheng register

Slang, code-switching, and typo-heavy input are less reliable than clean standard Swahili and can drift off topic.

Not an authority

Like any small model, it can produce plausible-sounding but incorrect facts, figures, or invented proverbs. Verify anything where accuracy matters.

Not a professional advisor

It is not a substitute for medical, legal, or financial advice, and it is built to decline rather than pretend otherwise.

Where it fits.

KW5-Lite is the Kiswahili language layer we are building underneath Motif, our education ecosystem, and a foundation for East African deployments that need private, local inference in Swahili.

This is the first release. Better multi-turn stability, wider register coverage, and larger checkpoints are on the roadmap. We publish as we go.

Interested in Kiswahili AI?

We work with universities, institutions, and teams building AI that serves East African languages natively. The model and its evaluation are public.

View on Hugging Face ↗Propose a Partnership →