API Authentication
Authenticate requests by sending your key in the header of each request. The platform validates tokens server-side before queuing ML pipeline tasks.
Authorization: Bearer yr_live_your_secret_api_key
Audio Transcription
Accepts binary audio files (mp3, wav, m4a) and runs Whisper inference locally to output text in the original language.
import requests
url = "http://localhost:8000/api/transcribe"
files = {"file": open("speech.wav", "rb")}
data = {"language": "sw"}
res = requests.post(url, files=files, data=data)
print(res.json())
Text Translation
Translates textual input from a source language to a target language using the M2M-100 model.
const response = await fetch("http://localhost:8000/api/translate", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: "Habari gani rafiki yetu?",
src_lang: "sw",
tgt_lang: "en"
})
});
const result = await response.json();
console.log(result.translated_text);
Speech Synthesis (Voicing)
Generates high-fidelity voicing for any language target (returns binary WAV audio stream) using the MMS-VITS model architecture.
curl -X POST http://localhost:8000/api/voice \
-H "Content-Type: application/json" \
-d '{"text": "Hello, welcome to Yere.", "lang": "en"}' \
--output response_speech.wav
Model Pipeline Chaining
The core execution layer. Chain L1, L2, L3, and L4 sequentially. Accepts multipart uploads for speech, and outputs transcription logs alongside a Base64-encoded audio voice response.
import requests
url = "http://localhost:8000/api/chain"
files = {"file": open("swahili_audio.wav", "rb")}
data = {
"src_lang": "sw",
"tgt_lang": "en",
"actions_str": "transcribe,translate,voice"
}
res = requests.post(url, files=files, data=data)
output = res.json()
print("Logs:", output["logs"])
print("Translated text:", output["text_output"])