How It Works
TruVector's mobile workflow collects African-context data with clear consent, automated quality checks, structured validation, and traceable delivery. Every data point includes comprehensive metadata for responsible AI development.
1. Onboarding and Identity
Users create accounts using phone-number authentication, providing basic context for dataset representativeness: region, primary languages, device type, and optional demographic data (with explicit consent). Granular consent is captured per data type, clearly explaining what will be collected, why it's needed, and how it will be used.
2. Data Capture
Users access a task feed with clear instructions, examples, and quality requirements. The platform supports multiple data types:
Audio
Speech, read-aloud, conversational snippets, pronunciation variants
Video
Controlled prompts, environment samples, scripted phrases
Images
Objects, scenes, context-specific visuals
Text
Messages, code-switching examples, local phrasing
Surveys
Structured metadata and questionnaires
In-app guidance helps users submit high-quality data: quiet environments for audio, stable framing for video, and privacy reminders.
3. Automated Quality Checks
Before acceptance into validation, automated checks flag issues: audio clipping/distortion, video resolution/stability, image blur, and text spam patterns. Users receive immediate feedback and can resubmit if needed.
4. Structured Validation
Multi-tier validation ensures quality: peer reviewers confirm basic correctness, higher-trust reviewers handle complex tasks, and expert reviewers validate linguistic/cultural context. Golden tasks calibrate reviewer accuracy. Confidence-weighted decisions route uncertain samples for additional review.
5. Rewards and Compensation
Users earn credits for validated submissions and eligible review work. Rewards are delivered through scalable mechanisms: mobile data bundles, airtime, vouchers, or other region-appropriate options, configured based on dataset requirements.
6. Dataset Delivery
Licensed datasets are delivered with validated samples and comprehensive metadata per data point:
Sample data
Validated audio/video/image/text
Rich metadata
Language, region, consent records, age, phenotypical data
Quality indicators
Automated check results and validation outcomes
Provenance
Complete audit trail and validation history
Documentation & licensing
Dataset scope, limitations, intended use, and licensing terms
Outcomes
This workflow produces datasets that enable AI systems to:
Perform better on African languages and accents
Generalize to real African environments
Reduce bias from thin or skewed data
Deploy responsibly with traceable consent and validation
Ready to build with trusted African datasets?
Contact us to discuss your dataset requirements.