REGIONAL // LUSOPHONE-INTELLIGENCE

Brazilian Portuguese AI Training Datasets

Capture the vibrancy of the world's largest Portuguese-speaking market. Our datasets bridge the gap between PT-BR regional accents and formal/informal registers with high-fidelity, ethically sourced corpora.

PT-BR Text

High-volume text datasets for LLM pre-training, covering Brazilian news, social media, and legal documents.

Multiaccent Speech

ASR data capturing the rhythmic and melodic variations of all five Brazilian regions.

Video & Context

Brazilian media streams paired with accurate audio for multimodal training and activity recognition.

OCR & Signage

Digitalized Brazilian documents and street-level imagery for localized Computer Vision models.

Native Accent Coverage

Brazil's continental size demands diverse acoustic sampling. We provide specialized training data for all major regional variants.

SOUTHEAST (SUDESTE)

Paulista & Carioca

The economic hubs. High-volume conversational data covering the distinct intonations of São Paulo and Rio de Janeiro.

PaulistanoCaipiraCarioca

NORTHEAST (NORDESTE)

Nordestino Variants

Rich phonetic diversity and unique vocabulary. Crucial for inclusive ASR models across the Brazilian territory.

BaianoRecifenseCearense

SOUTH (SUL)

Sulista Variants

Distinctive vowels and 'tu' usage. Specialized datasets for the southern states with European linguistic influences.

GauchoCatarinenseParanaense

CENTER-WEST (CENTRO-OESTE)

Sertanejo & Central

Data from the agricultural heartland and the capital, Brasília, featuring specific regional idioms and neutral registers.

BrasilienseGoianoSertanejo

Technical Matrix // Brazilian AI Solutions

Capability Brazilian Portuguese (PT-BR) Data Format
ASR Performance Accounts for 's' vs 'z' phonetic differences and regional intonations. WAV / JSONL Transcripts
LLM Fine-tuning Handles formal 'você' vs regional 'tu' and diverse slang registers. Cleaned Parquet / JSONL
Cultural RLHF Model alignment with Brazilian social norms and cultural specificities. Ranked Comparisons

Build Smarter Brazilian AI

Ensure your models resonate with 210+ million Brazilians. Consult with our regional data architects today.