DS-10 Language & NLP

African Customer Service Conversation Dataset

200K+ multi-turn conversation threads from African banks and telcos — spanning WhatsApp logs, IVR transcripts, and live-chat sessions — each labelled with intent, resolution outcome, sentiment trajectory, and agent-action annotations for training production-grade customer service AI.

This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.

The African Customer Service Conversation Dataset contains 200K+ multi-turn dialogue threads collected from customer service operations at major banks, mobile network operators, and fintech platforms across Nigeria, Kenya, South Africa, and Senegal. Data spans three channels — WhatsApp Business API logs, IVR call transcripts (ASR-generated and human-corrected), and web/app live-chat sessions — providing channel-diverse training signal for omnichannel conversational AI deployments.

Each conversation thread is annotated at two levels of granularity. At the turn level, every customer utterance carries an intent label from a 36-class taxonomy (covering account queries, transaction disputes, loan applications, airtime/data purchase, complaint escalation, and more), a sentiment score, and an entity extraction tag set. At the thread level, each conversation carries an overall resolution label (resolved, escalated, abandoned), a first-contact resolution flag, a handle-time bucket, and an industry-vertical tag (BANKING, TELCO, FINTECH).

Language composition reflects the real-world multilingual nature of African customer interactions: 48 % English, 22 % Nigerian Pidgin, 14 % Swahili, 9 % Zulu/Xhosa, 4 % French (Senegal), and 3 % Hausa/Yoruba code-switches. All personally identifiable information — names, account numbers, phone numbers, addresses — has been replaced with typed placeholders (e.g., [CUSTOMER_NAME], [ACCOUNT_ID]) using a rule-based PII redaction pipeline validated by a legal review.

Key Use Cases

Intent classification for customer service chatbot NLU engines
End-to-end dialogue policy training (RLHF / supervised fine-tuning)
Automated call-centre quality scoring and agent coaching
First-contact resolution prediction and escalation routing
Sentiment trajectory modelling for proactive intervention
Multilingual entity extraction (account IDs, dates, amounts)
Omnichannel conversation summarisation for CRM integration
PII-redaction pipeline benchmarking and compliance testing

Language Distribution

English 48 %
Nigerian Pidgin 22 %
Swahili 14 %
Zulu / Xhosa 9 %
French (Senegal) 4 %
Hausa / Yoruba code-switch 3 %

Dataset Highlights

Conversation Threads
200K+
multi-turn dialogues
Intent Classes
36
banking, telco, fintech taxonomy
Channels
3
WhatsApp, IVR, live-chat
Languages
6+
English, Pidgin, Swahili, Zulu, French, Hausa/Yoruba

Geographic Coverage

Primary Coverage
Other Regions

Dataset Schema

Each record represents one conversation thread. Fields cover channel provenance, language, thread-level labels, and aggregated turn statistics. Individual turn arrays are stored as nested JSON in the turns field.

Field NameTypeDescriptionNullableExample
conversation_id STRING Unique conversation thread identifier No CONV-NGA-BK-0094712
country_code STRING ISO 3166-1 alpha-2 country code No NG
industry ENUM Industry vertical: BANKING, TELCO, FINTECH No BANKING
channel ENUM Interaction channel: WHATSAPP, IVR, LIVE_CHAT No WHATSAPP
primary_language ENUM Dominant language: ENGLISH, PIDGIN, SWAHILI, ZULU, FRENCH, HAUSA, YORUBA No PIDGIN
has_code_switching BOOLEAN True if the conversation mixes two or more languages No true
turn_count INTEGER Total number of turns in the conversation No 8
primary_intent STRING Dominant customer intent from 36-class taxonomy No TRANSACTION_DISPUTE
resolution ENUM Thread outcome: RESOLVED, ESCALATED, ABANDONED No RESOLVED
first_contact_resolved BOOLEAN True if resolved without escalation or callback No true
handle_time_bucket ENUM Handle time: UNDER_2MIN, 2_5MIN, 5_10MIN, OVER_10MIN No 5_10MIN
opening_sentiment ENUM Customer sentiment at conversation open: POSITIVE, NEGATIVE, NEUTRAL No NEGATIVE
closing_sentiment ENUM Customer sentiment at conversation close: POSITIVE, NEGATIVE, NEUTRAL Yes POSITIVE
pii_redacted BOOLEAN True if PII placeholders were applied to this thread No true
split ENUM Dataset partition: TRAIN, VAL, TEST No TRAIN
turns JSON Array of turn objects {role, text, intent, sentiment, entities} — one per turn No [...]

Sample Records

Four representative conversation threads spanning industries, channels, languages, and resolution outcomes.

customer_service_sample.json
[ { "conversation_id": "CONV-NGA-BK-0094712", "country_code": "NG", "industry": "BANKING", "channel": "WHATSAPP", "primary_language": "PIDGIN", "has_code_switching": true, "turn_count": 8, "primary_intent": "TRANSACTION_DISPUTE", "resolution": "RESOLVED", "first_contact_resolved": true, "handle_time_bucket": "5_10MIN", "opening_sentiment": "NEGATIVE", "closing_sentiment": "POSITIVE", "pii_redacted": true, "split": "TRAIN", "turns": [ { "role": "customer", "text": "Abeg, dem deduct money from my account but I never buy anything o.", "intent": "TRANSACTION_DISPUTE", "sentiment": "NEGATIVE", "entities": [] }, { "role": "agent", "text": "Sorry for the stress. Pls share your [ACCOUNT_ID] so I can check.", "intent": null, "sentiment": "NEUTRAL", "entities": [ "[ACCOUNT_ID]" ] }, { "role": "customer", "text": "It's [ACCOUNT_ID], the amount na [AMOUNT].", "intent": "PROVIDE_DETAILS", "sentiment": "NEGATIVE", "entities": [ "[ACCOUNT_ID]", "[AMOUNT]" ] }, { "role": "agent", "text": "I can see the deduction. It has been reversed now. Please check.", "intent": null, "sentiment": "POSITIVE", "entities": [] } ] }, { "conversation_id": "CONV-KEN-TL-0031085", "country_code": "KE", "industry": "TELCO", "channel": "LIVE_CHAT", "primary_language": "SWAHILI", "has_code_switching": false, "turn_count": 5, "primary_intent": "DATA_BUNDLE_PURCHASE", "resolution": "RESOLVED", "first_contact_resolved": true, "handle_time_bucket": "UNDER_2MIN", "opening_sentiment": "NEUTRAL", "closing_sentiment": "POSITIVE", "pii_redacted": true, "split": "TRAIN", "turns": [ { "role": "customer", "text": "Nataka kununua data bundle ya 5GB.", "intent": "DATA_BUNDLE_PURCHASE", "sentiment": "NEUTRAL", "entities": [ "5GB" ] }, { "role": "agent", "text": "Sawa, bundle ya 5GB ni KES 500 kwa siku 30. Uthibitishe?", "intent": null, "sentiment": "NEUTRAL", "entities": [ "5GB", "KES 500", "30" ] }, { "role": "customer", "text": "Ndiyo, thibitisha tafadhali.", "intent": "CONFIRM_ACTION", "sentiment": "POSITIVE", "entities": [] } ] }, { "conversation_id": "CONV-ZAF-FT-0058341", "country_code": "ZA", "industry": "FINTECH", "channel": "IVR", "primary_language": "ENGLISH", "has_code_switching": false, "turn_count": 12, "primary_intent": "LOAN_APPLICATION_STATUS", "resolution": "ESCALATED", "first_contact_resolved": false, "handle_time_bucket": "OVER_10MIN", "opening_sentiment": "NEUTRAL", "closing_sentiment": "NEGATIVE", "pii_redacted": true, "split": "VAL", "turns": [ { "role": "customer", "text": "I applied for a loan two weeks ago and I still haven't heard back.", "intent": "LOAN_APPLICATION_STATUS", "sentiment": "NEUTRAL", "entities": [] }, { "role": "agent", "text": "I apologise for the delay. Can I have your application reference?", "intent": null, "sentiment": "NEUTRAL", "entities": [] }, { "role": "customer", "text": "It's [APP_REF]. This is taking too long, I need the money urgently.", "intent": "COMPLAINT_ESCALATION", "sentiment": "NEGATIVE", "entities": [ "[APP_REF]" ] } ] }, { "conversation_id": "CONV-SEN-BK-0072904", "country_code": "SN", "industry": "BANKING", "channel": "WHATSAPP", "primary_language": "FRENCH", "has_code_switching": false, "turn_count": 6, "primary_intent": "ACCOUNT_BALANCE_ENQUIRY", "resolution": "RESOLVED", "first_contact_resolved": true, "handle_time_bucket": "UNDER_2MIN", "opening_sentiment": "NEUTRAL", "closing_sentiment": "POSITIVE", "pii_redacted": true, "split": "TRAIN", "turns": [ { "role": "customer", "text": "Bonjour, je voudrais connaître le solde de mon compte.", "intent": "ACCOUNT_BALANCE_ENQUIRY", "sentiment": "NEUTRAL", "entities": [] }, { "role": "agent", "text": "Bonjour! Votre solde actuel est de [BALANCE] FCFA. Puis-je vous aider avec autre chose?", "intent": null, "sentiment": "POSITIVE", "entities": [ "[BALANCE]" ] }, { "role": "customer", "text": "Non merci, c'est tout. Bonne journée.", "intent": "CLOSE_CONVERSATION", "sentiment": "POSITIVE", "entities": [] } ] } ]
Request Dataset Access

All datasets are available under a commercial licence agreement. Our team typically responds within 2 business days.

Request Access
NDA may be required

Build with Data that reflects Africa

Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.