DS-10 Language & NLP

African Customer Service Conversation Dataset

200K+ multi-turn conversation threads from African banks and telcos — spanning WhatsApp logs, IVR transcripts, and live-chat sessions — each labelled with intent, resolution outcome, sentiment trajectory, and agent-action annotations for training production-grade customer service AI.

This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.

The African Customer Service Conversation Dataset contains 200K+ multi-turn dialogue threads collected from customer service operations at major banks, mobile network operators, and fintech platforms across Nigeria, Kenya, South Africa, and Senegal. Data spans three channels — WhatsApp Business API logs, IVR call transcripts (ASR-generated and human-corrected), and web/app live-chat sessions — providing channel-diverse training signal for omnichannel conversational AI deployments.

Each conversation thread is annotated at two levels of granularity. At the turn level, every customer utterance carries an intent label from a 36-class taxonomy (covering account queries, transaction disputes, loan applications, airtime/data purchase, complaint escalation, and more), a sentiment score, and an entity extraction tag set. At the thread level, each conversation carries an overall resolution label (resolved, escalated, abandoned), a first-contact resolution flag, a handle-time bucket, and an industry-vertical tag (BANKING, TELCO, FINTECH).

Language composition reflects the real-world multilingual nature of African customer interactions: 48 % English, 22 % Nigerian Pidgin, 14 % Swahili, 9 % Zulu/Xhosa, 4 % French (Senegal), and 3 % Hausa/Yoruba code-switches. All personally identifiable information — names, account numbers, phone numbers, addresses — has been replaced with typed placeholders (e.g., [CUSTOMER_NAME], [ACCOUNT_ID]) using a rule-based PII redaction pipeline validated by a legal review.

Key Use Cases

Intent classification for customer service chatbot NLU engines

End-to-end dialogue policy training (RLHF / supervised fine-tuning)

Automated call-centre quality scoring and agent coaching

First-contact resolution prediction and escalation routing

Sentiment trajectory modelling for proactive intervention

Multilingual entity extraction (account IDs, dates, amounts)

Omnichannel conversation summarisation for CRM integration

PII-redaction pipeline benchmarking and compliance testing

Language Distribution

English 48 %

Nigerian Pidgin 22 %

Swahili 14 %

Zulu / Xhosa 9 %

French (Senegal) 4 %

Hausa / Yoruba code-switch 3 %

Dataset Highlights

Conversation Threads

200K+

multi-turn dialogues

Intent Classes

banking, telco, fintech taxonomy

Channels

WhatsApp, IVR, live-chat

Languages

English, Pidgin, Swahili, Zulu, French, Hausa/Yoruba

Geographic Coverage

Primary Coverage

Other Regions

Dataset Schema

Each record represents one conversation thread. Fields cover channel provenance, language, thread-level labels, and aggregated turn statistics. Individual turn arrays are stored as nested JSON in the turns field.

Field Name	Type	Description	Nullable	Example
conversation_id	STRING	Unique conversation thread identifier	No	CONV-NGA-BK-0094712
country_code	STRING	ISO 3166-1 alpha-2 country code	No	NG
industry	ENUM	Industry vertical: BANKING, TELCO, FINTECH	No	BANKING
channel	ENUM	Interaction channel: WHATSAPP, IVR, LIVE_CHAT	No	WHATSAPP
primary_language	ENUM	Dominant language: ENGLISH, PIDGIN, SWAHILI, ZULU, FRENCH, HAUSA, YORUBA	No	PIDGIN
has_code_switching	BOOLEAN	True if the conversation mixes two or more languages	No	true
turn_count	INTEGER	Total number of turns in the conversation	No	8
primary_intent	STRING	Dominant customer intent from 36-class taxonomy	No	TRANSACTION_DISPUTE
resolution	ENUM	Thread outcome: RESOLVED, ESCALATED, ABANDONED	No	RESOLVED
first_contact_resolved	BOOLEAN	True if resolved without escalation or callback	No	true
handle_time_bucket	ENUM	Handle time: UNDER_2MIN, 2_5MIN, 5_10MIN, OVER_10MIN	No	5_10MIN
opening_sentiment	ENUM	Customer sentiment at conversation open: POSITIVE, NEGATIVE, NEUTRAL	No	NEGATIVE
closing_sentiment	ENUM	Customer sentiment at conversation close: POSITIVE, NEGATIVE, NEUTRAL	Yes	POSITIVE
pii_redacted	BOOLEAN	True if PII placeholders were applied to this thread	No	true
split	ENUM	Dataset partition: TRAIN, VAL, TEST	No	TRAIN
turns	JSON	Array of turn objects {role, text, intent, sentiment, entities} — one per turn	No	[...]

Sample Records

Four representative conversation threads spanning industries, channels, languages, and resolution outcomes.

customer_service_sample.json

[ { "conversation_id": "CONV-NGA-BK-0094712", "country_code": "NG", "industry": "BANKING", "channel": "WHATSAPP", "primary_language": "PIDGIN", "has_code_switching": true, "turn_count": 8, "primary_intent": "TRANSACTION_DISPUTE", "resolution": "RESOLVED", "first_contact_resolved": true, "handle_time_bucket": "5_10MIN", "opening_sentiment": "NEGATIVE", "closing_sentiment": "POSITIVE", "pii_redacted": true, "split": "TRAIN", "turns": [ { "role": "customer", "text": "Abeg, dem deduct money from my account but I never buy anything o.", "intent": "TRANSACTION_DISPUTE", "sentiment": "NEGATIVE", "entities": [] }, { "role": "agent", "text": "Sorry for the stress. Pls share your [ACCOUNT_ID] so I can check.", "intent": null, "sentiment": "NEUTRAL", "entities": [ "[ACCOUNT_ID]" ] }, { "role": "customer", "text": "It's [ACCOUNT_ID], the amount na [AMOUNT].", "intent": "PROVIDE_DETAILS", "sentiment": "NEGATIVE", "entities": [ "[ACCOUNT_ID]", "[AMOUNT]" ] }, { "role": "agent", "text": "I can see the deduction. It has been reversed now. Please check.", "intent": null, "sentiment": "POSITIVE", "entities": [] } ] }, { "conversation_id": "CONV-KEN-TL-0031085", "country_code": "KE", "industry": "TELCO", "channel": "LIVE_CHAT", "primary_language": "SWAHILI", "has_code_switching": false, "turn_count": 5, "primary_intent": "DATA_BUNDLE_PURCHASE", "resolution": "RESOLVED", "first_contact_resolved": true, "handle_time_bucket": "UNDER_2MIN", "opening_sentiment": "NEUTRAL", "closing_sentiment": "POSITIVE", "pii_redacted": true, "split": "TRAIN", "turns": [ { "role": "customer", "text": "Nataka kununua data bundle ya 5GB.", "intent": "DATA_BUNDLE_PURCHASE", "sentiment": "NEUTRAL", "entities": [ "5GB" ] }, { "role": "agent", "text": "Sawa, bundle ya 5GB ni KES 500 kwa siku 30. Uthibitishe?", "intent": null, "sentiment": "NEUTRAL", "entities": [ "5GB", "KES 500", "30" ] }, { "role": "customer", "text": "Ndiyo, thibitisha tafadhali.", "intent": "CONFIRM_ACTION", "sentiment": "POSITIVE", "entities": [] } ] }, { "conversation_id": "CONV-ZAF-FT-0058341", "country_code": "ZA", "industry": "FINTECH", "channel": "IVR", "primary_language": "ENGLISH", "has_code_switching": false, "turn_count": 12, "primary_intent": "LOAN_APPLICATION_STATUS", "resolution": "ESCALATED", "first_contact_resolved": false, "handle_time_bucket": "OVER_10MIN", "opening_sentiment": "NEUTRAL", "closing_sentiment": "NEGATIVE", "pii_redacted": true, "split": "VAL", "turns": [ { "role": "customer", "text": "I applied for a loan two weeks ago and I still haven't heard back.", "intent": "LOAN_APPLICATION_STATUS", "sentiment": "NEUTRAL", "entities": [] }, { "role": "agent", "text": "I apologise for the delay. Can I have your application reference?", "intent": null, "sentiment": "NEUTRAL", "entities": [] }, { "role": "customer", "text": "It's [APP_REF]. This is taking too long, I need the money urgently.", "intent": "COMPLAINT_ESCALATION", "sentiment": "NEGATIVE", "entities": [ "[APP_REF]" ] } ] }, { "conversation_id": "CONV-SEN-BK-0072904", "country_code": "SN", "industry": "BANKING", "channel": "WHATSAPP", "primary_language": "FRENCH", "has_code_switching": false, "turn_count": 6, "primary_intent": "ACCOUNT_BALANCE_ENQUIRY", "resolution": "RESOLVED", "first_contact_resolved": true, "handle_time_bucket": "UNDER_2MIN", "opening_sentiment": "NEUTRAL", "closing_sentiment": "POSITIVE", "pii_redacted": true, "split": "TRAIN", "turns": [ { "role": "customer", "text": "Bonjour, je voudrais connaître le solde de mon compte.", "intent": "ACCOUNT_BALANCE_ENQUIRY", "sentiment": "NEUTRAL", "entities": [] }, { "role": "agent", "text": "Bonjour! Votre solde actuel est de [BALANCE] FCFA. Puis-je vous aider avec autre chose?", "intent": null, "sentiment": "POSITIVE", "entities": [ "[BALANCE]" ] }, { "role": "customer", "text": "Non merci, c'est tout. Bonne journée.", "intent": "CLOSE_CONVERSATION", "sentiment": "POSITIVE", "entities": [] } ] } ]

Request Dataset Access

All datasets are available under a commercial licence agreement. Our team typically responds within 2 business days.

Request Access

NDA may be required

Related Datasets

Build with Data that reflects Africa

Request access to our full catalog of licensed human-validated African dataset or request a custom data tailored to your project.

Request Dataset Access Contact Sales