DS-01 Financial Services

African Banking Transaction Classification Dataset

Labelled transaction records from Nigerian, Ghanaian, and Kenyan retail banks, categorised by type — transfer, bill payment, merchant POS, mobile money, and USSD. Includes merchant codes, payment channel, time-of-day patterns, and binary fraud flags. Purpose-built for fraud detection, AML compliance, and spending analytics across three of Africa's largest fintech markets.

This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.

This dataset is the most comprehensive labelled banking transaction corpus for Sub-Saharan Africa's three largest fintech markets — Nigeria, Ghana, and Kenya — available under a commercial licence. It addresses a critical scarcity of row-level, ML-ready transaction data from African financial institutions in open-access repositories.

Records are sourced from a combination of synthetic datasets calibrated against real central bank statistics (CBN Nigeria, CBK Kenya, Bank of Ghana) and community-contributed datasets validated against actual transaction distributions. The result is a Gold Tier asset covering all five primary transaction categories, eight payment channels, and binary fraud labels aligned to real-world AML incidence rates across the three target economies.

The dataset incorporates Nigeria-specific behavioural signals (religious giving patterns, family remittance flows), Kenya mobile money (M-Pesa Paybill, agent cash-in/cash-out), and Ghana-specific payment instruments (GhanaPay, E-Zwich, GH-Link ATM), making it uniquely suited for training models that must generalise across distinct market contexts rather than a single generic African proxy.

Use Cases

Fraud detection model training
AML compliance & suspicious activity flagging
Customer spending analytics & segmentation
Credit scoring & alternative underwriting
Payment channel classification models
Merchant category code (MCC) tagging
Mobile money behaviour modelling
Cross-market fintech ML benchmarking

Transaction Types Covered

🔄 P2P Transfer
🧾 Bill Payment
🏪 Merchant POS
📱 Mobile Money
📟 USSD
🏧 ATM Withdrawal
💸 Airtime Top-Up
💼 Salary / Payroll

Geographic Coverage

Primary Coverage
Other Regions

Dataset Schema

Each record represents a single banking transaction with associated metadata, classification labels, fraud flags, and quality signals. Field naming follows snake_case with ISO standards for currencies (ISO 4217), dates (ISO 8601), and country codes (ISO 3166-1 alpha-2).

Field NameTypeDescriptionNullable
transaction_id STRING Unique transaction identifier (UUID), prefixed with ISO country code No
country_code ENUM ISO 3166-1 alpha-2 country code: NG / KE / GH No
transaction_type ENUM Canonical category: transfer / bill_payment / merchant_pos / mobile_money / ussd No
channel ENUM Payment channel: mobile_app / ussd / pos / atm / internet / agent No
amount_local FLOAT Transaction amount in local currency (NGN / KES / GHS) No
currency_code ENUM ISO 4217 currency code derived from country_code No
timestamp DATETIME ISO 8601 transaction datetime (UTC) Yes
hour_of_day INTEGER Hour of transaction (0–23), derived from timestamp No
day_of_week INTEGER Day of week (0=Monday, 6=Sunday), derived from timestamp No
merchant_category_code ENUM ISO 18245 MCC or Africa-extended local equivalent (e.g. 5411 = Grocery) Yes
originator_id STRING Anonymised sender / account holder identifier No
beneficiary_id STRING Anonymised recipient / merchant identifier (prefix 'M' = merchant) Yes
originator_balance_before FLOAT Originator account balance before transaction — fraud signal Yes
originator_balance_after FLOAT Originator account balance after transaction — fraud signal Yes
beneficiary_balance_before FLOAT Beneficiary balance before transaction Yes
beneficiary_balance_after FLOAT Beneficiary balance after transaction Yes
fraud_flag BOOLEAN Binary fraud label: 1 = fraudulent, 0 = legitimate — primary ML target No
aml_flag BOOLEAN AML suspicious activity flag (rule-based; high-value transfers) Yes
state_region ENUM Sub-national geographic unit (state/county/region) — Nigeria has all 36 states + FCT Yes
account_type ENUM Account type: current / savings / mobile_money Yes
operator STRING Financial institution or MNO operator (e.g. GTBank, Safaricom, MTN) Yes
financial_inclusion_segment ENUM Findex-derived user segment: banked / mobile_only / underbanked / unbanked Yes

Sample Records

The following are representative synthetic samples demonstrating the dataset structure, transaction category depth, and fraud label distribution.

sample_records.json
[ { "transaction_id": "TXN-20240315-NG-000183", "country_code": "NG", "transaction_type": "merchant_pos", "channel": "pos", "amount_local": 12500, "currency_code": "NGN", "timestamp": "2024-03-15T14:32:00Z", "hour_of_day": 14, "day_of_week": 4, "merchant_category_code": "5411", "originator_id": "C00482917", "beneficiary_id": "M00193847", "originator_balance_before": 87400, "originator_balance_after": 74900, "fraud_flag": false, "aml_flag": false, "state_region": "Lagos", "account_type": "current", "financial_inclusion_segment": "banked" }, { "transaction_id": "TXN-20240316-KE-002741", "country_code": "KE", "transaction_type": "mobile_money", "channel": "agent", "amount_local": 3200, "currency_code": "KES", "timestamp": "2024-03-16T09:11:00Z", "hour_of_day": 9, "day_of_week": 5, "merchant_category_code": null, "originator_id": "C00917364", "beneficiary_id": "C00482011", "originator_balance_before": 15800, "originator_balance_after": 12600, "fraud_flag": false, "aml_flag": false, "operator": "Safaricom", "account_type": "mobile_money", "financial_inclusion_segment": "mobile_only" }, { "transaction_id": "TXN-20240317-GH-007902", "country_code": "GH", "transaction_type": "transfer", "channel": "mobile_app", "amount_local": 9500, "currency_code": "GHS", "timestamp": "2024-03-17T22:47:00Z", "hour_of_day": 22, "day_of_week": 6, "originator_id": "C00728340", "beneficiary_id": "C00019023", "originator_balance_before": 9500, "originator_balance_after": 0, "beneficiary_balance_before": 200, "beneficiary_balance_after": 9700, "fraud_flag": true, "aml_flag": true, "operator": "MTN", "account_type": "mobile_money", "financial_inclusion_segment": "mobile_only" } ]
Request Dataset Access

All datasets are available under a commercial licence agreement. Our team typically responds within 2 business days.

Request Access
NDA may be required

Build with Data that reflects Africa

Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.