African Banking Transaction Classification Dataset
Labelled transaction records from Nigerian, Ghanaian, and Kenyan retail banks, categorised by type — transfer, bill payment, merchant POS, mobile money, and USSD. Includes merchant codes, payment channel, time-of-day patterns, and binary fraud flags. Purpose-built for fraud detection, AML compliance, and spending analytics across three of Africa's largest fintech markets.
This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.
This dataset is the most comprehensive labelled banking transaction corpus for Sub-Saharan Africa's three largest fintech markets — Nigeria, Ghana, and Kenya — available under a commercial licence. It addresses a critical scarcity of row-level, ML-ready transaction data from African financial institutions in open-access repositories.
Records are sourced from a combination of synthetic datasets calibrated against real central bank statistics (CBN Nigeria, CBK Kenya, Bank of Ghana) and community-contributed datasets validated against actual transaction distributions. The result is a Gold Tier asset covering all five primary transaction categories, eight payment channels, and binary fraud labels aligned to real-world AML incidence rates across the three target economies.
The dataset incorporates Nigeria-specific behavioural signals (religious giving patterns, family remittance flows), Kenya mobile money (M-Pesa Paybill, agent cash-in/cash-out), and Ghana-specific payment instruments (GhanaPay, E-Zwich, GH-Link ATM), making it uniquely suited for training models that must generalise across distinct market contexts rather than a single generic African proxy.
Use Cases
Transaction Types Covered
Geographic Coverage
Dataset Schema
Each record represents a single banking transaction with associated metadata, classification labels, fraud flags, and quality signals. Field naming follows snake_case with ISO standards for currencies (ISO 4217), dates (ISO 8601), and country codes (ISO 3166-1 alpha-2).
| Field Name | Type | Description | Nullable |
|---|---|---|---|
| transaction_id | STRING | Unique transaction identifier (UUID), prefixed with ISO country code | No |
| country_code | ENUM | ISO 3166-1 alpha-2 country code: NG / KE / GH | No |
| transaction_type | ENUM | Canonical category: transfer / bill_payment / merchant_pos / mobile_money / ussd | No |
| channel | ENUM | Payment channel: mobile_app / ussd / pos / atm / internet / agent | No |
| amount_local | FLOAT | Transaction amount in local currency (NGN / KES / GHS) | No |
| currency_code | ENUM | ISO 4217 currency code derived from country_code | No |
| timestamp | DATETIME | ISO 8601 transaction datetime (UTC) | Yes |
| hour_of_day | INTEGER | Hour of transaction (0–23), derived from timestamp | No |
| day_of_week | INTEGER | Day of week (0=Monday, 6=Sunday), derived from timestamp | No |
| merchant_category_code | ENUM | ISO 18245 MCC or Africa-extended local equivalent (e.g. 5411 = Grocery) | Yes |
| originator_id | STRING | Anonymised sender / account holder identifier | No |
| beneficiary_id | STRING | Anonymised recipient / merchant identifier (prefix 'M' = merchant) | Yes |
| originator_balance_before | FLOAT | Originator account balance before transaction — fraud signal | Yes |
| originator_balance_after | FLOAT | Originator account balance after transaction — fraud signal | Yes |
| beneficiary_balance_before | FLOAT | Beneficiary balance before transaction | Yes |
| beneficiary_balance_after | FLOAT | Beneficiary balance after transaction | Yes |
| fraud_flag | BOOLEAN | Binary fraud label: 1 = fraudulent, 0 = legitimate — primary ML target | No |
| aml_flag | BOOLEAN | AML suspicious activity flag (rule-based; high-value transfers) | Yes |
| state_region | ENUM | Sub-national geographic unit (state/county/region) — Nigeria has all 36 states + FCT | Yes |
| account_type | ENUM | Account type: current / savings / mobile_money | Yes |
| operator | STRING | Financial institution or MNO operator (e.g. GTBank, Safaricom, MTN) | Yes |
| financial_inclusion_segment | ENUM | Findex-derived user segment: banked / mobile_only / underbanked / unbanked | Yes |
Sample Records
The following are representative synthetic samples demonstrating the dataset structure, transaction category depth, and fraud label distribution.
Build with Data that reflects Africa
Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.