DS-02 Financial Services

African FinTech Loan Application & Approval Dataset

500K+ loan application records from digital lending platforms across Nigeria, Kenya, Ghana, and South Africa. Features applicant demographics, employment status, mobile money history, credit bureau scores, loan amounts, and approval/rejection outcomes with reason codes — purpose-built for credit risk modelling, alternative scoring, and financial inclusion analytics.

This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.

The DS-02 African FinTech Loan Application & Approval Dataset is the most comprehensive open-access-derived credit dataset for Sub-Saharan Africa currently available under a commercial licence. Assembled by unifying demand-side survey microdata, competition loan records, and macro financial indicators, it addresses a critical data gap for teams building credit risk and financial inclusion models across four of Africa's largest economies.

The dataset draws on eight primary sources — including EFInA's Access to Finance Nigeria 2023 survey, Kenya's FinAccess 2021 microdata, the World Bank Enterprise Survey, World Bank Findex 2021, and the Zindi African Credit Scoring Challenge — producing a unified schema of 20 canonical fields per applicant record. Macro enrichment from the IMF Financial Access Survey 2024 provides country-year indicators for contextual modelling.

Key features include binary loan approval outcomes, rejection reason codes, mobile money usage flags, employment and income proxies, and a default outcome label derived from repayment records. The unified schema is designed for direct ingestion into credit risk pipelines, with preprocessing steps documented per source.

Use Cases

Credit risk model training & validation
Alternative credit scoring for thin-file borrowers
Financial inclusion analytics & reporting
Loan default prediction & early warning
Gender & demographic bias analysis in lending
SME lending underwriting model development
Mobile money feature engineering for credit
Regulatory reporting & financial sector research

Data Quality Scores

Recency 90%
Geographic Coverage 84%
Volume 78%
Schema Richness 86%
Update Cadence 74%
Reliability / Provenance 96%

Geographic Coverage

Primary Coverage
Other Regions

Source Summary

Primary Sources
8
Qualifying datasets ingested
Top Quality Score
26/30
EFInA A2F, World Bank ES, IMF FAS
Canonical Fields
20
Unified schema fields
Target Economies
4
Nigeria · Kenya · Ghana · South Africa

Dataset Schema

Each record represents a single loan application or survey respondent mapped to the unified canonical schema. Fields use snake_case; country-specific fields are prefixed with ISO 3166 codes where unification is not possible. All currency values are expressed in USD equivalent.

Field NameTypeDescriptionNullable
applicant_id STRING Unique identifier for the loan application or survey respondent No
country_code ENUM ISO 3166-1 alpha-2 country code of applicant (NG / KE / GH / ZA) No
application_date DATE Date of loan application or survey interview (ISO 8601) Yes
applicant_age INTEGER Age of applicant in years at time of application Yes
applicant_gender ENUM Gender of applicant: M / F / Other Yes
employment_status ENUM employed_formal / employed_informal / self_employed / unemployed Yes
income_monthly_usd FLOAT Applicant's monthly income in USD (converted from local currency) Yes
education_level ENUM Highest education: none / primary / secondary / tertiary Yes
has_bank_account BOOLEAN Whether applicant holds a formal bank account Yes
has_mobile_money_account BOOLEAN Whether applicant has an active mobile money account Yes
loan_amount_usd FLOAT Requested loan amount in USD Yes
loan_purpose ENUM business / household_expense / education / health / agriculture / other Yes
loan_term_days INTEGER Duration of loan in days Yes
loan_approved BOOLEAN Primary label: whether loan application was approved (1) or rejected (0) No
default_outcome BOOLEAN Whether approved loan subsequently defaulted (repayment failure) Yes
rejection_reason_code ENUM no_collateral / low_income / no_credit_history / documentation / other Yes
urban_rural ENUM Urban or rural location of applicant Yes
lender_type ENUM bank / MFI / mobile_lender / SACCO / informal Yes
macro_sme_loans_gdp FLOAT Country-year macro: SME loans as % of GDP (from IMF FAS enrichment) Yes
macro_mfi_accounts_per1000 FLOAT Country-year macro: MFI loan accounts per 1,000 adults (IMF FAS) Yes

Sample Records

The following are representative synthetic samples demonstrating the dataset structure. Each record corresponds to a single loan application unified across the source schema.

sample_records.json
[ { "applicant_id": "NG_20230041", "country_code": "NG", "application_date": "2023-07-15", "applicant_age": 34, "applicant_gender": "F", "employment_status": "self_employed", "income_monthly_usd": 210.5, "education_level": "secondary", "has_bank_account": true, "has_mobile_money_account": true, "loan_amount_usd": 450, "loan_purpose": "business", "loan_term_days": 30, "loan_approved": true, "default_outcome": false, "urban_rural": "urban", "lender_type": "mobile_lender", "macro_sme_loans_gdp": 4.2, "macro_mfi_accounts_per1000": 87.3 }, { "applicant_id": "KE_20210882", "country_code": "KE", "application_date": "2021-04-03", "applicant_age": 27, "applicant_gender": "M", "employment_status": "employed_informal", "income_monthly_usd": 145, "education_level": "primary", "has_bank_account": false, "has_mobile_money_account": true, "loan_amount_usd": 80, "loan_purpose": "household_expense", "loan_term_days": 14, "loan_approved": true, "default_outcome": true, "urban_rural": "rural", "lender_type": "MFI", "macro_sme_loans_gdp": 5.8, "macro_mfi_accounts_per1000": 134.1 }, { "applicant_id": "GH_20240319", "country_code": "GH", "application_date": "2024-01-22", "applicant_age": 42, "applicant_gender": "M", "employment_status": "employed_formal", "income_monthly_usd": 520, "education_level": "tertiary", "has_bank_account": true, "has_mobile_money_account": true, "loan_amount_usd": 1200, "loan_purpose": "business", "loan_term_days": 90, "loan_approved": false, "rejection_reason_code": "no_credit_history", "urban_rural": "urban", "lender_type": "bank", "macro_sme_loans_gdp": 3.6, "macro_mfi_accounts_per1000": 62.4 } ]
Request Dataset Access

All datasets are available under a commercial licence agreement. Our team typically responds within 2 business days.

Request Access
NDA may be required

Build with Data that reflects Africa

Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.