DS-16 HealthCare

African Mental Health & Wellbeing Survey Dataset

50K+ mental health and wellbeing survey responses from Nigeria, Kenya, Ghana, and Uganda — collected using culturally adapted versions of PHQ-9, GAD-7, and SRQ-20 — for training risk-stratification models, personalising digital mental health tools, and supporting public health planning across African populations.

This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.

The African Mental Health & Wellbeing Survey Dataset aggregates 50K+ structured survey responses collected across Nigeria, Kenya, Ghana, and Uganda using culturally adapted versions of three validated instruments: the Patient Health Questionnaire (PHQ-9) for depression screening, the Generalised Anxiety Disorder scale (GAD-7) for anxiety, and the Self-Reporting Questionnaire (SRQ-20) for common mental disorders. Adaptations were developed through community co-design workshops, validated for psychometric equivalence, and administered in local languages by trained lay counsellors and digital self-completion.

Each respondent record captures the full item-level responses for all three instruments alongside demographic variables, help-seeking behaviour, reported stressors, and a clinician-validated severity label for a stratified sub-sample. Stressor taxonomy covers financial stress, bereavement, intimate partner violence, food insecurity, chronic illness, and employment loss — contextually prevalent drivers of mental health burden in the study populations. Urban, peri-urban, and rural respondents are represented in proportions reflecting national population distributions.

The dataset supports a range of AI applications: risk stratification models that triage users of digital mental health apps, population-level burden estimation and sub-group analysis, personalisation engines that adapt therapeutic content based on symptom profiles, and fairness auditing to detect demographic performance disparities in mental health screening tools. A longitudinal follow-up sub-sample (n=8,000) with 3-month and 6-month re-survey data is included for temporal modelling of treatment response and natural recovery trajectories.

Key Use Cases

Depression and anxiety risk stratification for digital mental health apps
Population-level mental health burden estimation and mapping
Personalisation of therapeutic content based on symptom profiles
Stressor-aware chatbot and conversational AI for mental wellbeing
Fairness and bias auditing across demographic subgroups
Longitudinal treatment response and natural recovery modelling
PHQ-9 / GAD-7 adaptive item selection (CAT) research
Integration with CHW decision-support for community mental health

Dataset Highlights

Survey Respondents
50K+
across 4 countries
Instruments
3
PHQ-9, GAD-7, SRQ-20 (adapted)
Stressor Categories
6
financial, bereavement, IPV, food, illness, employment
Longitudinal Sub-sample
8,000
3-month & 6-month follow-up

Severity Distribution (PHQ-9)

Minimal (0–4) 41 %
Mild (5–9) 28 %
Moderate (10–14) 18 %
Moderately Severe (15–19) 9 %
Severe (20–27) 4 %

Geographic Coverage

Primary Coverage
Other Regions

Dataset Schema

Each record represents one survey respondent at one time point. Fields cover demographics, instrument scores, stressor flags, help-seeking behaviour, and validated severity labels.

Field NameTypeDescriptionNullableExample
respondent_id STRING Anonymised persistent respondent identifier No RSP-NGA-0094821
country_code STRING ISO 3166-1 alpha-2 country code No NG
survey_wave ENUM Survey round: BASELINE, FOLLOWUP_3M, FOLLOWUP_6M No BASELINE
age_group ENUM Age band: 18_24, 25_34, 35_44, 45_54, 55_PLUS No 25_34
gender ENUM Gender: MALE, FEMALE, NON_BINARY No FEMALE
residence_type ENUM Settlement type: URBAN, PERI_URBAN, RURAL No URBAN
phq9_score INTEGER Total PHQ-9 score (0–27) No 11
gad7_score INTEGER Total GAD-7 score (0–21) No 8
srq20_score INTEGER Total SRQ-20 score (0–20) Yes 7
phq9_severity ENUM PHQ-9 severity band: MINIMAL, MILD, MODERATE, MOD_SEVERE, SEVERE No MODERATE
primary_stressor ENUM Self-reported primary stressor: FINANCIAL, BEREAVEMENT, IPV, FOOD_INSECURITY, CHRONIC_ILLNESS, EMPLOYMENT_LOSS, NONE Yes FINANCIAL
help_seeking ENUM Current help-seeking status: NONE, INFORMAL, TRADITIONAL, FORMAL_HEALTH No NONE
prior_diagnosis BOOLEAN True if respondent reports a prior formal mental health diagnosis Yes false
clinician_label ENUM Clinician-validated severity (stratified sub-sample only): NONE, MILD, MODERATE, SEVERE, null if not in sub-sample Yes MODERATE
is_longitudinal BOOLEAN True if respondent is in the 8,000-person longitudinal follow-up sub-sample No false
administration_mode ENUM How survey was completed: SELF_DIGITAL, LAY_COUNSELLOR, PHONE No LAY_COUNSELLOR

Sample Records

Four representative respondent records spanning countries, severity levels, stressors, and help-seeking behaviours.

mental_health_sample.json
[ { "respondent_id": "RSP-NGA-0094821", "country_code": "NG", "survey_wave": "BASELINE", "age_group": "25_34", "gender": "FEMALE", "residence_type": "URBAN", "phq9_score": 11, "gad7_score": 8, "srq20_score": 7, "phq9_severity": "MODERATE", "primary_stressor": "FINANCIAL", "help_seeking": "NONE", "prior_diagnosis": false, "clinician_label": "MODERATE", "is_longitudinal": false, "administration_mode": "LAY_COUNSELLOR" }, { "respondent_id": "RSP-KEN-0041203", "country_code": "KE", "survey_wave": "BASELINE", "age_group": "35_44", "gender": "MALE", "residence_type": "RURAL", "phq9_score": 3, "gad7_score": 2, "srq20_score": 1, "phq9_severity": "MINIMAL", "primary_stressor": "NONE", "help_seeking": "NONE", "prior_diagnosis": false, "clinician_label": null, "is_longitudinal": false, "administration_mode": "SELF_DIGITAL" }, { "respondent_id": "RSP-UGA-0078341", "country_code": "UG", "survey_wave": "FOLLOWUP_3M", "age_group": "18_24", "gender": "FEMALE", "residence_type": "PERI_URBAN", "phq9_score": 19, "gad7_score": 15, "srq20_score": 14, "phq9_severity": "MOD_SEVERE", "primary_stressor": "IPV", "help_seeking": "INFORMAL", "prior_diagnosis": null, "clinician_label": "SEVERE", "is_longitudinal": true, "administration_mode": "LAY_COUNSELLOR" }, { "respondent_id": "RSP-GHA-0033871", "country_code": "GH", "survey_wave": "BASELINE", "age_group": "45_54", "gender": "FEMALE", "residence_type": "URBAN", "phq9_score": 7, "gad7_score": 5, "srq20_score": null, "phq9_severity": "MILD", "primary_stressor": "BEREAVEMENT", "help_seeking": "TRADITIONAL", "prior_diagnosis": true, "clinician_label": null, "is_longitudinal": true, "administration_mode": "PHONE" } ]
Request Dataset Access

All datasets are available under a commercial licence agreement. Our team typically responds within 2 business days.

Request Access
NDA may be required

Build with Data that reflects Africa

Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.