African Mental Health & Wellbeing Survey Dataset
50K+ mental health and wellbeing survey responses from Nigeria, Kenya, Ghana, and Uganda — collected using culturally adapted versions of PHQ-9, GAD-7, and SRQ-20 — for training risk-stratification models, personalising digital mental health tools, and supporting public health planning across African populations.
This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.
The African Mental Health & Wellbeing Survey Dataset aggregates 50K+ structured survey responses collected across Nigeria, Kenya, Ghana, and Uganda using culturally adapted versions of three validated instruments: the Patient Health Questionnaire (PHQ-9) for depression screening, the Generalised Anxiety Disorder scale (GAD-7) for anxiety, and the Self-Reporting Questionnaire (SRQ-20) for common mental disorders. Adaptations were developed through community co-design workshops, validated for psychometric equivalence, and administered in local languages by trained lay counsellors and digital self-completion.
Each respondent record captures the full item-level responses for all three instruments alongside demographic variables, help-seeking behaviour, reported stressors, and a clinician-validated severity label for a stratified sub-sample. Stressor taxonomy covers financial stress, bereavement, intimate partner violence, food insecurity, chronic illness, and employment loss — contextually prevalent drivers of mental health burden in the study populations. Urban, peri-urban, and rural respondents are represented in proportions reflecting national population distributions.
The dataset supports a range of AI applications: risk stratification models that triage users of digital mental health apps, population-level burden estimation and sub-group analysis, personalisation engines that adapt therapeutic content based on symptom profiles, and fairness auditing to detect demographic performance disparities in mental health screening tools. A longitudinal follow-up sub-sample (n=8,000) with 3-month and 6-month re-survey data is included for temporal modelling of treatment response and natural recovery trajectories.
Key Use Cases
Dataset Highlights
Severity Distribution (PHQ-9)
Geographic Coverage
Dataset Schema
Each record represents one survey respondent at one time point. Fields cover demographics, instrument scores, stressor flags, help-seeking behaviour, and validated severity labels.
| Field Name | Type | Description | Nullable | Example |
|---|---|---|---|---|
| respondent_id | STRING | Anonymised persistent respondent identifier | No | RSP-NGA-0094821 |
| country_code | STRING | ISO 3166-1 alpha-2 country code | No | NG |
| survey_wave | ENUM | Survey round: BASELINE, FOLLOWUP_3M, FOLLOWUP_6M | No | BASELINE |
| age_group | ENUM | Age band: 18_24, 25_34, 35_44, 45_54, 55_PLUS | No | 25_34 |
| gender | ENUM | Gender: MALE, FEMALE, NON_BINARY | No | FEMALE |
| residence_type | ENUM | Settlement type: URBAN, PERI_URBAN, RURAL | No | URBAN |
| phq9_score | INTEGER | Total PHQ-9 score (0–27) | No | 11 |
| gad7_score | INTEGER | Total GAD-7 score (0–21) | No | 8 |
| srq20_score | INTEGER | Total SRQ-20 score (0–20) | Yes | 7 |
| phq9_severity | ENUM | PHQ-9 severity band: MINIMAL, MILD, MODERATE, MOD_SEVERE, SEVERE | No | MODERATE |
| primary_stressor | ENUM | Self-reported primary stressor: FINANCIAL, BEREAVEMENT, IPV, FOOD_INSECURITY, CHRONIC_ILLNESS, EMPLOYMENT_LOSS, NONE | Yes | FINANCIAL |
| help_seeking | ENUM | Current help-seeking status: NONE, INFORMAL, TRADITIONAL, FORMAL_HEALTH | No | NONE |
| prior_diagnosis | BOOLEAN | True if respondent reports a prior formal mental health diagnosis | Yes | false |
| clinician_label | ENUM | Clinician-validated severity (stratified sub-sample only): NONE, MILD, MODERATE, SEVERE, null if not in sub-sample | Yes | MODERATE |
| is_longitudinal | BOOLEAN | True if respondent is in the 8,000-person longitudinal follow-up sub-sample | No | false |
| administration_mode | ENUM | How survey was completed: SELF_DIGITAL, LAY_COUNSELLOR, PHONE | No | LAY_COUNSELLOR |
Sample Records
Four representative respondent records spanning countries, severity levels, stressors, and help-seeking behaviours.
Build with Data that reflects Africa
Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.