Crop Disease & Pest Detection Image Dataset
80K+ field-captured images of crop diseases and pest damage across cassava, maize, and cocoa — each frame expert-annotated with bounding boxes, disease class labels, and severity scores for training computer vision early-warning systems.
This is a synthetic dataset generated from high-quality expert-labelled seed data. All records are algorithmically derived — statistical distributions, inter-field correlations, and annotation characteristics faithfully replicate real-world patterns from the source data, while ensuring no real individual, organisation, or transaction can be identified or reconstructed.
The African Crop Disease & Pest Detection Image Dataset contains 80K+ high-resolution field photographs collected across Nigeria, Ghana, Cameroon, and Côte d'Ivoire — four countries that together account for a significant share of West and Central Africa's cassava, maize, and cocoa production. Images were captured under varying lighting and weather conditions using standardised smartphone protocols, ensuring realistic distribution of image quality for field-deployment models.
Each image is annotated by plant pathologists and certified agronomists with: a primary disease or pest class (29 distinct classes across the three crop types), a severity score on a 0–4 scale aligned with CABI severity standards, bounding-box or pixel-level segmentation masks (where available), and crop growth stage at time of capture. The class taxonomy covers fungal, bacterial, and viral diseases as well as major pest species including fall armyworm, cassava mealybug, and cocoa mirids.
The dataset is split into train / validation / test partitions stratified by crop type, country, and severity distribution. A held-out synthetic augmentation set generated via diffusion-model upsampling is included for regularisation experiments. All metadata fields are stored as JSON sidecar files compatible with standard annotation formats (COCO, YOLO, Pascal VOC).
Key Use Cases
Dataset Highlights
Compatible Frameworks & Formats
Geographic Coverage
Dataset Schema
Each record represents one annotated image. Metadata fields cover image provenance, annotation details, crop and disease taxonomy, severity, and dataset split assignment.
| Field Name | Type | Description | Nullable | Example |
|---|---|---|---|---|
| image_id | STRING | Unique image identifier | No | IMG-NGA-0048213 |
| country_code | STRING | ISO 3166-1 alpha-2 country of capture | No | NG |
| capture_date | DATE | Date of field capture (YYYY-MM-DD) | Yes | 2023-08-14 |
| crop_type | ENUM | Crop photographed: CASSAVA, MAIZE, COCOA | No | CASSAVA |
| growth_stage | ENUM | Crop growth stage: SEEDLING, VEGETATIVE, FLOWERING, MATURITY | Yes | VEGETATIVE |
| disease_class | STRING | Primary annotated disease or pest class (29 total classes) | No | Cassava Brown Streak Disease |
| disease_category | ENUM | Disease origin: FUNGAL, BACTERIAL, VIRAL, PEST, HEALTHY | No | VIRAL |
| severity_score | INTEGER | CABI-aligned severity 0 (healthy) – 4 (severe) | No | 2 |
| bbox_count | INTEGER | Number of bounding box annotations in the image | No | 3 |
| has_segmentation | BOOLEAN | True if pixel-level segmentation mask is available | No | false |
| image_width_px | INTEGER | Image width in pixels | No | 1920 |
| image_height_px | INTEGER | Image height in pixels | No | 1080 |
| capture_device | ENUM | Capture method: SMARTPHONE, DRONE, DSLR | Yes | SMARTPHONE |
| annotator_id | STRING | Anonymised annotator identifier | No | ANN-047 |
| annotation_confidence | FLOAT | Inter-annotator agreement score for this image (0–1) | Yes | 0.92 |
| is_synthetic | BOOLEAN | True if image was generated by diffusion-model augmentation | No | false |
| split | ENUM | Dataset partition: TRAIN, VAL, TEST, SYNTHETIC | No | TRAIN |
| annotation_format | ENUM | Annotation format available: COCO, YOLO, VOC | No | COCO |
Sample Records
Four representative metadata records spanning crop types, disease categories, severity levels, and capture devices.
Build with Data that reflects Africa
Request access to our full catalog of licensed human-validated African datasets or request custom data tailored to your project.