Brain Tumor Segmentation & Classification
Brain tumor detection from MRI is traditionally a slow, manual process requiring specialist radiologists. This project automates the full pipeline: a U-Net with MobileNetV2 encoder segments the exact tumor region pixel-by-pixel, then an EfficientNetB1 classifier identifies the tumor type from three categories (glioma, meningioma, pituitary). Both models use transfer learning from ImageNet pre-trained weights to compensate for the limited size of medical imaging datasets. The project was built iteratively — a baseline U-Net, then two rounds of improvements adding data augmentation and the classification head. A Streamlit app wraps the full pipeline for use by non-technical users.
The Problem
Brain tumor diagnosis requires radiologists to manually inspect hundreds of MRI slices, a process that is slow, expensive, and subject to human error. Early and accurate detection is critical — survival rates for gliomas drop sharply with delayed diagnosis. Automating segmentation (finding where the tumor is) and classification (identifying what type it is) from a single MRI upload could meaningfully accelerate clinical workflows, especially in resource-constrained settings without specialist access.
Key Engineering Decisions
U-Net for Segmentation over Simpler CNNs
U-Net's encoder-decoder architecture with skip connections was specifically designed for biomedical image segmentation — the skip connections pass fine-grained spatial features from encoder to decoder, enabling precise pixel-level tumor boundaries rather than coarse bounding boxes.
MobileNetV2 as Encoder Backbone
MobileNetV2 is lightweight enough for real-time inference while still providing strong ImageNet-pretrained features for the encoder. Its depthwise separable convolutions reduce parameters significantly compared to VGG or ResNet encoders without sacrificing segmentation quality.
EfficientNetB1 for Classification over Custom Head
EfficientNetB1's compound scaling (width, depth, resolution simultaneously) delivers better accuracy-per-parameter than scaling a single dimension. Pre-trained on ImageNet, it converges faster on the small MRI dataset than training a custom classifier from scratch.
Transfer Learning to Address Small Dataset Size
Medical imaging datasets are inherently small compared to natural image benchmarks — the Figshare dataset has ~3,000 images across 3 classes. Freezing early layers of ImageNet-pretrained weights and fine-tuning later layers is the standard approach to prevent overfitting on this scale.
Two-Model Pipeline over Unified Detection
Separating segmentation and classification into two dedicated models makes each task cleaner — the segmentation model outputs a mask, which is then used to crop and classify the region of interest. A unified one-stage model would need to learn both tasks simultaneously, making debugging and improvement harder.
Iterative Notebook Structure
Splitting the work into three notebooks (baseline, improvement 1, improvement 2) provided clear ablation checkpoints — each step's performance gain could be measured and reported independently, making the research contribution transparent.
Key Highlights
- U-Net architecture with MobileNetV2 encoder for tumor segmentation — encoder-decoder design with skip connections preserves spatial detail critical for precise tumor boundary delineation.
- EfficientNetB1 classification head identifies tumor type from three classes: glioma, meningioma, and pituitary tumor — compound scaling balances accuracy and inference cost.
- Transfer learning from ImageNet pre-trained weights on both models — critical for overcoming the small dataset size inherent in medical imaging tasks.
- Data augmentation pipeline (rotation, flipping, zoom, brightness) applied during training to reduce overfitting on the limited MRI dataset.
- Iterative development: baseline U-Net → improvement 1 (data augmentation + transfer learning) → improvement 2 (EfficientNetB1 classification) with measurable performance gains at each stage.
- End-to-end Streamlit application: upload an MRI scan, receive the segmented tumor mask overlaid on the original image, and get the classified tumor type — accessible to non-technical medical staff.
- Trained on the publicly available Figshare brain tumor dataset (Cheng et al., 2015) covering 3,064 T1-weighted contrast-enhanced MRI images.
Tech Stack
Key Takeaways
Skip connections in U-Net are not just a residual trick — they are the primary mechanism that prevents the decoder from losing spatial resolution, and their importance becomes obvious when you compare segmentation masks with and without them.
Transfer learning on medical images from ImageNet weights works better than expected, despite the domain gap — low-level edge and texture features learned on natural images transfer well to MRI, and only the later task-specific layers need significant fine-tuning.
Data augmentation is disproportionately important for medical imaging compared to natural image tasks — with only ~1,000 images per class, augmentation is the difference between a model that generalises and one that memorises the training set.
Evaluating segmentation requires Dice coefficient and IoU, not accuracy — a model that predicts all pixels as background scores 95%+ accuracy on a scan where the tumor occupies 5% of the image, which is completely useless clinically.
Building the Streamlit app exposed preprocessing gaps that weren't visible in the notebooks — the inference pipeline needed explicit handling of grayscale vs. RGB MRI formats, image resizing, and mask post-processing that the training notebooks assumed would always be correct.
Iterative development with measurable checkpoints (baseline → augmentation → classification) is far more informative than building the full system at once — each improvement stage taught something specific about the limiting factor at that point.