Brain Tumour Segmentation using Pyramid Scene Parsing (PSPNet)

Doing cool things with data!


Cancer is one of the deadliest diseases faced by mankind since ancient times. Almost 30% of the population of the world succumbs to cancer every year. If the tumour is detected in early stages , the chances of survival increase drastically. Deep Learning (CNN) has transformed computer vision including diagnosis on medical images. In this post we will harness the power of CNNs to detect and segment tumours from Brain MRI images.

See example of Brain MRI image with tumour below and the result of segmentation on it. Here the left image is the Brain MRI scan with the tumour in green. And the right image shows the machine prediction of tumour in red. It is amazingly accurate!

Tumour prediction example

We have worked with startups to build various applications using semantic segmentation. Contact us to find out more.

Semantic Segmentation

Semantic Segmentation is labelling pixels in an image into a class. In the image above we are labelling all pixels in the image as either tumour or background class. Many efficient deep learning based semantic segmentation methods have been published such as (in chronological order) :

  1. FCN
  2. UNet
  3. SegNet
  4. Dilated Convolutions
  5. DeepLab (v1 & v2)
  6. RefineNet
  7. PSPNet (Pyramid Scene Parsing Network)
  8. DeepLab v3

For this blog, we chose PSP-Net since it is pretty efficient and is known to do better than many state-of-the-art approaches such as U-net , FCN, DeepLab (v1,v2), and Dilated Convolutions etc.

DeepLabV3 is another popular and powerful model. I recently wrote a blog on how to do semantic segmentation at 30 FPS using DeepLabV3

To learn more about the different segmentation architectures listed above, please refer to this post.

Pyramid Scene Parsing Network

State-of-the-art scene parsing frameworks are mostly based on the fully convolutional network (FCN). The deep convolutional neural network (CNN) based methods boost dynamic object understanding, and yet still face challenges considering diverse scenes and unrestricted vocabulary. An example is where a boat is mistaken as a car. These errors are due to similar appearance of objects. But when viewing the image regarding the context prior that the scene is described as boathouse near a river, correct prediction should be yielded.

Accurate scene classification relies on having this prior knowledge of global scene categories. Pyramid pooling module helps capture this information by applying pooling layers with large kernels. Dilated convolutions are used ( Ref : dilated convolutions paper ) to modify Resnet and a pyramid pooling module is added to it. This module concatenates the feature maps from ResNet with upsampled output of parallel pooling layers with kernels covering entire image , half and small portions of image.

PSPNet architecture is described in the image below. You can read more about PSPNet in their paper here.

PSPNet Architecture

Building Brain Image Segmentation Model using PSPNet


The dataset was obtained from Kaggle . This was chosen since labelled data is in the form of binary mask images which is easy to process and use for training and testing. Alternatively, this useful web based annotation tool from VGG group [link] can be used to label custom datasets.

The dataset follows the following folder hierarchy :


|_images — RGB images in png format

|_masks — Mask RGB images in png format with regions filled with their respective label values.

Our labels are : 1 for tumour , 0 otherwise

For example :

Let’s say the pixel (10,10) belongs to tumour , it contains value 1.

Training framework

While many amazing frameworks exist for training and evaluation of semantic segmentation models using Keras, the following repo stands out due to its ease of usage, the number of different models it supports and the up to date documentation :

We chose “vgg_pspnet” , which is a pspnet implemented over pretrained vgg backbone.

Steps to be followed are :

Once the repo is installed, training can begin!

# Navigate to the Semantic_segmentation/image-segmentation-keras folder
import keras
from keras.models import model_from_json
import keras_segmentation as ks
# Initialise the pretrained model .
# Note that the input height and width need not be same as image height and width since the network takes care of the input sizes.
model = ks.models.pspnet.vgg_pspnet( n_classes=2,
# Training
train_images = “datasets/brain/images”,
train_annotations = “datasets/brain/masks”,
checkpoints_path = “ckpts/brain” , epochs=50 ,
auto_resume_checkpoint = True,
steps_per_epoch = 50

Running Inference through the trained model

# Load Neural Network
model = ks.predict.model_from_checkpoint_path(‘ckpts/brain’)
# Predicted output will be a mask image similar to the mask images specified in the input
pred_mask = ks.predict.predict( model = model , inp = ‘image.png’ )

Below are the results we obtained on a small subset of dataset . Though the dataset is quite easy to overfit , the highly accurate results show the potential of this method.

Image order : Raw image (Left) , Predicted mask (Center) , Overlaid mask boundary (Right)


Hope you like the blog and try the code for yourself. This blog shows that we can use pretrained models to get good segmentation results in half a day of work! It truly demonstrates the power of deep learning based computer vision.

We can extend this code to any kind of medical images which have features to be segmented. Examples include different kinds of cancer tumours , microbes , fractures, holes etc.

I have my own deep learning consultancy and love to work on interesting problems. I have helped many startups deploy innovative AI based solutions. Check us out at — If you have a project that we can collaborate on, then please contact me through my website or at

You can also see my other writings at:


read original article at——artificial_intelligence-5