# Image Segmentation Tutorial¶

This was originally material for a presentation and blog post. You can get the slides online.

Let us imagine you are trying to compare two image segmentation algorithms based on human-segmented images. This is a completely real-world example as it was one of the projects where I first used jug [1].

It depends on mahotas for image processing.

We are going to build this up piece by piece.

First a few imports:

import mahotas as mh
from glob import glob


Here, we test two thresholding-based segmentation method, called method1 and method2. They both (i) read the image, (ii) blur it with a Gaussian, and (iii) threshold it [2]:

@TaskGenerator
def method1(image):
image  = mh.gaussian_filter(image, 2)
binimage = (image > image.mean())
labeled, _ = mh.label(binimage)
return labeled

def method2(image):
image  = mh.gaussian_filter(image, 4)
image = mh.stretch(image)
binimage = (image > mh.otsu(image))
labeled, _ = mh.label(binimage)
return labeled


We need a way to compare these. We will use the Adjusted Rand Index [3]:

@TaskGenerator
def compare(labeled, ref):
from milk.measures.cluster_agreement import rand_arand_jaccard
return rand_arand_jaccard(labeled.ravel(), ref.ravel())[1]


Running over all the images looks exactly like Python:

results = []
for im in glob('images/*.jpg'):
m1 = method1(im)
m2 = method2(im)
ref = im.replace('images', 'references').replace('jpg', 'png')
v1 = compare(m1, ref)
v2 = compare(m2, ref)
results.append( (v1,v2) )


But how do we get the results out?

A simple solution is to write a function which writes to an output file:

@TaskGenerator
def print_results(results):
import numpy as np
r1, r2 = np.mean(results, 0)
with open('output.txt', 'w') as out:
out.write('Result method1: {}\nResult method2: {}\n'.format(r1,
r2))
print_results(results)


§

Except for the TaskGenerator this would be a pure Python file!

With TaskGenerator, we get jugginess!

We can call:

jug execute &
jug execute &
jug execute &
jug execute &


to get 4 processes going at once.

§

Note also the line:

print_results(results)


results is a list of Task objects. This is how you define a dependency. Jug picks up that to call print_results, it needs all the results values and behaves accordingly.

Easy as Py.

§

The full script above including data is available from github

 [1] The code in that repository still uses a pretty old version of jug, this was 2009, after all. TaskGenerator had not been invented yet.
 [2] This is for demonstration purposes; the paper had better methods, of course.
 [3] Again, you can do better than Adjusted Rand, as we show in the paper; but this is a demo. This way, we can just call a function in milk