# Image Segmentation Tutorial¶

This was originally material for a presentation and blog post. You can get the slides online.

Let us imagine you are trying to compare two image segmentation algorithms based on human-segmented images. This is a completely real-world example as it was one of the projects where I first used jug [1].

It depends on mahotas for image processing.

We are going to build this up piece by piece.

First a few imports:

```import mahotas as mh
from glob import glob
```

Here, we test two thresholding-based segmentation method, called `method1` and `method2`. They both (i) read the image, (ii) blur it with a Gaussian, and (iii) threshold it [2]:

```@TaskGenerator
def method1(image):
image  = mh.gaussian_filter(image, 2)
binimage = (image > image.mean())
labeled, _ = mh.label(binimage)
return labeled

def method2(image):
image  = mh.gaussian_filter(image, 4)
image = mh.stretch(image)
binimage = (image > mh.otsu(image))
labeled, _ = mh.label(binimage)
return labeled
```

We need a way to compare these. We will use the Adjusted Rand Index [3]:

```@TaskGenerator
def compare(labeled, ref):
from milk.measures.cluster_agreement import rand_arand_jaccard
return rand_arand_jaccard(labeled.ravel(), ref.ravel())[1]
```

Running over all the images looks exactly like Python:

```results = []
for im in glob('images/*.jpg'):
m1 = method1(im)
m2 = method2(im)
ref = im.replace('images', 'references').replace('jpg', 'png')
v1 = compare(m1, ref)
v2 = compare(m2, ref)
results.append( (v1,v2) )
```

But how do we get the results out?

A simple solution is to write a function which writes to an output file:

```@TaskGenerator
def print_results(results):
import numpy as np
r1, r2 = np.mean(results, 0)
with open('output.txt', 'w') as out:
out.write('Result method1: {}\nResult method2: {}\n'.format(r1,
r2))
print_results(results)
```

§

Except for the ``TaskGenerator`` this would be a pure Python file!

With `TaskGenerator`, we get jugginess!

We can call:

```jug execute &
jug execute &
jug execute &
jug execute &
```

to get 4 processes going at once.

§

Note also the line:

```print_results(results)
```

`results` is a list of `Task` objects. This is how you define a dependency. Jug picks up that to call `print_results`, it needs all the `results` values and behaves accordingly.

Easy as Py.

§

The full script above including data is available from github