parallel-locate¶
Faster Feature Location Through Parallel Computation¶
Feature-finding can easily be parallelized: each frame an independent task, and the tasks can be divided among the multiple CPU cores in most modern computers. Instead of running in a single process as usual, your code is spread across multiple "worker" processes, each running on its own CPU core.
First, let's set up the movie to track:
import pims
import trackpy as tp
@pims.pipeline
def gray(image):
return image[:, :, 1]
frames = gray(pims.ImageSequence('../sample_data/bulk_water/*.png'))
tp.quiet() # Disabling progress reports makes this a fairer comparison
Using trackpy.batch¶
Beginning with trackpy v0.4.2, use the "processes" argument to have trackpy.batch
run on multiple CPU cores at once (using Python's built-in multiprocessing module). Give the number of cores you want to use, or specify 'auto'
to let trackpy detect how many cores your computer has.
Let's compare the time required to process the first 100 frames:
%%timeit
features = tp.batch(frames[:100], 13, invert=True, processes='auto')
2.33 s ± 55.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
For comparison, here's the same thing running in a single process. This was run on a laptop with only 2 cores, so we should expect batch
to take roughly twice as long as the parallel version:
%%timeit
features = tp.batch(frames[:100], 13, invert=True)
4.93 s ± 110 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Using IPython Parallel¶
Using IPython parallel is a little more involved, but it gives you a lot of flexibility if you need to go beyond batch
, for example by having the parallel workers run your own custom image processing. It also works with all versions of trackpy.
Install ipyparallel and start a cluster¶
As of IPython 6.2 (November 2017), IPython parallel is a separate package. If you are not using a comprehensive distribution like Anaconda, you may need to install this package at the command prompt using pip install ipyparallel
or conda install ipyparallel
.
It is simplest to start a cluster on the CPUs of your local machine. In order to start a cluster, you will need to go to a Terminal and type:
ipcluster start
This automatically uses all available CPU cores, but you can also use the -n
option to specify how many workers to start. Now you are running a cluster — it's that easy! More information on IPython parallel is available in the IPython parallel documentation.
from ipyparallel import Client
client = Client()
view = client.load_balanced_view()
We can see that there are four cores available.
client[:]
<DirectView [0, 1, 2, 3]>
Use a little magic, %%px
, to import trackpy on all cores.
%%px
import trackpy as tp
tp.quiet()
Use the workers to locate features¶
Define a function from locate
with all the parameters specified, so the function's only argument is the image to be analyzed. We can map this function directly onto our collection of images. (This is a called "currying" a function, hence the choice of name.)
curried_locate = lambda image: tp.locate(image, 13, invert=True)
view.map(curried_locate, frames[:4]) # Optionally, prime each engine: make it set up numba.
<AsyncMapResult: <lambda>>
Compare the time it takes to locate features in the first 100 images with and without parallelization.
%%timeit
amr = view.map_async(curried_locate, frames[:100])
amr.wait_interactive()
results = amr.get()
100/100 tasks finished after 2 s done 2.9 s ± 195 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
serial_result = list(map(curried_locate, frames[:100]))
3.9 s ± 58.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Finally, if we want to get output similar to batch
, we collect the results into a single DataFrame:
import pandas as pd
amr = view.map_async(curried_locate, frames[:100])
amr.wait_interactive()
results = amr.get()
features_ipy = pd.concat(results, ignore_index=True)
features_ipy.head()
100/100 tasks finished after 2 s done
y | x | mass | size | ecc | signal | raw_mass | ep | frame | |
---|---|---|---|---|---|---|---|---|---|
0 | 5.728435 | 295.067222 | 297.073839 | 2.499673 | 0.230136 | 16.877187 | 14760.0 | 0.081197 | 0 |
1 | 5.918431 | 339.195418 | 254.571603 | 2.979975 | 0.300296 | 13.077611 | 14693.0 | 0.089778 | 0 |
2 | 6.782609 | 309.578502 | 219.491795 | 3.551496 | 0.137154 | 4.506474 | 14508.0 | 0.126768 | 0 |
3 | 7.380101 | 431.548351 | 474.240123 | 2.852436 | 0.358819 | 16.877187 | 15011.0 | 0.059789 | 0 |
4 | 8.202306 | 36.250343 | 321.903627 | 2.882596 | 0.173362 | 10.603468 | 15401.0 | 0.042414 | 0 |