Adaptive Search: Changing `search_range` on the Fly¶

As discussed in the tutorial on “Advanced Linking,” a basic limitation of the tracking algorithm involves the selection of search_range. If it is too small, many valid trajectories will be unlinked or linked incorrectly. If it is too large, trackpy will become overwhelmed with candidate features, and will either be bogged down by those possibilities or will give up entirely, raising a SubnetOversizeException.

The tutorial on “Advanced Linking” identifies some ways to ease this dilemma, including prediction, which lets trackpy select candidate features more intelligently. Still, “perfect” tracking is often computationally impossible. One is then faced with the task of finding a search_range that allows linking to finish in a reasonable amount of time while merely minimizing linking errors. The usual method for optimizing search_range is to repeatedly adjust it and run the linker, searching for the largest value that does not halt or intolerably slow the algorithm. This search can be time-consuming and frustrating, especially for long movies. Perhaps worse, one must choose the search_range for all particles in all frames, even if the failure is being caused by a single large subnet spanning just two successive frames.

This difficulty is addressed by the “adaptive search” functionality that is built into trackpy starting with version 0.3. Instead of specifying a single search_range for all particles in all frames, one gives trackpy a maximum value. trackpy then automatically reduces search_range only when and where it encounters difficulty.

How it works¶

When it encounters a large subnet, the algorithm repeatedly attempts to link only those particles, using smaller and smaller values of search_range. As search_range is reduced, some of the possible track assignments are eliminated from consideration. The original large subnet is thereby broken up into smaller pieces, some of which may be small enough to solve directly (by considering all possible track assignments). This process continues until no oversize subnet(s) remain, or a minimum search_range is reached. In the latter case, a SubnetOversizeException is raised and linking is aborted.

How to use it¶

To any of the link family of functions, supply the search_range argument as usual, along with an adaptive_stop argument, and optionally an adaptive_step. When trackpy encounters a large subnet, it will locally reduce search_range by multiplying it by adaptive_step. (adaptive_step must be greater than 0 and less than 1; the default is 0.95.) This multiplication will be applied repeatedly if necessary. adaptive_stop (note the single-letter difference!) tells trackpy when to give up. When search_range has become less than or equal to adaptive_stop, trackpy aborts linking by raising a SubnetOversizeException.

For example, at the high end, you might consider 10.0 to be a generous but reasonable search_range for your movie. At the low end, you feel that a search_range of 2.0 — the radius of one of your particles — is unreasonably small and would be a sure sign of trouble. Finally, you decide that search_range should be reduced carefully — just 5% at a time. The values you should specify are then

{'search_range': 10.0,
 'adaptive_stop': 2.0,
 'adaptive_step': 0.95}

{'adaptive_step': 0.95, 'adaptive_stop': 2.0, 'search_range': 10.0}

The maximum subnet size, and a word of caution¶

There is one more parameter that you may wish to adjust, but the reason is a little technical.

Under normal circumstances, if you have a movie that does not cry out for adaptive search, large subnets will be rare, and to get the most accurate linking you will want the computer to solve them all, even ones that take a few minutes. This favors a default maximum subnet size, stored in trackpy.linking.Linker.MAX_SUB_NET_SIZE, that is rather generous.

This situation changes when adaptive search is enabled. As it reduces search_range, trackpy is searching for the largest computable subnet(s) embedded within the original oversize, non-computable subnet. If you are lazy (and who isn’t?), you will want to choose a generous (but reasonable) maximum search_range and let trackpy do the hard work of optimizing it downward. The result is that trackpy will frequently be solving subnets that are just below the maximum size. This favors a small maximum size, corresponding to a subnet that can be solved in well under a second.

Because a different behavior is desired in each case, there is a different default subnet limit for adaptive search, stored in trackpy.linking.Linker.MAX_SUB_NET_SIZE_ADAPTIVE. As with the regular default, you may wish to adjust this value depending on your patience and the speed of your computer.

The default values in the current version of trackpy are

import trackpy
import pandas as pd
import numpy as np
%matplotlib notebook
from matplotlib.pyplot import *   # not recommended usage, but we use it for brevity here

trackpy.linking.Linker.MAX_SUB_NET_SIZE

/Users/dallan/miniconda/envs/trackpy-examples/lib/python3.4/site-packages/IPython/kernel/__init__.py:13: ShimWarning: The IPython.kernel package has been deprecated. You should import from ipykernel or jupyter_client instead.
  "You should import from ipykernel or jupyter_client instead.", ShimWarning)

trackpy.linking.Linker.MAX_SUB_NET_SIZE_ADAPTIVE

Since subnet computation time scales very roughly as \(N!\), this is actually a huge difference in speed:

import math
float(math.factorial(trackpy.linking.Linker.MAX_SUB_NET_SIZE) / \
    math.factorial(trackpy.linking.Linker.MAX_SUB_NET_SIZE_ADAPTIVE))

2.0284320493172736e+20

The difference in maximal subnet size is the primary drawback of adaptive search. Solving a large subnet may be slow, but at least you are getting the best possible set of trajectories, ones that minimize the total displacement over all particles. (For details, see the original paper by Crocker & Grier, referenced in the introduction to the trackpy documentation.) Depending on the details of your movie, if you insist on solving smaller subnets you may instead get results that are less than fully correct.

In other words, adaptive search can abolish a lot of headaches, tedium, and gut-wrenching decisions, but it cannot magically clean up problematic data, and it cannot abolish the possibility of shooting yourself in the foot. Proceed accordingly.

Example¶

This is a highly contrived example, but it illustrates how adaptive search can greatly ease the problem of choosing search_range.

As example data, we define a grid of points that has contracted in the second frame. It is visualized below, using a function we define.

def contracting_grid(scale=0.95):
    """Two frames with a grid of 441 points.

    In the second frame, the points contract, so that the outermost set
    coincides with the second-outermost set in the previous frame.

    This is a way to challenge (and/or stump) a subnet solver.
    """
    pts0x, pts0y = np.mgrid[-10:11,-10:11]
    pts0 = pd.DataFrame(dict(x=pts0x.flatten(), y=pts0y.flatten(),
                             frame=0))
    pts1 = pts0.copy()
    pts1.frame = 1
    pts1.x = pts1.x * scale
    pts1.y = pts1.y * scale
    return pd.concat([pts0, pts1], ignore_index=True)

cg = contracting_grid()
cg_original = cg.copy()

def trshow(tr, first_style='bo', last_style='g.', style='b.'):
    frames = list(tr.groupby('frame'))
    nframes = len(frames)
    for i, (fnum, pts) in enumerate(frames):
        if i == 0:
            sty = first_style
        elif i == nframes - 1:
            sty = last_style
        else:
            sty = style
        plot(pts.x, pts.y, sty)
    if 'particle' in tr.columns:
        for pid in tr.particle.unique():
            plot(tr.x[tr.particle == pid],
                 tr.y[tr.particle == pid], 'r-')
    axis('equal'); ylim(-11, 11)
    xlabel('x')
    ylabel('y')

figure(figsize=(6, 6))
trshow(cg_original)

<IPython.core.display.Javascript object>

In the preceding plot, the large blue dots are the first frame, and the small green dots are the second frame.

Linking without adaptive search¶

Next, we perform linking with a “small” search_range of 0.65. The computation finishes quickly but fails to link the particles with the largest displacements, in the corners.

%%time
tracks_toosmall = trackpy.link_df(cg, 0.65)

Frame 1: 441 trajectories present
CPU times: user 308 ms, sys: 6.78 ms, total: 315 ms
Wall time: 318 ms

/Users/dallan/miniconda/envs/trackpy-examples/lib/python3.4/site-packages/trackpy/linking.py:595: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  features.sort(['particle', t_column], inplace=True)

figure(figsize=(6, 6))
trshow(tracks_toosmall)

<IPython.core.display.Javascript object>

Seeing this problem, we increase search_range to 0.8. This computation also finishes quickly — by raising a SubnetOversizeException!

%%time
try:
    tracks_toobig = trackpy.link_df(cg, 0.8)
except trackpy.SubnetOversizeException:
    print('Failed!!')

Frame 0: 441 trajectories present
Failed!!
CPU times: user 22.6 ms, sys: 1.95 ms, total: 24.6 ms
Wall time: 23.3 ms

By trial and error (or some trigonometry), we can find a value of search_range that works for this movie. But it requires more than a little patience:

%%time
# Warning: takes several minutes!
tracks_regular = trackpy.link_df(cg, 0.75)

Frame 1: 441 trajectories present
CPU times: user 6min 11s, sys: 1.44 s, total: 6min 13s
Wall time: 6min 13s

/Users/dallan/miniconda/envs/trackpy-examples/lib/python3.4/site-packages/trackpy/linking.py:595: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  features.sort(['particle', t_column], inplace=True)

figure(figsize=(6, 6))
trshow(tracks_regular)

<IPython.core.display.Javascript object>

As printed above, it took several minutes to track just 441 particles in 2 frames. However, the result is correct, as shown by the plot, and by checking the duration of all the computed tracks:

tracks_regular.groupby('particle').particle.count().value_counts()

2    441
Name: particle, dtype: int64

This result indicates that there are 441 tracks that last for 2 frames, and no tracks that last for 1 frame, which would represent unlinked particles. So linking was successful, but at great cost.

Linking with adaptive search¶

Since we know this movie is tricky, we choose a very fine adaptive_step — just 1% per iteration. But we can choose an irresponsibly large search_range of 0.95 — almost the spacing between particles!

%%time
tracks_adaptive = trackpy.link_df(cg, 0.95, adaptive_stop=0.56, adaptive_step=0.99)

Frame 1: 441 trajectories present
CPU times: user 155 ms, sys: 2.46 ms, total: 158 ms
Wall time: 156 ms

/Users/dallan/miniconda/envs/trackpy-examples/lib/python3.4/site-packages/trackpy/linking.py:595: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  features.sort(['particle', t_column], inplace=True)

With adaptive search enabled, we can obtain the correct results more than an order of magnitude faster than we could with the single value of search_range we found by hand. Almost as importantly, this computation was not preceded by rounds of trial and error to select search_range.

figure(figsize=(6, 6))
trshow(tracks_adaptive)

<IPython.core.display.Javascript object>

Both the plot and our check of track lengths show that the result is correct.

tracks_adaptive.groupby('particle').particle.count().value_counts()

2    441
Name: particle, dtype: int64

Seeing adaptive search at work¶

We can use linking diagnostics, also new in trackpy v0.3, to see what adaptive search did. (See the separate tutorial on diagnostics for a more complete overview.) Let’s re-run linking with diagnostics, and look at the search_range that was used to link each particle in the second frame. (It’s not recorded for the first frame because those particles have no preceding frame.)

tracks_adaptive = trackpy.link_df(cg, 0.95, adaptive_stop=0.56, adaptive_step=0.99, diagnostics=True)

Frame 1: 441 trajectories present

/Users/dallan/miniconda/envs/trackpy-examples/lib/python3.4/site-packages/trackpy/linking.py:595: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  features.sort(['particle', t_column], inplace=True)

figure()
scatter(tracks_adaptive.x, tracks_adaptive.y, c=tracks_adaptive.diag_search_range.astype(float),
        s=20, linewidth=0, cmap=cm.GnBu, vmin=0.7)
colorbar(label='diag_search_range')
axis('image')

<IPython.core.display.Javascript object>

(-10.454959021724294,
 10.454959021724292,
 -10.456405403060545,
 10.456405403060545)

As we might have guessed, adaptive search worked by reducing search_range in the trouble spots at the corners.

A final note¶

As discussed above, adaptive search is not magic, and it may create the false impression that bad data are trackable. Until you are confident that you’re using adaptive search properly, it’s a good idea to pay attention to diagnostic output, especially diag_search_range, to get a sense of where and how often it’s used.

Adaptive Search: Changing search_range on the Fly¶