Basic Operations

Operations On Points

You will most commonly operate on points (singly or in small sets) in order to construct trajectories or while manipulating trajectories to construct more trajectories.

The second most common use case for operations on points is to to compute point-to-point quantities like speed, bearing, distance, and turn angle. These can be used as features in analysis or as annotations to decorate trajectories during rendering.

All of our mathematical operations on trajectory points are in the module tracktable.core.geomath. These include concepts like distance or speed between two points, the bearing from one point to another, the turn angle at a point, and the geometric mean or median of set of points. Please refer to the geomath module for details.

Examples of Per-Point Features

Todo

Write this section

Adding Per-Point Features To Trajectories

Once we have points or trajectories in memory we may want to annotate them with derived quantities for analysis or rendering. For example, we might want to color an airplane’s trajectory using its climb rate to indicate takeoff, landing, ascent and descent. We might want to use acceleration, deceleration and rates of turning to help classify moving objects.

In order to accomplish this, we add features to the per-point properties of TrajectoryPoint objects as annotations. The tracktable.feature.annotations module contains functions for this: calculators to compute a feature and accessors to retrieve the feature later for analysis and rendering. Calculators and accessors are deliberately simple to make it easier for you to add your own. There is no limit to the number of features you can add to each point.

The simplest feature is progress. This has a value of zero at the beginning of the trajectory and one at the end. It is useful for color- coding trajectories for visualization so that their beginnings and ends are easy to distinguish.

Annotations Example

Adding Progress Indicator To Trajectories
1
2
3
4
5
6
7
8
9
from tracktable.feature import annotations

# Suppose that my_trajectories is a list of already-
# compiled trajectories. We want to add the "progress"
# annotation to all the points in each trajectory.

annotated_trajectories = [
    annotations.progress(t) for t in my_trajectories
]
Retrieving Accessor For Given Annotation
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from tracktable.feature import annotations

# Retrieve the color type of the trajectory
if trajectory_color_type == 'scalar':
    annotator = annotations.retrieve_feature_function(trajectory_color)

    def annotation_generator(traj_source):
        for trajectory in traj_source:
            yield(annotator(trajectory))

    trajectories_to_render = annotation_generator(trajectory_source)
    scalar_generator = annotations.retrieve_feature_accessor(trajectory_color)
    colormap = trajectory_colormap

Todo

This second code snippet is confusing.

Assembling Trajectories from Points

Creating trajectories from a set of points is at heart a simple operation. Sort a set of input points by non-decreasing timestamp, then group them by object ID. Each different group can then be viewed as the vertices of a polyline (connected series of line segments). This is our representation for a trajectory.

The task becomes more nuanced when we consider the following question:

If a trajectory contains a large gap in either time or distance between two successive points, is it still a single trajectory?

The answer to this question changes for every different data set. The trajectory assembler in Tracktable allows you to specify your own values for the distance and time separation thresholds. Here are the details.

Tracktable includes a filter, tracktable.analysis.assemble_trajectories.AssembleTrajectoryFromPoints, to create a sequence of trajectories from a sequence of trajectory points sorted by increasing timestamp. The caller is responsible for ensuring that the points are sorted.

This filter is present in both C++ and Python. In Python, the input point sequence only needs to be an iterable and will only be traversed once. The output (sequence of trajectories) is also an iterable and can only be traversed once. In practice, we almost always save the assembled trajectories in a list for later use.

AssembleTrajectoryFromPoints has three parameters in addition to the point sequence:

  1. separation_time (datetime.timedelta) If the timestamps of two successive points with the same object ID differ by more than this amount, the points before the gap will be packaged up as a finished trajectory. A new trajectory will begin with the first point after the gap. The default separation time is 30 minutes.

  2. separation_distance (float): If two successive points with the same object ID are more than this distance apart, the points before the gap will be packaged up as a finished trajectory. A new trajectory will begin with the first point after the gap. The units of measurement for the separation distance depend on the point domain: kilometers for Terrestrial, no units for 2D and 3D Cartesian points. The default separation distance is infinite; that is, as long as two points are close enough together in time, the trajectory will continue.

  3. minimum_length (integer): Finished trajectories will be discarded unless they contain at least this many points. The default is 2 points.

Note

The name “minimum_length” is confusing because length can refer to distance as well as number of points. We will provide a better name in Tracktable 1.6, deprecate the existing name, and remove it in some future release.

Trajectory Assembly Example

Trajectory Assembly
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from tracktable.domain.terrestrial import TrajectoryPointReader

with open('SampleFlight.csv', 'rb') as infile:
    reader = TrajectoryPointReader()
    reader.input = infile
    reader.delimiter = ','

    # Columns 0 and 1 are the object ID and timestamp
    reader.object_id_column = 0
    reader.timestamp_column = 1

    # Columns 2 and 3 are the longitude and
    # latitude (coordinates 0 and 1)
    reader.coordinates[0] = 2
    reader.coordinates[1] = 3

    # Column 4 is the altitude
    reader.set_real_field_column("altitude", 4)

    trajectory_assembler = AssembleTrajectoryFromPoints()
    trajectory_assembler.input = reader

    trajectory_assembler.separation_time = datetime.timedelta(minutes=30)
    trajectory_assembler.separation_distance = 100
    trajectory_assembler.minimum_length = 10

    trajectories = list(trajectory_assembler)
    # process trajectories here or add to a list

Operations On Trajectories

Some common use cases for operating on trajectories:

  1. Interpolate between points to find an approximate position at a

    specified time or distance traveled

  2. Extract a subset of the trajectory with endpoints specified by

    time or distance traveled

  3. Compute a scalar feature that describes some aspect of the entire

    trajectory

  4. Compute a vector of distance geometry values that collectively describe

    the trajectory’s shape

Interpolation and Subsets

The module tracktable.core.geomath contains several functions for interpolation along trajectories and extracting subsets between interpolated points. The first two will produce a TrajectoryPoint at some specified fraction along the trajectory, parameterized between 0 and 1 by time elapsed or by distance traveled.

  1. tracktable.core.geomath.point_at_time_fraction()

  2. tracktable.core.geomath.point_at_length_fraction()

These functions interpolate coordinates, timestamps, and all of the additional features present at points. We provide two separate parameterizations because indexing by time can lead to division by zero in later algorithms when a trajectory includes a stretch where the underlying vehicle stopped. Indexing by distance avoids this problem by ignoring veloity.

To extract a subset of trajectory instead of individual points, use subset_during_interval(). This function takes its endpoints as fractions between 0 and 1 (parameterized by time). We will add analogous functions to extract a subset by distance traveled, time fraction, and distance fraction for Tracktable 1.6.

Computing Scalar-Valued Trajectory Features

A scalar-valued trajectory feature is a single number that describes some aspect of the trajectory. A collection of these features can characterize a trajectory well enough to establish similarity and difference in a collection.

Here are a few examples along with code snippets to compute them. There are many other possible features.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import tracktable.core.geomath

def total_travel_distance(trajectory):
    return trajectory[-1].current_length

def end_to_end_distance(trajectory):
    return tracktable.core.geomath.distance(
        trajectory[0], trajectory[-1]
    )

def straightness_ratio(trajectory):
    return end_to_end_distance(trajectory) / total_travel_distance(trajectory)

def total_winding(trajectory):
    t = trajectory
    return sum([
        tracktable.core.geomath.signed_turn_angle(t[i], t[i+1], t[i+2])
        for i in range(0, len(trajectory) - 3)
    ])

def total_turning(trajectory):
    t = trajectory
    return sum([
    tracktable.core.geomath.unsigned_turn_angle(t[i], t[i+1], t[i+2])
    for i in range(0, len(trajectory) - 3)
    ])

Computing Distance Geometry Features

Distance geometry is a family of methods for analyzing sets of points based only on the distances between pairs of members. In Tracktable, we use distance geometry to compute a multiscale description (called a signature) of a trajectory’s shape that can be used to search for similar trajectories independent of translation, uniform scale, rotation, or reflection.

The tracktable.analysis.distance_geometry module is responsible for computing the multilevel distance geometry signature of a given trajectory. As with extracting points and subsets, we provide functions to compute this signature with points sampled by length or time. If your data includes trajectories of objects that stop in one place, we recommend that you use the parameterization over length to avoid division by zero.

How Distance Geometry Works

When computing the distance geometry feature values for a trajectory, we first choose a depth d. For each level L = 1 ... d, we place L+1 points along the trajectory, equally spaced in either distance or time. Then, for that level, we compute the straightness of the L line segments that connect those points from beginning to end. A straightness value of 1 means that the trajectory is perfectly straight between two sample points. A straightness value of 0 means that the trajectory ends at the same point it began for a given segment regardless of its meandering along the way.

We collect these straightness values for all d levels to assemble a signature, which can be used as a feature vector. A distance geometry signature with depth d will have (d * (d+1)) / 2 values.

Distance Geometry Example

Distance Geometry by Distance and Time
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from tracktable.analysis.distance_geometry import distance_geometry_by_distance
from tracktable.analysis.distance_geometry import distance_geometry_by_time
from tracktable.domain.terrestrial import TrajectoryPointReader

with open('point_data.csv', 'rb') as infile:
    reader = TrajectoryPointReader()
    reader.input = infile
    reader.delimiter = ','

    # Columns 0 and 1 are the object ID and timestamp
    reader.object_id_column = 0
    reader.timestamp_column = 1

    # Columns 2 and 3 are the longitude and
    # latitude (coordinates 0 and 1)
    reader.coordinates[0] = 2
    reader.coordinates[1] = 3

    # Column 4 is the altitude
    reader.set_real_field_column("altitude", 4)

    trajectory_assembler = AssembleTrajectoryFromPoints()
    trajectory_assembler.input = reader

    trajectory_assembler.separation_time = datetime.timedelta(minutes=30)
    trajectory_assembler.separation_distance = 100
    trajectory_assembler.minimum_length = 10

    distance_geometry_length_values = distance_geometry_by_distance(trajectory_assembler.trajectories(), 4)
    distance_geometry_time_values = distance_geometry_by_time(trajectory_assembler.trajectories(), 4)

Note

Refer to “Clustering with Distance Geometry” example Mention what we could do with these distance geometry values after computing them

Analyzing Trajectories Using Feature Vectors

The goal of feature creation is to represent each data point (in this case, each trajectory) with a feature vector. then to use those feature vectors as the inputs for further analysis.

In this section we will show you how to create a feature vector from a collection of features and how to feed those features to DBSCAN for clustering and an R-tree for finding items similar to an example.

Creating Feature Vectors

Tracktable has a specific point domain for feature vectors just as it has domains for geographic and Cartesian coordinates. In our current release we support feature vectors with 1 to 30 components. The function tracktable.domain.feature_vectors.convert_to_feature_vector() will convert a list or array of values into a feature vector:

Creating a Feature Vector
1
2
3
4
5
6
from tracktable.domain.feature_vectors import convert_to_feature_vector

# Suppose that the array 'my_feature_values' contains all of the features
# for a single trajectory.

my_feature_vector = convert_to_feature_vector(my_feature_values)

Like other Tracktable point types, the caller can read and write the individual values in a feature vector using the [] operator. In other words, just treat it like an ordinary list or array.

  • The tracktable.analysis.distance_geometry submodule will compute the multilevel distance geometry for a trajectory based on either length or time.

  • The tracktable.analysis.dbscan submodule will perform box density-based spatial clustering of applications with noise analysis to determine the clustering of the feature vector points.

  • The tracktable.analysis.rtree submodule will generate an R-tree that can efficiently compute the nearest neighbors of a given point or set of points.

DBSCAN Clustering

DBSCAN is a density-based clustering method that does not need to know the number of clusters in advance. It operates instead on a notion of when two points are close together. You must supply two parameters:

  1. Closeness: How close must two points be along each axis

    in order to belong to the same cluster?

  2. Minimum cluster size: How many points must be close to one another

    in order to be considered a cluster instead of coincidence?

As originally described, DBSCAN uses a single value to define “closeness”. This value is used as the radius of a sphere. For any given point, all other points within that sphere are close by.

In Tracktable, we specify closeness as a list of values, one per feature. This allows different values of closeness depending on the properties of each feature.

Suppose that you have maximum altitude and maximum speed as two of your features. In clustering, you might want to identify trajectories that have similar combinations of altitude and speed. In this situation you need a neighborhood defined with a box and a sphere because of the ranges of the variables involved. Maximum altitude is measured in feet above sea level and ranges from 0 to around 40,000. Maximum speed is measured in kilometers per hour and ranges from 0 to around 1000. Since these ranges are so different, any value that encompasses “close enough” for altitude will be too large to distinguish different classes of speeds. Conversely, any value that can divide speeds into different classes will be too small to group altitudes together.

Mathematically, a single radius is equivalent to clustering on the L2 norm. A vector of distances is conceptually equivalent to the L-infinity norm.

Note

An upcoming release of Tracktable will add back in the ability to specify a single radius. We also hope to extend DBSCAN to arbitrary metrics.

Todo

Modify this example to use max altitude / max speed as our features. Run on an example data set that has a mix of different classes of aircraft.

Our implementation of DBSCAN is in the tracktable.analysis.dbscan module. Here is an example of how to invoke it.

DBSCAN Clustering
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
from tracktable.analysis.dbscan import compute_cluster_labels
import tracktable.core.geomath

# Assume that 'all_trajectories' is a list of trajectories from some
# data source

# First we need features.
def end_to_end_distance(trajectory):
    return tracktable.core.geomath.distance(trajectory[0], trajectory[-1])

def total_length(trajectory):
    return trajectory[-1].current_length

feature_values = [
   [end_to_end_distance(t), total_length(t)] for t in all_trajectories
]

# Now we can create feature vectors.
feature_vectors = [convert_to_feature_vector(fv) for fv in feature_values]

# Let's say that two flights are "similar" if they have end-to-end distances
# within 5km of one another (suggesting that they flew between the same two
# airports) and total lengths within 100km of one another (to allow for
# minor diversions and holding patterns).

closeness = [5, 100]

minimum_cluster_size = 10

# And now we can run DBSCAN.

cluster_labels = compute_cluster_labels(
                     feature_vectors,
                     closeness,
                     minimum_cluster_size
                 )

# Done -- conduct further analysis or visualization based on the cluster labels.

R-Tree

The R-tree is a data structure that provides a fast way to find all points near a given search position. We use it to find all feature vectors within some specified distance of a sample feature vector. This, in turn, allows us to identify trajectories that have similar features.

Note

This may sound very familiar to the description of how DBSCAN identifies points that are close together. DBSCAN uses an R-tree internally.

As in our last example, we will use end-to-end distance and total travel distance as our two features.

R-Tree Search
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
from tracktable.analysis.rtree import RTree
from tracktable.domain.feature_vectors import convert_to_feature_vector
import tracktable.core.geomath

# Assume that 'all_trajectories' is a list of trajectories from some
# data source

# First we need features.
def end_to_end_distance(trajectory):
    return tracktable.core.geomath.distance(trajectory[0], trajectory[-1])

def total_length(trajectory):
    return trajectory[-1].current_length

feature_values = [
   [end_to_end_distance(t), total_length(t)] for t in all_trajectories
]

# Now we can create feature vectors.
feature_vectors = [convert_to_feature_vector(fv) for fv in feature_values]

# Now we create an R-tree from those feature vectors.
my_tree = RTree(feature_vectors)

# Suppose that we have an interesting trajectory whose end-to-end distance
# is 1000 km but traveled a total of 2000 km -- that is, there was some
# significant wandering involved. We want to find similar trajectories.

interesting_feature_vector = convert_to_feature_vector([1000, 2000])

# Case 1: We want the 10 nearest neighbors.
nearest_neighbor_indices = my_tree.find_nearest_neighbors(
                             interesting_feature_vector, 10
                             )

# Case 2: We want all the points with end-to-end distance between
# 950 and 1050 km but total distance between 1900 and 5000 km.

search_box_min = convert_to_feature_vector([950, 1900])
search_box_max = convert_to_feature_vector([1050, 5000])

similar_indices = my_tree.find_points_in_box(
                                 search_box_min,
                                 search_box_max
                                 )

# The contents of nearest_neighbor_indices and similar_indices are
# indices into the list of feature vectors. Because the feature
# vectors are stored in the same order as the list of input
# trajectories, we can also use them as indices back into the
# list of trajectories.