Basic Operations

Operations On Points

You will most commonly operate on points (singly or in small sets) in order to construct trajectories or while manipulating trajectories to construct more trajectories.

The second most common use case for operations on points is to to compute point-to-point quantities like speed, bearing, distance, and turn angle. These can be used as features in analysis or as annotations to decorate trajectories during rendering.

All of our mathematical operations on trajectory points are in the module tracktable.core.geomath. These include concepts like distance or speed between two points, the bearing from one point to another, the turn angle at a point, and the geometric mean or median of set of points. Please refer to the geomath module for details.

Examples of Per-Point Features

Todo

Write this section

Adding Per-Point Features To Trajectories

Once we have points or trajectories in memory we may want to annotate them with derived quantities for analysis or rendering. For example, we might want to color an airplane’s trajectory using its climb rate to indicate takeoff, landing, ascent and descent. We might want to use acceleration, deceleration and rates of turning to help classify moving objects.

In order to accomplish this, we add features to the per-point properties of TrajectoryPoint objects as annotations. The tracktable.feature.annotations module contains functions for this: calculators to compute a feature and accessors to retrieve the feature later for analysis and rendering. Calculators and accessors are deliberately simple to make it easier for you to add your own. There is no limit to the number of features you can add to each point.

The simplest feature is progress. This has a value of zero at the beginning of the trajectory and one at the end. It is useful for color- coding trajectories for visualization so that their beginnings and ends are easy to distinguish.

Annotations Example

Adding Progress Indicator To Trajectories
1from tracktable.feature import annotations
2
3# Suppose that my_trajectories is a list of already-
4# compiled trajectories. We want to add the "progress"
5# annotation to all the points in each trajectory.
6
7annotated_trajectories = [
8    annotations.progress(t) for t in my_trajectories
9]
Retrieving Accessor For Given Annotation
 1from tracktable.feature import annotations
 2
 3# Retrieve the color type of the trajectory
 4if trajectory_color_type == 'scalar':
 5    annotator = annotations.retrieve_feature_function(trajectory_color)
 6
 7    def annotation_generator(traj_source):
 8        for trajectory in traj_source:
 9            yield(annotator(trajectory))
10
11    trajectories_to_render = annotation_generator(trajectory_source)
12    scalar_generator = annotations.retrieve_feature_accessor(trajectory_color)
13    colormap = trajectory_colormap

Todo

This second code snippet is confusing.

Assembling Trajectories from Points

Creating trajectories from a set of points is at heart a simple operation. Sort a set of input points by non-decreasing timestamp, then group them by object ID. Each different group can then be viewed as the vertices of a polyline (connected series of line segments). This is our representation for a trajectory.

The task becomes more nuanced when we consider the following question:

If a trajectory contains a large gap in either time or distance between two successive points, is it still a single trajectory?

The answer to this question changes for every different data set. The trajectory assembler in Tracktable allows you to specify your own values for the distance and time separation thresholds. Here are the details.

Tracktable includes a filter, tracktable.applications.assemble_trajectories.AssembleTrajectoryFromPoints, to create a sequence of trajectories from a sequence of trajectory points sorted by increasing timestamp. The caller is responsible for ensuring that the points are sorted.

This filter is present in both C++ and Python. In Python, the input point sequence only needs to be an iterable and will only be traversed once. The output (sequence of trajectories) is also an iterable and can only be traversed once. In practice, we almost always save the assembled trajectories in a list for later use.

AssembleTrajectoryFromPoints has three parameters in addition to the point sequence:

  1. separation_time (datetime.timedelta) If the timestamps of two successive points with the same object ID differ by more than this amount, the points before the gap will be packaged up as a finished trajectory. A new trajectory will begin with the first point after the gap. The default separation time is 30 minutes.

  2. separation_distance (float): If two successive points with the same object ID are more than this distance apart, the points before the gap will be packaged up as a finished trajectory. A new trajectory will begin with the first point after the gap. The units of measurement for the separation distance depend on the point domain: kilometers for Terrestrial, no units for 2D and 3D Cartesian points. The default separation distance is infinite; that is, as long as two points are close enough together in time, the trajectory will continue.

  3. minimum_length (integer): Finished trajectories will be discarded unless they contain at least this many points. The default is 2 points.

Note

The name “minimum_length” is confusing because length can refer to distance as well as number of points. We will provide a better name in Tracktable 1.6, deprecate the existing name, and remove it in some future release.

Trajectory Assembly Example

Note

As of Tracktable 1.7, there is a generalized trajectory loader that will automatically load CSV, TSV or TRAJ files and, if desired, automatically assemble the points into trajectories.

General Trajectory Assembly
 1from tracktable_data.data import retrieve
 2from tracktable.rw.load import load_trajectories
 3
 4 trajectories = load_trajectories(retrieve('SampleFlight.csv'),
 5                     real_fields={"altitude":4},
 6                     separation_time=30,
 7                     separation_distance=100,
 8                     minimum_length=10
 9                     )
10
11 # process the trajectories here

Note

For posterity, the example for creating a reader and assembler by hand has been preserved below for reference.

Trajectory Assembly
 1from tracktable.domain.terrestrial import TrajectoryPointReader
 2from tracktable_data.data import retrieve
 3
 4with open(retrieve('SampleFlight.csv'), 'rb') as infile:
 5    reader = TrajectoryPointReader()
 6    reader.input = infile
 7    reader.delimiter = ','
 8
 9    # Columns 0 and 1 are the object ID and timestamp
10    reader.object_id_column = 0
11    reader.timestamp_column = 1
12
13    # Columns 2 and 3 are the longitude and
14    # latitude (coordinates 0 and 1)
15    reader.coordinates[0] = 2
16    reader.coordinates[1] = 3
17
18    # Column 4 is the altitude
19    reader.set_real_field_column("altitude", 4)
20
21    trajectory_assembler = AssembleTrajectoryFromPoints()
22    trajectory_assembler.input = reader
23
24    trajectory_assembler.separation_time = datetime.timedelta(minutes=30)
25    trajectory_assembler.separation_distance = 100
26    trajectory_assembler.minimum_length = 10
27
28    trajectories = list(trajectory_assembler)
29
30    # process the trajectories here

Operations On Trajectories

Some common use cases for operating on trajectories:

  1. Interpolate between points to find an approximate position at a

    specified time or distance traveled

  2. Extract a subset of the trajectory with endpoints specified by

    time or distance traveled

  3. Compute a scalar feature that describes some aspect of the entire

    trajectory

  4. Compute a vector of distance geometry values that collectively describe

    the trajectory’s shape

Interpolation and Subsets

The module tracktable.core.geomath contains several functions for interpolation along trajectories and extracting subsets between interpolated points. The first two will produce a TrajectoryPoint at some specified fraction along the trajectory, parameterized between 0 and 1 by time elapsed or by distance traveled.

  1. tracktable.core.geomath.point_at_time_fraction()

  2. tracktable.core.geomath.point_at_length_fraction()

These functions interpolate coordinates, timestamps, and all of the additional features present at points. We provide two separate parameterizations because indexing by time can lead to division by zero in later algorithms when a trajectory includes a stretch where the underlying vehicle stopped. Indexing by distance avoids this problem by ignoring veloity.

To extract a subset of trajectory instead of individual points, use subset_during_interval(). This function takes its endpoints as fractions between 0 and 1 (parameterized by time). We will add analogous functions to extract a subset by distance traveled, time fraction, and distance fraction for Tracktable 1.6.

Computing Scalar-Valued Trajectory Features

A scalar-valued trajectory feature is a single number that describes some aspect of the trajectory. A collection of these features can characterize a trajectory well enough to establish similarity and difference in a collection.

Here are a few examples along with code snippets to compute them. There are many other possible features.

 1import tracktable.core.geomath
 2
 3def total_travel_distance(trajectory):
 4    return trajectory[-1].current_length
 5
 6def end_to_end_distance(trajectory):
 7    return tracktable.core.geomath.distance(
 8        trajectory[0], trajectory[-1]
 9    )
10
11def straightness_ratio(trajectory):
12    return end_to_end_distance(trajectory) / total_travel_distance(trajectory)
13
14def total_winding(trajectory):
15    t = trajectory
16    return sum([
17        tracktable.core.geomath.signed_turn_angle(t[i], t[i+1], t[i+2])
18        for i in range(0, len(trajectory) - 3)
19    ])
20
21def total_turning(trajectory):
22    t = trajectory
23    return sum([
24    tracktable.core.geomath.unsigned_turn_angle(t[i], t[i+1], t[i+2])
25    for i in range(0, len(trajectory) - 3)
26    ])

Computing Distance Geometry Features

Distance geometry is a family of methods for analyzing sets of points based only on the distances between pairs of members. In Tracktable, we use distance geometry to compute a multiscale description (called a signature) of a trajectory’s shape that can be used to search for similar trajectories independent of translation, uniform scale, rotation, or reflection.

The tracktable.algorithms.distance_geometry module is responsible for computing the multilevel distance geometry signature of a given trajectory. As with extracting points and subsets, we provide functions to compute this signature with points sampled by length or time. If your data includes trajectories of objects that stop in one place, we recommend that you use the parameterization over length to avoid division by zero.

How Distance Geometry Works

When computing the distance geometry feature values for a trajectory, we first choose a depth d. For each level L = 1 ... d, we place L+1 points along the trajectory, equally spaced in either distance or time. Then, for that level, we compute the straightness of the L line segments that connect those points from beginning to end. A straightness value of 1 means that the trajectory is perfectly straight between two sample points. A straightness value of 0 means that the trajectory ends at the same point it began for a given segment regardless of its meandering along the way.

We collect these straightness values for all d levels to assemble a signature, which can be used as a feature vector. A distance geometry signature with depth d will have (d * (d+1)) / 2 values.

Distance Geometry Example

Distance Geometry by Distance and Time
 1from tracktable.algorithms.distance_geometry import distance_geometry_by_distance
 2from tracktable.algorithms.distance_geometry import distance_geometry_by_time
 3from tracktable_data.data import retrieve
 4from tracktable.rw.load import load_trajectories
 5
 6 trajectories = load_trajectories(retrieve('SampleFlightsUS.csv'),
 7                     real_fields={"altitude":4},
 8                     separation_time=30,
 9                     separation_distance=100,
10                     minimum_length=10
11                     )
12
13 for trajectory in trajectories:
14     distance_geometry_length_values = distance_geometry_by_distance(trajectories, 4)
15     distance_geometry_time_values = distance_geometry_by_time(trajectories, 4)
16
17     # Process or store distance geometry values

Note

For posterity, the example for creating a reader and assembler by hand has been preserved below for reference.

Distance Geometry by Distance and Time
 1from tracktable.algorithms.distance_geometry import distance_geometry_by_distance
 2from tracktable.algorithms.distance_geometry import distance_geometry_by_time
 3from tracktable.domain.terrestrial import TrajectoryPointReader
 4
 5with open(retrieve('SampleFlightsUS.csv')) as infile:
 6    reader = TrajectoryPointReader()
 7    reader.input = infile
 8    reader.delimiter = ','
 9
10    # Columns 0 and 1 are the object ID and timestamp
11    reader.object_id_column = 0
12    reader.timestamp_column = 1
13
14    # Columns 2 and 3 are the longitude and
15    # latitude (coordinates 0 and 1)
16    reader.coordinates[0] = 2
17    reader.coordinates[1] = 3
18
19    # Column 4 is the altitude
20    reader.set_real_field_column("altitude", 4)
21
22    trajectory_assembler = AssembleTrajectoryFromPoints()
23    trajectory_assembler.input = reader
24
25    trajectory_assembler.separation_time = datetime.timedelta(minutes=30)
26    trajectory_assembler.separation_distance = 100
27    trajectory_assembler.minimum_length = 10
28
29    distance_geometry_length_values = distance_geometry_by_distance(trajectory_assembler.trajectories(), 4)
30    distance_geometry_time_values = distance_geometry_by_time(trajectory_assembler.trajectories(), 4)
31
32    # Process or store distance geometry values

Analyzing Trajectories Using Feature Vectors

The goal of feature creation is to represent each data point (in this case, each trajectory) with a feature vector. then to use those feature vectors as the inputs for further analysis.

In this section we will show you how to create a feature vector from a collection of features and how to feed those features to DBSCAN for clustering and an R-tree for finding items similar to an example.

Creating Feature Vectors

Tracktable has a specific point domain for feature vectors just as it has domains for geographic and Cartesian coordinates. In our current release we support feature vectors with 1 to 30 components. The function tracktable.domain.feature_vectors.convert_to_feature_vector() will convert a list or array of values into a feature vector:

Creating a Feature Vector
1from tracktable.domain.feature_vectors import convert_to_feature_vector
2
3# Suppose that the array 'my_feature_values' contains all of the features
4# for a single trajectory.
5
6my_feature_vector = convert_to_feature_vector(my_feature_values)

Like other Tracktable point types, the caller can read and write the individual values in a feature vector using the [] operator. In other words, just treat it like an ordinary list or array.

  • The tracktable.algorithms.distance_geometry submodule will compute the multilevel distance geometry for a trajectory based on either length or time.

  • The tracktable.algorithms.dbscan submodule will perform box density-based spatial clustering of applications with noise analysis to determine the clustering of the feature vector points.

  • The tracktable.domain.rtree submodule will generate an R-tree that can efficiently compute the nearest neighbors of a given point or set of points.

DBSCAN Clustering

DBSCAN is a density-based clustering method that does not need to know the number of clusters in advance. It operates instead on a notion of when two points are close together. You must supply two parameters:

  1. Closeness: How close must two points be along each axis

    in order to belong to the same cluster?

  2. Minimum cluster size: How many points must be close to one another

    in order to be considered a cluster instead of coincidence?

As originally described, DBSCAN uses a single value to define “closeness”. This value is used as the radius of a sphere. For any given point, all other points within that sphere are close by.

In Tracktable, we specify closeness as a list of values, one per feature. This allows different values of closeness depending on the properties of each feature.

Suppose that you have maximum altitude and maximum speed as two of your features. In clustering, you might want to identify trajectories that have similar combinations of altitude and speed. In this situation you need a neighborhood defined with a box and a sphere because of the ranges of the variables involved. Maximum altitude is measured in feet above sea level and ranges from 0 to around 40,000. Maximum speed is measured in kilometers per hour and ranges from 0 to around 1000. Since these ranges are so different, any value that encompasses “close enough” for altitude will be too large to distinguish different classes of speeds. Conversely, any value that can divide speeds into different classes will be too small to group altitudes together.

Mathematically, a single radius is equivalent to clustering on the L2 norm. A vector of distances is conceptually equivalent to the L-infinity norm.

Note

An upcoming release of Tracktable will add back in the ability to specify a single radius. We also hope to extend DBSCAN to arbitrary metrics.

Todo

Modify this example to use max altitude / max speed as our features. Run on an example data set that has a mix of different classes of aircraft.

Our implementation of DBSCAN is in the tracktable.algorithms.dbscan module. Here is an example of how to invoke it.

DBSCAN Clustering
 1from tracktable.algorithms.dbscan import compute_cluster_labels
 2import tracktable.core.geomath
 3
 4# Assume that 'all_trajectories' is a list of trajectories from some
 5# data source
 6
 7# First we need features.
 8def end_to_end_distance(trajectory):
 9    return tracktable.core.geomath.distance(trajectory[0], trajectory[-1])
10
11def total_length(trajectory):
12    return trajectory[-1].current_length
13
14feature_values = [
15   [end_to_end_distance(t), total_length(t)] for t in all_trajectories
16]
17
18# Now we can create feature vectors.
19feature_vectors = [convert_to_feature_vector(fv) for fv in feature_values]
20
21# Let's say that two flights are "similar" if they have end-to-end distances
22# within 5km of one another (suggesting that they flew between the same two
23# airports) and total lengths within 100km of one another (to allow for
24# minor diversions and holding patterns).
25
26closeness = [5, 100]
27
28minimum_cluster_size = 10
29
30# And now we can run DBSCAN.
31
32cluster_labels = compute_cluster_labels(
33                     feature_vectors,
34                     closeness,
35                     minimum_cluster_size
36                 )
37
38# Done -- conduct further analysis or visualization based on the cluster labels.

R-Tree

The R-tree is a data structure that provides a fast way to find all points near a given search position. We use it to find all feature vectors within some specified distance of a sample feature vector. This, in turn, allows us to identify trajectories that have similar features.

Note

This may sound very familiar to the description of how DBSCAN identifies points that are close together. DBSCAN uses an R-tree internally.

As in our last example, we will use end-to-end distance and total travel distance as our two features.

R-Tree Search
 1from tracktable.domain.rtree import RTree
 2from tracktable.domain.feature_vectors import convert_to_feature_vector
 3import tracktable.core.geomath
 4
 5# Assume that 'all_trajectories' is a list of trajectories from some
 6# data source
 7
 8# First we need features.
 9def end_to_end_distance(trajectory):
10    return tracktable.core.geomath.distance(trajectory[0], trajectory[-1])
11
12def total_length(trajectory):
13    return trajectory[-1].current_length
14
15feature_values = [
16   [end_to_end_distance(t), total_length(t)] for t in all_trajectories
17]
18
19# Now we can create feature vectors.
20feature_vectors = [convert_to_feature_vector(fv) for fv in feature_values]
21
22# Now we create an R-tree from those feature vectors.
23my_tree = RTree(feature_vectors)
24
25# Suppose that we have an interesting trajectory whose end-to-end distance
26# is 1000 km but traveled a total of 2000 km -- that is, there was some
27# significant wandering involved. We want to find similar trajectories.
28
29interesting_feature_vector = convert_to_feature_vector([1000, 2000])
30
31# Case 1: We want the 10 nearest neighbors.
32nearest_neighbor_indices = my_tree.find_nearest_neighbors(
33                             interesting_feature_vector, 10
34                             )
35
36# Case 2: We want all the points with end-to-end distance between
37# 950 and 1050 km but total distance between 1900 and 5000 km.
38
39search_box_min = convert_to_feature_vector([950, 1900])
40search_box_max = convert_to_feature_vector([1050, 5000])
41
42similar_indices = my_tree.find_points_in_box(
43                                 search_box_min,
44                                 search_box_max
45                                 )
46
47# The contents of nearest_neighbor_indices and similar_indices are
48# indices into the list of feature vectors. Because the feature
49# vectors are stored in the same order as the list of input
50# trajectories, we can also use them as indices back into the
51# list of trajectories.

Retrieving Airport and Port Information

Tracktable includes data bases of worldwide airports and maritime ports which can be used for rendering, data generation and analytics. Rendering guides can be found on the Rendering page while data generation guides can be found on the Data Generation page. Both airport and port modules have convient functions for retrieving information from their respective databases, these are outlined below.

Airports

Retrieve All Airports In Database
1 from tracktable.info import airports
2 all_airports = airports.all_airports()
Airport Information Retrieval By Name
1 from tracktable.info import airports
2 abq_airport = airports.airport_information("ABQ")
Airport Information Retrieval By Rank
1 from tracktable.info import airports
2 abq_airport = airports.airport_size_rank("ABQ")

Ports

Retrieve All Ports In Database
1from tracktable.info import ports
2all_ports = ports.all_ports
Port Information Retrieval By Name
1 from tracktable.info import ports
2 alexandria_port = ports.port_information("Alexandria")
Port Information Retrieval By Name And Specific Country
1 from tracktable.info import ports
2 newport_port = ports.port_information("Newport", country='United Kingdom')
Port Information Retrieval By A Port’s Alternate Name
1 from tracktable.info import ports
2 new_shoreham_port = ports.port_information("New Shoreham")
Port Information Retrieval By A Port’s World Port Index Number
1 from tracktable.info import ports
2
3 # WPI number can be str or int
4 newcastle_port = ports.port_information("53610")
5 newcastle_port = ports.port_information(53610)
Retrieve All Ports For A Specific Country
1 from tracktable.info import ports
2 united_states_ports = ports.all_ports_by_country("United States")
Retrieve All Ports For A Specific Body Of Water
1 from tracktable.info import ports
2 pacific_ocean_ports = ports.all_ports_by_water_body("Pacific Ocean")
Retrieve All Ports For A Specific World Port Index Region
 1 from tracktable.info import ports
 2
 3 # Any of the following will work when retrieving ports by WPI region
 4 wpi_region_wales_ports = ports.all_ports_by_wpi_region("Wales -- 34710")
 5
 6 wpi_region_wales_ports = ports.all_ports_by_wpi_region("Wales")
 7
 8 wpi_region_wales_ports = ports.all_ports_by_wpi_region("34710")
 9
10 wpi_region_wales_ports = ports.all_ports_by_wpi_region(34710)
Retrieve All Ports Within A Specified Bounding Box
1 from tracktable.domain.terrestrial import BoundingBox
2 from tracktable.info import ports
3
4 # Ports around Florida
5 bbox = BoundingBox((-88, 24), (-79.5, 31))
6 bounding_box_ports = ports.all_ports_within_bounding_box(bbox)