Basic Operations¶
Operations On Points¶
You will most commonly operate on points (singly or in small sets) in order to construct trajectories or while manipulating trajectories to construct more trajectories.
The second most common use case for operations on points is to to compute point-to-point quantities like speed, bearing, distance, and turn angle. These can be used as features in analysis or as annotations to decorate trajectories during rendering.
All of our mathematical operations on trajectory points are in the
module tracktable.core.geomath
. These include concepts
like distance or speed between two points, the bearing from one point
to another, the turn angle at a point, and the geometric mean or
median of set of points. Please refer to the geomath
module
for details.
Examples of Per-Point Features¶
Todo
Write this section
Adding Per-Point Features To Trajectories¶
Once we have points or trajectories in memory we may want to annotate them with derived quantities for analysis or rendering. For example, we might want to color an airplane’s trajectory using its climb rate to indicate takeoff, landing, ascent and descent. We might want to use acceleration, deceleration and rates of turning to help classify moving objects.
In order to accomplish this, we add features to the per-point properties
of TrajectoryPoint
objects as annotations. The
tracktable.feature.annotations
module contains functions for
this: calculators to compute a feature and accessors to retrieve the
feature later for analysis and rendering. Calculators and accessors
are deliberately simple to make it easier for you to add your own. There
is no limit to the number of features you can add to each point.
The simplest feature is progress. This has a value of zero at the beginning of the trajectory and one at the end. It is useful for color- coding trajectories for visualization so that their beginnings and ends are easy to distinguish.
Annotations Example¶
1from tracktable.feature import annotations
2
3# Suppose that my_trajectories is a list of already-
4# compiled trajectories. We want to add the "progress"
5# annotation to all the points in each trajectory.
6
7annotated_trajectories = [
8 annotations.progress(t) for t in my_trajectories
9]
1from tracktable.feature import annotations
2
3# Retrieve the color type of the trajectory
4if trajectory_color_type == 'scalar':
5 annotator = annotations.retrieve_feature_function(trajectory_color)
6
7 def annotation_generator(traj_source):
8 for trajectory in traj_source:
9 yield(annotator(trajectory))
10
11 trajectories_to_render = annotation_generator(trajectory_source)
12 scalar_generator = annotations.retrieve_feature_accessor(trajectory_color)
13 colormap = trajectory_colormap
Todo
This second code snippet is confusing.
Assembling Trajectories from Points¶
Creating trajectories from a set of points is at heart a simple operation. Sort a set of input points by non-decreasing timestamp, then group them by object ID. Each different group can then be viewed as the vertices of a polyline (connected series of line segments). This is our representation for a trajectory.
The task becomes more nuanced when we consider the following question:
If a trajectory contains a large gap in either time or distance between two successive points, is it still a single trajectory?
The answer to this question changes for every different data set. The trajectory assembler in Tracktable allows you to specify your own values for the distance and time separation thresholds. Here are the details.
Tracktable includes a filter,
tracktable.applications.assemble_trajectories.AssembleTrajectoryFromPoints
,
to create a sequence of trajectories from a sequence of trajectory
points sorted by increasing timestamp. The caller is responsible
for ensuring that the points are sorted.
This filter is present in both C++ and Python. In Python, the input point sequence only needs to be an iterable and will only be traversed once. The output (sequence of trajectories) is also an iterable and can only be traversed once. In practice, we almost always save the assembled trajectories in a list for later use.
AssembleTrajectoryFromPoints
has three parameters in addition to
the point sequence:
separation_time
(datetime.timedelta
) If the timestamps of two successive points with the same object ID differ by more than this amount, the points before the gap will be packaged up as a finished trajectory. A new trajectory will begin with the first point after the gap. The default separation time is 30 minutes.separation_distance
(float): If two successive points with the same object ID are more than this distance apart, the points before the gap will be packaged up as a finished trajectory. A new trajectory will begin with the first point after the gap. The units of measurement for the separation distance depend on the point domain: kilometers for Terrestrial, no units for 2D and 3D Cartesian points. The default separation distance is infinite; that is, as long as two points are close enough together in time, the trajectory will continue.minimum_length
(integer): Finished trajectories will be discarded unless they contain at least this many points. The default is 2 points.
Note
The name “minimum_length” is confusing because length can refer to distance as well as number of points. We will provide a better name in Tracktable 1.6, deprecate the existing name, and remove it in some future release.
Trajectory Assembly Example¶
Note
As of Tracktable 1.7, there is a generalized trajectory loader that will automatically load CSV, TSV or TRAJ files and, if desired, automatically assemble the points into trajectories.
1from tracktable_data.data import retrieve
2from tracktable.rw.load import load_trajectories
3
4 trajectories = load_trajectories(retrieve('SampleFlight.csv'),
5 real_fields={"altitude":4},
6 separation_time=30,
7 separation_distance=100,
8 minimum_length=10
9 )
10
11 # process the trajectories here
Note
For posterity, the example for creating a reader and assembler by hand has been preserved below for reference.
1from tracktable.domain.terrestrial import TrajectoryPointReader
2from tracktable_data.data import retrieve
3
4with open(retrieve('SampleFlight.csv'), 'rb') as infile:
5 reader = TrajectoryPointReader()
6 reader.input = infile
7 reader.delimiter = ','
8
9 # Columns 0 and 1 are the object ID and timestamp
10 reader.object_id_column = 0
11 reader.timestamp_column = 1
12
13 # Columns 2 and 3 are the longitude and
14 # latitude (coordinates 0 and 1)
15 reader.coordinates[0] = 2
16 reader.coordinates[1] = 3
17
18 # Column 4 is the altitude
19 reader.set_real_field_column("altitude", 4)
20
21 trajectory_assembler = AssembleTrajectoryFromPoints()
22 trajectory_assembler.input = reader
23
24 trajectory_assembler.separation_time = datetime.timedelta(minutes=30)
25 trajectory_assembler.separation_distance = 100
26 trajectory_assembler.minimum_length = 10
27
28 trajectories = list(trajectory_assembler)
29
30 # process the trajectories here
Operations On Trajectories¶
Some common use cases for operating on trajectories:
- Interpolate between points to find an approximate position at a
specified time or distance traveled
- Extract a subset of the trajectory with endpoints specified by
time or distance traveled
- Compute a scalar feature that describes some aspect of the entire
trajectory
- Compute a vector of distance geometry values that collectively describe
the trajectory’s shape
Interpolation and Subsets¶
The module tracktable.core.geomath
contains several
functions for interpolation along trajectories and extracting
subsets between interpolated points. The first two will produce a
TrajectoryPoint at some specified fraction along the trajectory,
parameterized between 0 and 1 by time elapsed or by distance
traveled.
These functions interpolate coordinates, timestamps, and all of the additional features present at points. We provide two separate parameterizations because indexing by time can lead to division by zero in later algorithms when a trajectory includes a stretch where the underlying vehicle stopped. Indexing by distance avoids this problem by ignoring veloity.
To extract a subset of trajectory instead of individual points, use
subset_during_interval()
. This function takes its endpoints
as fractions between 0 and 1 (parameterized by time). We will add
analogous functions to extract a subset by distance traveled,
time fraction, and distance fraction for Tracktable 1.6.
Computing Scalar-Valued Trajectory Features¶
A scalar-valued trajectory feature is a single number that describes some aspect of the trajectory. A collection of these features can characterize a trajectory well enough to establish similarity and difference in a collection.
Here are a few examples along with code snippets to compute them. There are many other possible features.
1import tracktable.core.geomath
2
3def total_travel_distance(trajectory):
4 return trajectory[-1].current_length
5
6def end_to_end_distance(trajectory):
7 return tracktable.core.geomath.distance(
8 trajectory[0], trajectory[-1]
9 )
10
11def straightness_ratio(trajectory):
12 return end_to_end_distance(trajectory) / total_travel_distance(trajectory)
13
14def total_winding(trajectory):
15 t = trajectory
16 return sum([
17 tracktable.core.geomath.signed_turn_angle(t[i], t[i+1], t[i+2])
18 for i in range(0, len(trajectory) - 3)
19 ])
20
21def total_turning(trajectory):
22 t = trajectory
23 return sum([
24 tracktable.core.geomath.unsigned_turn_angle(t[i], t[i+1], t[i+2])
25 for i in range(0, len(trajectory) - 3)
26 ])
Computing Distance Geometry Features¶
Distance geometry is a family of methods for analyzing sets of points based only on the distances between pairs of members. In Tracktable, we use distance geometry to compute a multiscale description (called a signature) of a trajectory’s shape that can be used to search for similar trajectories independent of translation, uniform scale, rotation, or reflection.
The tracktable.algorithms.distance_geometry
module is responsible
for computing the multilevel distance geometry signature of a given
trajectory. As with extracting points and subsets, we provide functions
to compute this signature with points sampled by length or time. If your
data includes trajectories of objects that stop in one place, we recommend
that you use the parameterization over length to avoid division by zero.
How Distance Geometry Works¶
When computing the distance geometry feature values
for a trajectory, we first choose a depth d. For each level
L = 1 ... d
, we place L+1
points along the trajectory, equally spaced
in either distance or time. Then, for that level, we compute the straightness
of the L
line segments that connect those points from beginning to end.
A straightness value of 1 means that the trajectory is perfectly straight between
two sample points. A straightness value of 0 means that the trajectory ends
at the same point it began for a given segment regardless of its meandering
along the way.
We collect these straightness values for all d levels to assemble a signature,
which can be used as a feature vector. A distance geometry signature with depth
d will have (d * (d+1)) / 2
values.
Distance Geometry Example¶
1from tracktable.algorithms.distance_geometry import distance_geometry_by_distance
2from tracktable.algorithms.distance_geometry import distance_geometry_by_time
3from tracktable_data.data import retrieve
4from tracktable.rw.load import load_trajectories
5
6 trajectories = load_trajectories(retrieve('SampleFlightsUS.csv'),
7 real_fields={"altitude":4},
8 separation_time=30,
9 separation_distance=100,
10 minimum_length=10
11 )
12
13 for trajectory in trajectories:
14 distance_geometry_length_values = distance_geometry_by_distance(trajectories, 4)
15 distance_geometry_time_values = distance_geometry_by_time(trajectories, 4)
16
17 # Process or store distance geometry values
Note
For posterity, the example for creating a reader and assembler by hand has been preserved below for reference.
1from tracktable.algorithms.distance_geometry import distance_geometry_by_distance
2from tracktable.algorithms.distance_geometry import distance_geometry_by_time
3from tracktable.domain.terrestrial import TrajectoryPointReader
4
5with open(retrieve('SampleFlightsUS.csv')) as infile:
6 reader = TrajectoryPointReader()
7 reader.input = infile
8 reader.delimiter = ','
9
10 # Columns 0 and 1 are the object ID and timestamp
11 reader.object_id_column = 0
12 reader.timestamp_column = 1
13
14 # Columns 2 and 3 are the longitude and
15 # latitude (coordinates 0 and 1)
16 reader.coordinates[0] = 2
17 reader.coordinates[1] = 3
18
19 # Column 4 is the altitude
20 reader.set_real_field_column("altitude", 4)
21
22 trajectory_assembler = AssembleTrajectoryFromPoints()
23 trajectory_assembler.input = reader
24
25 trajectory_assembler.separation_time = datetime.timedelta(minutes=30)
26 trajectory_assembler.separation_distance = 100
27 trajectory_assembler.minimum_length = 10
28
29 distance_geometry_length_values = distance_geometry_by_distance(trajectory_assembler.trajectories(), 4)
30 distance_geometry_time_values = distance_geometry_by_time(trajectory_assembler.trajectories(), 4)
31
32 # Process or store distance geometry values
Analyzing Trajectories Using Feature Vectors¶
The goal of feature creation is to represent each data point (in this case, each trajectory) with a feature vector. then to use those feature vectors as the inputs for further analysis.
In this section we will show you how to create a feature vector from a collection of features and how to feed those features to DBSCAN for clustering and an R-tree for finding items similar to an example.
Creating Feature Vectors¶
Tracktable has a specific point domain for feature vectors just as it has
domains for geographic and Cartesian coordinates. In our current release we
support feature vectors with 1 to 30 components. The function
tracktable.domain.feature_vectors.convert_to_feature_vector()
will
convert a list or array of values into a feature vector:
1from tracktable.domain.feature_vectors import convert_to_feature_vector
2
3# Suppose that the array 'my_feature_values' contains all of the features
4# for a single trajectory.
5
6my_feature_vector = convert_to_feature_vector(my_feature_values)
Like other Tracktable point types, the caller can read and write the
individual values in a feature vector using the []
operator. In
other words, just treat it like an ordinary list or array.
The
tracktable.algorithms.distance_geometry
submodule will compute the multilevel distance geometry for a trajectory based on eitherlength
ortime
.The
tracktable.algorithms.dbscan
submodule will perform box density-based spatial clustering of applications with noise analysis to determine the clustering of the feature vector points.The
tracktable.domain.rtree
submodule will generate an R-tree that can efficiently compute the nearest neighbors of a given point or set of points.
DBSCAN Clustering¶
DBSCAN is a density-based clustering method that does not need to know the number of clusters in advance. It operates instead on a notion of when two points are close together. You must supply two parameters:
- Closeness: How close must two points be along each axis
in order to belong to the same cluster?
- Minimum cluster size: How many points must be close to one another
in order to be considered a cluster instead of coincidence?
As originally described, DBSCAN uses a single value to define “closeness”. This value is used as the radius of a sphere. For any given point, all other points within that sphere are close by.
In Tracktable, we specify closeness as a list of values, one per feature. This allows different values of closeness depending on the properties of each feature.
Suppose that you have maximum altitude and maximum speed as two of your features. In clustering, you might want to identify trajectories that have similar combinations of altitude and speed. In this situation you need a neighborhood defined with a box and a sphere because of the ranges of the variables involved. Maximum altitude is measured in feet above sea level and ranges from 0 to around 40,000. Maximum speed is measured in kilometers per hour and ranges from 0 to around 1000. Since these ranges are so different, any value that encompasses “close enough” for altitude will be too large to distinguish different classes of speeds. Conversely, any value that can divide speeds into different classes will be too small to group altitudes together.
Mathematically, a single radius is equivalent to clustering on the L2 norm. A vector of distances is conceptually equivalent to the L-infinity norm.
Note
An upcoming release of Tracktable will add back in the ability to specify a single radius. We also hope to extend DBSCAN to arbitrary metrics.
Todo
Modify this example to use max altitude / max speed as our features. Run on an example data set that has a mix of different classes of aircraft.
Our implementation of DBSCAN is in the tracktable.algorithms.dbscan
module. Here is an example of how to invoke it.
1from tracktable.algorithms.dbscan import compute_cluster_labels
2import tracktable.core.geomath
3
4# Assume that 'all_trajectories' is a list of trajectories from some
5# data source
6
7# First we need features.
8def end_to_end_distance(trajectory):
9 return tracktable.core.geomath.distance(trajectory[0], trajectory[-1])
10
11def total_length(trajectory):
12 return trajectory[-1].current_length
13
14feature_values = [
15 [end_to_end_distance(t), total_length(t)] for t in all_trajectories
16]
17
18# Now we can create feature vectors.
19feature_vectors = [convert_to_feature_vector(fv) for fv in feature_values]
20
21# Let's say that two flights are "similar" if they have end-to-end distances
22# within 5km of one another (suggesting that they flew between the same two
23# airports) and total lengths within 100km of one another (to allow for
24# minor diversions and holding patterns).
25
26closeness = [5, 100]
27
28minimum_cluster_size = 10
29
30# And now we can run DBSCAN.
31
32cluster_labels = compute_cluster_labels(
33 feature_vectors,
34 closeness,
35 minimum_cluster_size
36 )
37
38# Done -- conduct further analysis or visualization based on the cluster labels.
R-Tree¶
The R-tree is a data structure that provides a fast way to find all points near a given search position. We use it to find all feature vectors within some specified distance of a sample feature vector. This, in turn, allows us to identify trajectories that have similar features.
Note
This may sound very familiar to the description of how DBSCAN identifies points that are close together. DBSCAN uses an R-tree internally.
As in our last example, we will use end-to-end distance and total travel distance as our two features.
1from tracktable.domain.rtree import RTree
2from tracktable.domain.feature_vectors import convert_to_feature_vector
3import tracktable.core.geomath
4
5# Assume that 'all_trajectories' is a list of trajectories from some
6# data source
7
8# First we need features.
9def end_to_end_distance(trajectory):
10 return tracktable.core.geomath.distance(trajectory[0], trajectory[-1])
11
12def total_length(trajectory):
13 return trajectory[-1].current_length
14
15feature_values = [
16 [end_to_end_distance(t), total_length(t)] for t in all_trajectories
17]
18
19# Now we can create feature vectors.
20feature_vectors = [convert_to_feature_vector(fv) for fv in feature_values]
21
22# Now we create an R-tree from those feature vectors.
23my_tree = RTree(feature_vectors)
24
25# Suppose that we have an interesting trajectory whose end-to-end distance
26# is 1000 km but traveled a total of 2000 km -- that is, there was some
27# significant wandering involved. We want to find similar trajectories.
28
29interesting_feature_vector = convert_to_feature_vector([1000, 2000])
30
31# Case 1: We want the 10 nearest neighbors.
32nearest_neighbor_indices = my_tree.find_nearest_neighbors(
33 interesting_feature_vector, 10
34 )
35
36# Case 2: We want all the points with end-to-end distance between
37# 950 and 1050 km but total distance between 1900 and 5000 km.
38
39search_box_min = convert_to_feature_vector([950, 1900])
40search_box_max = convert_to_feature_vector([1050, 5000])
41
42similar_indices = my_tree.find_points_in_box(
43 search_box_min,
44 search_box_max
45 )
46
47# The contents of nearest_neighbor_indices and similar_indices are
48# indices into the list of feature vectors. Because the feature
49# vectors are stored in the same order as the list of input
50# trajectories, we can also use them as indices back into the
51# list of trajectories.
Retrieving Airport and Port Information¶
Tracktable includes data bases of worldwide airports and maritime ports which can be used for rendering, data generation and analytics. Rendering guides can be found on the Rendering page while data generation guides can be found on the Data Generation page. Both airport and port modules have convient functions for retrieving information from their respective databases, these are outlined below.
Airports¶
1 from tracktable.info import airports
2 all_airports = airports.all_airports()
1 from tracktable.info import airports
2 abq_airport = airports.airport_information("ABQ")
1 from tracktable.info import airports
2 abq_airport = airports.airport_size_rank("ABQ")
Ports¶
1from tracktable.info import ports
2all_ports = ports.all_ports
1 from tracktable.info import ports
2 alexandria_port = ports.port_information("Alexandria")
1 from tracktable.info import ports
2 newport_port = ports.port_information("Newport", country='United Kingdom')
1 from tracktable.info import ports
2 new_shoreham_port = ports.port_information("New Shoreham")
1 from tracktable.info import ports
2
3 # WPI number can be str or int
4 newcastle_port = ports.port_information("53610")
5 newcastle_port = ports.port_information(53610)
1 from tracktable.info import ports
2 united_states_ports = ports.all_ports_by_country("United States")
1 from tracktable.info import ports
2 pacific_ocean_ports = ports.all_ports_by_water_body("Pacific Ocean")
1 from tracktable.info import ports
2
3 # Any of the following will work when retrieving ports by WPI region
4 wpi_region_wales_ports = ports.all_ports_by_wpi_region("Wales -- 34710")
5
6 wpi_region_wales_ports = ports.all_ports_by_wpi_region("Wales")
7
8 wpi_region_wales_ports = ports.all_ports_by_wpi_region("34710")
9
10 wpi_region_wales_ports = ports.all_ports_by_wpi_region(34710)
1 from tracktable.domain.terrestrial import BoundingBox
2 from tracktable.info import ports
3
4 # Ports around Florida
5 bbox = BoundingBox((-88, 24), (-79.5, 31))
6 bounding_box_ports = ports.all_ports_within_bounding_box(bbox)