tracktable.applications.cluster module

Module contents

tracktable.applications.cluster - Determine if trajectories are clustered together.

cluster_trajectories(), cluster_trajectories_rendezvous() and cluster_trajectories_shape() are the main driver functions for clustering.

tracktable.applications.cluster.cluster_name(cluster_id)[source]

Retrieve the cluster name for the given cluster ID.

Parameters

cluster_id (int) – Cluster ID to retrieve the name for.

Returns

The name of the cluser ID.

tracktable.applications.cluster.cluster_trajectories(trajectories, feature_vector_function, search_box_span, *args, min_cluster_size=2, **kwargs)[source]

Create a cotravel feature vector for each trajectory and use box-DBSCAN to cluster the trajectories.

Parameters
  • trajectories (list) – Trajectories to cluster.

  • feature_vector_function (function) – Function to use when generating the feature vector.

  • search_box_span (int) – The cluster labels with respect to two parameters: the search box size (defining “nearby” points).

Keyword Arguments

min_cluster_size (int) – The minimum number of points that you’re willing to call a cluster. (Default: 2)

Returns

list of ordered pairs. The first value of each ordered pair corresponds to trajectory index (matching the ordering of the given list of trajectories), and the second number is the number of the cluster that the trajectory belongs to.

tracktable.applications.cluster.cluster_trajectories_rendezvous(trajectories, start_fraction=0, end_fraction=1, num_control_points=10, epsilon_longitude=0.02, epsilon_latitude=0.02, epsilon_timestamp=3000, min_cluster_size=2)[source]

Create a cotravel feature vector for each trajectory and use box-DBSCAN to cluster the trajectories.

Parameters

trajectories (list) – Trajectories to cluster.

Keyword Arguments
  • start_fraction (float) – The fraction along the trajectory where you want to start sampling control points when looking for passersby. (Default: 0)

  • end_fraction (float) – The fraction along the trajectory where you want to stop sampling control points when looking for passersby. (Default: 1)

  • num_control_points (int) – The number of equally-spaced points to sample along each trajectory when clustering. (Default: 10)

  • epsilon_longitude (float) – The longitude in degrees to bound the DBSCAN clsutering (Default: 0.02)

  • epsilon_latitude (float) – The latitude in degrees to bound the DBSCAN clsutering (Default: 0.02)

  • epsilon_timestamp (int) – The timestamp in seconds to bound the DBSCAN clsutering (Default: 3000)

  • min_cluster_size (int) – The minimum number of points that you’re willing to call a cluster. (Default: 2)

Returns

list of ordered pairs. The first value of each ordered pair corresponds to trajectory index (matching the ordering of the given list of trajectories), and the second number is the number of the cluster that the trajectory belongs to.

tracktable.applications.cluster.cluster_trajectories_shape(trajectories, depth=4, epsilon=0.05, min_cluster_size=2)[source]

Create a cotravel feature vector for each trajectory and use box-DBSCAN to cluster the trajectories.

Parameters

trajectories (list) – Trajectories to cluster.

Keyword Arguments
  • depth (int) – How many levels to compute. Must be greater than zero.(Default: 4)

  • epsilon (float) – The epsilon value to generate the search box span. (Default: 0.05)

  • min_cluster_size (int) – The minimum number of points that you’re willing to call a cluster. (Default: 2)

Returns

list of ordered pairs. The first value of each ordered pair corresponds to trajectory index (matching the ordering of the given list of trajectories), and the second number is the number of the cluster that the trajectory belongs to.

tracktable.applications.cluster.group_clusters(cluster_labels, trajectories)[source]

Group clusters together with labels corresponding to trajectories.

Parameters
  • cluster_labels (list) – Labels for the cluster.

  • trajectories (list) – Trajectories to group together.

Returns

The clusters of trajectories.

tracktable.applications.cluster.print_cluster_sizes(clusters)[source]

Print the cluster id and the number of trajectories in the cluster.

Parameters

clusters (dict) – Clsuters to print information about.