POPAyI: Muscling Ordinal Patterns for Low-Complex and Usability-Aware Transportation Mode Detection

Detecting transportation modes’ usability in spatiotemporal urban trajectories can provide valuable insights into the mobility preferences of urban populations, helping epidemic prevention and urban quality-of-life improvement. With this goal, we introduce polar ordinal patterns with amplitude information (POPAyI), a strategy that bases its design on the ordinal pattern (OP) transformation applied to mobility-related time series. POPAyI can quantify time-series dynamics with a low-complex cost, muscling time series’ characteristics without the need for high computational and methodological complexities as the current machine learning (ML) and deep learning (DL) literature. POPAyI uses polar representation and captures amplitude information in time series, bringing the multivariate capability to the standard 1-D OP transformation. Our experiments show that POPAyI: 1) perfectly adapts to multidimensional mobility time series and natural nonlinear mobility behavior and 2) presents consistent detection results in any considered number of transportation mode’s classes with efficiency in terms of storage and computation complexity, using fewer features than ML approaches and computational resources than DL methods, e.g., reaching 10000 fewer parameters than a lightweight DL approach while increasing by 3% the F1-score.

Abstract-Detecting transportation modes' usability in spatiotemporal urban trajectories can provide valuable insights into the mobility preferences of urban populations, helping epidemic prevention and urban quality-of-life improvement.With this goal, we introduce polar ordinal patterns with amplitude information (POPAyI), a strategy that bases its design on the ordinal pattern (OP) transformation applied to mobilityrelated time series.POPAyI can quantify time-series dynamics with a low-complex cost, muscling time series' characteristics without the need for high computational and methodological complexities as the current machine learning (ML) and deep learning (DL) literature.POPAyI uses polar representation and captures amplitude information in time series, bringing the multivariate capability to the standard 1-D OP transformation.Our experiments show that POPAyI: 1) perfectly adapts to multidimensional mobility time series and natural nonlinear mobility behavior and 2) presents consistent detection results in any considered number of transportation mode's classes with efficiency in terms of storage and computation complexity, using fewer features than ML approaches and computational resources than DL methods, e.g., reaching 10 000 fewer parameters than a lightweight DL approach while increasing by 3% the F1-score.

I. INTRODUCTION
T RANSPORTATION mode detection (TMD) involves classifying mobility traces to identify the corresponding transport mode.It can provide valuable insights into the mobility preferences of urban populations and, in a broader sense, help tackle the consequences of urbanization, such as helping epidemic prevention, traffic management, and improving quality of life (e.g., road safety and carbon footprint control) [1], [2].
Most TMD literature relies on hand-crafted features (reaching hundreds of features [1], [3], [4]) or computationally-and data-intensive deep learning (DL) methods [5], [6], which pose challenges for resource-constrained Internet of Things (IoT) scenarios [2], [7].We tackle TMD differently by leveraging mobility analysis with ordinal patterns (OPs).OP provides a symbolic representation of time series dynamics, capturing intrinsic characteristics without relying on predefined models or significant computational resources.This lightweight solution is well-suited for edge computing applications, allowing on-device analytics and preserving privacy by processing data locally [7].
However, OP has its limitations.Initially designed for 1-D time series, it may not directly adapt to multidimensional data such as mobility.Previous proposals [8], [9], [10], [11] converted 2-D into 1-D data using dimensionality reduction or distance-based projections, which is often time consuming, complex, and do not capture nonlinear movement.Another limitation is not registering amplitude, rendering it oblivious to magnitude variations; e.g., the points (2,5) and (2,500) form the same OP symbol.This oversight may lead to lower performance, as amplitude carries valuable information, particularly in mobility, where it reflects transport displacement.Including amplitude can significantly enhance the ability to distinguish between transports and provide insights into urban conditions over time.However, literature incorporates it as a correcting factor in features extracted from OP, limiting its applicability to other OP-based features and representations [12], [13].
In this context, we propose polar OPs with amplitude information (POPAyI), a multivariate OP approach.It extracts linear and nonlinear motion using the polar form of 2-D coordinates and incorporates amplitude information by inferring entity displacement through coordinate distances.POPAyI is the first study that includes amplitude into a multivariate OP transformation.These strategies make POPAyI effective in capturing transportation behaviors, muscling OP, and bringing low-complex and usability-aware TMD.
In summary, our contributions are listed in the following.1) We design POP, a multivariate OP approach for TMD that uses polar coordinates, being more suitable for mobility data by considering movement aspects.
2) We incorporate amplitude information into POP to obtain POPAyI, leveraging TMD results.3) We validate POPAyI using two well-known mobility data sets: a) Cologne and b) GeoLife.We compare POPAyI with seven machine learning (ML)-and DL-based literature proposals, showing that it is a lightweight alternative with competitive results and lower complexity.POPAyI can achieve equivalent results to DL approaches using 10 3 to 10 6 times fewer parameters, e.g., we can increase the F1-score by 3% using 10 000 fewer parameters than a lightweight DL approach.We can also perform similar classification to ML methods with about 90 fewer features.
We organize this work as follows.Section II provides essential preliminary definitions (e.g., TMD problem statement) and OP theory.Section III reviews relevant prior work in TMD.Section IV introduces POPAyI.Section V discusses our results.Section VI concludes this work and explains future directions.

II. RATIONALE
Section II-A presents essential preliminary definitions of TMD.Sections II-B and II-C, respectively, introduce OP transformation and OP limitations for TMD, which we tackle with our approach.

A. Preliminary Definitions
Our technique relies on the ability to capture intricate details and changes in transportation modes from 2-D spatiotemporal data.Therefore, we prioritize data sets that offer high precision and are sampled at a frequency of few seconds.Specifically, data sets that provide samplings of 2-D geodesic coordinates.
In this scenario, a trajectory is the mobility record in chronological order as T(t) = {p 1 , . . ., p n }.Each point p i = (x i , y i , t i ), i ∈ N, represents a geospatial coordinate set and a corresponding timestamp [1], indicating the location of a mobile entity at a particular time.Since x and y vary together over time, we consider a trajectory as a multivariate time series T = {X, Y}, where X(t) = {x 1 , . . ., x n } and Y(t) = {y 1 , . . ., y n }.We also consider trajectories of varying lengths n generated by distinct transportation modes.
TMD involves two fundamental steps: 1) segmentation and 2) classification.In segmentation, a trajectory is divided into distinct nonoverlapping segments S(t) = {x 1 , . . ., x k }, where k ≤ n.Each segment corresponds to a consecutive sequence of points associated with a single transportation mode, and a point belongs to only one segment.To ensure comparability with existing literature, we split trajectories as described in [1], at either a stationary point (where the interval between two consecutive points exceeds 20 min) or the end of the trajectory.We exclusively consider trajectories (and segments) with a single associated transportation mode, relying on ground-truth data for this determination.Various segmentation methods can be employed to extract these segments, such as fixed-length segments [5] or change-point detection [1].
The second step, classification, focuses on predicting the transportation mode for each segment.Preprocessing techniques are employed to enhance distinctive characteristics, facilitating accurate identification.Again, for consistency with existing literature and to uphold feature quality, we adhered to the same preprocessing strategies utilized in [1], [3], and [5]: exclusion of segments with fewer than ten points due to limited information and removal of out-of-range coordinate points and outliers in speed and acceleration to mitigate the potential for inaccurate measurements that could impact TMD results.

B. Ordinal Patterns Transformation
OP can be applied to any time series, as it does not rely on specific model assumptions, merely replacing values in the same neighborhood with patterns based on their sequence [14].It needs two parameters: 1) embedding dimension D ∈ N to determine the length of the patterns and 2) embedding delay τ ∈ N to define the interval between consecutive data points [14].Formally, OP is defined as follows.Considering a time series X(t), at each time instant t = {1, . . ., n − (D − 1)τ }, there is a sliding window s t ⊆ x of size D, such as s t = {x t , x t+τ , . . ., x t+(D−2)τ , x t+(D−1)τ }.In other words, OP obtains each element within the sliding window in the time t, . . ., t + (D − 1) τ by sampling the time series at evenly spaced intervals, separated by intervals of size τ .In each t, OP determines an ordinal relationship between the points within the sliding window: the necessary reordering to sort these points in ascending order.Hence, the time series is converted to a set of OPs, = {π 1 , . . ., π m }, where m = n − (D − 1)τ and each π m represents a pattern of the possible permutation set D! [15].
Fig. 1 depicts how OP works.In the highlight, we see the parameters: each sliding window contains D points with a interval of τ .For instance, for D = 3, we have the first sliding window in t = 1, that is, s 1 = {1, 5, 2}, forming the pattern π m = 021.The choice of D must satisfy the condition n D! for the sake of reliability of the statistics estimated by the technique [15], [16].For practical purposes, [14] recommend values in the range 3 ≤ D ≤ 7.
1) Ordinal Patterns Probability Distribution: Other representations can be obtained after extracting the OPs from the time series, such as a probability distribution.It depicts the number of appearances of a particular pattern π m in the symbolic time series.Thus, the histogram of the probability distribution 2) Ordinal Patterns Transition Network: Another representation derived from OP transformation is the OPs transition network (OPTN).It is defined as a weighted directed graph sequential patterns π i and π j , respectively.The edge weights w : E → R are the probability of a specific transition occurring, given by w(v where | π i ,π j |∈ {0, . . ., m−1} is the number of transitions between π i and π j .Additionally, it satisfies v π i ,v π j w(v π i , v π j ) = 1.From these new representations, it is possible to extract features, such as Information Theory quantifiers, which we can use to characterize the time series dynamics [16].

C. OPs' Limitations for TMD
Considering TMD, OP exhibits linear complexity, making it suitable for resource-limited scenarios (e.g., online detection and IoT applications) and large time series processing.OP is also robust to observational and dynamic noise, making it resilient to measurement errors.Furthermore, OP is invariant to nonlinear monotonic transformations, i.e., it is insensitive to amplitude variation [15], [16], [17].Despite these benefits, OP has certain disadvantages for TMD.
1) Absence of Amplitude Information: The invariance to nonlinear monotonic transformation is helpful against noise, but it can be harmful when amplitude contains essential information.For instance, using the standard OP transformation (D = 3 and τ = 1) in two vehicles segments with speeds (in m/s) v 1 = {2, 5, 8} and v 2 = {2, 10, 19} result in the same pattern π m = 012 (i.e., speed increases over time), though their noticeable amplitude differences.Hence, amplitude absence may impact TMD since speed gradient diversity and spatial movement dynamism are not captured.2) Originally Defined for Univariate Time Series: The standard OP works for univariate time series.However, many phenomena have more than one temporal component, and we cannot observe these components in isolation.In mobility, we have latitude and longitude, which depend on time and each other.Hence, when considering two points x i and x i+1 , where each x i = (lat i , long i ), though their information changes in time, there is no intuitive way to establish an ordinal relationship between the components.So, OP cannot be trivially generalized to this new scenario.POPAyI overcomes these limitations by using a multivariate approach and incorporating amplitude information, resulting in improved performance and a more comprehensive representation of spatiotemporal mobility behavior.We will examine POPAyI in detail in Section IV.

III. RELATED WORK
While some studies have explored incorporating external information into sensor-sourced data to enhance TMD accuracy, this approach is often impractical due to the frequent collection of additional knowledge required [18], [19].Instead, POPAyI utilizes high-resolution spatiotemporal mobility data, such as GPS traces, to extract features that capture the distinctive characteristics of transportation modes.Consequently, in our literature review, we prioritize works that align with our noninvasive approach in the same data kind and perform feature extraction as a crucial component of their methodology.
We also discuss 1-D and 2-D OP approaches, evaluating their effectiveness in analyzing multivariate spatiotemporal data.
1) Transportation Mode Detection: Most TMD literature focuses on feature extraction for ML.Traditional ML methods, such as tree-based ensemble algorithms, have demonstrated high accuracy, reaching up to 90% [3], [4].However, these approaches often involve extracting numerous statistical and domain-specific features, which can range from dozens [1] to hundreds [3], which can be time consuming, require domain knowledge, and suffer from the curse of dimensionality.
To avoid using these hand-crafted features, some studies employ supervised [2], [5], [6], [20], [21], [22], [23] and semi-supervised [24], [25], [26] DL techniques to extract multiple layers of features automatically, often yielding comparable or even superior results than ML.Still, they demand significant computational resources and large volumes of training segments with equal length, which require interpolation or padding in real-world data.Additionally, their extracted features are highly abstract and nonintuitive, making interpretation challenging.Recently, DL with image-based features emerged in TMD, aiding the capture of spatial information, local patterns, and global context [27], [28], [29], [30], [31], but entailing temporal information loss, increased computational requirements, and more data preprocessing complexity.Therefore, we aim for a method that achieves high detection results using minimal features and computational resources, striking a balance between effectiveness and practicality.
2) Ordinal Patterns: OP [14] is a vital contribution to studying time series dynamics in several domains [15], [16], [17], including TMD.For instance, Zhang et al. [32] extracted permutation entropy (PE) [14] from OP and used it along with statistical features to identify the transports that generated GPS trajectories, qualifying PE as a great feature in this task due to its low computational complexity.Our previous work [11] extended such investigation to other features, such as statistical complexity (SC) and probability of self-transition, extracted from OPTN.We showed that features extracted from OP and OPTN help identify transports in scenarios with fewer data, such as IoT contexts.However, as OP is originally for univariate time series, these mentioned studies need to transform mobility into a 1-D space, which can be time consuming and complex.
Regarding the multivariate OP transformations, studies suggested joining the OP representation of each temporal component in a matrix [8], [33], which increases OP time and space complexity and demands more extensive time series to extract reliable statistics, making it impractical in certain domains.Another proposal combined dimensions into a 1-D projection with PCA [8], creating abstract time series that are hard to interpret.Alternatively, researchers calculated Euclidean and Manhattan distances between time series points and a reference point [9], which may lose essential information by incorrectly modeling mobility (e.g., not considering turns).
Furthermore, amplitude information is crucial in various fields, including mobility, where it represents the displacement of entities and plays a vital role in TMD.Neglecting this information can significantly degrade classification performance [34].Some studies incorporated amplitude directly in PE [12], [13], [35], limiting their application in other features and OP representations.Sun et al. [36] created amplitude symbols by dividing the time series plane into equal regions, which can lead to large distributions and affect the representativeness of extracted statistics.No studies consider amplitude information in multivariate OP to the best of our knowledge.
3) POPAyI Positioning: In POP transformation, we used polar coordinates instead of the original 2-D geodesic coordinates, effectively capturing the nonlinear aspects of spatiotemporal mobility without increasing OP complexity, unlike previous studies.We then enhance POPAyI, enabling its integration into any feature or OP-based transformation with low computational costs.To the best of our knowledge, POPAyI is the first work to leverage amplitude information in a multivariate OP approach.We will discuss POPAyI's design in the following section.

IV. POPAYI DESIGN
Section IV-A introduces POPAyI, an extension of standard 1-D OP transformation for 2-D data, specifically focused on TMD.The components of POPAyI, i.e., the polar-like multivariate OP transformation and amplitude information, are detailed in Sections IV-B and IV-C, respectively.

A. Methodology
Fig. 2 presents our methodology, described as follows.1) Data Preprocessing: This stage aims to transform the raw trajectory into a better format for further analysis, as follows.
1) Segmentation: To segment the trajectories, we detect stationary points with a minimum standing time of 20 min (i.e., the interval between two consecutive points is greater than 20 min) or the end of the trajectory.2) Data Handling: Data sampling is commonly influenced by various context conditions (e.g., weather), leading to inaccurate measurements that affect TMD results.To prevent this issue, we remove coordinate points with outof-range values and discard trajectories with fewer than ten points to avoid generating low-quality traces.2) POPAyI Transformation: OP extracts a symbolic pattern from a sliding window of a size determined by the parameters D and τ .For instance, in Fig. 2, we use D = 3 and τ = 1, indicating a sliding window with three coordinate points and a time interval of 1 (details in Fig. 1).Generalizing OP to 2-D, in each sliding window, we order the points based on their polar angle (cf.Section IV-B) and calculate the amplitude by the distance from the first to last point (cf.Section IV-C).
Algorithm 1 presents the pseudo-code for the POPAyI transformation.Lines 2-13 iterate through all segment points and perform the following operations: First, we sample a sliding window from the segment that will undergo the POPAyI transformation, with cost O(1).Lines 3-5 check if the segment contains only one dimension, indicating a 1-D time series.If so, a new dimension with zero values is added, transforming for each sliding window in segment do 3: if sliding window contains only one dimension then 4: add new dimension with zero values (each x is now (0, x)) 5: end if 6: π m ← apply POP on 2-D sliding window 7: if distance from first to last point in sliding window ≥ q then 8: a m ← 1 9: else 10: add the tuple (π m , a m ) to a list 13: end for 14: return list containing all tuples (π m , a m ) that represent the segment 15: end procedure it from S(t) = {x 1 , . . ., x n } to S(t) = {(0, x 1 ), . . ., (0, x n )}.It allows us to use the POPAyI transformation for both 1-D and 2-D data due to POP characteristics, detailed in Section IV-B.The time complexity to perform POP transformation in line 6 is O(D log D), but since D is typically small, it can be treated as O (1).Finally, the amplitude variation is calculated from lines 7 to 11, with a time complexity of O( 1).This step is explained in detail in Section IV-C.Therefore, the overall time complexity of the POPAyI transformation is O(n).
3) Representations Derived From POPAyI: Although POPAyI symbols are tuples, it is possible to extract the exact representations described in Sections II-B1 and II-B2, namely, the OP's probability distributions and transition network.At the first representation, we count the number of occurrences of each tuple composed of POP and amplitude.At the latter, the nodes in the transition network are the tuples, and the edges are the transition between patterns that appeared sequentially.
4) Feature Extraction: From each representation, we extract different features.For probability distribution, the features are listed in the following.
1) Shannon PE: This is a variation from Shannon's classical entropy and quantifies the probability distribution's randomness.Hence, its maximum occurs when all possible permutations of D! have the same probability of occurring (i.e., a uniform distribution), indicating a completely random time series.In contrast, low PE values represent a deterministic time series [16].2) SC: This feature computes the degree of regularity present in time series by comparing the difference between the analyzed probability and uniform distribution.Hence, SC captures the relationship between dynamical components (such as determinism and randomness) while measuring their disequilibrium [16], [37].3) Fisher Information (FI): It measures the amount of information one observation carries about an unknown parameter, e.g., the probability of observing multiple occurrences of patterns in a trajectory.In other words, FI captures how dispersed are the distribution values, usually evidenced by the curve's shape, wideness, or skewness.Hence, the higher the FI, the sharply peaked will be the curve describing the distribution values and easy to find "real" representative patterns.Contrarily to the population-like focus of PE, FI presents a locality property since it reflects the differences among consecutive probabilities of distributions [37].The extracted OPTNs features are listed in the following.1) Avg. and Standard Deviation of Edge Weights: Transition probabilities' central tendency and dispersion, respectively.2) Probability of Self-Transition: Self-transitions, or loops, are the edges from a vertex to itself, meaning the consecutive occurrences of a pattern.Borges et al. [17] showed that it is a valuable indicator of time series' main characteristics.3) Number of Edges: It is the density of a graph in edge connectivity, i.e., the ratio between the existing edges and the maximum number of edges a graph can contain.It can be a vital indicator of temporal dynamics.For instance, as the randomness of a process increases, there are more chances for all possible transitions to occur.Hence, a high number of edge values means more randomness in the system, while low values occur in more deterministic time series.4) Number of Nodes: Previous studies have shown that deterministic time series (regardless of size) may have forbidden OPs.Consequently, their graphs have forbidden nodes.In contrast, stochastic time series contain all possible patterns (and possible nodes) if long enough [17].Therefore, we measure the node density (using the maximum number of patterns as the upper bound of nodes).In summary, we consider 11 features: three extracted from POPAyI probability distribution and eight extracted from the transition network (i.e., PE, SC, FI, average, and standard deviation of edge weights, probability of self-transition, and number of edges and nodes).In our experiments, these features show promising results in discriminating transports, but still Algorithm 2 POP Transformation 1: procedure POP(2D sliding window) 2: for x, y in 2D sliding window do 3: end if 8: end for 9: π m ← ordering by θ , ties are solved by r 10: return π m 11: end procedure are not enough to reach state-of-the-art discrimination.For this reason, we go further by muscling POPAyI with statistical metrics (i.e., mean, variance, maximum, and minimum) extracted from both motion-related features, i.e., distance and speed.Hence, a total of 19 features are considered in our classifier.
5) Classification: Next, the extracted features are classified using the extreme gradient boosting (XGBoost), a tree-based traditional ML approach, which achieved the best result in our experiments, as shown in Section V-D5.

B. Polar Ordinal Patterns: POP
Extending standard 1-D OP to higher dimensional time series poses challenges in preserving its interpretability while maintaining the relationship among points (cf.Section II).In this context, POPAyI introduces POP (Polar OPs), which uses polar coordinates to extract OPs from 2-D data points while retaining their meaningful relationship, effectively addressing previous methods' limitations and enabling the capture of mobility dynamics.
1) POP Definition: POP is an adaptation of the Graham scan algorithm [38], designed for planar convex hull computation.It uses polar coordinates to determine the collinearity and relative orientation of points to a reference point.In POP, we leverage this Graham scan's fundamental concept but modify it to calculate the ordinal relationship between points.This adaptation enriches Graham's analysis by incorporating ordinal ordering in addition to collinearity, clockwise, and counterclockwise orientation.
Algorithm 2 presents the POP transformation.First, we transform from geodesic (lat, long) to polar coordinates.To this, we utilize the equations r = (lat where (a, b) denotes the reference point.Since we use (0, 0) as our reference point, we can apply the equations as shown in lines 3 and 4.
To obtain the angle θ , we use the atan2 function, which returns results in the range −π to π .However, we must adjust when the calculated angle is less than 0 since negative angles can introduce ambiguity when ordering points based on their polar angles.Thus, in lines 5-7 of the algorithm, we add 2π to the angle when negative, maintaining ordering consistency by using only positive angles.
Once each geodesic coordinate is transformed into polar coordinates, we calculate in line 9 the indices that would sort the array based on θ , providing the ordinal relationship between the points.In tie cases, i.e., when points have the same θ , we use the distance r to order.Sorting the points based Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.on their polar angles allows us to preserve the relative order of the points, enabling subsequent analysis and interpretation.
For instance, as illustrated in Fig. 3, for D = 3, π m = 021 signify a turn in a specific direction or along a particular route, which is captured by POP.Relying solely on a single 1-D coordinate (lat.or long.)limits our ability to extract precise motion details, obtaining only linear behavior, such as north/south and back/forward.
Moreover, the patterns π m = 012 and π m = 210 represent collinear points.Considering the ordinal ordering, we can distinguish between them based on the sequential relationship of the points, indicating increasing or decreasing trends.Similarly, π m = 021 (illustrated in Fig. 3) and π m = 120 indicate clockwise orientation, while π m = 201 and π m = 102 (also in Fig. 3) signify counterclockwise orientation.
Hence, by introducing ordinal ordering in Graham scan's idea, POP enhances the interpretability and analysis of mobility patterns in 2-D time series data, acquiring essential aspects, such as the direction of movement and changes in intensity along the direction.Therefore, POP brings several advantages while inheriting OP benefits (shown in Section IV-D) and avoiding the limitations of linear projections that fail to capture mobility points relationships, e.g., turns made in travel.
Finally, we claim that POP is a suitable generalization of the standard OP transformation to 2-D data.To this, consider a 1-D sliding window s m = {3, 1, 3}.Since we cannot directly extract polar coordinates from it, we employ POP as follows.
2) The polar angle θ is calculated, yielding the same value for all points [all of them are θ = tan −1 (0)].The distance r (i.e., the original y) resolves these ties.By consistently preserving the values and ordinal relationships from the 1-D to the 2-D representation, POP effectively captures the essential characteristics and ordinal relationships of the 1-D and 2-D time series.This capability allows for the robust analysis of mobility data.

C. Amplitude-Enhanced Ordinal Patterns
While speed and amplitude provide similar information in equally spaced trajectory samples, real-world data often presents complex and irregular sampling patterns.To tackle this challenge, POPAyI introduces a mechanism to calculate amplitude, capturing the displacement of entities over specific timeframes and encompassing variations in movement speed and direction.By incorporating amplitude, POPAyI facilitates more comprehensive analysis, particularly in scenarios with nonuniform sampling, ultimately enhancing accuracy and enabling a deeper understanding of the underlying dynamics.
We capture the amplitude by the distance between the first and last geodesic coordinates within each sliding window and then binarize it using a user-defined threshold value q, as shown in Algorithm 1.If the Euclidean distance between the window's last and first points exceeds q, a m = 1; otherwise, a m = 0.This way, in POPAyI, the time series is transformed into a set of patterns with their corresponding amplitudes, = {(π 1 , a 1 ), . . ., (π m , a m )}, where each (π m , a m ) represents a combination of the possible permutation set of D !.
One could expand this approach to more amplitude values, but this would cost an increased complexity and the need for additional parameters to determine amplitude thresholds.Instead, in this work, we adopt binarization to compare amplitude values and prevent the occurrence of infinite or massive amplitude distributions.Furthermore, binarization makes amplitude more reliable regarding outliers since a pattern in POPAyI with an abnormal value will not appear much, thus having minor importance in the corresponding distribution.
Using POPAyI, we extract features and representations from tuples (π m , a m ) as symbol patterns, preserving the linear time complexity of the standard OP transformation and slightly increasing space complexity to O(2 − D!).

D. POPAyI Advantages to TMD
POPAyI, as the representations derived from it, inherits the benefits of the OP transformation and introduces several advantages for TMD.These are included in the following.to trajectories of different sizes, ranging from small to large.Hence, it adapts to the person's real life, in which mobility behavior can describe trajectories of different trip sizes.4) Robustness: POPAyI is robust to observational and dynamic noise (e.g., GPS measurement errors) [15], [16].5) Enhanced Representation: POPAyI captures the rich 2-D temporal dynamics of time series, providing a more comprehensive representation than standard OP.By incorporating amplitude information, POPAyI enables better discrimination and characterization of different spatiotemporal mobility behaviors, enhancing the accuracy of TMD.

V. EXPERIMENTS
This section shows the validation results of POPAyI.We present the data sets used in Section V-A.Section V-B shows how we select the POPAyI's hyperparameters used in the experiments.We analyze the Cologne data set in Section V-C, demonstrating that amplitude information aid in capturing mobility behavior.In Section V-D, we use the Geolife data set to evaluate POPAyI in TMD tasks with different levels of detection complexity, from distinct to closely associated transports.We compare it to existing methods, evaluating their performance across various metrics, such as accuracy, F1 score, and computational complexity.

A. Data Sets
In Section V-C, we use the realistic large-scale Cologne1 data set describing vehicle trajectories generated by [39] with a temporal resolution of one second.It contains 24h weekday traffic of 700 000 individual vehicle trips in a 400-km2 region in Köln, Germany.Each data set's row informs the timestamp, the vehicle identification, the corresponding 2-D position (geodesic coordinates) x and y in meters, and speed (m/s).
Section V-D uses the real-world GeoLife 2 data set [1] containing 182 users' trajectories over five years (from April 2007 to August 2012), in which 73 have transportation mode information besides the latitude and longitude.Fig. 4 shows boxplots of trajectory size, average distance, and average speed extracted from the transports of these 73 users employed in our evaluation.This extracted set contains the transportation modes required to ensure comparability with related studies.

B. POPAyI's Hyperparameters Selection
In our experiments, we tailored the systematic search for the best hyperparameter values to their specific goals.In the first experiment (Section V-C), values were determined based on cluster visualization, while in Section V-D, the average F1 score of the training set was considered (as further detailed in Section V-D5).This section discusses the explored values.1) Amplitude Threshold (q): Setting too high or too low q can hinder displacement differentiation (i.e., vehicles moving similarly but with different speeds), as it results in a single amplitude level, a m = 1 or a m = 0, respectively.Consequently, as POPAyI probability distribution expects two distinct amplitude levels and half of the possible permutations are excluded due to an imbalanced amplitude threshold, it could adversely impact the accuracy of our results.In addition, the q unit is the same used in the data set.In Section V-C, q is measured in meters, while in Section V-D, q is measured in kilometers.Thus, we evaluated different values for each experiment: in the first, we used q = {0.5, 1, 2, 3}, and in the second, we tested q = {0.0005,0.005, 0.05, 0.1, 0.5, 1, 2, 3}.
2) Embedding Dimension (D): For balance performance and computational efficiency, we consider values of D ranging from 3 to 5, since D = 6 requires a larger number of parameters (i.e., 10 6 and 10 5 in POPAyI and POP, respectively), disrupting our objective of a solution that balances performance and computational efficiency.If presenting similar results, we selected the smallest D value that yielded the best results, satisfying the condition n D!.

C. Amplitude Benefits From POPAyI
Here, we aim to examine how amplitude relates to speed gradient and its ability to differentiate vehicles under varying speed-like conditions.The purpose is to showcase the usefulness of amplitude rather than evaluate POPAyI performance.Hence, to conduct this experiment in a controlled environment, we use the Cologne data set (cf. Section V-A), which consists of data exclusively related to one type of transportation.
The data set encompasses 700 000 trips from about 100 000 distinct vehicles.For this specific experiment, we sought vehicles with diverse traffic characteristics, categorizing them Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.into two groups based on their average speeds: 1) less than 5 m/s, ranging from 2 to 5 m/s (referred to as the slow group); and 2) equal to or exceeding 5 m/s, spanning from 6 to 30 m/s (termed as the fast group).We chose these speed limits based on their travel time: the slow group travels during rush hour (6:25 A.M. to 9 A.M.), while the fast group commutes in the early morning (1 A.M. to 5:30 A.M.).To ensure representative feature extraction that adheres to the condition n D!, we specifically opted for trajectories from unique vehicles exceeding 500 points.This criterion led us to randomly select 3000 vehicles (1500 per group), the maximum count meeting our standards in the slow group.
1) Evaluation Results: Fig. 5 compares results without [Fig.5(a)] and with [Fig.5(b)] amplitude.It shows the results of the PE and FI features extracted from the POPAyI's probability distribution.For a better cluster visualization and exclusively for this evaluation, the best parameters configuration is D = 3, τ = 10, and q = 2, selected as depicted in Section V-B.The τ value indicates the discretization of the data set in time intervals of 10 s, which gives the time distance between points in the OPs, i.e., inside the sliding window.It allows obtaining more representative patterns.Indeed, there are not enough changes in vehicle position for a temporal granularity lower than 10 s.
Fig. 5(a) shows that without amplitude is hard to distinguish between vehicles having different mobility behaviors, though presenting evident variances in average speeds.As discussed in Section IV-C, mobility behaviors are reflected in their speed dynamism over time and their spatiotemporal mobility dynamic during a timeframe.This dynamic is set aside if no amplitude information is considered in the OP transformation.Hence, as shown in Fig. 5(b), amplitude information leverages the capture of displacement behaviors, revealing two distinct groups of vehicle behaviors.
Note that both figures show that the two groups contain vehicles with speed dynamics ranging from random to deterministic occurrences, i.e., from high to low PE.Nevertheless, without the amplitude information [cf.Fig. 5(a)], it is hard to quantify how dispersed are the values of the related probability distribution, usually captured in the FI feature measurement.On the other hand, thanks to the amplitude information [cf.Fig. 5(b)], more information is extracted from vehicle patterns, revealing distinct spatiotemporal differences in mobility behavior, which is reflected in how probability distribution's values of pattern occurrences are scattered.This locality property is captured in the FI results when the amplitude is added to the OP transformation.
Those observations are reinforced by [40], who showed that, in regular processes, normalized PE is close to zero and FI is close to one.On the other hand, stochastic systems have PE close to one and FI close to zero.In this context, FI and PE results in Fig. 5 show that adding the amplitude information in OP transformation: 1) makes pattern distribution more deterministic (i.e., few concentrated patterns) in the fast group, which culminates in an FI increase and 2) increases the randomness of pattern distribution in the slow group, resulting in a stochastic behavior that is harder to predict (high PE) and in an FI decrease.Indeed, 87.27% of vehicles in the fast group present a single pattern responsible for more than half of the time series symbols.Among those 1309 vehicles set, 672 have the dominant pattern as π = (210, 1), and 637 have the pattern π = (012, 1).
In fact, [1] discussed in their work that trajectories suffering from congestion or heavy traffic are harder to infer since their temporal dynamics are uncertain, as verified in our current evaluation.In our case, 78.27% of vehicles in the slow group contain nine patterns in their distribution without any concentration.Among these 1174, 1094 vehicles present all possible patterns except for π = (021, 0) and π = (102, 0), endorsing their increase in random behavior.
2) POPAyI Transition Network: Furthermore, the number of patterns in the probability distribution affects the number of edges and nodes from the POPAyI transition network.We observe a density of 0.1447 edges (standard deviation of 0.036) and 0.6461 (0.121) nodes in the fast group.It means that they contain, on average, approximately eight nodes (out of 2 − D! = 12 possibilities) and ten edges (out of 66 possibilities).On the contrary, the slow group presents a density of 0.2010 (0.054) edges and 0.7843 (0.135) nodes: about ten nodes and 13 edges on average.Without amplitude, the slow group has a density of edges and nodes of 0.528 (0.121) and 0.957 (0.115), respectively, whereas the fast group contains 0.497 (0.102) and 0.973 (0.101) for density of edges and nodes, respectively.It means five nodes (from six possible) and nine edges on average for both groups.These similar values increase the challenge of identifying the different behaviors needed to precisely detect modes of transportation, which explains the indistinguishability of the two groups in Fig. 5(a).
Finally, results show the amplitude information's benefits in distinguishing various spatiotemporal mobility behaviors in the challenging context of a unique type of vehicle.Therefore, we claim that amplitude leveraging of POPAyI brings influential detection capability in scenarios counting on heterogeneous transports.The following section demonstrates this claim by comparing POPAyI with seven related literature proposals.

D. Transportation Mode Detection
Different transportation modes exhibit distinct traffic behaviors.Taxis, for example, display more randomness influenced by passenger destinations, while buses follow D = 3, τ = 2, AND q = 0.0005 predictable routes with regular stops.OP's probability distribution captures and reflects both random and regular behaviors: deterministic behaviors concentrate on specific patterns, while random behaviors encompass a wider range of possible patterns.Hence, this section demonstrates the effectiveness of features extracted from POPAyI transformations (i.e., probability distribution and transition network) in distinguishing traffic behaviors.These features achieve stateof-the-art results in TMD with fewer parameters than DL approaches.Evaluation is performed on the GeoLife data set (Section V-A), which offers diverse transports for a comprehensive assessment of POPAyI's performance compared to existing methods.

1) Evaluation Metrics:
We compared POPAyI with the state-of-the-art methods in four different transportation sets using the reported results from each study for the lack of experiment descriptions for replication, hence, some metrics are missing since they did not reported it.We reported accuracy, F1 score, recall, and precision.Accuracy is the fraction of correct predictions.F1-score is a weighted harmonic average between precision (pre) and recall (rec), defined as F1 = 2 × [(pre × rec)/(pre + rec)], where pre expresses the proportion of positive predictions that was true positives and rec explains the proportion of true positives that were correctly identified.
In addition, we included other interesting metrics, e.g., the trace size needed for training, the number of features, and the number of parameters for each model.These metrics provide insights into the practicality of each method, especially in resource-constrained environments, such as IoT and federated learning (FL).The ideal method in such scenarios should achieve high scores while using minimal features, parameters, and data size to perform well on limited devices with few computational resources.Therefore, these metrics are crucial for assessing methods' feasibility in real-world applications.
Tables I-III provide the confidence intervals (at a 95% confidence level), indicated in parentheses.These intervals were computed from the 1000 bootstrap rounds, with stratified classes and sample replacement from the test set.
2) Data Preprocessing: We apply specific data cleaning and training selection in this experiment to ensure consistency and comparability with the literature.For fixed-length data segments smaller than a threshold, we apply wrapping padding as [2], duplicating the segment until the desired length is reached.Data cleaning differ across transportation mode sets.For the first set (walk, bike, car&taxi, bus), we remove segments with fewer than ten data points or merge them if consecutive and belonging to the same transport, as [1].In the second set (walk, bike, car, bus&taxi, subway, train), we remove abnormal fixed-length segments that exceed the maximum values in the average speed distribution: 10 m/s for  [5], respectively: 7 m/s and 3 m/s 2 for walk, 12 m/s and 3 m/s 2 for bike, 34 m/s and 2 m/s 2 for bus, 50 m/s and 10 m/s 2 for car&taxi, and 34 m/s and 3 m/s 2 for train.Moreover, in third set, train refers to all railways-based transports, i.e., train, subway, and railways [5].
3) Number of Features: For DL approaches, the number of features corresponds to the input size.Regarding traditional ML methods, [1] used 13 features related to vehicle motion, e.g., speed change rate and stop rate.Xiao et al. [3] extracted 111 features, including global and local ones.Global features are descriptive statistics for the entire trajectory, such as average and skewness.The local features generated by profile decomposition focus on movement behavior (e.g., percentage of each decomposition class).
POPAyI uses the features described in Section IV-A.We extract 11 features from the probability distribution and transition network and incorporate eight statistical features related to motion: average, standard deviation, minimum, and maximum values of distance and speed time series.Therefore, we use 19 features in POPAyI for all transportation modes sets.
4) Number of Parameters: For DL approaches, the number of parameters is the number of weights.In POPAyI, it is the number of bins in probability distribution and the number of edges in the transition network assuming a complete graph, totaling 2D!+([2 − D!(2D! − 1)]/2).Traditional ML methods have too few parameters; thus, we do not show it in Table III.

5) ML Method and Hyperparameter Selection:
While our primary focus is on extracting features from POPAyI, selecting the most suitable ML method is crucial to prevent biases that could impact classification results [41].Hence, to determine the best POPAyI hyperparameters and select the top-performing ML method for our experiment, we employed a simplified version of Successive Halving [42].First, we randomly divided data into training (70%) and test sets (30%).We evaluated all POPAyI hyperparameters in each ML method using varying percentages of the training set (20%, 40%, and 80%), excluding the least effective classifier (i.e., lower average F1 score) at each step until the optimal ML method was identified.The chosen ML method thoroughly evaluated the entire training set, considering all possible combinations of selected values for D, τ , and q to pinpoint the best Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
TABLE III QUANTITATIVE COMPARISON.POPAyI WITH D = 3, τ = 2, AND q = 0.0005 hyperparameter combination.All rounds used cross-validation with five folds and stratified classes, ensuring that each fold had the same class distribution and, thus, neither of the classes was over-represented, which may lead to increased unrealistic results.This comprehensive approach ensured the robustness of our methodology and optimized experimentation time, especially given the substantial number of hyperparameter combinations involved.For ML methods, we employed support vector machines (SVMs) with radial basis function (RBF) kernel, decision trees (DTs), XGBoost with 300 trees, and random forest (RF) with 500 trees.Both tree-based ensembles allow unlimited depth until all nodes/leaves contain fewer than two samples, each subtree using a maximum of four features.The POPAyI hyperparameter values are detailed in Section V-B.We used the fourth set since it contains the largest number of transports.
In this process, SVM was the first excluded, with an average F1 score approximately 15% lower than the other methods in all hyperparameter combinations.The second round banned DT, which achieved about 8% less F1 score.In the last round, XGBoost outperformed RF by about 2%.
Fig. 6 illustrates the systematic search using XGBoost on the entire training set, revealing that two configurations achieved similar results: D = 3, τ = 2, and q = 0.0005, and D = 4, τ = 2, and q = 0.05.Considering the lower number of parameters, we opted for the former.
The unit of q depends on the data set, with the current experiment using kilometers.Therefore, q = 0.0005 represents a threshold of 0.5 m, providing a low but meaningful value that may capture nuances in traffic dynamics, such as stationary or accelerating/decelerating transports or even traffic congestion for road-based transportation.Additionally, the consistent use of τ = 2 in achieving the best results suggests that trajectories exhibit more representative transport dynamics when considering patterns between nonconsecutive points.Yet, the figure also indicates the difficulty of establishing a straightforward correlation between hyperparameter values.
Table I highlights the importance of choosing an appropriate ML method, as different algorithms yield distinct performance levels.SVM performs poorly, with about 50% lower scores than the best methods.DT improved the scores significantly, suggesting a better fit for the task.RF further enhances performance, achieving higher accuracy and F1 score than SVM and DT.Notably, XGBoost emerges as the topperforming ML method, surpassing all other algorithms in all metrics, especially RF, by about 2% in F1 score, precision, and recall.
6) POPAyI's Contribution to Classification: Table II provides valuable insights into the individual contributions of POP and amplitude in TMD for the fourth set, that contains the largest number of transports.We prepared this experiment as described in Section V-D1.Additionally, we evaluated the performance of 1-D OP applied separately to latitude and longitude.For this, we extract features of Section IV-A from lat and long separately and applied XGBoost feature importance method to select the top most relevant 19 features, the same number used in POPAyI.
The results presented in the table demonstrate significant performance improvements achieved by both POP and POPAyI compared to 1-D OP.Notably, POP surpasses 1-D OP in all metrics by about 3%.It suggests that adopting the polar form effectively captures mobility behavior more accurately than standard OP while preserving the exact parameter count.Additionally, using POP eliminates the need for timeconsuming feature selection, expediting the TMD process.
Furthermore, POPAyI achieves superior performance compared to POP, about a 2% improvement across all evaluated metrics.Therefore, amplitude information further enhances the accuracy of identifying and distinguishing different patterns of mobility behavior, contributing to a lower false alarm rate and enhancing the detection of positive instances.However, it comes at a higher computational cost since it requires one order of magnitude more in terms of the number of parameters.In resource-constrained scenarios where computational efficiency is a priority, using only POP can still yield satisfactory results, albeit with slightly less power than POPAyI.
7) Comparison Results: The advantages of traditional ML methods are handling data of many sizes and a small number of parameters, but they need more features to achieve better metrics results.In the first set, POPAyI surpasses the results reported by [1] by a significant margin: with only six more features, POPAyI achieves a remarkable improvement of over 10% in all evaluated metrics.In the second set, POPAyI demonstrates consistency in delivering solid metric results using significantly fewer features than [3], which extracted 111 features.Despite having five times fewer features, POPAyI achieves comparable F1-score and accuracy results, with a difference of around 4%.Moreover, these sets highlight POPAyI's capability to extract meaningful insights from mobility with a compact feature set, outperforming even DL approaches.
In the third set, POPAyI outperforms both a lightweight approach and an ensemble of CNNs regarding parameter efficiency, making it more suitable for resource-constrained scenarios.Despite using a smaller trace size for training, POPAyI achieves comparable classification results with significantly fewer parameters.In addition, compared to only the best CNN in the ensemble, POPAyI leverages F1 score and accuracy by about 10%.Furthermore, increasing the trace size enhances the performance of POPAyI, allowing it to achieve similar metric results to [6] and [21] without increasing the number of parameters, which is impossible with DL methods.The ability of POPAyI to utilize longer segments provides richer features, enhancing its effectiveness in TMD.
Finally, POPAyI has a significant advantage over DL approaches regarding parameter efficiency.It consistently achieves competitive results with 10 3 times fewer parameters compared to the Lightweight CNN method [2].In the first set, POPAyI achieves similar results to the Lightweight CNN with the same trace size for training (500 points).However, as the trained trace size increases, POPAyI outperforms the Lightweight CNN in metrics results, still using 10 3 times fewer parameters.This trend continues in the fourth set with six different transportation modes, where POPAyI surpasses the F1-score of the Lightweight CNN, even when using 1) the same trace size; 2) 10 3 times fewer parameters; and 3) reducing the number of features by 96%.
8) Confusion Matrix: Fig. 7 illustrates the confusion matrix for the fourth set, containing the largest number of transports.The main diagonal represents true positives, showcasing POPAyI's accuracy in correctly identifying transports and capturing the expected differences in temporal dynamics.
The confusion matrix unveils intriguing misclassification patterns, which could be attributed to various hypotheses.Imbalanced classes might significantly impact this scenario, where having more examples could enhance the classifier's ability to identify traffic dynamics and distinguish between Fig. 7. Confusion matrix using XGBoost with 300 trees, for D = 3, τ = 2, and q = 0.0005.Trajectory size of 500 points.different transports.Generating synthetic trajectories could be a potential solution, although it poses challenges.
Moreover, particular transportation modes are more prone to inaccurate predictions when compared to others, potentially due to shared traffic characteristics leading to similar temporal dynamics.Instances include bus and car&taxi, as well as subway and bus.Conversely, the confusion matrix indicates that distinctions between motor and nonmotor transports are more evident in their classifications (e.g., walking and biking compared to car&taxi and train) owing to their distinct behaviors in speed and traveled distances.
Additionally, some misclassifications may derive from data limitations, such as subway and walking, which might occur due to a stronger GPS signal at subway stations when the subway stops, creating the impression of walking dynamics (i.e., lower speed).
Addressing these challenges through preprocessing measures, such as cleaning, gap filling, and transfer learning could improve classification results.However, such approaches may demand more extensive efforts and could conflict with our goal of maintaining a lightweight approach.
Finally, the impact of false negatives and positives varies across intelligent transportation systems (ITSs) applications.For instance, in urban planning studies, combining walking and biking classifications might be acceptable, as both modes share similarities in not following traffic directions strictly.Mixing buses and car&taxi might also be adequate in this scenario.However, safety-critical applications demand minimal false negatives.Failing to detect pedestrians and bicycles among autonomous vehicles could lead to accidents, compromising safety.In such cases, minimizing false negatives between motorized and nonmotorized vehicles is essential.
For POPAyI, we observe that false negatives between motorized and nonmotorized are less frequent, which is advantageous for safety-critical applications.Moreover, it contains fewer false positives, providing precise, and reliable detection.This is essential in applications seeking to avoid unnecessary interventions, improve efficiency, and enhance user experiences, such as traffic management and planning.9) Latency: We used a machine with the following configuration: Ubuntu 20.04 OS, 12 × Intel Core i7-10750H CPU @ 2.60 GHz, and 15-GB RAM.In this context, POPAyI takes about 40 s to train and 0.01 s to classify segments containing 1000 points, and, respectively, 20 and 0.005 s for the ones with 500 points.Because of the lack of publicly available Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
reproducible implementation of the other methods, we could not reassess their time complexity.However, as shown in the last column in Table III, for different scenarios, POPAyI needs fewer parameters than DL methods (from 3 to 6 orders of magnitude) but still achieves competitive results.Traditional ML strategies have the disadvantage of needing more features than POPAyI to classify, which is a burdensome step that requires more effort to extract them.

VI. CONCLUSION AND FUTURE DIRECTIONS
We introduced POPAyI, a novel TMD approach based on the OP transformation applied to mobility time series.POPAyI uses polar coordinates in the OP extraction, enabling multivariate analysis while preserving the natural nonlinear mobility aspects.It also the first work to incorporate amplitude information in a multivariate OP transformation.
Experimental results on two popular mobility data sets showed that POPAyI achieves excellent performance while balancing accuracy, complexity, and the number of features used.Compared to traditional ML methods, POPAyI achieved similar results with significantly fewer features, reducing the feature count by approximately 90 while maintaining comparable performance.Similarly, POPAyI outperformed DL approaches while using 1000 to 10 000 times fewer parameters.
POPAyI Applicability: With its ability to handle diverse trip sizes and its robustness against observational and dynamic noise, POPAyI is a highly suitable choice for real-world transportation scenarios characterized by the complex mobility behavior of individuals, such urban management and planning.Its lightweight and efficient design makes it particularly well-suited for resource-constrained environments, such as ITS and Smart City applications.By processing data locally on edge devices, POPAyI minimizes data transmission and communication costs, making it ideal for energy-efficient IoT transportation systems by optimizing resource usage.The integration with edge computing enables adaptive decisionmaking, dynamic resource allocation, and low-latency local processing, ensuring scalability, and responsiveness for ITS solutions.Furthermore, when combined with FL frameworks, POPAyI empowers the development of personalized and privacy-preserving services, making it valuable in various business-to-consumer (B2C) FL applications, including personalized healthcare and virtual personal assistants.
Transfer POPAyI's Learning: We recognize the need to enhance POPAyI's generalizability by validating it with mobility data from various cities.We also plan to explore transfer learning techniques to adapt models to cities with distinct transportation characteristics.Additionally, we aim to investigate strategies that can reduce the need for manual labeling of trajectories, such as semi-supervised and weak-supervised learning, as well as FL, which preserve privacy while aggregating information from multiple devices and can achieve state-of-the-art performance with limited labeled data.
OP More Amplitude Levels: We intend to incorporate more amplitude levels into POPAyI to further enhance the distinction between transports by capturing a broader range of speed dynamics.Hence, we aim to investigate the tradeoff between amplitude levels and the algorithm's complexity to find an optimal balance that maximizes the discriminative power of the OP transformation while maintaining computational efficiency.
Data Sampling Resolution: In our future work, we aim to enhance POPAyI by adapting the parameter τ to appropriate data sampling.This approach will address several challenges associated with determining a representative temporal sampling, including the curse of dimensionality, the risk of overlooking important information, increased preprocessing time, and the occurrence of rare patterns in OP transformations.Therefore, this strategy can help enhance the robustness and effectiveness of the OP transformation, leading to more accurate feature extraction and improved classification results.
Leveraging Laws in Human Mobility Behavior: For instance, while the literature often treats private cars and taxis as the same mode, their behaviors differ significantly.Taxis follow routes determined by passenger pick-ups, being mainly random origin-destination patterns.In contrast, private cars follow human mobility laws influenced by drivers' daily circadian habits.Thus, we plan to incorporate new features based on human mobility laws (e.g., few important places, and itineraries adapted to traffic conditions) as well as scheduled transportation properties (e.g., fixed and unchangeable itineraries and highly regular stay points) [43].We anticipate these features will improve accuracy and robustness in TMD.
representing the observed patterns π m and the transitions between two Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I COMPARISON
FOR DIFFERENT ML METHODS.

TABLE II RESULTS
FOR DIFFERENT D VALUES FOR POPAYI WITH AND WITHOUT AMPLITUDE (I.E., POP).TRAJECTORY SIZE OF 500 POINTS walking, 25 m/s for bike, 35 m/s for bus and taxi, 35 m/s for car, 55 m/s for train, and 25 m/s for subway.The third set (walk, bike, car&taxi, bus, train) and fourth set (walk, bike, car&taxi, bus, subway, train) employ thresholds based on speed and acceleration to remove GPS points that exceed these values, as