GPS, Accelerometers, and Tracking Systems: Principles, Accuracy, and Limitations

Prerequisites: This article assumes familiarity with the distinction between external and internal training load. If any of these topics are new to you, start with:

External Load and Internal Load: Concepts, Differences, and Monitoring Strategies

Learning Objectives

Explain the working principles, advantages, and disadvantages of the four major player tracking technologies (GPS/GNSS, optical tracking, RFID/UWB, IMU).
Understand how key metrics — speed, distance, acceleration, Player Load — are derived from GPS and accelerometers, and the caveats in their interpretation.
Understand validity and reliability as a continuum and assess how measurement error affects practical interpretation.
Explain the limitations of comparing data across different tracking systems and the need for calibration equations.
Understand how data quality management (HDoP, filtering, software updates) affects tracking data reliability and apply this knowledge in practice.

Evolution of Player Tracking: From Manual Notation to GNSS

The first objective attempt to quantify player movement in football dates back to 1976, when Tom Reilly used manual coding systems to record the activity profiles of players during matches (Clubb & Murray, 2022). The process was labour-intensive: a single observer tracked a single player for an entire game. Today, entire squads, officials, and the ball can be tracked simultaneously and continuously throughout a match.

This progression was made possible by four distinct tracking technologies. Each operates on a fundamentally different physical principle, and each carries its own set of trade-offs.

Technology	Operating Principle	Sampling Rate	Key Advantage	Key Limitation
OTS	Cameras infer x-y coordinates via computer vision	25 Hz	No wearable required; ball tracking	Fixed infrastructure; high cost; no vertical data
RFID / UWB	Radio signals between anchors and player tags	Up to 100+ Hz	Indoor and outdoor; high precision	Dedicated antenna infrastructure; signal interference
GPS / GNSS	Satellite signals triangulate receiver position	10–18 Hz	Portable; any outdoor venue	Signal affected by environment; outdoor only
IMU	Inertial sensors detect tri-axial movement	1,000+ Hz	Non-locomotor activity; stride detail	Not a position tracker; arbitrary units

Optical Tracking Systems (OTS) use cameras mounted around the stadium to reconstruct two-dimensional movement from video. Machine learning and computer vision algorithms identify each player and derive positional coordinates. TRACAB Gen5 has demonstrated position accuracy of 0.08 m RMSE against a VICON reference system (Linke et al., 2020). The limitation is mobility: cameras are fixed, making OTS impractical for training venues. Processing time can also reach 24–36 hours for semi-automated systems (Clubb & Murray, 2022).

Radio Frequency Identification (RFID) and Ultra-Wideband (UWB) systems calculate position from signal transit times between fixed anchor nodes and player-worn transmitters. UWB uses a bandwidth exceeding 500 MHz, offering low power consumption, high precision, and resistance to interference. Position estimation error ranges from 11.9 to 23.4 cm, though error increases at higher speeds (Murray & Clubb, 2022). These systems also require dedicated infrastructure and are not portable.

Global Navigation Satellite Systems (GNSS) — including the US GPS constellation (24 satellites) and Russia’s GLONASS (24 satellites) — determine position by comparing the arrival time of low-power radio signals from at least four satellites. Devices are typically worn in a vest between the shoulder blades. Early sport-specific GPS units sampled at 1 Hz; current devices operate at 10 Hz or higher (Clubb & Murray, 2022). The advantage is portability. The disadvantage is environmental sensitivity — signal quality degrades near tall buildings, under stadium roofs, and in adverse weather.

Inertial Measurement Units (IMU) contain Microelectromechanical Systems (MEMS) housing a tri-axial accelerometer, gyroscope, and magnetometer. These sensors measure movement along three axes at rates exceeding 1,000 Hz, providing data on stride characteristics, asymmetry, and impact forces that positional systems alone cannot detect (Clubb & Murray, 2022). IMUs are typically embedded in the same wearable device as the GPS receiver.

The choice of system depends on the environment, budget, and the type of information required. In professional football, it is common for a club to use OTS during matches (often mandated by the league) and GPS during training. The same player may be tracked by two or three different systems within a single week (Buchheit & Simpson, 2017). This routine practice raises important questions about data compatibility, addressed in later sections.

How GPS Works: From Satellite Signals to Speed and Distance

Understanding how GPS converts satellite signals into usable metrics is essential for interpreting the data it produces. Two methods are used to derive speed and distance, and they differ in accuracy.

The Doppler-shift method estimates speed by measuring the frequency change of the satellite signal caused by the receiver’s movement. This method offers higher precision than the alternative (Clubb & Murray, 2022).

Positional differentiation calculates distance by differencing successive position fixes. Because each position fix carries its own error, cumulative distance measurement inherits that error. In practice, team sport GPS devices typically use the Doppler-shift method for speed and positional differentiation for distance (Clubb & Murray, 2022).

Acceleration is computed as the first derivative of speed — and therefore the second derivative of displacement. Each differentiation step amplifies the noise in the underlying signal. This compounding of error is a foundational limitation: acceleration and deceleration measurements carry substantially more error than speed or distance (Murray & Clubb, 2022).

Sampling Frequency

Higher sampling rates improve accuracy, particularly for rapid changes in speed and direction. A minimum of 10 Hz is standard practice. However, some manufacturers sample at a lower native frequency and interpolate to a higher reported frequency using accelerometer data. Practitioners should verify the native sampling rate before purchasing (Clubb & Murray, 2022).

An 18 Hz device accessing only US GPS satellites and a 10 Hz multi-constellation GNSS device accessing both GPS and GLONASS are not directly comparable on frequency alone (Murray & Clubb, 2022).

Signal Quality: HDoP and Environmental Factors

Horizontal Dilution of Precision (HDoP) quantifies the geometric quality of satellite positions relative to the receiver. An HDoP value of 1 is ideal; values below 2 are excellent; values above 20 indicate poor signal quality (Murray & Clubb, 2022). The number of connected satellites and environmental obstructions — tall stands, roofing, dense tree cover — directly affect HDoP.

Even when HDoP remains below exclusion thresholds, momentary satellite signal losses can introduce irregularities in speed and acceleration traces. Practitioners should monitor HDoP and satellite count for every session and document any data exclusion criteria applied (Murray & Clubb, 2022).

Accelerometers and IMU: Measuring the Invisible Movements

While GPS captures locomotor activity across the pitch, accelerometers within the IMU capture a different dimension of movement. A tri-axial accelerometer measures acceleration along three orthogonal axes by detecting the displacement of a seismic mass between electrodes via capacitance changes (Clubb & Murray, 2022).

Player Load

The most widely reported accelerometer-derived metric is Player Load — a scaled vector magnitude representing the instantaneous rate of change of acceleration across all three axes. It is expressed in arbitrary units and is proprietary: different manufacturers label equivalent constructs as Body Load or Accumulation Load (Clubb & Murray, 2022).

The term “load” in this context does not conform to mechanical definitions (force applied to a body). Staunton et al. (2022) argued that this naming convention violates fundamental mechanical principles, and the lack of algorithm transparency compounds the problem. The FITT-VP framework (frequency, intensity, time, type, volume, progression) has been proposed as a more principled alternative for describing training demands.

Within-device and between-device reliability for accelerometer-based load metrics has been reported as acceptable (CV = 0.91–1.9%) in both laboratory and football simulation contexts (Murray & Clubb, 2022). However, standardisation across manufacturers is absent. Player Load values from one system cannot be compared to those from another.

Placement and Postural Limitations

Wearable devices sit between the scapulae, not at the body’s centre of mass. This positioning introduces measurement bias that varies with posture. In field hockey, players spend approximately 89% of match time in a flexed trunk position of 20–90 degrees, altering accelerometer orientation relative to the direction of movement (Murray & Clubb, 2022).

Beyond Load: The Three-Level Metric Classification

Buchheit & Simpson (2017) proposed a three-level classification of tracking metrics that clarifies the relationship between data source and measurement quality.

Level 1 variables are distances covered in various speed zones. They are derivable from any positional technology and carry the highest validity and reliability.

Level 2 variables involve acceleration, deceleration, and change-of-direction events. They are available from most technologies but with lower reliability. Acceleration values are sensitive to the time window (0.2–0.8 s) and signal filtering technique applied, with no consensus on optimal settings (Buchheit & Simpson, 2017).

Level 3 variables are derived exclusively from inertial sensors — stride characteristics, ground reaction force estimates (Force Load), and bilateral asymmetry. These variables hold particular promise because they are independent of tactical context. A player’s stride profile does not change with formation or scoreline, making Level 3 metrics more sensitive to fitness and fatigue changes than Level 1 or 2 variables (Buchheit & Simpson, 2017).

Key Metrics Derived from Tracking Data

Speed-Zone Distances

Total distance and distance covered in predefined speed zones are the most frequently reported external load metrics. A systematic review of 82 studies found that speed-zone distances appeared in 50 studies and total distance in 47 (Miguel et al., 2021).

Speed-zone definitions are inconsistent across the literature. High-Speed Running (HSR) thresholds in football range from 19.8 to 25.1 km/h, while sprint distance is more consistently defined at >25.2 km/h (Rice et al., 2023). Units also vary by region — m/s, km/h, mph — and manufacturers set different default thresholds in their software. Two practitioners using different brands could classify the same movement differently (Clubb & Murray, 2022; Miguel et al., 2021).

Acceleration and Deceleration: The Individualisation Problem

Acceleration and deceleration events are typically classified using Arbitrary Absolute Thresholds (AAT), most commonly >3 m/s² and <-3 m/s². These fixed thresholds do not account for individual differences in capacity.

The average maximal acceleration capacity (ACCmax) and maximal deceleration capacity (DECmax) of football players exceed the commonly used 3 m/s² and -3 m/s² thresholds by approximately 67% and 133%, respectively (Pimenta et al., 2026). Acceleration capacity is inversely related to starting speed: at 14.4 km/h, a player may still produce 4–5 m/s², but at 23 km/h, maximal acceleration drops to barely above 2 m/s². An AAT of 3 m/s² classifies the latter effort as low intensity despite it being near-maximal for the player at that speed (Pimenta et al., 2026).

DECmax is typically greater than ACCmax and less dependent on initial speed, which makes symmetric thresholds physiologically inappropriate (Pimenta et al., 2026). A percentage-based normalisation — for example, high intensity at >75% of individual maximum — produces more physiologically valid classifications. GPS device variability of up to 56% between units further exacerbates misclassification when using AAT, whereas individualised thresholds, which set higher absolute values, are inherently more robust to this device noise (Pimenta et al., 2026).

Metabolic Power

Metabolic power attempts to estimate instantaneous energy expenditure (W/kg) by combining speed and acceleration. The concept is appealing because it could unify locomotor demands into a single energy-based metric. However, GPS-derived metabolic power diverges substantially from actual metabolic demand measured via indirect calorimetry: it overestimates cost during walking and underestimates cost during shuttle runs and sport-specific circuits (Buchheit & Simpson, 2017). In football, where a significant proportion of high-intensity activity is non-locomotor — tackles, jumps, contested duels — metabolic power systematically underestimates true energy cost (Clubb & Murray, 2022).

The Paradox of Metric Importance

A recurring theme is what Buchheit & Simpson (2017) described as a fundamental paradox: “the variables deemed most important are likely the least useful.” Total distance and low-speed metrics are measured with high validity and reliability. High-speed running, acceleration, deceleration, and metabolic power — the variables practitioners prioritise — carry the highest measurement error. This does not mean these metrics should be abandoned, but they must be interpreted with an appropriate margin of uncertainty.

How Accurate Are Tracking Systems?

Validity and reliability are not binary properties. A system is not simply “valid” or “invalid.” Both exist on a continuum that varies by metric, speed, context, and device (Murray & Clubb, 2022; Varley et al., 2022).

Validity Across Technologies

When TRACAB Gen5 was validated against VICON motion capture, position RMSE was 0.08 m, speed RMSE was 0.08 m/s, and all key performance indicators showed trivial deviations from the criterion (Linke et al., 2020). Testing occurred under optimal conditions — a professional stadium with controlled lighting and no dense player clusters — so real match error rates may be higher.

For GPS, distance and average speed measurements are generally accurate, but error increases as the rate of speed change rises. This pattern is consistent across all positional technologies (Murray & Clubb, 2022).

Reliability: The Signal-to-Noise Problem

Reliability determines whether an observed change reflects a real change in the athlete or merely measurement noise. A comparison of three 10 Hz GPS manufacturers found distance and speed measurements highly reliable (CV: 0.2–5.5%), while deceleration was the least reliable (CV: 2.5–72.8%) (Murray & Clubb, 2022). The range of that deceleration CV illustrates how dramatically reliability differs between manufacturers for the same variable.

GPS inter-unit variability (between devices of the same brand) can reach 50%, reinforcing the best practice of assigning the same device to the same player across all sessions (Buchheit & Simpson, 2017). When comparing data between players, larger thresholds for meaningful change must be applied to account for this inter-unit noise.

Between-Systems Agreement

Professional clubs frequently use different systems for training and matches. When GPS and a multi-camera optical system were compared during official matches, total distance and low-speed metrics showed strong agreement (ICC > 0.90, CV < 5%), but high-speed metrics diverged: CV reached 13.5% for low-intensity sprinting and 14.9% for high-intensity sprinting (Pons et al., 2019).

Regression equations were derived to convert between systems, but they are specific to the manufacturer, model, and software version tested (Pons et al., 2019). Aggregated metrics from different tracking systems cannot be directly compared. Combining training GPS data with match OTS data without calibration introduces systematic bias that may exceed the magnitude of the change practitioners are trying to detect (Murray & Clubb, 2022).

When system integration is necessary, practitioners should derive their own calibration equations using simultaneous data collection in their own environment.

Technology Outpaces Validation

Technology adoption consistently outpaces validation research. Manufacturers release new chipsets, firmware, and software faster than independent researchers can evaluate them (Murray & Clubb, 2022). This gap places responsibility on practitioners to conduct in-house validation — replicating published methods in their own setting to establish context-specific confidence in their data.

Trusting Your Data: Quality Management and Workflow

The path from raw satellite signal to a reported metric involves multiple processing steps. Understanding and controlling these steps separates trustworthy data from misleading numbers.

Filtering: The Invisible Processing Layer

Raw GPS data contains high-frequency noise. Filtering algorithms — Kalman, Butterworth, exponential, median, Fourier — smooth this noise to produce usable speed and acceleration traces. The choice of filter, its parameters, and the order of application all influence the output (Varley et al., 2022).

Manufacturers typically treat their filtering algorithms as intellectual property and do not disclose them. In some cases, practitioners cannot access the raw data at all. This opacity limits independent verification and reproducibility (Varley et al., 2022).

Software Updates Change Your Data

A documented case illustrates the impact: after a GPS software update, the number of acceleration events dropped from 251 to 177 and deceleration events from 181 to 151 — for the same session data (Varley et al., 2022). The athletes did not change. The training did not change. The filter changed.

Longitudinal monitoring becomes unreliable if a software update occurs between measurement points. Practitioners should document every update, compare pre- and post-update outputs on the same data where possible, and avoid updating mid-season unless necessary (Varley et al., 2022).

Building a Reproducible Workflow

A reproducible workflow allows a different person, given the same input data, to arrive at the same output. This requires documenting every step: which data were included, which were excluded (and why), what filtering was applied, what thresholds were used, and how outliers were handled (Varley et al., 2022).

A data dictionary — defining every variable, its units, naming conventions, and coding rules — should be established before data collection begins. This is especially important when multiple staff members collect or process data (Varley et al., 2022).

Three Conditions for Trustworthy Information

Torres Ronda (2022) proposed that information provided by technology must satisfy three conditions before it can support decision-making. It must be reliable (measurement error is understood and acceptable), manageable (the practitioner can maintain data quality), and it must contribute to the decision-making process (it changes what a practitioner would do). If any condition is not met, the technology adds cost without benefit.

This framing returns to the core thesis. The foundation of successful monitoring is not the technology itself but the practitioner’s understanding of how data are collected, what each variable’s limitations are, and how information is reported and applied (Buchheit & Simpson, 2017). Tracking systems measure only external training load. Integrating these data with internal training load measures is necessary for a complete picture of training stimulus and response (Impellizzeri et al., 2019).

Key Takeaways

GPS/GNSS, optical tracking, RFID/UWB, and IMU each operate on distinct principles with distinct trade-offs; the appropriate system depends on the environment, budget, and required information level.
In GPS, speed is derived via Doppler shift and distance via positional differentiation; acceleration is a derivative of speed that compounds error — the variables deemed most important are measured least reliably.
Validity and reliability exist on a continuum rather than as binary categories; technology adoption always outpaces validation research, making in-house validation in the practitioner’s own environment essential.
Aggregated metrics from different tracking systems cannot be directly compared; calibration equations are specific to manufacturer, model, and software version.
Software and firmware updates, filtering techniques, and HDoP directly influence metric output; managing update timing and building reproducible, documented workflows are essential for trustworthy longitudinal monitoring.

References

Buchheit, M. & Simpson, B. M. (2017). Player-Tracking Technology: Half-Full or Half-Empty Glass?. International Journal of Sports Physiology and Performance, 12(s2), S2-35-S2-41. https://doi.org/10.1123/ijspp.2016-0499
Clubb, J., & Murray, A. M. (2022). Characteristics of tracking systems and load monitoring. In D. N. French & L. Torres Ronda (Eds.), NSCA’s Essentials of Sport Science. Human Kinetics.
Impellizzeri, F. M., Marcora, S. M., & Coutts, A. J. (2019). Internal and External Training Load: 15 Years On. International Journal of Sports Physiology and Performance, 14(2), 270-273. https://doi.org/10.1123/ijspp.2018-0935
Linke, D., Link, D., & Lames, M. (2020). Football-specific validity of TRACAB’s optical video tracking systems. PLOS ONE, 15(3), e0230179. https://doi.org/10.1371/journal.pone.0230179
Miguel, M., Oliveira, R., Loureiro, N., García-Rubio, J., & Ibáñez, S. J. (2021). Load Measures in Training/Match Monitoring in Soccer: A Systematic Review. International Journal of Environmental Research and Public Health, 18(5), 2721. https://doi.org/10.3390/ijerph18052721
Murray, A. M., & Clubb, J. (2022). Analysis of tracking systems and load monitoring. In D. N. French & L. Torres Ronda (Eds.), NSCA’s Essentials of Sport Science. Human Kinetics.
Pimenta, R., Antunes, H., Silva, H., Ribeiro, J., & Nakamura, F. Y. (2026). The Need for GPS Data to be Normalized for Performance and Fatigue Monitoring in Soccer: Considerations for Accelerations and Decelerations. Strength & Conditioning Journal. https://doi.org/10.1519/ssc.0000000000000958
Pons, E., García-Calvo, T., Resta, R., Blanco, H., López Del Campo, R., Díaz García, J., & Pulido, J. J. (2019). A comparison of a GPS device and a multi-camera video technology during official soccer matches: Agreement between systems. PLOS ONE, 14(8), e0220729. https://doi.org/10.1371/journal.pone.0220729
Rice, J., Kovacevic, D., Calder, A., & Carter, J. (2023). Wearable technology. In A. Calder & A. Centofanti (Eds.), Peak performance for soccer: The elite coaching and training manual. Routledge.
Staunton, C. A., Abt, G., Weaving, D., & Wundersitz, D. W. (2022). Misuse of the term ‘load’ in sport and exercise science. Journal of Science and Medicine in Sport, 25(5), 439-444. https://doi.org/10.1016/j.jsams.2021.08.013
Torres Ronda, L. (2022). Technological implementation. In D. N. French & L. Torres Ronda (Eds.), NSCA’s Essentials of Sport Science. Human Kinetics.
Varley, M. C., Lovell, R., & Carey, D. (2022). Data hygiene. In D. N. French & L. Torres Ronda (Eds.), NSCA’s Essentials of Sport Science. Human Kinetics.

Log in