Autonomous vehicles are generating unprecedented amounts of data, with sensors like cameras and LiDAR producing up to 14 terabytes per day. This massive influx poses a critical : current vehicle storage systems are inadequate, often limited to small, purpose-specific loggers that can't support the flexible, queryable storage needed for emerging applications like traffic analysis, safety forensics, and predictive maintenance. Without efficient onboard storage, the vision of vehicles as mobile computing platforms stalls, as continuous data offload via cellular networks is impractical due to cost and bandwidth constraints. A new approach is essential to unlock the potential of vehicle data without overwhelming hardware or budgets.
The researchers developed AVS, an Autonomous Vehicle Storage system that dramatically reduces storage footprint while maintaining data utility. By applying intelligent data reduction techniques, AVS cuts LiDAR data to just 24% of its original size through voxel downsampling at a 0.2-meter resolution, preserving geometric fidelity for tasks like odometry. For images, perceptual hashing deduplication retains 72% of frames by removing visually similar ones, with minimal impact on object tracking accuracy. Compression s like LAZ for LiDAR and JPEG at quality 95 for images further shrink data, achieving compression ratios of 6.56 and 4.06 respectively. Overall, compared to raw logging s, AVS reduces storage by an average of 8.4 times, making continuous data retention feasible on constrained edge devices.
Ology behind AVS involves a co-designed computational and hierarchical architecture. The system operates as a storage sidecar alongside the vehicle's autonomy stack, avoiding interference with safety-critical functions. It processes data in real-time through modality-aware reduction and compression, then organizes it into hot and cold tiers: an SSD for recent, frequently accessed data and an HDD for long-term archival. Key components include lightweight metadata indexing using SQLite for efficient queries, and filesystem optimization with XFS to minimize fragmentation and improve throughput. The prototype was implemented on a Raspberry Pi 5 with real L4 autonomous driving traces, validating the design under realistic embedded constraints.
From the prototype demonstrate robust performance across multiple metrics. Ingest latency remains within real-time budgets, with p99 latencies of 29.71 ms for images, 60.99 ms for LiDAR, and 1.58 ms for GPS, well below the 10 Hz and 50 Hz sensor rates. Retrieval is fast, with time-to-first-byte around 29 ms for images and 23 ms for LiDAR, enabling quick access for applications. Storage efficiency is highlighted by comparisons with ros2bag s: AVS uses only 4.0 GB on average for three days of data, versus 33.8 GB for raw logging, while maintaining downstream task accuracy, as shown in tables like Table 2 for LiDAR odometry and Table 4 for image tracking. The hierarchical design also ensures efficient archival, with XFS on HDD achieving zero fragmentation and high sequential throughput.
Of this work are significant for the future of autonomous vehicles and broader edge computing. By making onboard storage a first-class component, AVS enables vehicles to support third-party applications that rely on historical data, such as infrastructure analysis, safety investigations, and machine learning model training. This reduces dependency on cloud offload, enhancing privacy and reducing costs. For everyday readers, it means smarter vehicles that can learn from past drives, improve safety over time, and offer new services without constant connectivity. The system's resource-aware design also addresses practical concerns like power and thermal budgets, making it deployable in real-world scenarios.
Despite its advances, the paper acknowledges limitations and open questions. The reduction techniques rely on fixed thresholds, such as a 0.2-meter voxel size for LiDAR or Hamming distance for image deduplication, which may not adapt optimally to varying driving conditions like pedestrian-heavy scenes. Memory pressure could increase with more sensor streams, and SSD endurance may be affected by frequent small writes, though the prototype shows manageable wear. Future work could explore adaptive algorithms, better scheduling for compression tasks, and more detailed device-level monitoring to further optimize performance and longevity in longer deployments.
Original Source
Read the complete research paper
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn