LiDAR Data to Road Insights: pctFusion Makes it Easy for Autonomous Vehicles

  • Published
  • 4 mins read

Imagine cruising down a bustling street, not behind the wheel, but as a passenger in a self-driving car. Every inch of the environment is a blur of information?—?lane markings, traffic signs, vehicles weaving in and out, pedestrians crossing intersections. This constant data stream is crucial for safe navigation, but how do Autonomous Vehicles (AVs) decipher it all?

The answer lies in sensor data perception, a technology akin to human senses. LiDAR, one of the prominent sensors, paints a 3D picture of the world using laser pulses, providing accurate information even in challenging conditions both day and night. But raw LiDAR data need interpretation. Just like our brain processes sensory inputs, the perception system of AV analyzes LiDAR data and interprets the world. This is where pctFusion, a cutting-edge 3D deep learning architecture co-developed by SimDaaS, steps in.

Think of pctFusion as a highly trained translator. It takes the LiDAR point cloud and transforms it into a rich understanding of the environment. It identifies objects, cars, and vegetation-even faded or obscured, and differentiates erratic two-wheelers from pedestrians. This intricate understanding is especially crucial in decision making and building a real-time map for safe navigation.

pctFusion (see Figure 1) has multiple features that make it efficient and effective specifically for point cloud segmentation, like:

Dual convolutions and positional encodings: It captures local spatial relationships within point neighborhoods. Extracting features from multiple neighborhoods (like overlapping patches in images) is beneficial for high-level feature extraction. However, geometric information about objects often loses its meaning when features are learned through convolutions. This is primarily because, while establishing a neighbourhood to a representative point, relative encodings are learned. This means an aggregator representative point is aware of the relative positions of the neighbour points, however, points are not aware of each other’s position. During transformations in higher dimensions, the geometry information due to relative encodings often leads to geometric information loss. pctFusion addresses this by incorporating positional encodings, allowing each point to be aware of its neighbor’s position and leading to better geometric learning in higher dimensions.

Figure 1. Overall network architecture of pCTFusion along with two proposed encoders architectures (Encoder V1 and Encoder V2). [35] Encoder V1 uses vector self-attention by operating on local neighborhood with K-nearest neighbors. Encoder V2 uses standard global dot product attention by operating on the entire point cloud. Encoder V2 is computationally feasible as the last encoder layer has low point density.

Hierarchical local & global attention: Self-attention mechanisms at different levels capture long-range dependencies and global context within the entire point cloud. These local and global embeddings are then fused for more informative feature vectors, enhancing representation ability. Notably, global attention is implemented after multiple convolutions for computational efficiency.

Pointwise Geometric Anisotropy (PGA) Loss: pctFusion introduces an attention-based loss function that assigns weights based on the semantic distribution of points in a neighborhood. This overcomes limitations of the existing loss functions that neglect semantic and positional importance, leading to improved accuracy, especially at sharp class boundaries. The impact of the PGA loss can be visualized in the Figure 2 below:

Figure 2. Visual comparison of PGA vs non-PGA-Loss enabled framework. The red circle highlights the area of interest where the PGA-enabled framework provides better boundary-level performance.

The ability of pctFusion to exploit LiDAR point cloud characteristics is a significant leap forward in segmentation. The perception engine of AVs benefits from this by developing a better understanding of their surroundings and, thus, better decision making. Whether it’s bustling city streets or winding country roads, LiDAR and pctFusion pave the way for a future where autonomous transportation becomes a reality. The full paper and the code can be accessed at: https://link.springer.com/article/10.1007/s42979-024-02627-5