Traffic monitoring sensors are crucial to intelligent transportation systems (ITS). Deployed at intersections, mid-blocks, and sidewalks—detecting the presence of road users, such as vehicles and pedestrians—the sensors are used for better signal timing. Combined with the latest Internet-capable devices, or edge devices, they can access real-time traffic data over the Internet, enabling a wide range of applications, such as traffic analysis, navigation, and smart vehicle communication.

In an article published in the IEEE Canadian Journal of Electrical and Computer Engineering, researchers present a camera edge device that runs on an efficient, low-cost Jetson Nano processor and performs deep learning inference. After providing an overview of the existing studies concerning ITS and image processing, the article introduces the hardware devices selected for the device and elaborates on the development and deployment of the deep learning algorithm. The proposed device successfully deploys the original full YOLOv4 and achieves real-time detection and tracking with decent accuracy, as outlined in the evaluation results section. 

Device Selection

The main components of the proposed camera system are the camera lens, sensor, and computational unit. The researchers chose the Jetson Nano because it was the leading image-processing and deep-learning application platform. The module is small but high in performance and power efficiency, making it ideal for running modern AI workloads and simultaneously processing data from high-resolution sensors.

As noted by the authors, a fisheye camera was selected as the single sensor for the device because it provides full-intersection coverage with its 180° field of vision (FoV). The cost of fisheye lenses is relatively high compared with rectilinear lenses. However, when covering a large FoV, a single fisheye camera would be cheaper than setting up multiple regular cameras.

Object Detection Algorithm

Object detection in intelligent transportation systems applications involves detecting pedestrians, cyclists, and various types of vehicles. In this study, both efficiency and accuracy are important considerations. The researchers explain the selection of deep learning detection models, and how performance improvement is achieved through transfer learning.

The You Only Look Once (YOLO) algorithm is a generic object detection method. Generic object detection methods locate and classify objects in an image, labeling them with rectangular bounding boxes. The YOLO algorithm has been proven to perform faster than many other algorithms, especially two-stage or region proposal methods. The YOLOv4 model used in this study was trained on the dataset. For the training process in this study, four classes are used, including car, bus, truck, and pedestrian.

Example of COCO dataset containing person, bus, truck, and car.

 

According to the researchers, the regions of interest (ROIs) in the below scene include the intersection, the pedestrian crosswalk, and the vehicle lanes, which are critical for monitoring activities on the road. This article presented an ROI cropping technique to further improve the detection performance within the ROI. This technique effectively zooms in on the ROI, allowing the YOLOv4 detection model to work with more image features. This results in a significant improvement in detection and tracking performance while maintaining the same computational overhead.

Camera view of the intersection and ROI configuration.

 

Evaluation

The DeepStream pipeline edge device was evaluated at a local four-way intersection in Hamilton, ON, Canada. Three common use cases are identified: 1) vehicle detection and counting at an intersection; 2) vehicle detection and counting inside lanes; and 3) pedestrian detection at a crosswalk. The models demonstrated the feasibility of running large deep learning models for traffic monitoring services, even on resource-restrained AI edge devices.

The evaluation results show that this edge device runs detection and tracking steadily at 7.8 fps and detects vehicles with 90% accuracy and zero false detections. This frame rate is sufficient for many potential use cases, such as lane occupation detection, vehicle counting, and identifying illegal maneuvers at an intersection. The detection of pedestrians is relatively low, possibly due to the lack of resolution of the subjects.

According to the researchers, potential improvements will revolve around improved computational efficiency while maintaining or improving detection performance. One of the future works is investigating neural network structures tailored for traffic monitoring applications. Another potential direction is to explore optimization methods for deployment, such as model pruning.

Interested in acquiring full-text access to this collection for your entire organization? Request a free demo and trial subscription for your organization.

Interested in expanding your knowledge on the AI or Edge Computing? IEEE offers continuing education with these course programs to smartly implement digital tools into your organization: