Invented by Yifang Xu, Xin Liu, Chia-Chih Chen, Carolina Parada, Davide Onofrio, Minwoo Park, Mehdi Sajjadi Mohammadabadi, Vijay Chintalapudi, Ozan Tonkal, John Zedlewski, Pekka Janis, Jan Nikolaus Fritsch, Gordon Grigor, Zuoguan Wang, I-Kuei Chen, Miguel Sainz, Nvidia Corp
The Market for Autonomous Vehicles
Markets are venues where parties can exchange goods or services. They may be physical, such as a retail outlet, or virtual like an online marketplace.
Autonomous vehicles have the capacity to detect lanes and boundaries in real-time, increasing road safety worldwide. This is expected to lead to an increase in demand for autonomous cars over the coming years.
Real-time Lane Detection
Autonomous vehicles possess the capability of detecting lanes and boundaries in real-time, which can help reduce accidents and enhance safety. Unfortunately, lane detection and tracking is not an easy feat due to factors like inconsistent road lanes, different line patterns (solid, broken, single, double), merging/splitting lines, lack of distinctive features that distinguish one lane from another, etc.
Researchers have devised a variety of algorithms that can accurately recognize and track lane markings in real-time. These include deep convolutional neural networks, lane marking generation, lane grouping, and model fitting.
The lane detection algorithm is capable of sensing lanes from their environment and can distinguish a specific one from non-lane markings. Furthermore, this algorithm can handle changing environmental conditions while maintaining the lane for short periods without camera input.
Some issues associated with lane detection and tracking methods involve occlusion, poor line painting, high speed flow, and complex background interference. These elements can make it difficult to accurately detect lane markings and may lead to false alarms.
These challenges must be overcome in order for autonomous vehicles to reach full autonomy. Furthermore, researchers need to create new databases for testing their algorithms.
This will allow researchers to gauge how well an algorithm adapts to various road scenarios and weather conditions. Furthermore, it allows them to compare the performances of various algorithms.
Other research in this area has utilized environmental detection of lanes instead of just looking at actual markings. This approach is more reliable and can be employed in various conditions such as inclement weather or low light levels.
This technique can be implemented using a sensor that captures video and converts it into grayscale images. A Gaussian filter is applied to remove any unnecessary noise from the image, followed by Canny edge detection for extracting lane markers. Finally, a Q-function approximator is utilized to make decisions.
Real-time Traffic Light Detection
Real-time traffic light detection and recognition is a prerequisite for autonomous vehicles, enabling them to safely share the road with human drivers while avoiding accidents at intersections.
Different approaches have been proposed to detect and recognize traffic lights. Some use visual information combined with high-speed cameras, while others rely on machine learning algorithms for the task.
A study recently proposed a hybrid approach that combined frequency analysis and visual information to detect blinking lights. This technique proved more accurate than other techniques that solely used visual cues.
A deep learning neural network was trained to detect and classify traffic lights using a dataset of 11179 frames recorded at 25 frames per second in Paris under various weather conditions and lighting situations such as daylight and night.
The state of traffic lights was also taken into account. To do this, prior maps that contain annotations about lights must be integrated. Doing so increases TLR’s robustness by increasing the number of relevant traffic lights.
These techniques offer numerous advantages, but require a substantial amount of training data and computational capacity. Nonetheless, they have high performance levels and remarkable robustness against variation.
Another advantage of these systems is their adaptability; they can be implemented on various types of devices such as mobile phones and smartwatches, increasing their efficiency in urban settings where speeds and weather conditions may differ.
In this work, we propose a system for autonomous vehicles that utilizes both a deep learning model and prior maps to detect traffic lights. The model performs detection and classification in one step, then prior maps are utilized to select only relevant traffic lights from among all proposed detections. Our method has proven successful at accurately detecting and recognizing traffic lights in real-time.
This paper presents an experimental investigation on the performance of a neural network-based traffic light detection and recognition system under various times and weather conditions (day and night). The results demonstrated that our proposed approach was successful in recognizing traffic signals both daytime and nighttime, determining their area with time, and deciding if they were bright or dark with it.
Real-time Pedestrian Detection
Real-Time Pedestrian Detection has become a critical concern for Autonomous vehicles. This necessitates an advanced approach to detect pedestrians and prevent collisions between cars and pedestrians. Furthermore, this technology must be able to operate in various visibility conditions (night, headlight glare, tunnel entrances/exits, smoke or fog), thus reducing the risk of collisions.
While various approaches have been developed to assist AVs in detecting pedestrians, there remain several challenges that must be overcome. One such issue is occlusions – when a pedestrian’s head becomes hidden within an image. Another difficulty lies in distinguishing pedestrians of different sizes and locations within an image.
Many techniques for occlusion and pedestrian detection have been developed, such as human head detectors and occlusion-resistant object detection algorithms. Unfortunately, these technologies have yet to demonstrate accurate detection accuracy in realistic scenarios.
Furthermore, they require high computational power which could cause a delay in the system’s reaction time, particularly when there are many pedestrians present.
In this scenario, a solution that utilizes UAVs and their cameras to detect pedestrians could be beneficial. Many systems have been implemented on microcomputers to recognize pedestrians in an image and determine their position at pedestrian crossings. The purpose of this paper is to explore these techniques in detail and create a real-time pedestrian detection system that can be run on a Raspberry Pi microcomputer.
The initial objective of the research is to identify the optimal algorithm for pedestrian detection. This involves analyzing various existing systems and comparing their speed, accuracy in determination, and portability on microcomputers. Subsequently, researchers will create a new pedestrian detection system that can be implemented on e-ink displays and Raspberry Pi microcomputers.
In the concluding stage of this research, researchers tested a pedestrian detection system in various scenes. Results demonstrated that it could accurately determine pedestrian locations at pedestrian crossings and monitor their social distance using only a small microcomputer.
Real-time Parking Detection
Real-time parking detection is the ability to locate an available spot in a car park or garage within a short amount of time. This helps drivers locate a parking spot quickly and efficiently, saving them both time and money.
This technology can be applied to a range of applications, such as smart cities and parking garages. It also provides an automated method for tracking and predicting the availability of parking spaces in cities – which could potentially generate revenue for municipal authorities.
One way to accomplish this goal is through cameras and sensors installed at various levels of a parking structure. This enables a computer program to monitor vehicle movement and occupancy status in real time, determining if there is an open space or not.
Another solution is using artificial intelligence (AI) technology such as neural networks. With this advanced setup, one can analyze historical data and forecast parking space availability in various parts of a city.
A parking detection system may include a sensor array, parking availability determiner and storage. In one embodiment, sensor array 104 includes one-third of a sensor element per parking space 804 a-804 c. In alternative embodiments, sensor array 104 could include any number of sensor elements per space depending on the ratio between 804 spaces to the overall number in parking lot 812.
Sensor data output signal 110 from the sensor system 104 is received by parking availability determiner 106, who then utilizes this information to determine whether each of parking spaces 804 a-804 are occupied or available for parking.
Once each parking space 804 has been determined to be available, parking availability information 112 is generated by the availability determiner 106 and stored in storage 108. This can then be communicated to a user device 202 (e.g., mobile computing device 510 or phone 512) via an appropriate user device 202 interface.
In addition to sensor data output signal 110, parking availability determiner 106 may also receive an image of parking lot 812 from an image capturing device 1302 (e.g., a camera). Captured images of parking lot 812 are then received by parking availability determiner 106 as captured image signal 1306.
The Nvidia Corp invention works as followsIn different examples, sensor data representing a field view of a vehicle sensor might be received. The sensor data can then be applied to a machine-learning model. A segmentation mask may be computed by the machine learning model. This mask represents portions of the image that correspond to the vehicle’s driving surface. An analysis of the segmentation mask can be used to identify lane marking types. Curve fitting may then be used to generate lane boundaries by using the curve fitting method on each of the lane markers. Data representative of the lane boundaries can then be sent to a component for use in navigation of the vehicle on the driving surface.
Background for Autonomous vehicles can detect lanes and boundaries in real-time
Autonomous vehicles cannot operate in dangerous environments. They must be capable to perform vehicle maneuvers such as lane keeping, changing lanes, turning, turning, stopping at intersections, crossingwalks and other maneuvers. An autonomous vehicle must be able to navigate on streets (e.g. city streets, side streets and neighborhood streets) in order to function safely. On highways (e.g. multi-lane roads), an autonomous vehicle must navigate a vehicle that is moving quickly between one or more divisions (e.g. lanes, intersections and crosswalks, boundaries etc.). These roads are not always clearly defined and can be difficult to spot in some conditions. An autonomous vehicle must be the functional equivalent to a human driver. It must have a perception system and an action system that can detect and respond to static and moving obstacles in complex environments. This is to ensure it does not collide with any other structures or objects.
Conventional methods for detecting road and lane boundaries involve processing images from one or several cameras and trying to interpolate the road and lane boundaries using visual indicators that are identified during processing (e.g. using computing vision or another machine learning technique). This method of detecting lane and road boundaries has been too computationally intensive to run in real-time, and/or it has resulted in inaccuracy due to shortcuts that were taken to lower computing requirements. These systems are either too inaccurate to run in real time or they don’t operate in real time to achieve acceptable accuracy. Even in systems that can achieve the level of safety and effectiveness required to operate autonomous vehicles safely and effectively, accuracy is limited to ideal road conditions and weather conditions. These conventional approaches can cause autonomous vehicles to not be able operate accurately in real-time or with precision in all weather and road conditions.
The present disclosure focuses on machine learning models that can detect lanes and road borders by autonomous vehicles and advanced driver assist systems in real time. The disclosed systems and methods allow for the accurate identification and detection of lanes/road boundaries in real time using a deep neural networks that has been trained. This can be used, for example, with low-resolution images and region of interest images and a variety ground truth masks to detect lanes and boundaries under a range of conditions, including weather conditions and roads that are less than ideal.
Contrary to other systems such as the ones described above, this system may employ one or more machine-learning models that are computationally cheap and can be deployed in real time to detect lanes. Machine learning models (or any combination thereof) can be trained using a variety annotations and transformed images so that they are capable of detecting lanes or boundaries quickly and accurately, particularly at longer distances. Low-resolution images, area of interest images (e.g. cropped images), and transformed images (e.g. spatially augmented or color augmented) can all be used to train the machine learning model(s). Ground truth labels, masks, and/or transform ground truth labels, masks (e.g. augmented according the corresponding enhancement of the transformed images to whom they relate). Machine learning models can be trained with both binary and multiclass segmentation masks to increase their accuracy. To identify and label the contours and contours of the lane markings, and boundaries more precisely, the outputs from the machine learning model may be subject to post-processing. Post-processing may produce lane curves or labels that can be used by one or several layers of an autonomous driving stack, such as a perception layer and a layer for world model management, planning, control, and/or obstacle avoidance.
As a result, autonomous vehicles may be capable of detecting road boundaries and lanes of a driving surface. This allows them to navigate safely and effectively within the current lane. They can also detect lane merges and lane splittings. The architecture of the machine-learning model(s), the training methods and the post processing methods to convert outputs to lane curves or labels from the machine-learning model(s), may make lane and boundary detection more computationally efficient than conventional methods. It also requires less power, energy consumption and bandwidth.
Systems, methods, and devices are disclosed that use one or more machine-learning models to detect in real time lanes and road boundaries using autonomous vehicles and/or advanced drivers assistance systems (ADAS). This disclosure can be used to describe an example autonomous vehicle 800 (alternatively called?vehicle800?). (alternatively referred to herein as?vehicle 800?). 8A-8D. This is not meant to be restrictive. The systems and methods described in this document may be used for augmented reality, virtual realities, robotics, or other technology areas such as localization, calibration, or other purposes. The detections described herein are not limited to just lanes, road borders, lane splits and merges, intersections or crosswalks. The processes described herein can also be used to detect other features or objects, such as poles, trees or barriers. The present disclosure does not limit lane detections to lane splits or lane merges. The functionality and features described herein in relation to detecting lanes or road boundaries could also be applied to detect lane splits/lane merges. Alternativly, the functionality and features described herein in relation to detecting lane merges and/or lanes splits could also be applied to detect lanes and/or road borders.
Lane and Road Boundary Detection System.
Conventional systems, as described above, rely on real time images processed using various machine learning or computer vision techniques (e.g. from visual indicators identified through image processing) in order to detect lanes/road boundaries. These methods are too expensive and/or inaccurate to perform accurate tasks in real time. Conventional systems are unable to detect lanes and/or road borders in real time. They either provide inaccurate or too late information that is not suitable for an autonomous vehicle, which can then be used to safely navigate the roads.
The present systems, however, allow for an autonomous vehicle to detect lanes or road boundaries with greater processing capability. They use a smaller footprint (e.g. less layers than traditional approaches) and a deep neural network (DNN). To increase the DNN’s ability to detect lanes and road borders, particularly at longer distances, a variety of images and ground truth masks may be used. The DNN architecture, the DNN training process, and the DNN output post-processing may allow the current systems to detect lanes and road borders in autonomous vehicles.
Examples of real-time visual sensor data include images and/or videos, LIDAR, RADAR, or other data. Sensors (e.g. one or more cameras or one or two LIDAR sensors or one or several RADAR sensors) may be used to receive data. An autonomous vehicle may have sensors. The sensor data can be used to train a machine learning model (e.g. the DNN), to identify areas of interest related to road markings, road boundaries and intersections (e.g. raised pavement markers, rumble stripes, colored lane divides, sidewalks crosswalks, turnoffs, etc.). From the sensor data.
More specifically the machine learning model(s), may be a DNN that is designed to infer lane markers and generate one or more segmentation Masks (e.g. binary and/or multiclass) that may identify the locations in the representations (e.g. images) of the sensor data potential road boundaries and lanes. The segmentation masks may contain points indicated by pixels in an image that indicate where the DNN has determined lanes or boundaries. The segmentation masks generated in some embodiments may include a binary mask that includes a first representation of background elements and a second representation of foreground elements. The DNN can also be trained to create a multi-class segmentation map, which may include different classes that relate to different lane markings or boundaries. The classes could include a class for background elements and a class for road borders, a class for solid lane markers, a class for dashed markings, an eighth class for intersections, six classes for crosswalks, seven classes for lane splits, or other classes.
The DNN can contain any number of layers, but some examples have 14 or fewer layers to reduce data storage and speed up processing speeds. One or more convolutional layer may be included in the DNN. The convolutional layers can continuously down sample the spatial resolution from the input image, e.g., up to the output layers or one or two deconvolutional levels are reached. Each layer may generate a higher level extraction than the one before it. The input resolution of each layer can be reduced, making it possible for the DNN to process sensor data (e.g. image data, LIDAR and RADAR data etc.). The DNN is faster than other systems. One or more deconvolutional layer(s), which could be the output layer in some cases, may be included in the DNN. The deconvolutional layer may increase the spatial resolution in order to produce an output image with a comparatively higher spatial resolution that the convolutional layers before it. The DNN output (e.g. the segmentation mask), may indicate the likelihood of a spatial grid cells (e.g. a pixel) belonging in some way to a particular class of lanes.
The DNN can be trained using labeled images through multiple iterations, until the loss function value is below a threshold value. Forward pass computations may be performed by the DNN on the training images in order to extract feature extractions for each transformation. The DNN might extract features from images and then predict the probability that the features correspond to a particular boundary or lane class. One or more ground truth masks may be used to measure the error in DNN predictions. One example is a binary cross-entropy function that could be used as the loss functions.
To recursively calculate gradients of loss function relative to training parameters, “backward pass computations” may be used. These gradients may be computed using weight and biases from the DNN in some cases. A region-based weighted loss could be added to the loss function. This may increase the penalty for loss at greater distances from the bottom (e.g., representing physical locations further away from the autonomous vehicle). This may increase detection of boundaries and lanes at longer distances than conventional systems. The DNN may be able to detect at greater distances more accurately and may thus improve the accuracy of these detections. An optimizer can be used in some cases to adjust the training parameters (e.g. weights, biases etc.). An Adam optimizer could be used in one example. In others, stochastic or stochastic descent with a momentum term may be used. The training process (e.g., forward pass computations?backward pass computations?parameter updates) may be reiterated until the trained parameters converge to optimum, desired, or acceptable values.
In certain non-limiting cases, after the segmentation mask has been output by the DNN an unlimited number of post-processing steps can be performed to generate lane marking types or curves. Connected component (CC), labeling might be used in some cases. Another example is directional connected components (DCC), which may be used for grouping pixels (or points) in the segmentation mask using pixel values and lane type connectivity. This allows the user to move the image from bottom to top. DCC may be used to take advantage of the perspective view (e.g. from the sensor(s), of the vehicle) of lane markings or road boundaries of driving surface. DCC can also use lane appearance type (e.g. based on classes in the multi-class segmentation matrix) to determine which pixels or points are connected.
Dynamic programming can be used to identify significant peak points, represented as 2D locations, and determine the associated confidence values. Each pair of significant peak points may have connectivity evaluated. A set of edges and peaks with the corresponding connectivity scores can be generated (e.g. based on confidence levels). To identify candidate lane edges, you can use a shortest path algorithm or a longest path algorithm and/or all-pairs shortest path (APSP). An additional term for curvature smoothness may be added to an APSP function in order to bias the function toward smooth curves over candidate lane edges that are zig-zag. Clustering algorithms can then be used to create a set final lane edges by merging similar sub-paths into one group (e.g. identified to correspond with candidate lane edge edges).
The final lane edges can then be assigned lane type, which may be determined relative the vehicle’s position. Some possible lane types include the left boundary of vehicle (e.g., the ego-lane), the right boundary of vehicle lane and the left outer boundary to right-adjacent lanes to vehicle lane.
Curve fitting can be used in some cases to create final shapes that best reflect the natural curves of road and lane boundaries. You can use polyline fitting, polynomial fit, clothoid fit, or any other type of curve-fitting algorithm. Some examples show how lane curves can be determined by sampling segmentation points within the area of interest in the segmentation mask.
Ultimately, data representing lane markings and lane boundaries may be collected and sent to a perception, world model management layer and a planning layer. This will aid the autonomous vehicle in safely and effectively navigating the driving surface.
Now, refer to FIG. 1A, FIG. 1A, FIG. FIG. FIG. 1A is lane and road boundary detection. However, this is only an example and is not meant to limit the possibilities.
The 100-degree process for lane and road border detection may also include the generation and/or reception of sensor data 102 from one, or more, sensors of the autonomous car 800. Sensor data 102 could include data from any sensor of the vehicle 800 and/or other objects, such robotic devices, VR system, AR systems, etc. Referring to FIGS. FIGS. 8A-8C show that the sensor data 102 could include data generated by, for instance, the Global Navigation Satellite Systems (GNSS), sensor(s), 858, RADAR sensor (s), 860, Ultrasonic Sensor(s), 862, LIDAR sensors(s), 864, IMU sensor(s), 866 (e.g. accelerometer(s), GPS sensor(s), gyroscope (s), magnetic compass (es), magnetometer (s), and others), ), microphone(s), 896, stereo camera (s), 868 (e.g. fisheye cameras), stereo camera (s), 870 (e.g. 360 degree cameras), surround camera (s), 874 (e.g. speed sensor(s), 844), vibration sensor(s), 842, steering sensor (s), 840), brake sensor(s), (e.g. as part of brake sensor system 846), or other types of sensor.
Click here to view the patent on Google Patents.