The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply the distortion correction to the raw image.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find lane boundary.
- Determine curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
The images for camera calibration are stored in the folder called camera_cal
.
The images in test_images
are for testing your pipeline on single frames.
The video called project_video.mp4
is the video your pipeline should work well on.
challenge_video.mp4
is an extra (and optional) challenge for you if you want to
test your pipeline.
If you're feeling ambitious (totally optional though), don't stop there!
We encourage you to go out and take video of your own, calibrate your camera and
show us how you would implement this project from scratch!
The entire project is implemented in jupyter notebook called lane-detection-pipeline-final.ipynb
First part of the notebook loads the camera calibration images of a chessboard pattern
and uses cv2
functions drawChessboardCorners
and calibrateCamera
to find the calibration
matrix and distortion coefficients.
These parameters are saved in calibration_pickle.p
file.
Then I define functions to undistort
image and function thresholded_binary
to create a binary image from a color image which is most suitable for lane detection later.
It combines function calls for sobel transforms, gradients/gradient magnitudes
and S channel from HLS color space in a manner
that I empirically found to produce good results on various given test images.
The notebook shows intermediate results calculated in this function to arrive at final binary image.
Next is perspective_transform
function that has a shape of transform area defined to
cover the part of the road in front of the car, that is rectangular and covers both lane lines
and goes some distance in front. This area is visualised in the notebook.
It gets transformed into a 'bird-eye' view (where lines should be parallel)
for subsequent line detection.
Then I define Line
class to detect one lane line in a binary image, given input bottom
x coordinate
of its most probable position. The instances of this class hold the information about
detection results between calls and re-use it if there are difficulties in line detection
in subsequent calls.
Next LaneDetection
class is defined to detect both lane lines given unprocessed color images
(individual or from video stream).
It keeps track of both detected lines using Line
objects. It also checks for lines to
be parallel and reasonable distance apart.
It relies on Line objects holding the state from image to image and using it for successful
detection in subsequent images.
If detection fails in few consequtive frames
it tries to detect lines afresh using histogram window.
The lane with low confidence of right detection is annotated with lines shown in red.
Frames with high confidence detection have lines shown in blue.
OpenCV functions or other methods were used to calculate the correct camera matrix and distortion coefficients using the calibration chessboard images provided in the repository. The distortion matrix should be used to un-distort the test calibration image provided as a demonstration that the calibration is correct
Alexey Simonov:
cv2
functions drawChessboardCorners
and calibrateCamera
used. See first part of the notebook.
Distortion correction that was calculated via camera calibration has been correctly applied to each image.
Alexey Simonov:
Function undistort
is called as part of pipeline in LaneDetector.process_image
for each processed image.
At least two methods (i.e., color transforms, gradients) have been combined to create a binary image containing likely lane pixels. There is no "ground truth" here, just visual verification that the pixels identified as part of the lane lines are, in fact, part of the lines.
Alexey Simonov:
The function that creates binary image is threshold_binary
.
I have combined:
- S channel in HLS color space (see
hls_s_thresholds
), - sobel transforms for X and Y (
sobel_abs_thresholds
), - sobel gradient magnitude (
sobel_magnitude_thresholds
), - sobel gradient direction (
sobel_gradidir_thresholds
)
They are combined as:
binary = (sx & smag) || ( (sy & sgrad) || ( (smag & sgrad) || (sx || s) ) )
where:
- sx is sobel x,
- smag is sobel gradient magnitude,
- sy is sobel y,
- sgrad is sobel gradient,
- s is S channel in HLS space
I have found this emprically superior to other combinations I tried. The resulting image has quite distinct lane lines that are good input for later stages in the pipeline.
OpenCV function or other method has been used to correctly rectify each image to a "birds-eye view"
Alexey Simonov:
The function that creates 'top-down' image is perspective_transform
.
It calls getPerspectiveTransform
and warpPerspective
functions from OpenCV
.
I have found the shape of the area to be transformed using empirical tests -- looking for resulting top down image to have lane lines as parallel as possible on provided 6 test images.
Methods have been used to identify lane line pixels in the rectified binary image. The left and right line have been identified and fit with a curved functional form (e.g., spine or polynomial).
Alexey Simonov:
Two functions from Line
class do this.
find_left_right_x
(static) method uses histogram to find most probably locations for left and right lines in the 'top-down' view of the road in front. It does some basic checks to estimate the confidence that the lines are detected correctly. If returns two x coordinates -- one for each line. If no good candidates found one or both x returned are None. This is a static method so it does not change state of Line object on which it is called.
fit_from_x_on_image
method is given initial x coordinate at the bottom of the image to then apply sliding window procedure to find other pixels along the potential line. It then fits second order polynomial through these points.
use_last_good_fit
method is similar to fit_from_x_on_image
but is NOT given initial x coordinate. It uses the last fit results to draw line on new image. It is called from the outside by Detector
class when two detected lines are either too close or not parallel. The lines shown using this method are denoted in red by Detector
to signify low confidence of detection.
Here the idea is to take the measurements of where the lane lines are and estimate how much the road is curving and where the vehicle is located with respect to the center of the lane. The radius of curvature may be given in meters assuming the curve of the road follows a circle and the position of the vehicle within the lane may be given as meters off of center.
Alexey Simonov:
Function calc_radius_of_curvature
calculates radius of curvature using polynomial coefficients. It calculates that in both pixel and meter coordinates. The position of the line compared to vehicle center is calculated inside
The fit from the rectified image has been warped back onto the original image and plotted to identify the lane boundaries. This should demonstrate that the lane boundaries were correctly identified.
Alexey Simonov:
LaneDetector.annotate_undistorted_image
function warps fitted polynomials and the region between them back from 'top-down' image into original undistorted image.
LaneDetector
class also has the ability to show 'diagnostic' view, combining the final annotated image with binary thresholded image, top-down view and annotated top-down view to easier identify where pipeline struggles.
The image processing pipeline that was established to find the lane lines in images successfully processes the video. The output here should be a new video where the lanes are identified in every frame, and outputs are generated regarding the radius of curvature of the lane and vehicle position within the lane. The identification and estimation don't need to be perfect, but they should not be wildly off in any case. The pipeline should correctly map out curved lines and not fail when shadows or pavement color changes are present.
Alexey Simonov:
The notebook produces project_video_annotated.mp4
video, using the provided video of the driving. We can see in the resulting video that detection is achieved after first 5 frames. And it is maintained for the duration of the video. In couple of moments the confidence in detected lines drops and they are show in red. The detection results from previous frames are used in this case, up to 5 failed frames. At which point the pipeline is trying to detect initial x coordinates afresh using histogram method.
In the first few frames of video, the algorithm should perform a search without prior assumptions about where the lines are (i.e., no hard coded values to start with). Once a high-confidence detection is achieved, that positional knowledge may be used in future iterations as a starting point to find the lines.
Alexey Simonov:
LaneDetector.process_image
(main pipeline function) counts the frames as part of the processing. Once it reaches the specified number of frames (passed to class constructor or defaulted to class constant _initial_images_number
) it starts to annotate images with detected lane lines and the space between them. For the first few frames it uses the histogram search method from Line
class. The subsequent frames use the prior knowledge and start detecting the lines from information found in previous frames.
As soon as a high confidence detection of the lane lines has been achieved, that information should be propagated to the detection step for the next frame of the video, both as a means of saving time on detection and in order to reject outliers (anomalous detections).
Alexey Simonov:
The Line
class is persisting the information from previous successful line detection. LaneDetector
class has two instances of Line
objects -- one for left and right lines.
The Readme file submitted with this project includes a detailed description of what steps were taken to achieve the result, what techniques were used to arrive at a successful result, what could be improved about their algorithm/pipeline, and what hypothetical cases would cause their pipeline to fail.
Alexey Simonov:
This is the README file.
The steps taken for the project were:
- camera calibration implemented and used for to undistort input images
- computer vision techniques implemented to transform color image into binary image with as distinct pixels of the lane lines as possible
- perspective transform implemented to get the 'top-down' view of the road for easy line detection
Line
class implemented to hold the state of line detection and implement initial search using histogram methods, as well as detailed line pixel detection using sliding window method. It then fits the lines using parabolic functions and calculates few details like line curvature and position with respect to the vehicle.LaneDetector
class implemented to combine all stages of the individual image pipeline and extend it to video processing.
What can be improved:
- the individual images pipeline can be improved to work better on images from
challenge_video
andharder_challenge_video
. The current version does not work really well there.
- It fails on images with shadows and sun glare on the screen. May be fine-tuning thresholding parameters will help here. May be different way of combining different filters will also improve the results.
- It also fails when curvature of the lane is too steep. May be decreasing size of sliding window size can help here.
- the video pipeline can be improved with
- more confidence tests of successfull line detection and
- fine-tune detection by calling existing
Line
methods with slightly better guesses - could use more involved logic given different combinations of detection confidence measures, line positions and curvature
NB: there is no output_images
folder showing examples for each stage of the pipeline. Instead the provided jupyter notebook displays intermediate images as the pipeline is defined.