The Computer Vision: My Final Year Project: a brief overview of the summer work

One of the reasons why I created this blog, has been to announce the progress that I am carrying out my Final Year Project (FYP). I started working on my FYP in early june 2011, although I had thought the subject of my project since mid-march. The title of my FYP is "Development and implementation of a method of generating 3D maps using the Kinect sensor" which is to develop an aplication of 6D SLAM using RGB images and point clouds captured from the Kinect.

As I was saying, I started the project in early june, and since then I have been working on hard to finish it as soon as possible. The first stages of my FYP consisted on the reading of a series of articles related to my project, as well as the familiarization with the open source libraries that I would use to implement the project.

Some of the articles I have read and about wich I am basing my project are:

Simultaneous Localization and Mapping (SLAM): Part I, by Hugh Durrant y Tim Bailey
Simultaneous Localization and Mapping (SLAM): Part II, by Hugh Durrant y Tim Bailey
RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments, by Peter Henry, Michael Krainin, Evan Herbst, Xiaofeng Ren y Dieter Fox.
Generalized-ICP, by Aleksandr V. Segal, Dirk Haehnel y Sebastian Thrun.
Real-time 3D visual SLAM with a hand-held RGB-D camera, by Nikolas Engelhard, Felix Endres, Jürgen Hess, Jürgen Sturm y Wolfram Burgard.
Scene reconstruction from Kinect motion, by Marek Šolony.
Realtime Visual and Point Cloud SLAM, by Nicola Fioraio y Kurt Konolige.

The main open source libraries I am using for my project are the following:

OpenCV: http://opencv.willowgarage.com/wiki/Welcome
PCL: http://pointclouds.org/
MRPT: http://www.mrpt.org/
CUDA: http://developer.nvidia.com/cuda-toolkit-40

I am also using the original implementation of GICP that uses the ANN and GSL libraries. You can find the original GICP code in the following link: http://www.stanford.edu/~avsegal/generalized_icp.html

The first developments that took place were some test like 2D feature matching and 3D point cloud alignment.

2D feature matching

ICP alignment (only)

The next step consisted on developing the first SLAM aplication using point clouds only. In this first aproach I used the PCL ICP implementation to align consecutive point clouds. The main problem this version had was that ICP wasn't initialised, so the convergence time, as well as the estimated pose were horrible.

ICP only SLAM

To reduce execution time of ICP and get better convergences, I started to work on the visual pose approximation. The first thing I did was to get the 3D points corresponding to the 2D features of the RGB image.

3D points corresponding to 2D visual features

Getting a good visual pose approximation to allow ICP converge to good solutions took me quite some time. In late august I got a first version that allowed align two consecutive RGB-D frames using RGB images and point clouds.

Pairwise alignment with visual approximation and ICP refinement

Over the last month I have been making multiple optimizations of my code. I introduced GPU SURF to get a faster feature detection and descriptor extraction, I integrated the original implementation of GICP for improving the alignment between point clouds, I replaced the frame capture functionality from the MRPT to the PCL functions to avoid copies between data structures and much more.

The video that follows shows a map generated with the latest version I've implemented. Still under development, but this is one of the first functional versions. Future releases will introduce loop detection and graph optimization to avoid cumulative error.

5 comments:

CastillonisSeptember 24, 2011 at 1:22 AM
Have you looked at Furukawa's interior mapping and Manhattan-world papers? He changes some of the assumptions based upon the lack of texture and geometry. http://grail.cs.washington.edu/projects/interior/ ( uses Manhattan-world )
http://grail.cs.washington.edu/projects/manhattan/
UnknownSeptember 24, 2011 at 8:40 AM
I saw some of his videos some time ago, but I hadn't already read his articles. I'm going to read them because they are truly spectacular. Thanks a lot for the links!
havefunNovember 14, 2011 at 11:45 PM
Hi,
I am newbie, i feel the pointers you gave are very helpful. May i know the application of your project. RGBD slam is already available. So building a 3D map is a part of it. Anything that you are aiming at. Apart from the pcl documentation and pcl_ros, do you have any links to tutorials where i can start off with kinect pcl. I am a Masters student in robotics and would like to use Kinect as my primary sensor.

Thanks,
Karthik
UnknownDecember 26, 2011 at 2:08 PM
Thanks a lot! You may also see my newer entries ;)
UnknownJanuary 11, 2012 at 10:24 AM
@arumugam Thanks a lot! I have added the "Follow by Email" feature that you requested. Hope you like it :)

Sunday, September 18, 2011

My Final Year Project: a brief overview of the summer work

5 comments: