The Computer Vision: October 2011

Sunday, October 30, 2011

Better design and organization are good news:

As I said in my previous post, I decided to start to restructure my entire project so that I could isolate problems. In these last two weeks I have been working hard to get to the point where I was before. Yesterday I finally managed to reach the position where I was, but now I have one advantage: the project is better organized and the problems will be easier to detect.

I haven't integrated the loop detection and graph optimization functionality yet, however I now have the necessary classes to perform visual odometry. During these past few days, I have also done many optimizations and incorporated the possibility of using ORB (Oriented FAST + Rotated BRIEF) to the features detection and descriptors extraction step.

I have reconstructed a room so you can get an idea of the results I am getting. This room was a challenge because it was poorly lit and lacked from visual features, however the last implementation of my project has been able to reconstruct the room quite accurately. I leave you a video of the process below:

Reconstruction of a room using a handheld Kinect (visual odometry). This approach is based in pairwise alignment and uses SURF-GPU for 2D feature matching and ICP for pose refinement.

In my experiments, ORB has shown to be considerably faster than SURF-GPU in the features detection and descriptors extraction process. At first I thought it would be an excellent alternative to SURF-GPU since it could significantly reduce the computation time. The problem of ORB is that, detects few 2D features when there are not many "corners" in the image. This lack of features makes the visual pose approximation process less accurate and, finally ICP converges to worse solutions. In the other hand, SURF-GPU is considerably slower than ORB. However, SURF-GPU produces a huge set of features ("blobs") in many situations, leading to good pose approximations. Hence, SURF-GPU+ICP converges to good solutions even with there are few "corners" in the image.

In this version I have also added the ability to use the original Stanford GICP implementation. This implementation demonstrates better results than ICP when the point clouds are relative far apart, yet produces similar results than ICP when the point clouds are close enough. As usually the visual pose approximation is relatively good, GICP and ICP produces very similar results, hence I decided to use ICP instead of GICP since the first takes less computation time.

[Updated]

This is another video using SURFGPU for 2D feature matching and Generalized ICP for pose refinement:

Wednesday, October 12, 2011

Take three steps backwards to take a leap forward:

It has been more than two weeks since my last entry and I have decided to write a new one to summarize what I have been doing during those days.

As I mentioned in my previous entry, the next milestone of my FYP consist in the construction of a graph of keyframe's poses as nodes and rigid transformations between nodes as edges. The main objective consist in the optimization of that graph to avoid accumulation of error performing visual odometry.

The first task was to integrate the g2o library in my project so that I could generate and optimize the graph. This task was not too difficult although I must confess that I took more time than originally expected.

Once integrated this part, I started to test the application comparing the resulting global map from the optimization process with the global unoptimized map. The bad news came to see that the results were pretty bad, and guilt obviously, would not be the developers of the g2o but mine.

After few days trying to fix those problems with the g2o library, I decided to do the graph optimization part with the MRPT library to see if this way I could get better results. This task took me some days too and unfortunately didn't work as I expected. Curiously, the global maps obtained from the optimization of small graphs with the MRPT library, seemed to be slightly better than the unoptimized maps. Instead, when I tried to optimize a graph reconstructing a room, the results where much worse optimizing than just doing odometry.

Global map without graph optimization:

Global map graph optimization (small MRPT graph):

At this point, I decided that the best thing I could do was a cleaning of the code and restructuring in classes to help me find the problem. This is what I have been doing during the last four days and I think It will take at least one or two weeks more refactoring the whole project.

Therefore, in the coming weeks my work will not consist in the addition of new functionality, but improve what I have and rebuild the project over a new base. Perhaps in this way I could be able to find the problem and fix it, in any case this won't be work in vain. I'll take three steps backwards to take a big leap forward!