I haven't integrated the loop detection and graph optimization functionality yet, however I now have the necessary classes to perform visual odometry. During these past few days, I have also done many optimizations and incorporated the possibility of using ORB (Oriented FAST + Rotated BRIEF) to the features detection and descriptors extraction step.
I have reconstructed a room so you can get an idea of the results I am getting. This room was a challenge because it was poorly lit and lacked from visual features, however the last implementation of my project has been able to reconstruct the room quite accurately. I leave you a video of the process below:
Reconstruction of a room using a handheld Kinect (visual odometry). This approach is based in pairwise alignment and uses SURF-GPU for 2D feature matching and ICP for pose refinement.
In my experiments, ORB has shown to be considerably faster than SURF-GPU in the features detection and descriptors extraction process. At first I thought it would be an excellent alternative to SURF-GPU since it could significantly reduce the computation time. The problem of ORB is that, detects few 2D features when there are not many "corners" in the image. This lack of features makes the visual pose approximation process less accurate and, finally ICP converges to worse solutions. In the other hand, SURF-GPU is considerably slower than ORB. However, SURF-GPU produces a huge set of features ("blobs") in many situations, leading to good pose approximations. Hence, SURF-GPU+ICP converges to good solutions even with there are few "corners" in the image.
In this version I have also added the ability to use the original Stanford GICP implementation. This implementation demonstrates better results than ICP when the point clouds are relative far apart, yet produces similar results than ICP when the point clouds are close enough. As usually the visual pose approximation is relatively good, GICP and ICP produces very similar results, hence I decided to use ICP instead of GICP since the first takes less computation time.
[Updated]
This is another video using SURFGPU for 2D feature matching and Generalized ICP for pose refinement: