First day of CVPR is packed with some good talk which shows the trend of the computer vision research right now. Day one is packed with object detection work, especially by using convolution neural network (CNN, aka deep learning approach).
Here I just report some interesting work:
Matching and Alignment:
- Learning to Assign Orientations to Feature Points: Include the orientation learning in 3D reconstructions by CNN implicitly will help to obtain the missing part of alignment, so you have less holes. It sounds like the orientation of the image patch can play a key role in image alignment.
- Learning Dense Correspondence via 3D -Guided Cycle: Directly apply to car, this paper talks about how to find matches in two images. The similarity need to be at the component level. In this way, you can reconstruct image B with information from image A’s pixels, while still maintain the structure and orientation of image B. It shows how to do the 3D model to 2D image alignment. And with possible occlusion, matchability learning is the way they try. Possible extension of the work is to extend the patch to the entire target so even in the occlusion case we can have a fully recovered image.
- The Global Patch Collider: Try to find the Patch which matches in different images, by forest voting.
- Joint Probabilistic Matching Using m-Best solution: a little optimization by using a sampling weighted function to choose several sub-optimal solutions.
- Face Alignment Across Large Poses: A 3D Solution. In traditional way, face alignment rely on the fact that all the tracking points are available, which is too strong. To training on large pose tracking data, we normally do not have this kind labeled data. In this paper, the people use synthesized training data by align a morphable model to the any face pose with knowing pose information, then get the 3D position and the corresponding 2D intensity plus the pose. Then a CNN can be trained to locate the correspondence.
During the spotlight session, Segmentation and Contour Detection is covered.
- Affinity CNN: Learning Pixel-Centric Pairwise Relations for Figure/Ground Embedding: Should look into.
Then basically I went to the poster session so take some photos about different posters I am interested in. One talk about real-time (80 fps) CNN, with lower accuracy got my attention. Low memory bandwidth with full code and “How to run” tutorial, this could be a very good way to try some cool idea: The detail about this can be found at Pjreddie.com/yolo.
Here are some poster photos:
At the end of first day, the best paper award and related work has been announced. MSRA’s new deep learning model “Deep Residual Learning for Image Recognition” shows Microsoft’s position in this deep battle. By winning all the major competitions during 2015, it does not sound that the model is very elegant, but it works. The best paper award to this paper settles the tune of this CVPR to still be “Deep Learning”. And later during the CVPR we notice that the author of the paper, Jian Sun, has been dig out from MS Asian Research to Face++ by a super good salary (like 8 digit in Chinese Yuan). As I know, a good PhD student focus on deep learning now normally don’t worry about job and salary at all. They are like bias in the market because there are so much data but so little people have the hints on how to dig them.