1. Deal with the 'n-to-1' cases: keep the most important one
2. Make an attention area in both Kinect frame and hand-held camera frame:
- keep the information which is 60% of the original frame (...just use the central 70% information)
- Reason:
a. A person's visual attention should be at the most central part of a view (...in hand-held frame)
b. Information closing to the boundary is not important (... in Kinect frame)
3. Re-Confirming the metric for similarity computation:
- Use SAD + NearestNeighborRatio
- SURF features in 128-length extracting from each MSER feature location
4. Part of results in scaling cases (...basic cases): number of matches are around 20 to 90
Scaling 1: before improve |
Scaling 1: after improve |
Scaling 2: before improve |
Scaling 2: after improve |
![]() |
Scaling 5: before improve |
![]() |
Scaling 5: after improve |
5. Part of results in negative cases: number of matches are lower than 10
II. Part of results in changing viewpoint:
- number of matches has the similar trend as shown on the top 2 cases
III. Next... Considering to introduce Color and Depth information from the RGB-D camera, i.e. Kinect camera, seeing whether make something different in distinguishing position and negative cases
No comments:
Post a Comment