Computer vision – KURAZUME AND KAWAMURA LABORATORY : REAL-WORLD INFORMATIVE ROBOTICS Faculty of Information Science and Electrical Engineering, Kyushu University

High-speed 3D geometrical modeling using Fast Level Set Method

The level set method, introduced by S. Osher and J. A. Sethian, has garnered significant attention as a topology-free approach to active contour modeling. This method employs an implicit representation of the contour to be tracked, inherently managing the contour's topological changes. Various applications based on the level set method include motion tracking, 3D geometric modeling, and simulations of crystallization or semiconductor growth. However, the computational cost of reinitializing and updating the implicit function is considerably higher than that of conventional active contour models like "Snakes." We propose an efficient algorithm for the level set method, called the Fast Level Set Method (FLSM). The key features of the proposed FLSM are: i) Utilization of extension velocity and the rapid construction of the extension velocity field using the Fast Narrow Band Method. ii) Frequent execution of the reinitialization process for the implicit function, which incurs minimal computational cost. The efficiency of the proposed method is validated through computer simulations and two typical applications: real-time tracking of moving objects in video images and fast 3D surface reconstruction from scattered point data.

Bunny (Stanford Univ.)

Wired basket

Real-time tracking of multiple objects using Fast Level Set Method

Simultaneous tracking of moving objects	Fast detection of moving objects
Skeleton extraction	Labeling

Development of robust motion capture system using FLSM and stereo cameras

We are developing a new motion capture system using the Fast Level Set Method and multiple stereo cameras. This system can capture motion data of several people simultaneously, even if they are occluding each other. Experiments have been conducted to capture Japanese traditional dancing and clothing in 3D.

Motion	Captured data
After texture mapping	Motion capture

Papers

Yumi Iwashita, Ryo Kurazume, Kenji Hara, Seiichi Uchida, Ken'ichi Morooka, and Tsutomu Hasegawa, Fast 3D Reconstruction of Human Shape and Motion Tracking by Parallel Fast Level Set Method, in Proc. IEEE International Conference on Robotics and Automation, pp.980-986, April 2008.
Yumi Iwashita, Ryo Kurazume, Kenji Hara, and Tsutomu Hasegawa, Robust 3D Shape Reconstruction against Target Occlusion using Fast Level Set Method, Proc. The Second Joint Workshop on Machine Perception and Robotics, CD-ROM, 2006.
Yumi Iwashita, Ryo Kurazume, Kenji Hara, and Tsutomu Hasegawa, Robust Motion Capture System against Target Occlusion using Fast Level Set Method, in Proc. IEEE International Conference on Robotics and Automation, pp.168-174, 2006.
Yumi Iwashita, Ryo Kurazume, Tokuo Tsuji, Kenji Hara, and Tsutomu Hasegawa, Fast implementation of level set method and its realtime applications,in Proc. IEEE International Conference on Systems, Man and Cybernetics 2004, pp.6302-6307, 2004.

2D-3D alignment based on geometrical consistency

We have proposed a new registration algorithm for aligning 2D images with 3D geometrical models to reconstruct realistic 3D models of indoor scene settings. One of the common techniques for pose estimation of a 3D model in a 2D image involves matching 2D photometric edges with 3D geometrical edges projected onto the 2D image. However, in indoor settings, the features that can be robustly extracted from 2D images and the jump edges of geometrical models are limited. This limitation makes it difficult to accurately find corresponding edges between the 2D image and the 3D model. Consequently, the relative position often needs to be manually set close to the correct position beforehand. To overcome this issue, the proposed method first roughly estimates the relative pose by utilizing the geometric consistency of back-projected 2D photometric edges on a 3D model. After this initial estimation, an edge-based method is applied for precise pose estimation once the prior estimation has converged. The performance of the proposed method is successfully demonstrated through experiments using simulated models of indoor scene settings and actual environments measured by range and image sensors.

2D-3D alignment

Alignment result

Papers

Yumi Iwashita, Kenji Hara, Yuuki Kabashima, Ryo Kurazume, Tsutomu Hasegawa, Robust 2D-3D alignment based on geometrical consistency, The 6th International Conference on 3-D Digital Imaging and Modeling (3DIM2007), August 2007.

Visual servo of mobile manipulator using redundancy

We have proposed a new technique for visual servoing using the concept of "redundancy." The key idea is the use of a "virtual link" that connects the camera and the target positions. This virtual link can be treated as a virtual mechanical link, allowing the null-space operation, which was developed for controlling a redundant manipulator, to be applied in the same manner.

Tracking using redundancy

Visual servo using redundancy

Place recognition using RGB-D camera and laser scanner

Categorizing places in indoor and outdoor environments is crucial for service robots to effectively work and interact with humans. In this study, we present a method for categorizing different areas using a mobile robot equipped with an RGB-D camera (Microsoft Kinect) or a laser scanner (FARO/Velodyne). Our approach converts depth and color images taken at each location into histograms of local binary patterns (LBPs), whose dimensionality is further reduced using a uniform criterion. These histograms are then combined into a single feature vector, which is categorized using a supervised method. For indoor environments, our technique distinguishes between five place categories: corridors, laboratories, offices, kitchens, and study rooms. Experimental results show that our approach can accurately categorize these places. We also apply the proposed technique to outdoor environments such as parking areas, residential areas, and urban areas. This technique is beneficial for autonomous driving technology.

Place recognition using Kinect sensor

Database

corridors(255 Mbyte, 5 categories)	genkiclub_f3_corridor_01, genkiclub_f4_corridor_01, w2_10f_corridor_01, w2_7f_corridor_01, w2_9f_corridor_02
kitchens(204 Mbyte, 8 categories)	genkiclub_f3_kitchen_01, genkiclub_f3_kitchen_02, w2_10f_kitchen_01, w2_10f_kitchen_09, w2_9f_kitchen_01, w2_9f_kitchen_02, w2_9f_kitchen_10, w4_6f_kitchen_01
labs(583 Mbyte, 4 categories)	hasegawa_lab, kurazume_lab, taniguchi_lab, uchida_lab
offices(95 Mbyte, 3 categories)	hasegawa_office, kurazume_office, morooka_office
studyrooms(328 Mbyte, 8 categories)	w2_2f_studyroom_01, w2_2f_studyroom_02, w2_2f_tatamiroom_01, w2_2f_tatamiroom_02, w4_2f_studyroom_01, w4_2f_studyroom_02, w4_2f_tatamiroom_01, w4_2f_tatamiroom_02
toilets(116 Mbyte, 3 categories)	w2_10f_toilet_01, w2_2f_toilet_01, w2_9f_toilet_01

Papers

Hojung Jung, Oscar Martinez Mozos, Yumi Iwashita, Ryo Kurazume, Local N-ary Patterns: a local multi-modal descriptor for place categorization, Advanced Robotics, Vol. 30, No. 6, pp.402--415, 2016, doi:10.1080/01691864.2015.1120242
Hojung Jung, Oscar Martinez Mozos, Yumi Iwashita, Ryo Kurazume, The Outdoor LiDAR Dataset for Semantic Place Labeling, The 2015 JSME/RMD International Conference on Advanced Mechatronics (ICAM2015), Tokyo, Dec. 12.5-8, 2015
Oscar Martinez Mozos, Hitoshi Mizutani, Hojung Jung, Ryo Kurazume, Tsutomu Hasegawa, Categorization of Indoor Places by Combining Local Binary Pattern Histograms of Range and Reflectance Data from Laser Range Finders, Advanced Robotics, Vol.27, No.18, pp.1455?1464, 2013
Oscar Martinez Mozos, Hitoshi Mizutani, Ryo Kurazume, Tsutomu Hasegawa, Categorization of Indoor Places Using the Kinect Sensor, Sensors, Vol. 12, No. 5, pp.6695-6711, 2012
Hojung Jung, Ryo Kurazume, Yumi Iwashita, Outdoor Scene Classification Using Laser Scanner, Proc. The Ninth Joint Workshop on Machine Perception and Robotics (MPR13), K-P-06, Kyoto, 2012.10.31-11.1(Best Poster Session Award)

Previewed Reality - Near-future perception system -

This research develops a near-future perception system named "Previewed Reality." The system consists of an informationally structured environment (ISE), an immersive VR display, a stereo camera, an optical tracking system, and a dynamic simulator. In an ISE, numerous sensors are embedded to sense and store information about the position of furniture, objects, humans, and robots in a database. The position and orientation of the immersive VR display are also tracked by an optical tracking system. Consequently, the system can forecast the next possible events using a dynamic simulator and synthesize virtual images of what users will see in the near future from their own viewpoint. These synthesized images, overlaid on a real scene using augmented reality technology, are presented to the user. The proposed system can allow humans and robots to coexist more safely by intuitively showing possible hazardous situations to the human in advance.

Previewed Reality	Previewed Reality
Previewed Reality 1.0 and 2.0
Smart Previewed Reality

Papers

Asuka Egashira, Yuta Horikawa, Takuma Hayashi, Akihiro Kawamura, and Ryo Kurazume, Near-future perception system: Previewed Reality, Advanced Robotics, Vol., No., pp.-, 2020, DOI:
Yuta Horikawa, Asuka Egashira, Kazuto Nakashima, Akihiro Kawamura, Ryo Kurazume, Previewed Reality: Near-future perception system, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017), Vancouver, Canada, 2017.9.24-28, pp.370-375, 2017
Yuta Horikawa, Asuka Egashira, Kazuto Nakashima, Akihiro Kawamura, Ryo Kurazume, Previewed Reality: Near-future perception system, Proc. The 13th Joint Workshop on Machine Perception and Robotics (MPR17), , Peking, 2017.10.16-17

Fourth person sensing / Fourth person captioning

"Fourth person sensing" and "fourth person captioning" are innovative concepts for accurately recognizing the circumstances surrounding a user by integrating multimodal information from various viewpoints. These concepts categorize information sources based on n-person viewpoints: the first-person (wearable camera), the second-person (camera on a robot), and the third-person (camera embedded in the environment). All the information is combined to correctly recognize the current situation. For instance, a novel reader can understand all the information, including the emotions of the hero, sub-characters, and other characters. This is akin to a "god's viewpoint," and this research aims to achieve such a comprehensive perspective.

Fourth person sensing

Fourth person captioning

Papers

Kazuto Nakashima, Yumi Iwashita, Akihiro Kawamura, Ryo Kurazume, Fourth-person Captioning: Describing Daily Events by Uni-supervised and Tri-regularized Training, The 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2018), Miyazaki, 2018.10.7-10
Kazuto Nakashima, Yumi Iwashita, Yoonseok Pyo, Asamichi Takamine, Ryo Kurazume, Fourth-Person Sensing for a Service Robot, Proc. of IEEE International Conference on Sensors 2015, pp.1110-1113, 2015
Yumi Iwashita, Kazuto Nakashima, Yoonseok Pyo, Ryo Kurazume, Fourth-person sensing for pro-active services, Fifth International Conference on Emerging Security Technologies (EST-2014), pp.113-117, 2014
Kazuto Nakashima and Ryo Kurazume, Describing Daily Events in Intelligent Space via Fourth-person Perspective Images, Proc. The 14th Joint Workshop on Machine Perception and Robotics (MPR18), PS2-1, Fukuoka, 2018.10.16-17