Guangming Wang

I am a Research Associate (2023.07-) in the Construction Information Technology (CIT) Laboratory at the University of Cambridge advised by Prof. Ioannis Brilakis.

I obtain my Ph.D. degree in the Intelligent Robotics and Machine Vision (IRMV) Lab at the Shanghai Jiao Tong University advised by Prof. Hesheng Wang.

From 2021.09, I also work with Prof. Masayoshi Tomizuka from UC Berkeley. From 2022.06 to 2023.06, I was a visiting researcher in the Computer Vision and Geometry Group (CVG), ETH Zurich, advised by Prof. Marc Pollefeys.

I am Associate Editor (AE) for ICRA2024 and will also service as AE for IROS2024. I was awarded the DAAD AI and Robotics fellow. I was awarded the SJTU "Academic Star" Nomination Award (Top 0.2% in Shanghai Jiao Tong University).

Most of my co-mentored undergraduates have gone to UC Berkeley, Princeton, HKU, Columbia, Gatech, UCSD, UCLA, TUM, SJTU and so on. Some of them got full scholarships for direct PhD students. Welcome to contact me to come to the University of Cambridge for a visiting PhD student or remote collaboration!

Email / Google Scholar  /  Github

Research

My area of focus is on developing robust robot perception, localization, and mapping methods, enabling intelligent understanding of the real world while ensuring safety and efficiency. These keywords, intelligence, safety, and efficiency, make the research technology widely and practically applicable in areas such as autonomous mobile robots, robot manipulation, and digital twin construction. My long-term vision is to achieve full understanding and utilization of real-world scenes by robots in terms of geometry, semantics, concepts, and logic, enabling AI robots to closely collaborate with humans.

My research interests include robot perception, localization, mapping, planning, and construction automation based on deep learning/reinforcement learning, specifically on topics such as:

  • Robot perception: depth estimation, optical/scene flow estimation, semantic segmentation, object detection/tracking, 3D point cloud learning.

  • Robot localization: visual/LiDAR odometry, 2D/3D registration, robot relocalization.

  • Robot mapping: dynamic mapping, implicit neural field-based mapping.

  • Robot planning: reinforcement learning for manipulator tasks.

  • Construction Automation: build mapping, BIM generation, digital twin.


News

[2022.03.18] My doctoral dissertation "Image-Point Cloud Soft Correspondence based Robot Multi-Modal Localization in Dynamic and Complex Environment" received the Excellent Doctoral Dissertation Award from Shanghai Jiao Tong University (15 recipients university-wide annually)!

[2024.02.27] Three co-first-author or second-author (master or PhD students rank first co-directed by me) papers on 3D scene flow, auto-labelling, and Nerf-SLAM are accepted by top conference CVPR 2024!

[2024.01.23] I wil service as Associate Editor (AE) for the top robotics conference IROS 2024!

[2023.09.11] I wil service as Associate Editor (AE) for the top robotics conference ICRA 2024!

[2023.07.20] Join Construction Information Technology (CIT) Laboratory, University of Cambridge as a Research Associate!

[2023.07.14] Three co-first-author (undergraduate and Junior graduate rank first co-directed by me) papers on 3D scene flow, point cloud registration, and robust estimation are accepted by top conference ICCV 2023!

[2023.01.17] One co-first-author (undergraduate ranks first co-directed by me) paper on self-supervised learning of depth and pose with pseudo-LiDAR point cloud is accepted by top conference ICRA 2023!

[2022.12.08] Got Shanghai Jiao Tong University "Academic Star" Nomination Award (Top 0.2%)!

/ News: (Shanghai Jiao Tong University, School of Electronic Information and Electrical Engineering)

[2022.11.19] One co-first-author (undergraduate ranks first co-directed by me) paper on the learning of LiDAR odometry with Transformer is accepted by top conference AAAI 2023!

[2022.10.11] One co-first-author paper for the unsupervised learning of depth and pose is accepted by top journal TCSVT 2022 (IF=5.859)!

[2022.09.26] Got National Scholarships for Doctoral Students again (Top 2%)!

[2022.09.23] One first-authored (master student ranks first co-directed by me) paper for pseudo-LiDAR 3D scene flow estimation is accepted by IEEE Transactions on Industrial Informatics (T-II, IF=11.648)

[2022.09.13] One co-first-author paper for the efficient 3D deep LiDAR odometry is accepted by top journal TPAMI 2022 (IF=24.314)!

---- show more ----
Publications / Preprints
PontTuset

RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration
J. Liu*, G. Wang*, Z. Liu, C. Jiang, M. Pollefeys, H. Wang
2023 International Conference on Computer Vision (ICCV), 2023
arXiv/ Code
(* indicates equal contributions)

We propose an end-to-end efficient point cloud registration method of 100,000 level point clouds.

PontTuset

DELFlow: Dense Efficient Learning of Scene Flow for Large-Scale Point Clouds
C. Peng*, G. Wang*, X. W. Lo, X. Wu, C. Xu, M. Tomizuka, W. Zhan, H. Wang
2023 International Conference on Computer Vision (ICCV), 2023
arXiv/ Code
(* indicates equal contributions)

We propose an efficient and high-precision scene flow learning method for large-scale point clouds, achieving the efficiency of the 2D method and the high accuracy of the 3D method.

PontTuset

RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End Robust Estimation
C. Nie*, G. Wang*, Z. Liu, L. Cavalli, M. Pollefeys, H. Wang
2023 International Conference on Computer Vision (ICCV), 2023
arXiv/ Code
(* indicates equal contributions)

We model the RANSAC sampling consensus as a reinforcement learning process, achieving a full end-to-end learning sampling consensus robust estimation.

PontTuset

Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks With Base Controllers
G. Wang*, M. Xin*, Z. Liu, and H. Wang
IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), 2022 (IF=19.118)
arXiv/ IEEE Xplore/ Code
(* indicates equal contributions)

We introduce a method of learning challenging sparse-reward tasks utilizing existing controllers. Compared to previous works of learning from demonstrations, our method improves sample efficiency by orders of magnitude and can learn online safely.

PontTuset

Efficient 3D Deep LiDAR Odometry
G. Wang*, X. Wu*, S. Jiang, Z. Liu, and H. Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022 (IF=24.314)
arXiv/ IEEE Xplore/ Code
(* indicates equal contributions)

We propose a new efficient 3D point cloud learning method, which is specially designed for the frame-by-frame processing task of real-time perception and localization of robots. It can accelerate the deep LiDAR odometry of our previous CVPR to real-time while improving the accuracy.

PontTuset

What Matters for 3D Scene Flow Network
G. Wang*, Y. Hu*, Z. Liu, Y. Zhou, W. Zhan, M. Tomizuka, and H. Wang
European Conference on Computer Vision (ECCV), 2022
arXiv/ ECCV 2022/ Code
(* indicates equal contributions)

We introduce a novel flow embedding layer with all-to-all mechanism and reverse verification mechanism. Besides, we investigate and compare several design choices in key components of the 3D scene flow network and achieve SOTA performance.

PontTuset

PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization
G. Wang*, X. Wu*, Z. Liu, and H. Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 arXiv/ CVPR 2021/ Code
(* indicates equal contributions)

We introduce a novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, using hierarchical embedding mask optimization. It outperforms all recent learning-based methods and the geometry-based approach, LOAM with mapping optimization, on most sequences of the KITTI odometry dataset.

PontTuset

FFPA-Net: Efficient Feature Fusion with Projection Awareness for 3D Object Detection
C. Jiang*, G. Wang*, J. Wu*, Y. Miao, and H. Wang
arXiv
(* indicates equal contributions)

We propose an efficient feature fusion framework with projection awareness for 3D Object Detection.

PontTuset

Interactive Multi-scale Fusion of 2D and 3D Features for Multi-object Tracking
G. Wang*, C. Peng*, J. Zhang, and H. Wang
arXiv
(* indicates equal contributions)

We propose an interactive feature fusion between multi-scale features of images and point clouds. Besides, we explore the effectiveness of pre-training on each single modality and fine-tuning the fusion-based model.

PontTuset

DetFlowTrack: 3D Multi-object Tracking based on Simultaneous Optimization of Object Detection and Scene Flow Estimation
Y. Shen, G. Wang, and H. Wang
arXiv

We propose a new joint learning method for 3D object detection and 3D multi-object tracking based on 3D scene flow.

PontTuset

Residual 3D Scene Flow Learning with Context-Aware Feature Extraction
G. Wang*, Y. Hu*, X. Wu, and H. Wang
IEEE Transactions on Instrumentation and Measurement (TIM), 2022 (IF=5.332)
arXiv/ IEEE Xplore/ Code
(* indicates equal contributions)

We propose a novel context-aware set conv layer to cope with repetitive patterns in the learning of 3D scene flow. We also propose an explicit residual flow learning structure in the residual flow refinement layer to cope with long-distance movement.

PontTuset

3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose from Monocular Video
G. Wang*, J. Zhong*, S. Zhao, W. Wu, Z. Liu, and H. Wang
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022 (IF=5.859)
arXiv/ IEEE Xplore/ Code
(* indicates equal contributions)

We propose a novel unsupervised training framework of depth and pose with 3D hierarchical refinement and augmentation using explicit 3D geometry.

PontTuset

FusionNet: Coarse-to-Fine Extrinsic Calibration Network of LiDAR and Camera with Hierarchical Point-pixel Fusion
G. Wang*, J. Qiu*, Y. Guo*, and H. Wang
International Conference on Robotics and Automation (ICRA), Xi'an, China, 2021.
IEEE Xplore
(* indicates equal contributions)

We propose Fusion-Net, an online and end-to-end solution that can automatically detect and correct the extrinsic calibration matrix between LiDAR and a monocular RGB camera without any specially designed targets or environments.

PontTuset

SFGAN: Unsupervised Generative Adversarial Learning of 3D Scene Flow from the 3D Scene Self
G. Wang, C. Jiang, Z. Shen, Y. Miao, and H. Wang
Advanced Intelligent Systems (AIS), 2021 (AIS, IF=7.298)
authorea/ Wiley Online Library

We utilize the generative adversarial networks (GAN) to self-learn 3D scene flow without ground truth.

PontTuset

Unsupervised Learning of Scene Flow from Monocular Camera
G. Wang*, X. Tian*, R. Ding, and H. Wang
International Conference on Robotics and Automation (ICRA), Xi'an, China, 2021.
arXiv/ IEEE Xplore
(* indicates equal contributions)

We present a framework to realize the unsupervised learning of scene flow from a monocular camera.

PontTuset

Anchor-Based Spatio-Temporal Attention 3D Convolutional Networks for Dynamic 3D Point Cloud Sequences
G. Wang, H. Liu, M. Chen, Y. Yang, Z. Liu, and H. Wang
IEEE Transactions on Instrumentation and Measurement (TIM), 2021 (IF=5.332)
arXiv/ IEEE Xplore/ Code

We introduce an Anchor-based Spatial-Temporal Attention Convolution operation (ASTAConv) to process dynamic 3D point cloud sequences. It makes better use of the structured information within the local region and learns spatial-temporal embedding features from dynamic 3D point cloud sequences.

PontTuset

Hierarchical Attention Learning of Scene Flow in 3D Point Clouds
G. Wang*, X. Wu*, Z. Liu, and H. Wang
IEEE Transactions on Image Processing (TIP), 2021 (IF=11.041)
arXiv/ IEEE Xplore/ Code
(* indicates equal contributions)

We introduce a novel hierarchical neural network with double attention for learning the correlation of point features in adjacent frames and refining scene flow from coarse to fine layer by layer. It has a new, more-for-less hierarchical architecture. The proposed network achieves the state-of-the-art performance of 3D scene flow estimation on the FlyingThings3D and KITTI Scene Flow 2015 datasets.

PontTuset

NccFlow: Unsupervised Learning of Optical Flow With Non-occlusion from Geometry
G. Wang*, S. Ren*, and H. Wang
IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2022 (IF=9.551)
arXiv/ IEEE Xplore/ Code
(* indicates equal contributions)

We introduce a novel unsupervised learning method of optical flow by considering the constraints in non-occlusion regions with geometry analysis.

PontTuset

Motion Projection Consistency Based 3D Human Pose Estimation with Virtual Bones from Monocular Videos
G. Wang*, H. Zeng*, Z. Wang, Z. Liu, and H. Wang
IEEE Transactions on Cognitive and Developmental Systems (TCSD), 2022 (IF=4.546)
arXiv/ IEEE Xplore
(* indicates equal contributions)

We introduce a novel unsupervised learning method of the 3D human pose by considering the loop constraints from real/virtual bones and the joint motion constraints in consecutive frames.

PontTuset

Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds
G. Wang, Y. Yang, Z. Liu, and H. Wang
IEEE Transactions on Cybernetics (T-Cyb), 2021 (IF=19.118)
arXiv/ IEEE Xplore

We introduce a spherical interpolated convolution operator to replace the traditional grid-shaped 3D convolution operator. It improves the accuracy and reduces the parameters of the network.

PontTuset

Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion From 3D Geometry
G. Wang, C. Zhang, H. Wang, J. Wang, Y. Wang, and X. Wang
IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2020 (IF=9.551)
arXiv/ IEEE Xplore/ News: (DeepBlue深兰科技, The First International Forum on 3D Optical Sensing and Applications (iFOSA 2020), 计算机视觉life)/ Video/ Code

We propose a method to explicitly handle occlusion, propose the less-than-mean mask, the maximum normalization, and the consistency of depth-pose and optical flow in the occlusion regions.

PontTuset

Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple Masks
G. Wang, H. Wang, Y. Liu, and W. Chen
International Conference on Robotics and Automation (ICRA), Montreal, Canada, 2019
arXiv/ IEEE Xplore / News: (泡泡机器人, 上海交大研究生教育)/ Video/ Code

We propose a new unsupervised learning method of depth and ego motion using multiple masks to handle the occlusion problem.

Academic Services

Associate Editor (AE) for the Conference Editorial Board (CEB) of the IEEE Robotics and Automation Society for ICRA

IEEE Member

IEEE Robotics and Automation Society Member

IEEE Young Professionals Member

Reviewer for journals: IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), International Journal of Computer Vision (IJCV), IEEE Transactions on Image Processing (T-IP), IEEE Transactions on Intelligent Transportation Systems (T-ITS), IEEE Transactions on Cybernetics (T-cyb), IEEE Transactions on Systems, Man and Cybernetics: Systems (T-SMC), IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), Transactions on Intelligent Transportation Systems (T-ITS), IEEE/ASME Transactions on Mechatronics (T-Mech), IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), IEEE Transactions on Medical Imaging (T-MI), IEEE Transactions on Automation Science and Engineering (T-ASE), IEEE Transactions on Industrial Informatics (T-II), IEEE Transactions on Industrial Electronics (T-IE), IEEE Transactions on Intelligent Vehicles (T-IV), IEEE Transactions on Vehicular Technology (T-VT), IEEE Transactions on Broadcasting (T-BC), IEEE Transactions on Cognitive and Developmental Systems (T-CDS), IEEE Transactions on Artificial Intelligence (T-AI), IEEE Transactions on Emerging Topics in Computational Intelligence (T-ETCI), IEEE Transactions on Medical Robotics and Bionics (T-MRB), IEEE Robotics and Automation Letters (RAL), Pattern Recognition (PR), Machine Intelligence Research (MIR).

Reviewer for conferences: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision (ICCV), European Conference on Computer Vision (ECCV), AAAI Conference on Artificial Intelligence (AAAI), International Conference on Robotics and Automation (ICRA), International Conference on Intelligent Robots and Systems (IROS), European Conference on Mobile Robots (ECMR).

Talks & Posters

[2022.10] Conference Poster on “What Matters for 3D Scene Flow Network,” at ECCV 2022 in Tel Aviv, Israel, October 23-27, 2022.

[2022.08] Zhidx Open Talk on “3D Scene Flow and Lidar Point Cloud Odometry in Autonomous Driving”, Online.

[2021.06] Conference Talk on “FusionNet: Coarse-to-Fine Extrinsic Calibration Network of LiDAR and Camera with Hierarchical Point-pixel Fusion” at ICRA 2022 in May 23-27, 2022 (Online Talk, the conference is held in Philadelphia (PA), USA).

[2021.06] Conference Poster on “PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization” at CVPR 2021 in CVPR Virtual, June 19-25, 2021.

[2021.05] Conference Talk on “Unsupervised Learning of Scene Flow from Monocular Camera” at ICRA 2021 in Xi’an, China, 30 May - 5 June, 2021.

[2020.10] Invited Talk on “Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion From 3D Geometry” at the First International Forum on 3D Optical Sensing and Applications (iFOSA 2020) held in Being, China, October 17-18, 2020.

[2019.07] Seminar Talk on “Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple Masks” at Sino-European Engineering Education Platform (SEEEP) Doctoral Summer School in Instituto Superior Técnico (IST), Lisbon, Portugal, July 22-25, 2019.

[2019.05] Conference Poster on “Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple Masks” at ICRA 2019 in Montreal, Canada, May 20-24, 2019.

Teaching

[2022.10] Teaching Assistant (Giving a cutting-edge lecture on the topic of robot perception and localization), Lectures on Frontier Academics in China University of Mining and Technology, Autumn 2022.

[2023.03-06] Teaching Assistant (Project Supervisor of Masters), 3D Vision in ETH, Spring 2023.

[2023.10] Lecture on academic career planning for Shanghai Jiao Tong University's "Academic Navigation" training program for MPhil students and PhD students, Autumn 2023.

Mentoring

Past Masters:
Yueling Shen:    Shanghai Jiao Tong University, SJTU, gone to PlusAI, Inc. to work.
Chaokang Jiang:    CUMT and SJTU, gone to PhiGent Robotics to work.
Huiying Deng:    CUMT and SJTU, gone to China Mobile Communications Group Co., Ltd to work.

Past Undergrad Interns:
Minjian Xin:    SJTU, gone to University of California at San Diego, UCSD to pursue master.
Xinrui Wu:    Shanghai Jiao Tong University, SJTU, gone to our lab to pursue a master's.
Yehui Yang:    SJTU, gone to SJTU to pursue master.
Ruiqi Ding:    SJTU, gone to SJTU to pursue master.
Hanwen Liu:    SJTU, gone to Technische Universität München, TUM to pursue a master's.
Xiaoyu Tian:    SJTU, gone to Columbia University to pursue master.
Chi Zhang:    SJTU, gone to SJTU to pursue master.
Huixin Zhang:    SJTU, gone to our lab to pursue master.
Zehang Shen:    SJTU, gone to our lab to pursue master.
Zhiheng Feng:    SJTU, gone to our lab to pursue master.
Shuaiqi Ren:    SJTU, gone to SJTU to pursue master.
Zike Cheng:    SJTU, gone to SJTU to pursue master.
Jiquan Zhong:    SJTU, gone to Xiamen University, XMU to pursue master.
Yanfeng Guo:    SJTU, gone to University of California, Los Angeles, UCLA to pursue master.
Yunzhe Hu:    SJTU, gone to The University of Hong Kong, HKU, to directly pursue a doctorate with a full scholarship.
Ziliang Wang:    SJTU, gone to SJTU to pursue doctor.
Honghao Zeng:    SJTU, gone to Shanghai Baosight Software Co., Ltd to work.
Muyao Chen:    SJTU, gone to ByteDance to work.
Haolin Song:    SJTU, gone to SJTU to pursue master.
Jiuming Liu:    HIT, gone to our lab to pursue master.
Shuyang Jiang:    SJTU, gone to Fudan University to pursue doctor.
Wenlong Yi:    SJTU, gone to the University of California, Los Angeles, UCLA to pursue a master's.
Chensheng Peng:    SJTU, gone to UC Berkeley to directly pursue a doctor with a full scholarship.
Wenhua Wu:    SJTU, gone to our lab to pursue doctor.
Yu Zheng:    SJTU, gone to our lab to directly pursue a joint PhD in ETH and SJTU.
Jiahao Qiu:    SJTU, gone to Princeton University to directly pursue a doctorate with a full scholarship.
Shijie Zhao:    SJTU, gone to Georgia Institute of Technology, Gatech to pursue master.

About Me

In my free time, I like reading books on psychology and literature. I like travelling. I also enjoy sports and meditation.


Last update: 2023.07. Thanks for Jon Barron's and Bo Yang's websites.