Jiahao Zhang

Ph.D. Candidate

The Australian National University

Bio

He is currently a final-year Ph.D. student in the Research School of Computing, The Australian National University. On the one hand, he is interested in many deep learning topics, particularly video understanding and generation. On the other hand, he is an active full-stack web developer. He is currently doing a research project supervised by Professor Stephen Gould, Dr. Anoop Cherian, Dr. Yizhak Ben-Shabat, and Dr. Cristian Rodriguez. Before that, in 2021, he received his bachelor’s degree in Advanced Computing (honours) and Computer Science and Technology from The Australian National University and Shandong University respectively.

Interests

Video Understanding
Video Generation
Web Development

Education

Ph.D. of Computer Science, 2022 - Present
The Australian National University
Bachelor of Advanced Computing (Honours), 2019 - 2021
The Australian National University
Bachelor of Computer Science and Technology, 2017 - 2019
Shandong University

Selected Publications

Jiahao Zhang, Anoop Cherian, Cristian Rodriguez, Weijian Deng, Stephen Gould

June, 2025 In ICCV 2025

Manual-PA: Learning 3D Part Assembly from Instruction Diagrams

We introduce Manual-PA, a transformer-based framework that leverages diagrammatic assembly manuals to guide both the selection and 6D pose estimation of furniture parts, enabling efficient and realistic 3D assembly by aligning parts with instructional illustrations.

Jiahao Zhang, Frederic Zhang, Cristian Rodriguez, Yizhak Ben-Shabat, Anoop Cherian, Stephen Gould

September, 2024 In WACV 2025

Temporally Grounding Instructional Diagrams in Unconstrained Videos

We introduce a new approach to simultaneously localize a sequence of instructional diagrams in videos by modeling their mutual relationships and temporal order, rather than handling each step independently.

Jiahao Zhang, Anoop Cherian, Yanbin Liu, Yizhak Ben-Shabat, Cristian Rodriguez, Stephen Gould

February, 2023 In CVPR 2023

Aligning Step-by-Step Instructional Diagrams to Video Demonstrations

We introduce a new framework and dataset (IAW) for aligning assembly diagrams from instruction manuals with real-world video segments, enabling cross-modal retrieval and step-level alignment between illustrated instructions and assembly actions in videos.

Publications

Quickly discover relevant content by filtering publications.

Jiahao Zhang, Anoop Cherian, Cristian Rodriguez, Weijian Deng, Stephen Gould (2025). Manual-PA: Learning 3D Part Assembly from Instruction Diagrams. In ICCV 2025.

PDF Cite Code Dataset ArXiv

Weijian Deng, Dylan Campbell, Chunyi Sun, Jiahao Zhang, Shubham Kanitkar, Matthew E. Shaffer, Stephen Gould (2025). Pos3R: 6D Pose Estimation for Unseen Objects Made Easy. In CVPR 2025.

PDF Cite Project Poster Slides Video

Jiahao Zhang, Frederic Zhang, Cristian Rodriguez, Yizhak Ben-Shabat, Anoop Cherian, Stephen Gould (2024). Temporally Grounding Instructional Diagrams in Unconstrained Videos. In WACV 2025.

PDF Cite Code Dataset Poster Slides DOI ArXiv

Jiahao Zhang, Anoop Cherian, Yanbin Liu, Yizhak Ben-Shabat, Cristian Rodriguez, Stephen Gould (2023). Aligning Step-by-Step Instructional Diagrams to Video Demonstrations. In CVPR 2023.

PDF Cite Code Dataset Poster Slides Video DOI ArXiv Supplementary

Zheyu Zhuang, Yizhak Ben-Shabat, Jiahao Zhang, Stephen Gould, Robert Mahony (2022). GoferBot: A Visual Guided Human-Robot Collaborative Assembly System. In IROS 2022.

PDF Cite Video DOI ArXiv

Projects

ArXiv Toolkit

A Chrome/Edge extension to help you quickly scan through the flood of daily ArXiv papers.

Manual-PA

Official implementation of Manual-PA: Learning 3D Part Assembly from Instruction Diagrams.

TDGV

Official implementation of WACV 2025 Temporal Instructional Diagram Grounding in Unconstrained Videos.

Assembly Video Manual Alignment

Official implementation of CVPR 2023 Aligning Step-by-Step Instructional Diagrams to Video Demonstrations.

Homepage for Ikea Assembly In the Wild (IAW) dataset

A frontend project to introduce the IAW dataset.

Pytorch Lightning Template

A template for simple deep learning projects using Lightning.

SDU Research Management System

SDU Research CMS.

CSMetrics

Institutional Publication Metrics for Computer Science.

Influence Flower

Influence flower visualizes citation influences among academic entities, including papers, authors, institutions, and research topics.

SlimeVerse

Official website of SlimeVerse.

Vidat

An in-browser video annotation tool developed by ANU CVML.

Image Caption Generator

An encoder(Resnet152)-decoder(LSTM) implementation of image caption model.

Plugin Manager

A plugin Manager based on Cpp Micro Service.

Weihai Construction Consulting System

An enterprise-level business projects(CMS for construction companies).

School Booking System

A booking system designed for school.

Virtual Judge

Virtual Judge for ACM practice.

Experience

Research Intern

Mitsubishi Electric Research Laboratories (MERL)

October 2025 – February 2026 Boston, USA

Research Intern

Roblox

July 2025 – October 2025 San Mateo, USA

Teaching Assistant

The Australian National University

July 2024 – November 2024 Canberra, Australia

Tutor for COMP8536 - Advanced Topics in Deep Learning for Computer Vision.

Teaching Assistant

The Australian National University

February 2024 – May 2024 Canberra, Australia

Tutor for COMP4528/COMP6528 - Computer Vision.

Teaching Assistant

The Australian National University

February 2023 – June 2023 Canberra, Australia

Tutor for COMP2420/COMP6420 - Introduction to Data Management, Analysis and Security.

Teaching Assistant

The Australian National University

July 2022 – November 2022 Canberra, Australia

Tutor for COMP3670/COMP6670 - Introduction to Machine Learning and COMP4650 - Document Analysis.

Research Assistant

The Australian National University

May 2022 – February 2023 Canberra, Australia

Web maintainer for InfluenceMap and CSMetrics.

Intern Software Development Engineer

Inspur

November 2019 – January 2020 Jinan, China

I was an intern at the Inspur, which is a Server and Cloud company in China. I helped to build a plugin management system based on Cpp Micro Service.

Developer of School Booking System

Shandong University, Weihai, School of Mechanical, Electrical & Information Engineering

May 2019 – July 2019 Weihai, China

I was an independent developer of the Booking System designed to simplify the appointment process. The system is in-use till now.

Major Developer of Weihai Construction Consulting System

Shandong University, Weihai, School of Mechanical, Electrical & Information Engineering

March 2019 – July 2019 Weihai, China

I was a major developer(one of six) of the Weihai Construction Consulting System, which is an enterprise-level business projects.

Major Maintainer for VJ

Shandong University, Weihai, School of Mechanical, Electrical & Information Engineering

February 2018 – July 2019 Weihai, China

I was the major maintainer(one of three) of the VJ(Virtual Judge for ACM practice) system developed by a senior student. The system has more than 10K submissions since first published in 2017.

Awards

Postgraduate Research Scholarship

ANU Mar 2022

Chancellor’s Letter of Commendation

ANU May 2021

First Scholarship of Studying Abroad

SDU Oct 2020

First Scholarship of Abroad Special

SDU Jul 2019

Third Scholarship of University

SDU Jul 2019

Second Scholarship of Research and Innovation

SDU Jul 2018

Third Scholarship of University

SDU Jul 2018

LanQiao Programming Competition Province-Level Second Prize

LanQiao Jun 2018