Publications

This page is automatically generated from the White Rose database using name-string queries. It has known inaccuracies – please contact the authors directly to confirm data.

T. Shao, Y. Yang, Y. Weng, Q. Hou, and K. Zhou, H-CNN: Spatial Hashing Based CNN for 3D Shape Analysis, IEEE Transactions on Visualization and Computer Graphics, 2018.

Abstract | Bibtex | PDF

We present a novel spatial hashing based data structure to facilitate 3D shape analysis using convolutional neural networks (CNNs). Our method builds hierarchical hash tables for an input model under different resolutions that leverage the sparse occupancy of 3D shape boundary. Based on this data structure, we design two efficient GPU algorithms namely hash2col and col2hash so that the CNN operations like convolution and pooling can be efficiently parallelized. The perfect spatial hashing is employed as our spatial hashing scheme, which is not only free of hash collision but also nearly minimal so that our data structure is almost of the same size as the raw input. Compared with existing 3D CNN methods, our data structure significantly reduces the memory footprint during the CNN training. As the input geometry features are more compactly packed, CNN operations also run faster with our data structure. The experiment shows that, under the same network structure, our method yields comparable or better benchmark results compared with the state-of-the-art while it has only one-third memory consumption when under high resolutions (i.e. 256 3).

@article{wrro140897,
month = {December},
title = {H-CNN: Spatial Hashing Based CNN for 3D Shape Analysis},
author = {T Shao and Y Yang and Y Weng and Q Hou and K Zhou},
publisher = {Institute of Electrical and Electronics Engineers},
year = {2018},
note = {{\copyright} 2018 IEEE. This is an author produced version of a paper published in IEEE Transactions on Visualization and Computer Graphics. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Uploaded in accordance with the publisher's self-archiving policy.},
journal = {IEEE Transactions on Visualization and Computer Graphics},
keywords = {perfect hashing , convolutional neural network , shape classification , shape retrieval , shape segmentation},
url = {http://eprints.whiterose.ac.uk/140897/},
abstract = {We present a novel spatial hashing based data structure to facilitate 3D shape analysis using convolutional neural networks (CNNs). Our method builds hierarchical hash tables for an input model under different resolutions that leverage the sparse occupancy of 3D shape boundary. Based on this data structure, we design two efficient GPU algorithms namely hash2col and col2hash so that the CNN operations like convolution and pooling can be efficiently parallelized. The perfect spatial hashing is employed as our spatial hashing scheme, which is not only free of hash collision but also nearly minimal so that our data structure is almost of the same size as the raw input. Compared with existing 3D CNN methods, our data structure significantly reduces the memory footprint during the CNN training. As the input geometry features are more compactly packed, CNN operations also run faster with our data structure. The experiment shows that, under the same network structure, our method yields comparable or better benchmark results compared with the state-of-the-art while it has only one-third memory consumption when under high resolutions (i.e. 256 3).}
}

R. Luo, T. Shao, H. Wang, W. Xu, X. Chen, K. Zhou, and Y. Yang, NNWarp: Neural Network-based Nonlinear Deformation, IEEE Transactions on Visualization and Computer Graphics, 2018.

Abstract | Bibtex | PDF

NNWarp is a highly re-usable and efficient neural network (NN) based nonlinear deformable simulation framework. Unlike other machine learning applications such as image recognition, where different inputs have a uniform and consistent format (e.g. an array of all the pixels in an image), the input for deformable simulation is quite variable, high-dimensional, and parametrization-unfriendly. Consequently, even though the neural network is known for its rich expressivity of nonlinear functions, directly using an NN to reconstruct the force-displacement relation for general deformable simulation is nearly impossible. NNWarp obviates this difficulty by partially restoring the force-displacement relation via warping the nodal displacement simulated using a simplistic constitutive model – the linear elasticity. In other words, NNWarp yields an incremental displacement fix per mesh node based on a simplified (therefore incorrect) simulation result other than synthesizing the unknown displacement directly. We introduce a compact yet effective feature vector including geodesic, potential and digression to sort training pairs of per-node linear and nonlinear displacement. NNWarp is robust under different model shapes and tessellations. With the assistance of deformation substructuring, one NN training is able to handle a wide range of 3D models of various geometries. Thanks to the linear elasticity and its constant system matrix, the underlying simulator only needs to perform one pre-factorized matrix solve at each time step, which allows NNWarp to simulate large models in real time.

@article{wrro140899,
month = {November},
title = {NNWarp: Neural Network-based Nonlinear Deformation},
author = {R Luo and T Shao and H Wang and W Xu and X Chen and K Zhou and Y Yang},
publisher = {Institute of Electrical and Electronics Engineers},
year = {2018},
note = {{\copyright} 2018 IEEE. This is an author produced version of a paper published in IEEE Transactions on Visualization and Computer Graphics. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Uploaded in accordance with the publisher's self-archiving policy.},
journal = {IEEE Transactions on Visualization and Computer Graphics},
keywords = {neural network , machine learning , data-driven animation , nonlinear regression , deformable model , physics-based simulation},
url = {http://eprints.whiterose.ac.uk/140899/},
abstract = {NNWarp is a highly re-usable and efficient neural network (NN) based nonlinear deformable simulation framework. Unlike other machine learning applications such as image recognition, where different inputs have a uniform and consistent format (e.g. an array of all the pixels in an image), the input for deformable simulation is quite variable, high-dimensional, and parametrization-unfriendly. Consequently, even though the neural network is known for its rich expressivity of nonlinear functions, directly using an NN to reconstruct the force-displacement relation for general deformable simulation is nearly impossible. NNWarp obviates this difficulty by partially restoring the force-displacement relation via warping the nodal displacement simulated using a simplistic constitutive model - the linear elasticity. In other words, NNWarp yields an incremental displacement fix per mesh node based on a simplified (therefore incorrect) simulation result other than synthesizing the unknown displacement directly. We introduce a compact yet effective feature vector including geodesic, potential and digression to sort training pairs of per-node linear and nonlinear displacement. NNWarp is robust under different model shapes and tessellations. With the assistance of deformation substructuring, one NN training is able to handle a wide range of 3D models of various geometries. Thanks to the linear elasticity and its constant system matrix, the underlying simulator only needs to perform one pre-factorized matrix solve at each time step, which allows NNWarp to simulate large models in real time.}
}

J. Geng, T. Shao, Y. Zheng, Y. Weng, and K. Zhou, Warp-Guided GANs for Single-Photo Facial Animation, ACM Transactions on Graphics, vol. 37, iss. 6, 2018.

Abstract | Bibtex | PDF

This paper introduces a novel method for realtime portrait animation in a single photo. Our method requires only a single portrait photo and a set of facial landmarks derived from a driving source (e.g., a photo or a video sequence), and generates an animated image with rich facial details. The core of our method is a warp-guided generative model that instantly fuses various fine facial details (e.g., creases and wrinkles), which are necessary to generate a high-fidelity facial expression, onto a pre-warped image. Our method factorizes out the nonlinear geometric transformations exhibited in facial expressions by lightweight 2D warps and leaves the appearance detail synthesis to conditional generative neural networks for high-fidelity facial animation generation. We show such a factorization of geometric transformation and appearance synthesis largely helps the network better learn the high nonlinearity of the facial expression functions and also facilitates the design of the network architecture. Through extensive experiments on various portrait photos from the Internet, we show the significant efficacy of our method compared with prior arts.

@article{wrro138578,
volume = {37},
number = {6},
month = {November},
author = {J Geng and T Shao and Y Zheng and Y Weng and K Zhou},
note = {{\copyright} 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. This is the author?s version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Graphics, https://doi.org/10.1145/3272127.3275043.},
title = {Warp-Guided GANs for Single-Photo Facial Animation},
publisher = {Association for Computing Machinery},
year = {2018},
journal = {ACM Transactions on Graphics},
url = {http://eprints.whiterose.ac.uk/138578/},
abstract = {This paper introduces a novel method for realtime portrait animation in a single photo. Our method requires only a single portrait photo and a set of facial landmarks derived from a driving source (e.g., a photo or a video sequence), and generates an animated image with rich facial details. The core of our method is a warp-guided generative model that instantly fuses various fine facial details (e.g., creases and wrinkles), which are necessary to generate a high-fidelity facial expression, onto a pre-warped image. Our method factorizes out the nonlinear geometric transformations exhibited in facial expressions by lightweight 2D warps and leaves the appearance detail synthesis to conditional generative neural networks for high-fidelity facial animation generation. We show such a factorization of geometric transformation and appearance synthesis largely helps the network better learn the high nonlinearity of the facial expression functions and also facilitates the design of the network architecture. Through extensive experiments on various portrait photos from the Internet, we show the significant efficacy of our method compared with prior arts.}
}

Y. Zhang, S. Garcia, W. Xu, T. Shao, and Y. Yang, Efficient voxelization using projected optimal scanline, Graphical Models, vol. 100, p. 61–70, 2018.

Abstract | Bibtex | PDF

In the paper, we propose an efficient algorithm for the surface voxelization of 3D geometrically complex models. Unlike recent techniques relying on triangle-voxel intersection tests, our algorithm exploits the conventional parallel-scanline strategy. Observing that there does not exist an optimal scanline interval in general 3D cases if one wants to use parallel voxelized scanlines to cover the interior of a triangle, we subdivide a triangle into multiple axis-aligned slices and carry out the scanning within each polygonal slice. The theoretical optimal scanline interval can be obtained to maximize the efficiency of the algorithm without missing any voxels on the triangle. Once the collection of scanlines are determined and voxelized, we obtain the surface voxelization. We fine tune the algorithm so that it only involves a few operations of integer additions and comparisons for each voxel generated. Finally, we comprehensively compare our method with the state-of-the-art method in terms of theoretical complexity, runtime performance and the quality of the voxelization on both CPU and GPU of a regular desktop PC, as well as on a mobile device. The results show that our method outperforms the existing method, especially when the resolution of the voxelization is high.

@article{wrro134272,
volume = {100},
month = {November},
author = {Y Zhang and S Garcia and W Xu and T Shao and Y Yang},
note = {{\copyright} 2017 Elsevier Inc. All rights reserved. This is an author produced version of a paper published in Graphical Models. Uploaded in accordance with the publisher's self-archiving policy},
title = {Efficient voxelization using projected optimal scanline},
publisher = {Elsevier},
journal = {Graphical Models},
pages = {61--70},
year = {2018},
keywords = {3D voxelization; Scanline; Integer arithmetic; Bresenham?s algorithm},
url = {http://eprints.whiterose.ac.uk/134272/},
abstract = {In the paper, we propose an efficient algorithm for the surface voxelization of 3D geometrically complex models. Unlike recent techniques relying on triangle-voxel intersection tests, our algorithm exploits the conventional parallel-scanline strategy. Observing that there does not exist an optimal scanline interval in general 3D cases if one wants to use parallel voxelized scanlines to cover the interior of a triangle, we subdivide a triangle into multiple axis-aligned slices and carry out the scanning within each polygonal slice. The theoretical optimal scanline interval can be obtained to maximize the efficiency of the algorithm without missing any voxels on the triangle. Once the collection of scanlines are determined and voxelized, we obtain the surface voxelization. We fine tune the algorithm so that it only involves a few operations of integer additions and comparisons for each voxel generated. Finally, we comprehensively compare our method with the state-of-the-art method in terms of theoretical complexity, runtime performance and the quality of the voxelization on both CPU and GPU of a regular desktop PC, as well as on a mobile device. The results show that our method outperforms the existing method, especially when the resolution of the voxelization is high.}
}

M. Lin, T. Shao, Y. Zheng, Z. Ren, Y. Weng, and Y. Yang, Automatic Mechanism Modeling from a Single Image with CNNs, Computer Graphics Forum, vol. 37, iss. 7, p. 337–348, 2018.

Abstract | Bibtex | PDF

This paper presents a novel system that enables a fully automatic modeling of both 3D geometry and functionality of a mechanism assembly from a single RGB image. The resulting 3D mechanism model highly resembles the one in the input image with the geometry, mechanical attributes, connectivity, and functionality of all the mechanical parts prescribed in a physically valid way. This challenging task is realized by combining various deep convolutional neural networks to provide high?quality and automatic part detection, segmentation, camera pose estimation and mechanical attributes retrieval for each individual part component. On the top of this, we use a local/global optimization algorithm to establish geometric interdependencies among all the parts while retaining their desired spatial arrangement. We use an interaction graph to abstract the inter?part connection in the resulting mechanism system. If an isolated component is identified in the graph, our system enumerates all the possible solutions to restore the graph connectivity, and outputs the one with the smallest residual error. We have extensively tested our system with a wide range of classic mechanism photos, and experimental results show that the proposed system is able to build high?quality 3D mechanism models without user guidance.

@article{wrro138539,
volume = {37},
number = {7},
month = {October},
author = {M Lin and T Shao and Y Zheng and Z Ren and Y Weng and Y Yang},
note = {{\copyright} 2018 The Author(s) Computer Graphics Forum {\copyright} 2018 The Eurographics Association and John Wiley \& Sons Ltd. Published by John Wiley \& Sons Ltd. This is the peer reviewed version of the following article: Automatic Mechanism Modeling from a Single Image with CNNs, which has been published in final form at https://doi.org/10.1111/cgf.13572. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.},
title = {Automatic Mechanism Modeling from a Single Image with CNNs},
publisher = {Wiley},
year = {2018},
journal = {Computer Graphics Forum},
pages = {337--348},
keywords = {CCS Concepts; ?Computing methodologies {$\rightarrow$} Image processing; Shape modeling; Neural networks},
url = {http://eprints.whiterose.ac.uk/138539/},
abstract = {This paper presents a novel system that enables a fully automatic modeling of both 3D geometry and functionality of a mechanism assembly from a single RGB image. The resulting 3D mechanism model highly resembles the one in the input image with the geometry, mechanical attributes, connectivity, and functionality of all the mechanical parts prescribed in a physically valid way. This challenging task is realized by combining various deep convolutional neural networks to provide high?quality and automatic part detection, segmentation, camera pose estimation and mechanical attributes retrieval for each individual part component. On the top of this, we use a local/global optimization algorithm to establish geometric interdependencies among all the parts while retaining their desired spatial arrangement. We use an interaction graph to abstract the inter?part connection in the resulting mechanism system. If an isolated component is identified in the graph, our system enumerates all the possible solutions to restore the graph connectivity, and outputs the one with the smallest residual error. We have extensively tested our system with a wide range of classic mechanism photos, and experimental results show that the proposed system is able to build high?quality 3D mechanism models without user guidance.}
}

X. Chen, Y. Li, X. Luo, T. Shao, J. Yu, K. Zhou, and Y. Zheng, AutoSweep: Recovering 3D Editable Objects from a Single Photograph, IEEE Transactions on Visualization and Computer Graphics, 2018.

Abstract | Bibtex | PDF

This paper presents a fully automatic framework for extracting editable 3D objects directly from a single photograph. Unlike previous methods which recover either depth maps, point clouds, or mesh surfaces, we aim to recover 3D objects with semantic parts and can be directly edited. We base our work on the assumption that most human-made objects are constituted by parts and these parts can be well represented by generalized primitives. Our work makes an attempt towards recovering two types of primitive-shaped objects, namely, generalized cuboids and generalized cylinders. To this end, we build up a novel instance-aware segmentation network for accurate part separation. Our GeoNet outputs a set of smooth part-level masks labeled as profiles and bodies. Then in a key stage, we simultaneously identify profile-body relations and recover 3D parts by sweeping the recognized profile along their body contour and jointly optimize the geometry to align with the recovered masks. Qualitative and quantitative experiments show that our algorithm can recover high quality 3D models and outperforms existing methods in both instance segmentation and 3D reconstruction.

@article{wrro138568,
month = {September},
title = {AutoSweep: Recovering 3D Editable Objects from a Single Photograph},
author = {X Chen and Y Li and X Luo and T Shao and J Yu and K Zhou and Y Zheng},
publisher = {Institute of Electrical and Electronics Engineers},
year = {2018},
note = { {\copyright} 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.},
journal = {IEEE Transactions on Visualization and Computer Graphics},
keywords = {Three-dimensional displays; Solid modeling; Image segmentation; Shape; Trajectory; Semantics; Geometry; Editable objects; Instance-aware segmentation; Sweep surfaces},
url = {http://eprints.whiterose.ac.uk/138568/},
abstract = {This paper presents a fully automatic framework for extracting editable 3D objects directly from a single photograph. Unlike previous methods which recover either depth maps, point clouds, or mesh surfaces, we aim to recover 3D objects with semantic parts and can be directly edited. We base our work on the assumption that most human-made objects are constituted by parts and these parts can be well represented by generalized primitives. Our work makes an attempt towards recovering two types of primitive-shaped objects, namely, generalized cuboids and generalized cylinders. To this end, we build up a novel instance-aware segmentation network for accurate part separation. Our GeoNet outputs a set of smooth part-level masks labeled as profiles and bodies. Then in a key stage, we simultaneously identify profile-body relations and recover 3D parts by sweeping the recognized profile along their body contour and jointly optimize the geometry to align with the recovered masks. Qualitative and quantitative experiments show that our algorithm can recover high quality 3D models and outperforms existing methods in both instance segmentation and 3D reconstruction.}
}

M. Lin, T. Shao, Y. Zheng, N. Mitra, and K. Zhou, Recovering Functional Mechanical Assemblies from Raw Scans, IEEE Transactions on Visualization and Computer Graphics, vol. 24, iss. 3, p. 1354–1367, 2018.

Abstract | Bibtex | PDF

This paper presents a method to reconstruct a functional mechanical assembly from raw scans. Given multiple input scans of a mechanical assembly, our method first extracts the functional mechanical parts using a motion-guided, patch-based hierarchical registration and labeling algorithm. The extracted functional parts are then parameterized from the segments and their internal mechanical relations are encoded by a graph. We use a joint optimization to solve for the best geometry, placement, and orientation of each part, to obtain a final workable mechanical assembly. We demonstrated our algorithm on various types of mechanical assemblies with diverse settings and validated our output using physical fabrication.

@article{wrro134214,
volume = {24},
number = {3},
month = {March},
author = {M Lin and T Shao and Y Zheng and NJ Mitra and K Zhou},
note = {{\copyright} 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.},
title = {Recovering Functional Mechanical Assemblies from Raw Scans},
publisher = {IEEE},
year = {2018},
journal = {IEEE Transactions on Visualization and Computer Graphics},
pages = {1354--1367},
keywords = {3D scanning; mechanical assembly; functionality; mechanical constraints; motion},
url = {http://eprints.whiterose.ac.uk/134214/},
abstract = {This paper presents a method to reconstruct a functional mechanical assembly from raw scans. Given multiple input scans of a mechanical assembly, our method first extracts the functional mechanical parts using a motion-guided, patch-based hierarchical registration and labeling algorithm. The extracted functional parts are then parameterized from the segments and their internal mechanical relations are encoded by a graph. We use a joint optimization to solve for the best geometry, placement, and orientation of each part, to obtain a final workable mechanical assembly. We demonstrated our algorithm on various types of mechanical assemblies with diverse settings and validated our output using physical fabrication.}
}

D. Li, T. Shao, H. Wu, and K. Zhou, Shape Completion from a Single RGBD Image, IEEE Transactions on Visualization and Computer Graphics, vol. 23, iss. 7, p. 1809–1822, 2017.

Abstract | Bibtex | PDF

We present a novel approach for constructing a complete 3D model for an object from a single RGBD image. Given an image of an object segmented from the background, a collection of 3D models of the same category are non-rigidly aligned with the input depth, to compute a rough initial result. A volumetric-patch-based optimization algorithm is then performed to refine the initial result to generate a 3D model that not only is globally consistent with the overall shape expected from the input image but also possesses geometric details similar to those in the input image. The optimization with a set of high-level constraints, such as visibility, surface confidence and symmetry, can achieve more robust and accurate completion over state-of-the art techniques. We demonstrate the efficiency and robustness of our approach with multiple categories of objects with various geometries and details, including busts, chairs, bikes, toys, vases and tables.

@article{wrro134259,
volume = {23},
number = {7},
month = {July},
author = {D Li and T Shao and H Wu and K Zhou},
note = {(c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.},
title = {Shape Completion from a Single RGBD Image},
publisher = {IEEE},
year = {2017},
journal = {IEEE Transactions on Visualization and Computer Graphics},
pages = {1809--1822},
keywords = {RGBD camera; shape completion; single RGBD image},
url = {http://eprints.whiterose.ac.uk/134259/},
abstract = {We present a novel approach for constructing a complete 3D model for an object from a single RGBD image. Given an image of an object segmented from the background, a collection of 3D models of the same category are non-rigidly aligned with the input depth, to compute a rough initial result. A volumetric-patch-based optimization algorithm is then performed to refine the initial result to generate a 3D model that not only is globally consistent with the overall shape expected from the input image but also possesses geometric details similar to those in the input image. The optimization with a set of high-level constraints, such as visibility, surface confidence and symmetry, can achieve more robust and accurate completion over state-of-the art techniques. We demonstrate the efficiency and robustness of our approach with multiple categories of objects with various geometries and details, including busts, chairs, bikes, toys, vases and tables.}
}

T. Shao, D. Li, Y. Rong, C. Zheng, and K. Zhou, Dynamic Furniture Modeling Through Assembly Instructions, ACM Transactions on Graphics, vol. 35, iss. 6, 2016.

Abstract | Bibtex | PDF

We present a technique for parsing widely used furniture assembly instructions, and reconstructing the 3D models of furniture components and their dynamic assembly process. Our technique takes as input a multi-step assembly instruction in a vector graphic format and starts to group the vector graphic primitives into semantic elements representing individual furniture parts, mechanical connectors (e.g., screws, bolts and hinges), arrows, visual highlights, and numbers. To reconstruct the dynamic assembly process depicted over multiple steps, our system identifies previously built 3D furniture components when parsing a new step, and uses them to address the challenge of occlusions while generating new 3D components incrementally. With a wide range of examples covering a variety of furniture types, we demonstrate the use of our system to animate the 3D furniture assembly process and, beyond that, the semantic-aware furniture editing as well as the fabrication of personalized furnitures.

@article{wrro134260,
volume = {35},
number = {6},
month = {November},
author = {T Shao and D Li and Y Rong and C Zheng and K Zhou},
note = {{\copyright} ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Graphics VOL 35, ISS 6, November 2016. : http://dx.doi.org/10.1145/2980179.2982416},
title = {Dynamic Furniture Modeling Through Assembly Instructions},
publisher = {Association for Computing Machinery},
year = {2016},
journal = {ACM Transactions on Graphics},
keywords = {Assembly instructions; furniture modeling; supervised learning; personalized fabrication},
url = {http://eprints.whiterose.ac.uk/134260/},
abstract = {We present a technique for parsing widely used furniture assembly instructions, and reconstructing the 3D models of furniture components and their dynamic assembly process. Our technique takes as input a multi-step assembly instruction in a vector graphic format and starts to group the vector graphic primitives into semantic elements representing individual furniture parts, mechanical connectors (e.g., screws, bolts and hinges), arrows, visual highlights, and numbers. To reconstruct the dynamic assembly process depicted over multiple steps, our system identifies previously built 3D furniture components when parsing a new step, and uses them to address the challenge of occlusions while generating new 3D components incrementally. With a wide range of examples covering a variety of furniture types, we demonstrate the use of our system to animate the 3D furniture assembly process and, beyond that, the semantic-aware furniture editing as well as the fabrication of personalized furnitures.}
}

C. Cao, H. Wu, Y. Weng, T. Shao, and K. Zhou, Real-time Facial Animation with Image-based Dynamic Avatars, ACM Transactions on Graphics, vol. 35, iss. 4, 2016.

Abstract | Bibtex | PDF

We present a novel image-based representation for dynamic 3D avatars, which allows effective handling of various hairstyles and headwear, and can generate expressive facial animations with fine-scale details in real-time. We develop algorithms for creating an image-based avatar from a set of sparsely captured images of a user, using an off-the-shelf web camera at home. An optimization method is proposed to construct a topologically consistent morphable model that approximates the dynamic hair geometry in the captured images. We also design a real-time algorithm for synthesizing novel views of an image-based avatar, so that the avatar follows the facial motions of an arbitrary actor. Compelling results from our pipeline are demonstrated on a variety of cases.

@article{wrro134265,
volume = {35},
number = {4},
month = {July},
author = {C Cao and H Wu and Y Weng and T Shao and K Zhou},
note = {{\copyright} ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Graphics, VOL 35, ISS 4, July 2016. http://doi.acm.org/10.1145/2897824.2925873.},
title = {Real-time Facial Animation with Image-based Dynamic Avatars},
publisher = {Association for Computing Machinery},
year = {2016},
journal = {ACM Transactions on Graphics},
keywords = {facial animation; face tracking; virtual avatar; image-based rendering; hair modeling},
url = {http://eprints.whiterose.ac.uk/134265/},
abstract = {We present a novel image-based representation for dynamic 3D avatars, which allows effective handling of various hairstyles and headwear, and can generate expressive facial animations with fine-scale details in real-time. We develop algorithms for creating an image-based avatar from a set of sparsely captured images of a user, using an off-the-shelf web camera at home. An optimization method is proposed to construct a topologically consistent morphable model that approximates the dynamic hair geometry in the captured images. We also design a real-time algorithm for synthesizing novel views of an image-based avatar, so that the avatar follows the facial motions of an arbitrary actor. Compelling results from our pipeline are demonstrated on a variety of cases.}
}

M. Chai, T. Shao, H. Wu, Y. Weng, and K. Zhou, AutoHair: Fully Automatic Hair Modeling from A Single Image, ACM Transactions on Graphics, vol. 35, iss. 4, 2016.

Abstract | Bibtex | PDF

We introduce AutoHair, the first fully automatic method for 3D hair modeling from a single portrait image, with no user interaction or parameter tuning. Our method efficiently generates complete and high-quality hair geometries, which are comparable to those generated by the state-of-the-art methods, where user interaction is required. The core components of our method are: a novel hierarchical deep neural network for automatic hair segmentation and hair growth direction estimation, trained over an annotated hair image database; and an efficient and automatic data-driven hair matching and modeling algorithm, based on a large set of 3D hair exemplars. We demonstrate the efficacy and robustness of our method on Internet photos, resulting in a database of around 50K 3D hair models and a corresponding hairstyle space that covers a wide variety of real-world hairstyles. We also show novel applications enabled by our method, including 3D hairstyle space navigation and hair-aware image retrieval.

@article{wrro134268,
volume = {35},
number = {4},
month = {July},
author = {M Chai and T Shao and H Wu and Y Weng and K Zhou},
note = {{\copyright} ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Graphics, VOL 35, ISS 4, July 2016. http://doi.acm.org/10.1145/2897824.2925961.},
title = {AutoHair: Fully Automatic Hair Modeling from A Single Image},
publisher = {Association for Computing Machinery},
year = {2016},
journal = {ACM Transactions on Graphics},
keywords = {hair modeling; image segmentation; data-driven modeling; deep neural network},
url = {http://eprints.whiterose.ac.uk/134268/},
abstract = {We introduce AutoHair, the first fully automatic method for 3D hair modeling from a single portrait image, with no user interaction or parameter tuning. Our method efficiently generates complete and high-quality hair geometries, which are comparable to those generated by the state-of-the-art methods, where user interaction is required. The core components of our method are: a novel hierarchical deep neural network for automatic hair segmentation and hair growth direction estimation, trained over an annotated hair image database; and an efficient and automatic data-driven hair matching and modeling algorithm, based on a large set of 3D hair exemplars. We demonstrate the efficacy and robustness of our method on Internet photos, resulting in a database of around 50K 3D hair models and a corresponding hairstyle space that covers a wide variety of real-world hairstyles. We also show novel applications enabled by our method, including 3D hairstyle space navigation and hair-aware image retrieval.}
}

Y. Rong, Y. Zheng, T. Shao, Y. Yang, and K. Zhou, An Interactive Approach for Functional Prototype Recovery from a Single RGBD Image, Computational Visual Media, vol. 2, iss. 1, p. 87–96, 2016.

Abstract | Bibtex | PDF

Inferring the functionality of an object from a single RGBD image is difficult for two reasons: lack of semantic information about the object, and missing data due to occlusion. In this paper, we present an interactive framework to recover a 3D functional prototype from a single RGBD image. Instead of precisely reconstructing the object geometry for the prototype, we mainly focus on recovering the object?s functionality along with its geometry. Our system allows users to scribble on the image to create initial rough proxies for the parts. After user annotation of high-level relations between parts, our system automatically jointly optimizes detailed joint parameters (axis and position) and part geometry parameters (size, orientation, and position). Such prototype recovery enables a better understanding of the underlying image geometry and allows for further physically plausible manipulation. We demonstrate our framework on various indoor objects with simple or hybrid functions.

@article{wrro134217,
volume = {2},
number = {1},
month = {March},
author = {Y Rong and Y Zheng and T Shao and Y Yang and K Zhou},
note = {{\copyright} The Author(s) 2016. The articles published in this journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits
unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.},
title = {An Interactive Approach for Functional Prototype Recovery from a Single RGBD Image},
publisher = {Springer},
year = {2016},
journal = {Computational Visual Media},
pages = {87--96},
keywords = {functionality; cuboid proxy; prototype; part relations; shape analysis},
url = {http://eprints.whiterose.ac.uk/134217/},
abstract = {Inferring the functionality of an object from a single RGBD image is difficult for two reasons: lack of semantic information about the object, and missing data due to occlusion. In this paper, we present an interactive framework to recover a 3D functional prototype from a single RGBD image. Instead of precisely reconstructing the object geometry for the prototype, we mainly focus on recovering the object?s functionality along with its geometry. Our system allows users to scribble on the image to create initial rough proxies for the parts. After user annotation of high-level relations between parts, our system automatically jointly optimizes detailed joint parameters (axis and position) and part geometry parameters (size, orientation, and position). Such prototype recovery enables a better understanding of the underlying image geometry and allows for further physically plausible manipulation. We demonstrate our framework on various indoor objects with simple or hybrid functions.}
}

T. Shao, A. Monszpart, Y. Zheng, B. Koo, W. Xu, K. Zhou, and N. Mitra, Imagining the unseen: stability-based cuboid arrangements for scene understanding, ACM Transactions on Graphics, vol. 33, iss. 6, 2014.

Abstract | Bibtex | PDF

Missing data due to occlusion is a key challenge in 3D acquisition, particularly in cluttered man-made scenes. Such partial information about the scenes limits our ability to analyze and understand them. In this work we abstract such environments as collections of cuboids and hallucinate geometry in the occluded regions by globally analyzing the physical stability of the resultant arrangements of the cuboids. Our algorithm extrapolates the cuboids into the un-seen regions to infer both their corresponding geometric attributes (e.g., size, orientation) and how the cuboids topologically interact with each other (e.g., touch or fixed). The resultant arrangement provides an abstraction for the underlying structure of the scene that can then be used for a range of common geometry processing tasks. We evaluate our algorithm on a large number of test scenes with varying complexity, validate the results on existing benchmark datasets, and demonstrate the use of the recovered cuboid-based structures towards object retrieval, scene completion, etc.

@article{wrro134270,
volume = {33},
number = {6},
month = {November},
author = {T Shao and A Monszpart and Y Zheng and B Koo and W Xu and K Zhou and NJ Mitra},
note = {{\copyright} 2014, Association for Computing Machinery, Inc. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Graphics, https://doi.org/10.1145/10.1145/2661229.2661288. Uploaded in accordance with the publisher's self-archiving policy.},
title = {Imagining the unseen: stability-based cuboid arrangements for scene understanding},
publisher = {Association for Computing Machinery},
year = {2014},
journal = {ACM Transactions on Graphics},
keywords = {box world; proxy arrangements; physical stability; shape analysis},
url = {http://eprints.whiterose.ac.uk/134270/},
abstract = {Missing data due to occlusion is a key challenge in 3D acquisition, particularly in cluttered man-made scenes. Such partial information about the scenes limits our ability to analyze and understand them. In this work we abstract such environments as collections of cuboids and hallucinate geometry in the occluded regions by globally analyzing the physical stability of the resultant arrangements of the cuboids. Our algorithm extrapolates the cuboids into the un-seen regions to infer both their corresponding geometric attributes (e.g., size, orientation) and how the cuboids topologically interact with each other (e.g., touch or fixed). The resultant arrangement provides an abstraction for the underlying structure of the scene that can then be used for a range of common geometry processing tasks. We evaluate our algorithm on a large number of test scenes with varying complexity, validate the results on existing benchmark datasets, and demonstrate the use of the recovered cuboid-based structures towards object retrieval, scene completion, etc.}
}