Parsing-Conditioned Anime Translation

Z. Li; Y. Xu; N. Zhao; Y. Zhou; Y. Liu; D. Lin; S. He

doi:https://doi.org/10.1145/3585002

Parsing-Conditioned Anime Translation A New Dataset and Method

Authors	Z. Li Y. Xu N. Zhao Y. Zhou Y. Liu D. Lin S. He
Publication date	06-2023
Journal	ACM Transactions on Graphics
Article number	30
Volume \| Issue number	42 \| 3
Number of pages	14
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Anime is an abstract art form that is substantially different from the human portrait, leading to a challenging misaligned image translation problem that is beyond the capability of existing methods. This can be boiled down to a highly ambiguous unconstrained translation between two domains. To this end, we design a new anime translation framework by deriving the prior knowledge of a pre-Trained StyleGAN model. We introduce disentangled encoders to separately embed structure and appearance information into the same latent code, governed by four tailored losses. Moreover, we develop a FaceBank aggregation method that leverages the generated data of the StyleGAN, anchoring the prediction to produce in-domain animes. To empower our model and promote the research of anime translation, we propose the first anime portrait parsing dataset, Danbooru-Parsing, containing 4,921 densely labeled images across 17 classes. This dataset connects the face semantics with appearances, enabling our new constrained translation setting. We further show the editability of our results, and extend our method to manga images, by generating the first manga parsing pseudo data. Extensive experiments demonstrate the values of our new dataset and method, resulting in the first feasible solution on anime translation.
Document type	Article
Note	With Supplementary Material
Language	English
Published at	https://doi.org/10.1145/3585002 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Parsing-Conditioned Anime Translation A New Dataset and Method