CISI-net: Explicit Latent Content Inference and Imitated Style Rendering for Image Inpainting
Convolutional neural networks (CNNs) have presented their potential in filling large missing areas with plausible contents. To address the blurriness issue commonly existing in the CNN-based inpainting, a typical approach is to conduct texture refinement on the initially completed images by replacing the neural patch in the predicted region using the closest one in the known region. However, such a processing might introduce undesired content change in the predicted region, especially when the desired content does not exist in the known region. To avoid generating such incorrect content, in this paper, we propose a content inference and style imitation network (CISI-net), which explicitly separate the image data into content code and style code. The content inference is realized by performing inference in the latent space to infer the content code of the corrupted images similar to the one from the original images. It can produce more detailed content than a similar inference procedure in the pixel domain, due to the dimensional distribution of content being lower than that of the entire image. On the other hand, the style code is used to represent the rendering of content, which will be consistent over the entire image. The style code is then integrated with the inferred content code to generate the complete image. Experiments on multiple datasets including structural and natural images demonstrate that our proposed approach out-performs the existing ones in terms of content accuracy as well as texture details.