Learning continuity for image and video recognition

J. Zhao

Learning continuity for image and video recognition

Authors	J. Zhao
Supervisors	C.G.M. Snoek
Cosupervisors	P.S.M. Mettes
Award date	14-04-2022
Number of pages	148
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	This thesis aims at learning continuity for visual recognition. As a natural property of images and videos, continuity is important for many computer vision tasks. The thesis strives to answer the research question '' What is the benefit of continuity for image and video recognition? '' Therefore, the thesis includes two parts, respectively looking into spatial continuity of images and spatio-temporal continuity of videos. Part I is specifically for learning continuity for image recognition. In Chapter 2, we explore spatial continuity for image colorization. Chapter 3 presents a new pooling method maintaining better spatial continuity. Part II aims at learning continuity for video recognition. The goal of Chapter 4 is to utilize temporal continuity for action detection. Chapter 5 targets on endowing a 3D-Convnet with spatio-temporal continuity. In Chapter 6, we propose TubeR: a simple solution for spatio-temporal video action detection. To summarize, this thesis aims at studying continuity for image and video recognition. In depth, we start with the benefit of learning continuity for images or videos in each part, and then respectively dig into technological innovations of exploiting continuity in various network architectures. In breadth, the thesis explores spatial continuity for images and spatio-temporal continuity for videos. Specifically, it covers image colorization, image classification, semantic segmentation, video action detection, video action recognition, and video object segmentation. We hope our journey is able to stimulate more research on image and video continuity.
Document type	PhD thesis
Language	English
Downloads	Thesis
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Learning continuity for image and video recognition