:: ICCAS 2021 ::

TC6		Deep Learning and Machine Vision Applications
Time : October 14 (Thu) 14:50-16:20 Room : Room 6 (Online 2F Byang)		Chair : Prof.Sungho Kim (Yeungnam University, Korea)



14:50-15:05	TC6-1

HybridNet: Indoor Segmentation with Range and Voxel Fusion
Sangsik Yeom(Seoul National University of Science and Technology, Korea), Jongeun Ha(Institution, Korea)

In this paper, we propose a HybridNet that improves performance by fusing 2D and 3D features. A voxel-based method and a projection-based method were adopted to derive the results through one scan. Our approach consists of two parallel networks, extracts features along each dimension, and converges them in a Fusion Network. In the fusion network, the voxel blocks and 2D feature maps extracted from each structure are fused to the voxel grid and then trained through convolution. For effective training of 2D networks, we use data augmentation techniques using coordinate system rotation transformation.



15:05-15:20	TC6-2

Visual Surveillance Transformer
Keoung Hun Choi, Jong Eun Ha(Seoul National University of Science and Technology, Korea)

In the case of the unmanned surveillance system field, even if it is the same object, the detection result will be different depending on the state of the object and the configuration of the surrounding environment. Therefore, artificial intelligence for unmanned surveillance needs to understand the environment on the image, understand the state of the object within the image, and understand the relationship between them. For this purpose, in this study, a transformed transformer structure that can receive a single image, which is 2D data, as an input, unlike splitting one image into a certain size and using it as an input, is presented, and the effect between neighboring pixels is considere



15:20-15:35	TC6-3

Segmentation Applying TAG Type Label Data and Transformer
Keoung Hun Choi, Jong-Eun Ha(Seoul National University of Science and Technology, Korea)

In this paper, to improve this point, a transformed transformer structure is applied to improve the performance of segmentation, and it is proposed to use data in a format different from the existing label data. By using a single image as an input, there is no loss of location information, and a lighter model is presented by obtaining a segmentation image without going through a separate process. At the same time, in order to improve generalization performance, a method of assigning one label to one characteristic rather than assigning one label to one object was applied to the composition of the label data, and the difference in generalization ability was compared.



15:35-15:50	TC6-4

Scene Text Recognition with Multi-decoders
Wang Yao(Seoul National University of Science and Technology, Korea), Jong-Eun Ha(Seoul National University of Science&Technology, Korea)

In this article, we focus on the scene text recognition problem, which is one of the challenging sub-files of computer vision because of the random existence of scene text. Recently, scene text recognition has achieved state-of-art performance because of the improvement of deep learning. Specifically, at the decoder part, connectionist temporal classification(CTC), attention mechanism, and transformer(self-attention) are three main approaches used in recent research. a novel decoder mechanism is introduced in our study.



15:50-16:05	TC6-5

Hyperspectral Imaging Labelling Tool using Preprocessing and Edge Detection
Sangho Jo, Sungho Kim(Yeungnam University, Korea)

Hyperspectral imaging data has hundreds of data for each pixel. The data present in each pixel has an independent presence, for which reason ground truth for deep learning training consists in segmentation format. Therefore, the class must be assigned to each pixel, so the data labelling must be refined. We felt the need for a labelling tool suitable for hyperspectral data because the current segmentation labelling tool is designed as a technique for specifying areas directly by humans. For this reason, in this paper, we considered the relationship of light reflection between light sources and objects, and created a hyperspectral labelling tool.