This project is the 17th batch of college student innovation projects of Northeastern University.
The responsible person and core members are Tianshuo Yuan and Wenyan Jiang.
Aiming at the problems of information loss and lack of data sets in some special scenarios in multi-modal image imaging, a high-resolution image imaging method based on an improved residual generator is proposed and incorporated into the produced virtual data set for effect assistance. First, images of different modalities are processed by cropping and filling, and then the images are subjected to Gaussian blur processing. The ResNet module in the CycleGAN residual generator is replaced by the Res2net module and the TripletAttention attention mechanism module is embedded in the last layer of the encoder. Use the improved CycleGAN as the backbone network to extract and fuse features of images of different modalities. Use the Vaihingen dataset to supplement the virtual dataset to train the model using a small sample learning method. Finally, on the SEN1-2 dataset, it is compared with the common Algorithms are compared. Experimental results show that the accuracy of the model using the TripletAttention attention mechanism and adding a virtual data set to assist training has been improved, and the model is more robust.
Train the data using small sample learning methods, and obtain the target image after passing through an enhanced Res2net feature extraction module and an embedded TripletAttribute feature enhancement module.
The following diagram is the
3.2 Res2Net
Due to the fact that synthetic aperture radar sensors do not provide rich visual image information like optical sensors , We chose to use Res2net to replace the original Resnet generator, which can increase the receptive field, better extract global information in local images, and better extract local information in large images.
For input tensors, triple attention utilizes rotation operations and residual transformation to establish interdependencies between dimensions, and encodes inter channel and spatial information with negligible computational overhead. Compared with other attention mechanisms, triple attention improves efficiency with reduced parameter count
In order to compensate for the shortage of datasets, Unity was used to construct a virtual scene dataset. The experimental results showed that the addition of virtual datasets helps to improve the accuracy of the model.
Shows the left side as the Vaihingen dataset and the right side as the virtual dataset