Global-to-Local Feature Mining Network for RGB-Infrared Person Re-Identification.- Semantic Transition Detection for Self-Supervised Vide Scene Segmentation.- Multi-Task Collaborative Network for Image-text Retrieval.- FGENet:Fine-Grained Extraction Network for Congested Crowd Counting.- MSMV-UNet : A 2.5D Stroke Lesion Segmentation Method based on Multi-slice Feature Fusion.- Non-Local Spatial-Wise and Global Channel-Wise Transformer for
Efficient Image Super-Resolution.- MobileViT-FocR: MobileViT with Fixed-One-Centre Loss and Gradient Reversal for Generalised Fake Face Detection.- ASF-Conformer: Audio Scoring Conformer with FFC for Speaker Verification in Noisy Environments.- Prior-Knowledge-Free Video Frame Interpolation with Bidirectional Regularized Implicit Neural Representations.- Two-Stage Reasoning Network with Modality Decomposition for Text
VQA.- Localization and Local Motion Magnification of Pulsatile Regions in Endoscopic Surgery Videos.- Co-speech Gesture Generation with Variational Auto Encoder.- Differentiable Neural Architecture Search Based on Efficient Architecture for Lightweight Image Super-Resolution.- Learning Collaborative Reinforcement Attention for 3D Face Reconstruction and Dense Alignment.- Exploring Multi-Modal Fusion for Image Manipulation Detection and Localization.- Object-based Spatio-Temporal Heterogeneous Network for VideoQA.- Adaptive Token Selection and Fusion Network for Multimodal Sentiment Analysis.- Exploring Imperceptible Adversarial Examples in YCbCr Color Space.- Fractional-order image moments and applications.- Time-Quality Tradeoff of MuseHash Query Processing Performance.- Dual-Fisheye Image Stitching via Unsupervised Deep Learning.- CA-GAN: Conditional Adaptive Generative Adversarial Network for Text-to-Image Synthesis.- RDC-YOLOv5:Improved Safety Helmet Detection in Adverse Weather.- Sustainable Commercial Fishery Control using Multimedia Forensics Data from Non-trusted, Mobile Edge Nodes.- MC-TCMNER: A Multi-Modal Fusion Model Combining Contrast Learning Method for Traditional Chinese Medicine NER.- C3-PO: A Convolutional Neural Network for COVID Onset Prediction
from Cough Sounds.- Pseudo-label based Unsupervised Momentum Representation Learning for Multi-domain Image Retrieval.- DFGait: Decomposition Fusion Representation Learning for Multimodal Gait Recognition.- MoPE: Mixture of Pooling Experts Framework for Image-Text Retrieval.- Multi-Modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation.- Unsupervised Multi-Collaborative Learning Network for 3D Face Reconstruction.- A Region Based Non-overlapping Reference Speech Estimation Method for Speaker Extraction.- Self-Supervised Edge Structure Learning forMulti-View Stereo and Parallel Optimization.- Prototype-Enhanced Hypergraph Learning for Heterogeneous Information Networks.- A Language-based solution to enable Metaverse Retrieval.- Part-aware Prompt Tuning For Weakly Supervised Referring Expression Grounding.- Adversarially Robust Deepfake Detection via Adversarial Feature Similarity Learning.- A Multidimensional Taxonomy Model for Music Tangible User Interfaces.