2024-01-18

Title: Temporal Embeddings: Scalable Self-Supervised Temporal Representation Learning from Spatiotemporal Data for Multimodal Computer Vision

Title: Improved Pothole Detection Using YOLOv7 and ESRGAN

Title: Online Anomaly Detection over Live Social Video Streaming

Title: One-Step Diffusion Distillation via Deep Equilibrium Models

Title: SAiD: Speech-driven Blendshape Facial Animation with Diffusion

Title: Attention Modules Improve Modern Image-Level Anomaly Detection: A DifferNet Case Study

Title: NODI: Out-Of-Distribution Detection with Noise from Diffusion

Title: Contrastive Learning with Negative Sampling Correction

Title: Unsupervised Pre-Training for 3D Leaf Instance Segmentation

Title: Revealing Vulnerabilities in Stable Diffusion via Targeted Attacks

Title: SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

Title: Fixed Point Diffusion Models

Title: HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Title: Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping

Title: Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive

Title: Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Title: Robust Localization of Key Fob Using Channel Impulse Response of Ultra Wide Band Sensors for Keyless Entry Systems

Title: 3D Human Pose Analysis via Diffusion Synthesis

Title: COCO is "ALL'' You Need for Visual Instruction Fine-tuning

Title: Hearing Loss Detection from Facial Expressions in One-on-one Conversations

Title: ACT-GAN: Radio map construction based on generative adversarial networks with ACT blocks

Title: A GAN-based data poisoning framework against anomaly detection in vertical federated learning

Title: Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR

Title: Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation

Title: VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Title: Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Title: Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior

Title: CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding

Title: Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models

Title: UniVG: Towards UNIfied-modal Video Generation

Title: Machine Learning for Healthcare-IoT Security: A Review and Risk Mitigation

Title: SM$^3$: Self-Supervised Multi-task Modeling with Multi-view 2D Images for Articulated Objects

Title: Unsupervised Multiple Domain Translation through Controlled Disentanglement in Variational Autoencoder

Title: Training-Free Semantic Video Composition via Pre-trained Diffusion Model

Title: Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery

Title: POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

Title: Vlogger: Make Your Dream A Vlog

Title: TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Title: Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model