2024-12-13

Title: ChatDyn: Language-Driven Multi-Actor Dynamics Generation in Street Scenes

Title: ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder

Title: Generative Modeling with Explicit Memory

Title: DALI: Domain Adaptive LiDAR Object Detection via Distribution-level and Instance-level Pseudo Label Denoising

Title: ViUniT: Visual Unit Tests for More Robust Visual Programming

Title: Radiology Report Generation via Multi-objective Preference Optimization

Title: Reversing the Damage: A QP-Aware Transformer-Diffusion Approach for 8K Video Restoration under Codec Compression

Title: Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration

Title: Selective Visual Prompting in Vision Mamba

Title: Mojito: Motion Trajectory and Intensity Control for Video Generation

Title: Elevating Flow-Guided Video Inpainting with Reference Generation

Title: Enhancing Facial Consistency in Conditional Video Generation via Facial Landmark Transformation

Title: MS2Mesh-XR: Multi-modal Sketch-to-Mesh Generation in XR Environments

Title: Arbitrary-steps Image Super-resolution via Diffusion Inversion

Title: Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model

Title: An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

Title: Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

Title: LVMark: Robust Watermark for latent video diffusion models

Title: Pinpoint Counterfactuals: Reducing social bias in foundation models via localized counterfactual generation

Title: RAD: Region-Aware Diffusion Models for Image Inpainting

Title: ExpRDiff: Short-exposure Guided Diffusion Model for Realistic Local Motion Deblurring

Title: eCARLA-scenes: A synthetically generated dataset for event-based optical flow prediction

Title: LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync

Title: InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption

Title: Transfer Learning of RSSI to Improve Indoor Localisation Performance

Title: GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expression

Title: T-SVG: Text-Driven Stereoscopic Video Generation

Title: Are Conditional Latent Diffusion Models Effective for Image Restoration?

Title: Auto-Regressive Moving Diffusion Models for Time Series Forecasting

Title: Causal Graphical Models for Vision-Language Compositional Understanding

Title: UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer

Title: Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation

Title: Search Strategy Generation for Branch and Bound Using Genetic Programming

Title: OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs

Title: SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

Title: Video Creation by Demonstration

Title: LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors

Title: Owl-1: Omni World Model for Consistent Long Video Generation

Title: SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Title: Spectral Image Tokenizer

Title: FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers

Title: Olympus: A Universal Task Router for Computer Vision Tasks

Title: Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG

Title: EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Title: SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Title: LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Title: OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation

Title: GenEx: Generating an Explorable World

Title: Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors

Title: FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Title: Doe-1: Closed-Loop Autonomous Driving with Large World Model