2024-12-02

Title: Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop

Title: RoMo: Robust Motion Segmentation Improves Structure from Motion

Title: AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Title: OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains

Title: HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior

Title: Towards Chunk-Wise Generation for Long Videos

Title: FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Title: AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

Title: Active Data Curation Effectively Distills Large-Scale Multimodal Models

Title: MatchDiffusion: Training-free Generation of Match-cuts

Title: Random Walks with Tweedie: A Unified Framework for Diffusion Models

Title: Generative Visual Communication in the Era of Vision-Language Models

Title: DiffMVR: Diffusion-based Automated Multi-Guidance Video Restoration

Title: Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds

Title: FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution

Title: CrossTracker: Robust Multi-modal 3D Multi-Object Tracking via Cross Correction

Title: RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning

Title: T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Title: Textured As-Is BIM via GIS-informed Point Cloud Segmentation

Title: Data Augmentation with Diffusion Models for Colon Polyp Localization on the Low Data Regime: How much real data is enough?

Title: VIPaint: Image Inpainting with Pre-Trained Diffusion Models via Variational Inference

Title: ICLERB: In-Context Learning Embedding and Reranker Benchmark

Title: Random Sampling for Diffusion-based Adversarial Purification

Title: SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing

Title: Locally-Focused Face Representation for Sketch-to-Image Generation Using Noise-Induced Refinement

Title: 3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes

Title: I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting

Title: VARCO-VISION: Expanding Frontiers in Korean Vision-Language Models

Title: Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Title: Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Title: MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation

Title: Deep Learning for GWP Prediction: A Framework Using PCA, Quantile Transformation, and Ensemble Modeling

Title: LoRA of Change: Learning to Generate LoRA for the Editing Instruction from A Single Before-After Image Pair

Title: SOWing Information: Cultivating Contextual Coherence with MLLMs in Image Generation

Title: Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection

Title: Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG

Title: Z-STAR+: A Zero-shot Style Transfer Method via Adjusting Style Distribution

Title: Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes

Title: Face2QR: A Unified Framework for Aesthetic, Face-Preserving, and Scannable QR Code Generation

Title: Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention

Title: Trajectory Attention for Fine-grained Video Motion Control

Title: Libra: Leveraging Temporal Images for Biomedical Radiology Analysis

Title: DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Title: AMO Sampler: Enhancing Text Rendering with Overshooting

Title: Any-Resolution AI-Generated Image Detection by Spectral Learning

Title: Fleximo: Towards Flexible Text-to-Human Motion Video Generation

Title: V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow

Title: Interleaved-Modal Chain-of-Thought

Title: Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Title: DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Title: RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Title: QUOTA: Quantifying Objects with Text-to-Image Models for Any Domain

Title: Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook

Title: ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration

Title: Tortho-Gaussian: Splatting True Digital Orthophoto Maps

Title: FairDD: Fair Dataset Distillation via Synchronized Matching

Title: Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing

Title: TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

Title: JetFormer: An Autoregressive Generative Model of Raw Images and Text

Title: A Note on Small Percolating Sets on Hypercubes via Generative AI

Title: MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks

Title: $C^{3}$-NeRF: Modeling Multiple Scenes via Conditional-cum-Continual Neural Radiance Fields

Title: SIMS: Simulating Human-Scene Interactions with Real World Script Planning

Title: Free-form Generation Enhances Challenging Clothed Human Modeling