2025-03-20

Title: Synthetic Data Generation of Body Motion Data by Neural Gas Network for Emotion Recognition

Title: Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control

Title: Salient Temporal Encoding for Dynamic Scene Graph Generation

Title: Sampling Decisions

Title: Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance

Title: Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

Title: A Simple Combination of Diffusion Models for Better Quality Trade-Offs in Image Denoising

Title: Elevating Visual Question Answering through Implicitly Learned Reasoning Pathways in LVLMs

Title: ShapeShift: Towards Text-to-Shape Arrangement Synthesis with Content-Aware Geometric Constraints

Title: Decompositional Neural Scene Reconstruction with Generative Diffusion Prior

Title: SemanticFlow: A Self-Supervised Framework for Joint Scene Flow Prediction and Instance Segmentation in Dynamic Environments

Title: LogLLaMA: Transformer-based log anomaly detection with LLaMA

Title: Temporal-Consistent Video Restoration with Pre-trained Diffusion Models

Title: Efficient Personalization of Quantized Diffusion Model without Backpropagation

Title: DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework

Title: Exploring the Limits of KV Cache Compression in Visual Autoregressive Transformers

Title: GenM$^3$: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation

Title: Shushing! Let's Imagine an Authentic Speech from the Silent Video

Title: 3D Engine-ready Photorealistic Avatars via Dynamic Textures

Title: MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance

Title: Generating Multimodal Driving Scenes via Next-Scene Prediction

Title: Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering

Title: Universal Scene Graph Generation

Title: Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training

Title: Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Title: Multivariate Gaussian Topic Modelling: A novel approach to discover topics with greater semantic coherence

Title: Single-Step Bidirectional Unpaired Image Translation Using Implicit Bridge Consistency Distillation

Title: Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis

Title: DeCaFlow: A Deconfounding Causal Generative Model

Title: VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention

Title: Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Title: DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation

Title: LEGION: Learning to Ground and Explain for Synthetic Image Detection

Title: DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Title: TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models

Title: Learn Your Scales: Towards Scale-Consistent Generative Novel View Synthesis

Title: Temporal Regularization Makes Your Video Generator Stronger

Title: LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding

Title: MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Title: Di$\mathtt{[M]}$O: Distilling Masked Diffusion Models into One-step Generator

Title: FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers

Title: Toward task-driven satellite image super-resolution

Title: Cube: A Roblox View of 3D Intelligence

Title: TULIP: Towards Unified Language-Image Pretraining