2025-10-09

Title: CML-Bench: A Framework for Evaluating and Enhancing LLM-Powered Movie Scripts Generation

Title: RareGraph-Synth: Knowledge-Guided Diffusion Models for Generating Privacy-Preserving Synthetic Patient Trajectories in Ultra-Rare Diseases

Title: General and Efficient Visual Goal-Conditioned Reinforcement Learning using Object-Agnostic Masks

Title: Traj-Transformer: Diffusion Models with Transformer for GPS Trajectory Generation

Title: BlockGPT: Spatio-Temporal Modelling of Rainfall via Frame-Level Autoregression

Title: RGBD Gaze Tracking Using Transformer for Feature Fusion

Title: SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation

Title: Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

Title: TransFIRA: Transfer Learning for Face Image Recognizability Assessment

Title: TDiff: Thermal Plug-And-Play Prior with Patch-Based Diffusion

Title: SIGMA-GEN: Structure and Identity Guided Multi-subject Assembly for Image Generation

Title: Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin

Title: Valid Stopping for LLM Generation via Empirical Dynamic Formal Lift

Title: Text2Interact: High-Fidelity and Diverse Text-to-Two-Person Interaction Generation

Title: Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security

Title: VUGEN: Visual Understanding priors for GENeration

Title: Incoherence in goal-conditioned autoregressive models

Title: HSNet: Heterogeneous Subgraph Network for Single Image Super-resolution

Title: Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer

Title: SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation

Title: AIM 2025 Challenge on Real-World RAW Image Denoising

Title: POME: Post Optimization Model Edit via Muon-style Projection

Title: Three Forms of Stochastic Injection for Improved Distribution-to-Distribution Generative Modeling

Title: StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering

Title: The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators

Title: Heptapod: Language Modeling on Visual Signals

Title: DreamOmni2: Multimodal Instruction-based Editing and Generation

Title: A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking

Title: Evaluating LLMs for Historical Document OCR: A Methodological Framework for Digital Humanities

Title: Extreme Amodal Face Detection

Title: StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance

Title: Continual Action Quality Assessment via Adaptive Manifold-Aligned Graph Regularization

Title: SaFeR-VLM: Toward Safety-aware Fine-grained Reasoning in Multimodal Models

Title: Utilizing Large Language Models for Machine Learning Explainability

Title: DecompGAIL: Learning Realistic Traffic Behaviors with Decomposed Multi-Agent Generative Adversarial Imitation Learning

Title: Label-frugal satellite image change detection with generative virtual exemplar learning

Title: IAR2: Improving Autoregressive Visual Generation with Semantic-Detail Associated Token Prediction

Title: OBJVanish: Physically Realizable Text-to-3D Adv. Generation of LiDAR-Invisible Objects

Title: Generating Surface for Text-to-3D using 2D Gaussian Splatting

Title: Addressing the ID-Matching Challenge in Long Video Captioning

Title: No MoCap Needed: Post-Training Motion Diffusion Models with Reinforcement Learning using Only Textual Prompts

Title: Sharpness-Aware Data Generation for Zero-shot Quantization

Title: Generative World Modelling for Humanoids: 1X World Model Challenge Technical Report

Title: Enhancing Concept Localization in CLIP-based Concept Bottleneck Models

Title: Graph Conditioned Diffusion for Controllable Histopathology Image Generation

Title: A Multi-Agent Framework for Stateful Inference-Time Search

Title: MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis

Title: EigenScore: OOD Detection using Covariance in Diffusion Models

Title: GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image Generation

Title: TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Title: SpecGuard: Spectral Projection-based Advanced Invisible Watermarking

Title: MATRIX: Mask Track Alignment for Interaction-aware Video Generation

Title: WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation

Title: Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers

Title: Temporal Prompting Matters: Rethinking Referring Video Object Segmentation