2025-06-13

Title: Multimodal Cinematic Video Synthesis Using Text-to-Image and Audio Generation Models

Title: NOCL: Node-Oriented Conceptualization LLM for Graph Tasks without Message Passing

Title: A Survey of Automatic Evaluation Methods on Text, Visual and Speech Generations

Title: GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models

Title: TaskCraft: Automated Generation of Agentic Tasks

Title: LoRA-Edit: Controllable First-Frame-Guided Video Editing via Mask-Aware LoRA Fine-Tuning

Title: Test-Time Adaptation for Generalizable Task Progress Estimation

Title: Optimizing Latent Dimension Allocation in Hierarchical VAEs: Balancing Attenuation and Information Retention for OOD Detection

Title: EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models

Title: NnD: Diffusion-based Generation of Physically-Nonnegative Objects

Title: SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score

Title: Geometric Regularity in Deterministic Sampling of Diffusion-based Generative Models

Title: Attention, Please! Revisiting Attentive Probing for Masked Image Modeling

Title: Scalable Non-Equivariant 3D Molecule Generation via Rotational Alignment

Title: ScoreMix: Improving Face Recognition via Score Composition in Diffusion Generators

Title: Graph-MLLM: Harnessing Multimodal Large Language Models for Multimodal Graph Learning

Title: PointGS: Point Attention-Aware Sparse View Synthesis with Gaussian Splatting

Title: Can We Infer Confidential Properties of Training Data from LLMs?

Title: Leveraging 6DoF Pose Foundation Models For Mapping Marine Sediment Burial

Title: EQA-RM: A Generative Embodied Reward Model with Test-time Scaling

Title: ReconMOST: Multi-Layer Sea Temperature Reconstruction with Observations-Guided Diffusion

Title: Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation

Title: Generative Algorithms for Wildfire Progression Reconstruction from Multi-Modal Satellite Active Fire Measurements and Terrain Height

Title: PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier

Title: Rethinking Generative Human Video Coding with Implicit Motion Transformation

Title: A Crack in the Bark: Leveraging Public Knowledge to Remove Tree-Ring Watermarks

Title: Equivariant Neural Diffusion for Molecule Generation

Title: DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers

Title: DanceChat: Large Language Model-Guided Music-to-Dance Generation

Title: Harmonizing Geometry and Uncertainty: Diffusion with Hyperspheres

Title: High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model

Title: TexTailor: Customized Text-aligned Texturing via Effective Resampling

Title: Hessian Geometry of Latent Space in Generative Models

Title: Anatomy-Grounded Weakly Supervised Prompt Tuning for Chest X-ray Latent Diffusion Models

Title: Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models

Title: GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning

Title: Unsourced Adversarial CAPTCHA: A Bi-Phase Adversarial CAPTCHA Framework

Title: ConTextTab: A Semantics-Aware Tabular In-Context Learner

Title: Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement

Title: IQE-CLIP: Instance-aware Query Embedding for Zero-/Few-shot Anomaly Detection in Medical Domain

Title: ME: Trigger Element Combination Backdoor Attack on Copyright Infringement

Title: Dense Associative Memory with Epanechnikov Energy

Title: Accelerating Diffusion Large Language Models with SlowFast: The Three Golden Principles

Title: Analyzing the relationships between pretraining language, phonetic, tonal, and speaker information in self-supervised speech models

Title: The Diffusion Duality

Title: AIR: Zero-shot Generative Model Adaptation with Iterative Refinement

Title: Foundation Models for Causal Inference via Prior-Data Fitted Networks

Title: M4V: Multi-Modal Mamba for Text-to-Video Generation

Title: VINCIE: Unlocking In-context Image Editing from Video

Title: ReGuidance: A Simple Diffusion Wrapper for Boosting Sample Quality on Hard Inverse Problems

Title: Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods

Title: SpectralAR: Spectral Autoregressive Visual Generation

Title: MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning

Title: Fine-Grained Perturbation Guidance via Attention Head Selection

Title: InstaInpaint: Instant 3D-Scene Inpainting with Masked Large Reconstruction Model

Title: SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis

Title: Rethinking Losses for Diffusion Bridge Samplers