2025-06-10

Title: Wine Quality Prediction with Ensemble Trees: A Unified, Leak-Free Comparative Study

Title: ExplainBench: A Benchmark Framework for Local Model Explanations in Fairness-Critical Applications

Title: From Transformers to Large Language Models: A systematic review of AI applications in the energy sector towards Agentic Digital Twins

Title: Beyond the Norm: A Survey of Synthetic Data Generation for Rare Events

Title: Exploring Adversarial Watermarking in Transformer-Based Models: Transferability and Robustness Against Defense Mechanism for Medical Images

Title: Unlocking Chemical Insights: Superior Molecular Representations from Intermediate Encoder Layers

Title: Synthetic Problem Generation for Reasoning via Quality-Diversity Algorithms

Title: Hierarchical and Collaborative LLM-Based Control for Multi-UAV Motion and Communication in Integrated Terrestrial and Non-Terrestrial Networks

Title: A Deep Learning Approach for Facial Attribute Manipulation and Reconstruction in Surveillance and Reconnaissance

Title: Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques

Title: Breaking Data Silos: Towards Open and Scalable Mobility Foundation Models via Generative Continual Learning

Title: A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution

Title: RecipeGen: A Step-Aligned Multimodal Benchmark for Real-World Recipe Generation

Title: Training-Free Identity Preservation in Stylized Image Generation Using Diffusion Models

Title: IMPA-HGAE:Intra-Meta-Path Augmented Heterogeneous Graph Autoencoder

Title: Controllable Coupled Image Generation via Diffusion Models

Title: Face recognition on point cloud with cgan-top for denoising

Title: LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer

Title: Task-driven real-world super-resolution of document scans

Title: AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Title: DM$^3$Net: Dual-Camera Super-Resolution via Domain Modulation and Multi-scale Matching

Title: Towards Physics-informed Diffusion for Anomaly Detection in Trajectories

Title: MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Title: Interpretable and Reliable Detection of AI-Generated Images via Grounded Reasoning in MLLMs

Title: D2R: dual regularization loss with collaborative adversarial generation for model robustness

Title: SceneLCM: End-to-End Layout-Guided Interactive Indoor Scene Generation with Latent Consistency Model

Title: Quality-Diversity Red-Teaming: Automated Generation of High-Quality and Diverse Attackers for Large Language Models

Title: Hi-VAE: Efficient Video Autoencoding with Global and Detailed Motion

Title: AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models

Title: Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models

Title: GGBall: Graph Generative Model on Poincaré Ball

Title: TV-LiVE: Training-Free, Text-Guided Video Editing via Layer Informed Vitality Exploitation

Title: Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning

Title: Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

Title: From Generation to Generalization: Emergent Few-Shot Learning in Video Diffusion Models

Title: Multi-Step Guided Diffusion for Image Restoration on Edge Devices: Toward Lightweight Perception in Embodied AI

Title: FANVID: A Benchmark for Face and License Plate Recognition in Low-Resolution Videos

Title: Generative Modeling of Networked Time-Series via Transformer Architectures

Title: Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

Title: Generative Models at the Frontier of Compression: A Survey on Generative Face Video Coding

Title: ARGUS: Hallucination and Omission Evaluation in Video-LLMs

Title: MrM: Black-Box Membership Inference Attacks against Multimodal RAG Systems

Title: InverseScope: Scalable Activation Inversion for Interpreting Large Language Models

Title: Compressed Feature Quality Assessment: Dataset and Baselines

Title: LiteVLM: A Low-Latency Vision-Language Model Inference Pipeline for Resource-Constrained Environments

Title: PhysiInter: Integrating Physical Mapping for High-Fidelity Human Interaction Generation

Title: ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning

Title: GLOS: Sign Language Generation with Temporally Aligned Gloss-Level Conditioning

Title: Drive Any Mesh: 4D Latent Diffusion for Mesh Deformation from Video

Title: Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency

Title: Addressing Correlated Latent Exogenous Variables in Debiased Recommender Systems

Title: Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study

Title: APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs

Title: Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries

Title: OpenDance: Multimodal Controllable 3D Dance Generation Using Large-scale Internet Data

Title: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization

Title: Explore the vulnerability of black-box models via diffusion models

Title: TwinBreak: Jailbreaking LLM Security Alignments based on Twin Prompts

Title: SceneRAG: Scene-level Retrieval-Augmented Generation for Video Understanding

Title: Scaling Human Activity Recognition: A Comparative Evaluation of Synthetic Data Generation and Augmentation Techniques

Title: Synthetic Visual Genome

Title: NOVA3D: Normal Aligned Video Diffusion Model for Single Image to 3D Generation

Title: Adaptive Blind Super-Resolution Network for Spatial-Specific and Spatial-Agnostic Degradations

Title: Evaluating Robustness in Latent Diffusion Models via Embedding Level Augmentation

Title: Consistent Video Editing as Flow-Driven Image-to-Video Generation

Title: AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization

Title: Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images

Title: Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation

Title: Comparing Credit Risk Estimates in the Gen-AI Era

Title: Language-Vision Planner and Executor for Text-to-Visual Reasoning

Title: Re-ranking Reasoning Context with Tree Search Makes Large Vision-Language Models Stronger

Title: Incorporating Uncertainty-Guided and Top-k Codebook Matching for Real-World Blind Image Super-Resolution

Title: Self-Cascaded Diffusion Models for Arbitrary-Scale Image Super-Resolution

Title: M2Restore: Mixture-of-Experts-based Mamba-CNN Fusion Framework for All-in-One Image Restoration

Title: Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation

Title: R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation

Title: Diffusion models under low-noise regime

Title: Jarzynski Reweighting and Sampling Dynamics for Training Energy-Based Models: Theoretical Analysis of Different Transition Kernels

Title: PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

Title: SAM2Auto: Auto Annotation Using FLASH

Title: VIVAT: Virtuous Improving VAE Training through Artifact Mitigation

Title: Diffusion Counterfactual Generation with Semantic Abduction

Title: EgoM2P: Egocentric Multimodal Multitask Pretraining

Title: Video Unlearning via Low-Rank Refusal Vector

Title: FunDiff: Diffusion Models over Function Spaces for Physics-Informed Generative Modeling

Title: Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces

Title: W4S4: WaLRUS Meets S4 for Long-Range Sequence Modeling

Title: A Generative Physics-Informed Reinforcement Learning-Based Approach for Construction of Representative Drive Cycle

Title: TokenBreak: Bypassing Text Classification Models Through Token Manipulation

Title: Cost-Optimal Active AI Model Evaluation

Title: SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design

Title: OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Title: Realistic Urban Traffic Generator using Decentralized Federated Learning for the SUMO simulator

Title: CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray

Title: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Title: Generative Modeling of Weights: Generalization or Memorization?

Title: MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation

Title: Audio-Sync Video Generation with Multi-Stream Temporal Control

Title: Dreamland: Controllable World Creation with Simulator and Generative Models

Title: Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion