2025-06-24

Title: Mechanistic Interpretability of Diffusion Models: Circuit-Level Analysis and Causal Validation

Title: Recursive Learning-Based Virtual Buffering for Analytical Global Placement

Title: Origins of Creativity in Attention-Based Diffusion Models

Title: A Novel Multi-layer Task-centric and Data Quality Framework for Autonomous Driving

Title: Efficient Feedback Gate Network for Hyperspectral Image Super-Resolution

Title: SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification

Title: Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling?

Title: VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models

Title: LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning

Title: LFR-PINO: A Layered Fourier Reduced Physics-Informed Neural Operator for Parametric PDEs

Title: OpenMAP-BrainAge: Generalizable and Interpretable Brain Age Predictor

Title: HIRE: Lightweight High-Resolution Image Feature Enrichment for Multimodal LLMs

Title: Optimization-Free Patch Attack on Stereo Depth Estimation

Title: Histopathology Image Report Generation by Vision Language Model with Multimodal In-Context Learning

Title: DreamJourney: Perpetual View Generation with Video Diffusion Models

Title: Programmable-Room: Interactive Textured 3D Room Meshes Generation Empowered by Large Language Models

Title: PhysID: Physics-based Interactive Dynamics from a Single-view Image

Title: PhysiX: A Foundation Model for Physics Simulations

Title: Toward Autonomous UI Exploration: The UIExplorer Benchmark

Title: Beyond instruction-conditioning, MoTE: Mixture of Task Experts for Multi-task Embedding Models

Title: Reimagining Parameter Space Exploration with Diffusion Models

Title: Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach

Title: A Comparative Study of Open-Source Libraries for Synthetic Tabular Data Generation: SDV vs. SynthCity

Title: PlanMoGPT: Flow-Enhanced Progressive Planning for Text to Motion Synthesis

Title: GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning

Title: Adapting Vision-Language Models for Evaluating World Models

Title: BPCLIP: A Bottom-up Image Quality Assessment from Distortion to Semantics Based on CLIP

Title: Enabling PSO-Secure Synthetic Data Sharing Using Diversity-Aware Diffusion Models

Title: Imputation of Longitudinal Data Using GANs: Challenges and Implications for Classification

Title: ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation

Title: RL for Reasoning by Adaptively Revealing Rationales

Title: Targeted False Positive Synthesis via Detector-guided Adversarial Diffusion Attacker for Robust Polyp Detection

Title: Pattern-Based Phase-Separation of Tracer and Dispersed Phase Particles in Two-Phase Defocusing Particle Tracking Velocimetry

Title: Make It Efficient: Dynamic Sparse Attention for Autoregressive Image Generation

Title: Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability

Title: Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-resolution Information in Temporal Domain

Title: Adaptive Mask-guided K-space Diffusion for Accelerated MRI Reconstruction

Title: Instability in Diffusion ODEs: An Explanation for Inaccurate Image Reconstruction

Title: Rapeseed population point cloud completion network (RP-PCN) with dynamic graph convolution for 3D reconstruction of crop canopy occlusion architecture

Title: NSFW-Classifier Guided Prompt Sanitization for Safe Text-to-Image Generation

Title: Geometry-Aware Preference Learning for 3D Texture Generation

Title: Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention

Title: Controlled Generation with Equivariant Variational Flow Matching

Title: BSMamba: Brightness and Semantic Modeling for Long-Range Interaction in Low-Light Image Enhancement

Title: RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models

Title: CPAM: Context-Preserving Adaptive Manipulation for Zero-Shot Real Image Editing

Title: GANs vs. Diffusion Models for virtual staining with the HER2match dataset

Title: ShowFlow: From Robust Single Concept to Condition-Free Multi-Concept Generation

Title: PuckTrick: A Library for Making Synthetic Data More Realistic

Title: MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis

Title: Enhancing Image Restoration Transformer via Adaptive Translation Equivariance

Title: Auto-Regressively Generating Multi-View Consistent Images

Title: VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning

Title: VisualChef: Generating Visual Aids in Cooking via Mask Inpainting

Title: No Training Wheels: Steering Vectors for Bias Correction at Inference Time

Title: Simulation-Free Differential Dynamics through Neural Conservation Laws

Title: On Union-Closedness of Language Generation

Title: RDPO: Real Data Preference Optimization for Physics Consistency Video Generation

Title: Historical Report Guided Bi-modal Concurrent Learning for Pathology Report Generation

Title: SIM-Net: A Multimodal Fusion Network Using Inferred 3D Object Shape Point Clouds from RGB Images for 2D Classification

Title: Matrix-Game: Interactive World Foundation Model

Title: USVTrack: USV-Based 4D Radar-Camera Tracking Dataset for Autonomous Driving in Inland Waterways

Title: ContinualFlow: Learning and Unlearning with Neural Flow Matching

Title: 3D Arena: An Open Platform for Generative 3D Evaluation

Title: 4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation

Title: Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset

Title: TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting

Title: OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation

Title: OmniGen2: Exploration to Advanced Multimodal Generation

Title: Let Your Video Listen to Your Music!

Title: Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Title: 4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

Title: Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Title: FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation

Title: From Virtual Games to Real-World Play

Title: VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory