2025-08-19

Title: Sparse Attention across Multiple-context KV Cache

Title: FusionFM: Fusing Eye-specific Foundational Models for Optimized Ophthalmic Diagnosis

Title: Scalable Geospatial Data Generation Using AlphaEarth Foundations Model

Title: FairTabGen: Unifying Counterfactual and Causal Fairness in Synthetic Tabular Data Generation

Title: Large Kernel Modulation Network for Efficient Image Super-Resolution

Title: SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress

Title: Assessment of Using Synthetic Data in Brain Tumor Segmentation

Title: Deep Learning For Point Cloud Denoising: A Survey

Title: Extending Straight-Through Estimation for Robust Neural Networks on Analog CIM Hardware

Title: UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding

Title: MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding

Title: Q-FSRU: Quantum-Augmented Frequency-Spectral Fusion for Medical Visual Question Answering

Title: Content Accuracy and Quality Aware Resource Allocation Based on LP-Guided DRL for ISAC-Driven AIGC Networks

Title: VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

Title: Generic Event Boundary Detection via Denoising Diffusion

Title: Error Propagation Mechanisms and Compensation Strategies for Quantized Diffusion

Title: Generative Medical Event Models Improve with Scale

Title: VELVET-Med: Vision and Efficient Language Pre-training for Volumetric Imaging Tasks in Medicine

Title: Demystifying Foreground-Background Memorization in Diffusion Models

Title: Scalable RF Simulation in Generative 4D Worlds

Title: Distribution Matching via Generalized Consistency Models

Title: In vivo 3D ultrasound computed tomography of musculoskeletal tissues with generative neural physics

Title: SNNSIR: A Simple Spiking Neural Network for Stereo Image Restoration

Title: Semantic Discrepancy-aware Detector for Image Forgery Identification

Title: Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models

Title: DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models

Title: Federated Cross-Modal Style-Aware Prompt Generation

Title: MPCAR: Multi-Perspective Contextual Augmentation for Enhanced Visual Reasoning in Large Vision-Language Models

Title: TiP4GEN: Text to Immersive Panorama 4D Scene Generation

Title: Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations

Title: X-Ray-CoT: Interpretable Chest X-ray Diagnosis with Vision-Language Models via Chain-of-Thought Reasoning

Title: Standardization of Neuromuscular Reflex Analysis -- Role of Fine-Tuned Vision-Language Model Consortium and OpenAI gpt-oss Reasoning LLM Enabled Decision Support System

Title: Design and Validation of a Responsible Artificial Intelligence-based System for the Referral of Diabetic Retinopathy Patients

Title: LangVision-LoRA-NAS: Neural Architecture Search for Variable LoRA Rank in Vision Language Models

Title: An Initial Study of Bird's-Eye View Generation for Autonomous Vehicles using Cross-View Transformers

Title: Toward Architecture-Agnostic Local Control of Posterior Collapse in VAEs

Title: REVEAL -- Reasoning and Evaluation of Visual Evidence through Aligned Language

Title: Illuminating LLM Coding Agents: Visual Analytics for Deeper Understanding and Enhancement

Title: Physics-informed deep operator network for traffic state estimation

Title: A Hybrid Surrogate for Electric Vehicle Parameter Estimation and Power Consumption via Physics-Informed Neural Operators

Title: ViLaD: A Large Vision Language Diffusion Framework for End-to-End Autonomous Driving

Title: ViDA-UGC: Detailed Image Quality Analysis via Visual Distortion Assessment for UGC Images

Title: Creative4U: MLLMs-based Advertising Creative Image Selector with Comparative Reasoning

Title: FlowMol3: Flow Matching for 3D De Novo Small-Molecule Generation

Title: Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery

Title: Stable Diffusion-Based Approach for Human De-Occlusion

Title: BUILDA: A Thermal Building Data Generation Framework for Transfer Learning

Title: Drifting Away from Truth: GenAI-Driven News Diversity Challenges LVLM-Based Misinformation Detection

Title: Single-Reference Text-to-Image Manipulation with Dual Contrastive Denoising Score

Title: A Multi-Resolution Benchmark Framework for Spatial Reasoning Assessment in Neural Networks

Title: D2-Mamba: Dual-Scale Fusion and Dual-Path Scanning with SSMs for Shadow Removal

Title: Next Visual Granularity Generation

Title: DEEP-SEA: Deep-Learning Enhancement for Environmental Perception in Submerged Aquatics

Title: S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models

Title: CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis

Title: CMF-IoU: Multi-Stage Cross-Modal Fusion 3D Object Detection with IoU Joint Prediction

Title: 7Bench: a Comprehensive Benchmark for Layout-guided Text-to-image Models

Title: Lumen: Consistent Video Relighting and Harmonious Background Replacement with Video Generative Models

Title: Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation

Title: Omni Survey for Multimodality Analysis in Visual Object Tracking

Title: Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model

Title: EgoTwin: Dreaming Body and View in First Person

Title: Eyes on the Image: Gaze Supervised Multimodal Learning for Chest X-ray Diagnosis and Report Generation

Title: ID-Card Synthetic Generation: Toward a Simulated Bona fide Dataset

Title: DMS:Diffusion-Based Multi-Baseline Stereo Generation for Improving Self-Supervised Depth Estimation

Title: Precise Action-to-Video Generation Through Visual Action Prompts

Title: MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models

Title: 4DNeX: Feed-Forward 4D Generative Modeling Made Easy