2025-07-01

Title: Counting with Confidence: Accurate Pest Monitoring in Water Traps

Title: Hierarchical Adversarially-Resilient Multi-Agent Reinforcement Learning for Cyber-Physical Systems Security

Title: Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

Title: Visual-Semantic Knowledge Conflicts in Operating Rooms: Synthetic Data Curation for Surgical Risk Perception in Multimodal Large Language Models

Title: Weakly Supervised Object Segmentation by Background Conditional Divergence

Title: Lightning the Night with Generative Artificial Intelligence

Title: Preserve Anything: Controllable Image Synthesis with Object Preservation

Title: ReCo: Reminder Composition Mitigates Hallucinations in Vision-Language Models

Title: 3D Shape Generation: A Survey

Title: Mitigating Semantic Collapse in Generative Personalization with a Surprisingly Simple Test-Time Embedding Adjustment

Title: LightBSR: Towards Lightweight Blind Super-Resolution via Discriminative Implicit Degradation Representation Learning

Title: UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

Title: Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

Title: VSRM: A Robust Mamba-Based Framework for Video Super-Resolution

Title: Multimodal Atmospheric Super-Resolution With Deep Generative Models

Title: PhonemeFake: Redefining Deepfake Realism with Language-Driven Segmental Manipulation and Adaptive Bilevel Detection

Title: RGE-GS: Reward-Guided Expansive Driving Scene Reconstruction via Diffusion Priors

Title: Riemannian-Geometric Fingerprints of Generative Models

Title: Listener-Rewarded Thinking in VLMs for Image Preferences

Title: SemFaceEdit: Semantic Face Editing on Generative Radiance Manifolds

Title: Neural Cellular Automata: From Cells to Pixels

Title: MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering

Title: Point Cloud Compression and Objective Quality Assessment: A Survey

Title: Towards Time Series Generation Conditioned on Unstructured Natural Language

Title: Infinite Sampling: Efficient and Stable Grouped RL Training for Large Language Models

Title: Peccavi: Visual Paraphrase Attack Safe and Distortion Free Image Watermarking Technique for AI-Generated Images

Title: A Reinforcement Learning Approach for Optimal Control in Microgrids

Title: Inpainting is All You Need: A Diffusion-based Augmentation Method for Semi-supervised Medical Image Segmentation

Title: Ovis-U1 Technical Report

Title: Double-Diffusion: Diffusion Conditioned Diffusion Probabilistic Model For Air Quality Prediction

Title: Learning Counterfactually Decoupled Attention for Open-World Model Attribution

Title: Dare to Plagiarize? Plagiarized Painting Recognition and Retrieval

Title: RoboScape: Physics-informed Embodied World Model

Title: VisualPrompter: Prompt Optimization with Visual Feedback for Text-to-Image Synthesis

Title: AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation

Title: Data Can Speak for Itself: Quality-guided Utilization of Wireless Synthetic Data

Title: Attribution assignment for deep-generative sequence models enables interpretability analysis using positive-only data

Title: BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion

Title: Single Image Inpainting and Super-Resolution with Simultaneous Uncertainty Guarantees by Universal Reproducing Kernels

Title: High-quality Pseudo-labeling for Point Cloud Segmentation with Scene-level Annotation

Title: PixelBoost: Leveraging Brownian Motion for Realistic-Image Super-Resolution

Title: Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Title: Why Settle for One? Text-to-ImageSet Generation and Evaluation

Title: Autoregressive Denoising Score Matching is a Good Video Anomaly Detector

Title: Hierarchical Quantized Diffusion Based Tree Generation Method for Hierarchical Representation and Lineage Analysis

Title: DDL: A Dataset for Interpretable Deepfake Detection and Localization in Real-World Scenarios

Title: DiffFit: Disentangled Garment Warping and Texture Refinement for Virtual Try-On

Title: Endo-4DGX: Robust Endoscopic Scene Reconstruction and Illumination Correction with Gaussian Splatting

Title: IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering

Title: VALID-Mol: a Systematic Framework for Validated LLM-Assisted Molecular Design

Title: CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation

Title: Federated Timeline Synthesis: Scalable and Private Methodology For Model Training and Deployment

Title: OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions

Title: Why Settle for Mid: A Probabilistic Viewpoint to Spatial Relationship Alignment in Text-to-image Models

Title: PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Title: Contrastive Learning with Diffusion Features for Weakly Supervised Medical Image Segmentation

Title: Time-variant Image Inpainting via Interactive Distribution Transition Estimation

Title: MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting

Title: ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models

Title: WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image

Title: Pyramidal Patchification Flow for Visual Generation

Title: JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching

Title: Metadata, Wavelet, and Time Aware Diffusion Models for Satellite Image Super Resolution

Title: Transition Matching: Scalable and Flexible Generative Modeling

Title: CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Title: AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval

Title: SG-LDM: Semantic-Guided LiDAR Generation via Latent-Aligned Diffusion

Title: TurboVSR: Fantastic Video Upscalers and Where to Find Them

Title: Revisiting Audio-Visual Segmentation with Vision-Centric Transformer

Title: Blending Concepts with Text-to-Image Diffusion Models

Title: VAP-Diffusion: Enriching Descriptions with MLLMs for Enhanced Medical Image Generation

Title: A Unified Framework for Stealthy Adversarial Generation via Latent Optimization and Transferability Enhancement

Title: SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation

Title: Subjective Camera: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion

Title: System-Embedded Diffusion Bridge Models

Title: Radioactive Watermarks in Diffusion and Autoregressive Image Generative Models

Title: Controllable Reference-Based Real-World Remote Sensing Image Super-Resolution with Generative Diffusion Priors

Title: Refine Any Object in Any Scene

Title: RGC-VQA: An Exploration Database for Robotic-Generated Video Quality Assessment

Title: VMoBA: Mixture-of-Block Attention for Video Diffusion Models

Title: Bridging the Gap with Retrieval-Augmented Generation: Making Prosthetic Device User Manuals Available in Marginalised Languages

Title: Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios

Title: Imagine for Me: Creative Conceptual Blending of Real Images and Text via Blended Attention

Title: MotionGPT3: Human Motion as a Second Modality

Title: WaRA: Wavelet Low Rank Adaptation

Title: DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World

Title: Epona: Autoregressive Diffusion World Model for Autonomous Driving

Title: TextMesh4D: High-Quality Text-to-4D Mesh Generation

Title: Calligrapher: Freestyle Text Image Customization

Title: FADRM: Fast and Accurate Data Residual Matching for Dataset Distillation