2025-07-01

Title: Robust Perspective Correction for Real-World Crack Evolution Tracking in Image-Based Structural Health Monitoring

Title: Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

Title: Visual-Semantic Knowledge Conflicts in Operating Rooms: Synthetic Data Curation for Surgical Risk Perception in Multimodal Large Language Models

Title: Weakly Supervised Object Segmentation by Background Conditional Divergence

Title: SABRE-FL: Selective and Accurate Backdoor Rejection for Federated Prompt Learning

Title: AgentStealth: Reinforcing Large Language Model for Anonymizing User-generated Text

Title: FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise Alignment

Title: Towards Text-free Graph Foundation Models: Rethinking Multi-Domain Graph Contrastive Learning

Title: Lightning the Night with Generative Artificial Intelligence

Title: In-context learning for the classification of manipulation techniques in phishing emails

Title: A Survey on Model Extraction Attacks and Defenses for Large Language Models

Title: Unifying Biomedical Vision-Language Expertise: Towards a Generalist Foundation Model via Multi-CLIP Knowledge Distillation

Title: CaO$_2$: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation

Title: 3D Shape Generation: A Survey

Title: Mitigating Semantic Collapse in Generative Personalization with a Surprisingly Simple Test-Time Embedding Adjustment

Title: Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

Title: Multimodal Atmospheric Super-Resolution With Deep Generative Models

Title: PhonemeFake: Redefining Deepfake Realism with Language-Driven Segmental Manipulation and Adaptive Bilevel Detection

Title: RGE-GS: Reward-Guided Expansive Driving Scene Reconstruction via Diffusion Priors

Title: Riemannian-Geometric Fingerprints of Generative Models

Title: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate

Title: Prompting without Panic: Attribute-aware, Zero-shot, Test-Time Calibration

Title: Listener-Rewarded Thinking in VLMs for Image Preferences

Title: SemFaceEdit: Semantic Face Editing on Generative Radiance Manifolds

Title: xLSTMAD: A Powerful xLSTM-based Method for Anomaly Detection

Title: STR-Match: Matching SpatioTemporal Relevance Score for Training-Free Video Editing

Title: Towards Time Series Generation Conditioned on Unstructured Natural Language

Title: Peccavi: Visual Paraphrase Attack Safe and Distortion Free Image Watermarking Technique for AI-Generated Images

Title: On the Generalizability of "Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals"

Title: Cybersecurity-Focused Anomaly Detection in Connected Autonomous Vehicles Using Machine Learning

Title: Kernel Outlier Detection

Title: Inpainting is All You Need: A Diffusion-based Augmentation Method for Semi-supervised Medical Image Segmentation

Title: Ovis-U1 Technical Report

Title: Double-Diffusion: Diffusion Conditioned Diffusion Probabilistic Model For Air Quality Prediction

Title: Dare to Plagiarize? Plagiarized Painting Recognition and Retrieval

Title: VisualPrompter: Prompt Optimization with Visual Feedback for Text-to-Image Synthesis

Title: Learning-to-Context Slope: Evaluating In-Context Learning Effectiveness Beyond Performance Illusions

Title: V-SYNTHESIS: Task-Agnostic Synthesis of Consistent and Diverse In-Context Demonstrations from Scratch via V-Entropy

Title: Self-Supervised Contrastive Learning for Multi-Label Images

Title: Data Can Speak for Itself: Quality-guided Utilization of Wireless Synthetic Data

Title: Attribution assignment for deep-generative sequence models enables interpretability analysis using positive-only data

Title: BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion

Title: PixelBoost: Leveraging Brownian Motion for Realistic-Image Super-Resolution

Title: Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Title: Why Settle for One? Text-to-ImageSet Generation and Evaluation

Title: Autoregressive Denoising Score Matching is a Good Video Anomaly Detector

Title: MoMa: Modulating Mamba for Adapting Image Foundation Models to Video Recognition

Title: Hierarchical Quantized Diffusion Based Tree Generation Method for Hierarchical Representation and Lineage Analysis

Title: Objective-Free Local Learning and Emergent Language Structure in Thinking Machines

Title: DiffFit: Disentangled Garment Warping and Texture Refinement for Virtual Try-On

Title: Securing AI Systems: A Guide to Known Attacks and Impacts

Title: FastSeg: Efficient Training-Free Open-Vocabulary Segmentation via Hierarchical Attention Refinement Method

Title: IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering

Title: Federated Timeline Synthesis: Scalable and Private Methodology For Model Training and Deployment

Title: OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions

Title: When Additive Noise Meets Unobserved Mediators: Bivariate Denoising Diffusion for Causal Discovery

Title: Pipelined Decoder for Efficient Context-Aware Text Generation

Title: PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Title: Enhancing Insider Threat Detection Using User-Based Sequencing and Transformer Encoders

Title: Contrastive Learning with Diffusion Features for Weakly Supervised Medical Image Segmentation

Title: Time-variant Image Inpainting via Interactive Distribution Transition Estimation

Title: Reconciling Attribute and Structural Anomalies for Improved Graph Anomaly Detection

Title: MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting

Title: ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models

Title: WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image

Title: When Test-Time Adaptation Meets Self-Supervised Models

Title: Uncertainty-aware Diffusion and Reinforcement Learning for Joint Plane Localization and Anomaly Diagnosis in 3D Ultrasound

Title: Pyramidal Patchification Flow for Visual Generation

Title: JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching

Title: Metadata, Wavelet, and Time Aware Diffusion Models for Satellite Image Super Resolution

Title: StackCLIP: Clustering-Driven Stacked Prompt in Zero-Shot Industrial Anomaly Detection

Title: Transition Matching: Scalable and Flexible Generative Modeling

Title: CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Title: When Will It Fail?: Anomaly to Prompt for Forecasting Future Anomalies in Time Series

Title: SG-LDM: Semantic-Guided LiDAR Generation via Latent-Aligned Diffusion

Title: PGOV3D: Open-Vocabulary 3D Semantic Segmentation with Partial-to-Global Curriculum

Title: TurboVSR: Fantastic Video Upscalers and Where to Find Them

Title: Blending Concepts with Text-to-Image Diffusion Models

Title: Unified Multimodal Understanding via Byte-Pair Visual Encoding

Title: VAP-Diffusion: Enriching Descriptions with MLLMs for Enhanced Medical Image Generation

Title: MReg: A Novel Regression Model with MoE-based Video Feature Mining for Mitral Regurgitation Diagnosis

Title: On the Domain Robustness of Contrastive Vision-Language Models

Title: A Unified Framework for Stealthy Adversarial Generation via Latent Optimization and Transferability Enhancement

Title: SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation

Title: Subjective Camera: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion

Title: System-Embedded Diffusion Bridge Models

Title: Proteus-ID: ID-Consistent and Motion-Coherent Video Customization

Title: Radioactive Watermarks in Diffusion and Autoregressive Image Generative Models

Title: Can We Challenge Open-Vocabulary Object Detectors with Generated Content in Street Scenes?

Title: Controllable Reference-Based Real-World Remote Sensing Image Super-Resolution with Generative Diffusion Priors

Title: Adaptive Out-of-Control Point Pattern Detection in Sequential Random Finite Set Observations

Title: MadCLIP: Few-shot Medical Anomaly Detection with CLIP

Title: Refine Any Object in Any Scene

Title: VMoBA: Mixture-of-Block Attention for Video Diffusion Models

Title: Bridging the Gap with Retrieval-Augmented Generation: Making Prosthetic Device User Manuals Available in Marginalised Languages

Title: Visual and Memory Dual Adapter for Multi-Modal Object Tracking

Title: Ella: Embodied Social Agents with Lifelong Memory

Title: Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data

Title: Faster Diffusion Models via Higher-Order Approximation

Title: Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios

Title: Imagine for Me: Creative Conceptual Blending of Real Images and Text via Blended Attention

Title: MotionGPT3: Human Motion as a Second Modality

Title: Epona: Autoregressive Diffusion World Model for Autonomous Driving

Title: TextMesh4D: High-Quality Text-to-4D Mesh Generation

Title: Calligrapher: Freestyle Text Image Customization