2025-08-07

Title: Text2VR: Automated instruction Generation in Virtual Reality using Large language Models for Assembly Task

Title: CX-Mind: A Pioneering Multimodal Large Language Model for Interleaved Reasoning in Chest X-ray via Curriculum-Guided Reinforcement Learning

Title: StorySync: Training-Free Subject Consistency in Text-to-Image Generation via Region Harmonization

Title: Refine-IQA: Multi-Stage Reinforcement Finetuning for Perceptual Image Quality Assessment

Title: LLM-Prior: A Framework for Knowledge-Driven Prior Elicitation and Aggregation

Title: Provably Near-Optimal Distributionally Robust Reinforcement Learning in Online Settings

Title: 4D-PreNet: A Unified Preprocessing Framework for 4D-STEM Data Analysis

Title: HPSv3: Towards Wide-Spectrum Human Preference Score

Title: Point-Based Shape Representation Generation with a Correspondence-Preserving Diffusion Model

Title: Next Generation Equation-Free Multiscale Modelling of Crowd Dynamics via Machine Learning

Title: RAVID: Retrieval-Augmented Visual Detection: A Knowledge-Driven Approach for AI-Generated Image Identification

Title: Dynamic User-controllable Privacy-preserving Few-shot Sensing Framework

Title: CAD-Judge: Toward Efficient Morphological Grading and Verification for Text-to-CAD Generation

Title: $\text{S}^2$Q-VDiT: Accurate Quantized Video Diffusion Transformer with Salient Data and Sparse Token Distillation

Title: SPJFNet: Self-Mining Prior-Guided Joint Frequency Enhancement for Ultra-Efficient Dark Image Restoration

Title: VisualTrans: A Benchmark for Real-World Visual Transformation Reasoning

Title: Motion is the Choreographer: Learning Latent Pose Dynamics for Seamless Sign Language Generation

Title: DOMR: Establishing Cross-View Segmentation via Dense Object Matching

Title: Uni-DocDiff: A Unified Document Restoration Model Based on Diffusion

Title: Bridging Diffusion Models and 3D Representations: A 3D Consistent Super-Resolution Framework

Title: Model Inversion Attacks on Vision-Language Models: Do They Leak What They Learn?

Title: Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decode

Title: Conditional Latent Diffusion Models for Zero-Shot Instance Segmentation

Title: COPO: Consistency-Aware Policy Optimization

Title: IDCNet: Guided Video Diffusion for Metric-Consistent RGBD Scene Generation with Precise Camera Control

Title: ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation

Title: Audio-Assisted Face Video Restoration with Temporal and Identity Complementary Learning

Title: Semi-Supervised Deep Domain Adaptation for Predicting Solar Power Across Different Locations

Title: ToxicTAGS: Decoding Toxic Memes with Rich Tag Annotations

Title: One Small Step with Fingerprints, One Giant Leap for emph{De Novo} Molecule Generation from Mass Spectra

Title: Deeper Inside Deep ViT

Title: RPCANet++: Deep Interpretable Robust PCA for Sparse Object Segmentation

Title: From Learning to Unlearning: Biomedical Security Protection in Multimodal Large Language Models

Title: LayerT2V: Interactive Multi-Object Trajectory Layering for Video Generation

Title: Intention Enhanced Diffusion Model for Multimodal Pedestrian Trajectory Prediction

Title: DocVCE: Diffusion-based Visual Counterfactual Explanations for Document Image Classification

Title: TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

Title: TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understanding

Title: Cloud Model Characteristic Function Auto-Encoder: Integrating Cloud Model Theory with MMD Regularization for Enhanced Generative Modeling

Title: Automatic LLM Red Teaming

Title: Small transformer architectures for task switching

Title: CARD: Cache-Assisted Parallel Speculative Decoding for Efficient Large Language Model Inference

Title: 4DVD: Cascaded Dense-view Video Diffusion Model for High-quality 4D Content Generation

Title: Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model

Title: Emotion Detection Using Conditional Generative Adversarial Networks (cGAN): A Deep Learning Approach

Title: QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution

Title: RAIDX: A Retrieval-Augmented Generation and GRPO Reinforcement Learning Framework for Explainable Deepfake Detection

Title: MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning

Title: Drone Detection with Event Cameras

Title: Analyzing and Mitigating Object Hallucination: A Training Bias Perspective

Title: DDTracking: A Deep Generative Framework for Diffusion MRI Tractography with Streamline Local-Global Spatiotemporal Modeling

Title: Improved Training Strategies for Physics-Informed Neural Networks using Real Experimental Data in Aluminum Spot Welding

Title: Multitask Learning with Stochastic Interpolants

Title: CaPulse: Detecting Anomalies by Tuning in to the Causal Rhythms of Time Series