2025-08-08

Title: LumiGen: An LVLM-Enhanced Iterative Framework for Fine-Grained Text-to-Image Generation

Title: Edge-Assisted Collaborative Fine-Tuning for Multi-User Personalized Artificial Intelligence Generated Content (AIGC)

Title: Uncertainty-aware Predict-Then-Optimize Framework for Equitable Post-Disaster Power Restoration

Title: RetinexDual: Retinex-based Dual Nature Approach for Generalized Ultra-High-Definition Image Restoration

Title: Single-Step Reconstruction-Free Anomaly Detection and Segmentation via Diffusion Models

Title: Unified Flow Matching for Long Horizon Event Forecasting

Title: Accelerating Conditional Prompt Learning via Masked Image Modeling for Vision-Language Models

Title: TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Title: Steering One-Step Diffusion Model with Fidelity-Rich Decoder for Fast Image Compression

Title: AU-IQA: A Benchmark Dataset for Perceptual Quality Assessment of AI-Enhanced User-Generated Content

Title: A Novel Image Similarity Metric for Scene Composition Structure

Title: Automatic Image Colorization with Convolutional Neural Networks and Generative Adversarial Networks

Title: FLUX-Makeup: High-Fidelity, Identity-Consistent, and Robust Makeup Transfer via Diffusion Transformer

Title: Integrated Influence: Data Attribution with Baseline

Title: PoseGen: In-Context LoRA Finetuning for Pose-Controllable Long Human Video Generation

Title: Exploring Superior Function Calls via Reinforcement Learning

Title: Latent Expression Generation for Referring Image Segmentation and Grounding

Title: Rotation Equivariant Arbitrary-scale Image Super-Resolution

Title: X-MoGen: Unified Motion Generation across Humans and Animals

Title: Multi-tracklet Tracking for Generic Targets with Adaptive Detection Clustering

Title: FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in finance

Title: ReasoningTrack: Chain-of-Thought Reasoning for Long-term Vision-Language Tracking

Title: ArbiViewGen: Controllable Arbitrary Viewpoint Camera Data Generation for Autonomous Driving via Stable Diffusion Models

Title: SGDFuse: SAM-Guided Diffusion for High-Fidelity Infrared and Visible Image Fusion

Title: B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding

Title: mKG-RAG: Multimodal Knowledge Graph-Enhanced RAG for Visual Question Answering

Title: PriorRG: Prior-Guided Contrastive Pre-training and Coarse-to-Fine Decoding for Chest X-ray Report Generation

Title: CT-GRAPH: Hierarchical Graph Attention Network for Anatomy-Guided CT Report Generation

Title: Echo: Decoupling Inference and Training for Large-Scale RL Alignment on Heterogeneous Swarms

Title: UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation

Title: MolSnap: Snap-Fast Molecular Generation with Latent Variational Mean Flow

Title: Discovering Interpretable Programmatic Policies via Multimodal LLM-assisted Evolutionary Search

Title: EnergyPatchTST: Multi-scale Time Series Transformers with Uncertainty Estimation for Energy Forecasting

Title: FS-IQA: Certified Feature Smoothing for Robust Image Quality Assessment

Title: When Deepfake Detection Meets Graph Neural Network:a Unified and Lightweight Learning Framework

Title: Tractable Sharpness-Aware Learning of Probabilistic Circuits

Title: Follow-Your-Instruction: A Comprehensive MLLM Agent for World Data Synthesis

Title: WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction

Title: LLaVA-RE: Binary Image-Text Relevancy Evaluation with Multimodal Large Language Model

Title: Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision

Title: Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Title: TrajEvo: Trajectory Prediction Heuristics Design via LLM-driven Evolution

Title: GAP: Gaussianize Any Point Clouds with Text Guidance

Title: FaceAnonyMixer: Cancelable Faces via Identity Consistent Latent Space Mixing