2025-04-15

Title: Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning

Title: Embedding Hidden Adversarial Capabilities in Pre-Trained Diffusion Models

Title: PriM: Principle-Inspired Material Discovery through Multi-Agent Collaboration

Title: Analogical Learning for Cross-Scenario Generalization: Framework and Application to Intelligent Localization

Title: Datum-wise Transformer for Synthetic Tabular Data Detection in the Wild

Title: ML For Hardware Design Interpretability: Challenges and Opportunities

Title: Knowledge Graph-extended Retrieval Augmented Generation for Question Answering

Title: Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries

Title: LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

Title: An Adaptive Vector Index Partitioning Scheme for Low-Latency RAG Pipeline

Title: MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer

Title: AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models

Title: Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization

Title: UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance

Title: Synthetic Aircraft Trajectory Generation Using Time-Based VQ-VAE

Title: Multi-modal and Multi-view Fundus Image Fusion for Retinopathy Diagnosis via Multi-scale Cross-attention and Shifted Window Self-attention

Title: MASH: Masked Anchored SpHerical Distances for 3D Shape Representation and Generation

Title: MatWheel: Addressing Data Scarcity in Materials Science Through Synthetic Data

Title: Type-Constrained Code Generation with Language Models

Title: FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment

Title: Head-Aware KV Cache Compression for Efficient Visual Autoregressive Modeling

Title: Towards Explainable Partial-AIGC Image Quality Assessment

Title: MedIL: Implicit Latent Spaces for Generating Heterogeneous Medical Images at Arbitrary Resolutions

Title: Text To 3D Object Generation For Scalable Room Assembly

Title: REMEMBER: Retrieval-based Explainable Multimodal Evidence-guided Modeling for Brain Evaluation and Reasoning in Zero- and Few-shot Neurodegenerative Diagnosis

Title: Beyond Degradation Conditions: All-in-One Image Restoration via HOG Transformers

Title: Structure-Accurate Medical Image Translation based on Dynamic Frequency Balance and Knowledge Guidance

Title: FractalForensics: Proactive Deepfake Detection and Localization via Fractal Watermarks

Title: D$^2$iT: Dynamic Diffusion Transformer for Accurate Image Generation

Title: Comorbidity-Informed Transfer Learning for Neuro-developmental Disorder Diagnosis

Title: CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models

Title: GenEDA: Unleashing Generative Reasoning on Netlist via Multimodal Encoder-Decoder Aligned Foundation Model

Title: PCM-SAR: Physics-Driven Contrastive Mutual Learning for SAR Classification

Title: DiffuMural: Restoring Dunhuang Murals with Multi-scale Diffusion

Title: 3D CoCa: Contrastive Learners are 3D Captioners

Title: AeroLite: Tag-Guided Lightweight Generation of Aerial Image Captions

Title: Trajectory-guided Motion Perception for Facial Expression Quality Assessment in Neurological Disorders

Title: FastRSR: Efficient and Accurate Road Surface Reconstruction from Bird's Eye View

Title: SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification

Title: Mitigating Long-tail Distribution in Oracle Bone Inscriptions: Dataset, Model, and Benchmark

Title: DualPrompt-MedCap: A Dual-Prompt Enhanced Approach for Medical Image Captioning

Title: Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training

Title: KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation

Title: Computer-Aided Layout Generation for Building Design: A Review

Title: Transformer-Based Representation Learning for Robust Gene Expression Modeling and Cancer Prognosis

Title: Dynamical symmetries in the fluctuation-driven regime: an application of Noether's theorem to noisy dynamical systems

Title: EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise

Title: ST-Booster: An Iterative SpatioTemporal Perception Booster for Vision-and-Language Navigation in Continuous Environments

Title: Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

Title: Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution

Title: KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference

Title: Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes

Title: Beyond Degradation Redundancy: Contrastive Prompt Learning for All-in-One Image Restoration

Title: Metric-Guided Synthesis of Class Activation Mapping

Title: GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting

Title: Masked Autoencoder Self Pre-Training for Defect Detection in Microelectronics

Title: Aligning Anime Video Generation with Human Feedback

Title: Global and Local Mamba Network for Multi-Modality Medical Image Super-Resolution

Title: SoccerNet-v3D: Leveraging Sports Broadcast Replays for 3D Scene Understanding

Title: The Impact of Model Zoo Size and Composition on Weight Space Learning

Title: GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions

Title: Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers

Title: Efficient Generative Model Training via Embedded Representation Warmup

Title: VibrantLeaves: A principled parametric image generator for training deep restoration models

Title: ROSFD: Robust Online Streaming Fraud Detection with Resilience to Concept Drift in Data Streams

Title: A Model Zoo of Vision Transformers

Title: XY-Cut++: Advanced Layout Ordering via Hierarchical Mask Mechanism on a Novel Benchmark

Title: $α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

Title: ESCT3D: Efficient and Selectively Controllable Text-Driven 3D Content Generation with Gaussian Splatting

Title: SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model

Title: InstructEngine: Instruction-driven Text-to-Image Alignment

Title: FingER: Content Aware Fine-grained Evaluation with Reasoning for AI-Generated Videos

Title: PG-DPIR: An efficient plug-and-play method for high-count Poisson-Gaussian inverse problems

Title: HUMOTO: A 4D Dataset of Mocap Human Object Interactions

Title: MonoDiff9D: Monocular Category-Level 9D Object Pose Estimation via Diffusion Model

Title: Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing

Title: M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Title: Art3D: Training-Free 3D Generation from Flat-Colored Illustration

Title: Weight Ensembling Improves Reasoning in Language Models

Title: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Title: REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers

Title: Decoupled Diffusion Sparks Adaptive Scene Generation