2025-02-24

Title: KKA: Improving Vision Anomaly Detection through Anomaly-related Knowledge from Large Language Models

Title: A Comprehensive Survey on Concept Erasure in Text-to-Image Diffusion Models

Title: KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

Title: LAVID: An Agentic LVLM Framework for Diffusion-Generated Video Detection

Title: Generative Modeling of Individual Behavior at Scale

Title: Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios

Title: Hardware-Friendly Static Quantization Method for Video Diffusion Transformers

Title: M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment

Title: Methods and Trends in Detecting Generated Images: A Comprehensive Review

Title: FlipConcept: Tuning-Free Multi-Concept Personalization for Text-to-Image Generation

Title: Omnidirectional Image Quality Captioning: A Large-scale Database and A New Model

Title: Efficiently Solving Discounted MDPs with Predictions on Transition Matrices

Title: Weakly Supervised Video Scene Graph Generation via Natural Language Supervision

Title: LongCaptioning: Unlocking the Power of Long Caption Generation in Large Multimodal Models

Title: Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution

Title: MVIP -- A Dataset and Methods for Application Oriented Multi-View and Multi-Modal Industrial Part Recognition

Title: Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation

Title: Decoding for Punctured Convolutional and Turbo Codes: A Deep Learning Solution for Protocols Compliance

Title: CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution

Title: Network Resource Optimization for ML-Based UAV Condition Monitoring with Vibration Analysis

Title: Activation Steering in Neural Theorem Provers

Title: SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

Title: Bridging vision language model (VLM) evaluation gaps with a framework for scalable and cost-effective benchmark generation

Title: Improving the Scaling Laws of Synthetic Data with Deliberate Practice

Title: WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents

Title: The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer

Title: VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

Title: One-step Diffusion Models with $f$-Divergence Distribution Matching