2024-12-10

Title: FodFoM: Fake Outlier Data by Foundation Models Creates Stronger Visual Out-of-Distribution Detector

Title: Self-Supervised Learning for Graph-Structured Data in Healthcare Applications: A Comprehensive Review

Title: Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models

Title: Generative Model-Based Fusion for Improved Few-Shot Semantic Segmentation of Infrared Images

Title: MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Title: Tabular data generation with tensor contraction layers and transformers

Title: DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA

Title: COOOL: Challenge Of Out-Of-Label A Novel Benchmark for Autonomous Driving

Title: Multi-Armed Bandit Approach for Optimizing Training on Synthetic Data

Title: Enhancing Sample Generation of Diffusion Models using Noise Level Correction

Title: A New Perspective on Time Series Anomaly Detection: Faster Patch-based Broad Learning System

Title: Street Gaussians without 3D Object Tracker

Title: Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation

Title: Dif4FF: Leveraging Multimodal Diffusion Models and Graph Neural Networks for Accurate New Fashion Product Performance Forecasting

Title: Do We Need to Design Specific Diffusion Models for Different Tasks? Try ONE-PIC

Title: Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising

Title: Biological Brain Age Estimation using Sex-Aware Adversarial Variational Autoencoder with Multimodal Neuroimages

Title: Efficient Continuous Video Flow Model for Video Prediction

Title: Hyperedge Anomaly Detection with Hypergraph Neural Network

Title: WATER-GS: Toward Copyright Protection for 3D Gaussian Splatting via Universal Watermarking

Title: Segment-Level Road Obstacle Detection Using Visual Foundation Model Priors and Likelihood Ratios

Title: On the effective transfer of knowledge from English to Hindi Wikipedia

Title: PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks

Title: Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent

Title: A Tiered GAN Approach for Monet-Style Image Generation

Title: Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events

Title: BudgetFusion: Perceptually-Guided Adaptive Diffusion Models

Title: Open-Source Acceleration of Stable-Diffusion.cpp

Title: Language-Guided Image Tokenization for Generation

Title: Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation

Title: Self-Guidance: Boosting Flow and Diffusion Generation on Their Own

Title: CSG: A Context-Semantic Guided Diffusion Approach in De Novo Musculoskeletal Ultrasound Image Generation

Title: MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Title: 3D-Consistent Image Inpainting with Diffusion Models

Title: MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One Day

Title: XKV: Personalized KV Cache Memory Reduction for Long-Context LLM Inference

Title: Accelerating Video Diffusion Models via Distribution Matching

Title: GBR: Generative Bundle Refinement for High-fidelity Gaussian Splatting and Meshing

Title: BiDM: Pushing the Limit of Quantization for Diffusion Models

Title: Enhanced 3D Generation by 2D Editing

Title: Anti-Reference: Universal and Immediate Defense Against Reference-Based Generation

Title: Nested Diffusion Models Using Hierarchical Latent Priors

Title: Enhancing Content Representation for AR Image Quality Assessment Using Knowledge Distillation

Title: Post-hoc Probabilistic Vision-Language Models

Title: siForest: Detecting Network Anomalies with Set-Structured Isolation Forest

Title: Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation

Title: FlexDiT: Dynamic Token Density Control for Diffusion Transformer

Title: Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training

Title: Perceptual Hash Inversion Attacks on Image-Based Sexual Abuse Removal Tools

Title: Are foundation models for computer vision good conformal predictors?

Title: GraPE: A Generate-Plan-Edit Framework for Compositional T2I Synthesis

Title: SGIA: Enhancing Fine-Grained Visual Classification with Sequence Generative Image Augmentation

Title: Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

Title: ASGDiffusion: Parallel High-Resolution Generation with Asynchronous Structure Guidance

Title: Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity

Title: Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

Title: VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition

Title: U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening

Title: A Comparative Study of Learning Paradigms in Large Language Models via Intrinsic Dimension

Title: Rendering-Refined Stable Diffusion for Privacy Compliant Synthetic Data

Title: Flow Matching Guide and Code

Title: Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction

Title: No Annotations for Object Detection in Art through Stable Diffusion

Title: See Further When Clear: Curriculum Consistency Model

Title: HAIFAI: Human-AI Collaboration for Mental Face Reconstruction

Title: Normalizing Flows are Capable Generative Models

Title: TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions

Title: UniPaint: Unified Space-time Video Inpainting via Mixture-of-Experts

Title: Is Self-Supervision Enough? Benchmarking Foundation Models Against End-to-End Training for Mitotic Figure Classification

Title: Measuring Pre-training Data Quality without Labels for Time Series Foundation Models

Title: Exploring Memorization and Copyright Violation in Frontier LLMs: A Study of the New York Times v. OpenAI 2023 Lawsuit

Title: Exploring the Impact of Synthetic Data on Human Gesture Recognition Tasks Using GANs

Title: Generative Lines Matching Models

Title: Can foundation models actively gather information in interactive environments to test hypotheses?

Title: Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

Title: Gated Delta Networks: Improving Mamba2 with Delta Rule

Title: Small Languages, Big Models: A Study of Continual Training on Languages of Norway

Title: AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

Title: MoViE: Mobile Diffusion for Video Editing

Title: Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey

Title: MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences

Title: MAVias: Mitigate any Visual Bias

Title: Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers

Title: Detecting Facial Image Manipulations with Multi-Layer CNN Models

Title: Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion

Title: Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone

Title: Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Title: You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale

Title: Facade: High-Precision Insider Threat Detection Using Deep Contextual Anomaly Detection

Title: Parkinson's Disease Diagnosis Through Deep Learning: A Novel LSTM-Based Approach for Freezing of Gait Detection

Title: How to Merge Your Multimodal Models Over Time?

Title: Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection

Title: ContRail: A Framework for Realistic Railway Image Synthesis using ControlNet

Title: ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities

Title: InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention

Title: Visual Lexicon: Rich Image Features in Language Space

Title: Diverse Score Distillation

Title: Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Title: Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation

Title: Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

Title: [MASK] is All You Need