2025-04-16

Title: Beyond the Generative Learning Trilemma: Generative Model Assessment in Data Scarcity Domains

Title: VAE-based Feature Disentanglement for Data Augmentation and Compression in Generalized GNSS Interference Classification

Title: H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models

Title: Demo: ViolentUTF as An Accessible Platform for Generative AI Red Teaming

Title: Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling

Title: Improving In-Context Learning with Reasoning Distillation

Title: H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Title: Achieving Optimal Tissue Repair Through MARL with Reward Shaping and Curriculum Learning

Title: SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models

Title: Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization

Title: Power-scaled Bayesian Inference with Score-based Generative mModels

Title: Tabular foundation model to detect empathy from visual cues

Title: GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR

Title: IlluSign: Illustrating Sign Language Videos by Leveraging the Attention Mechanism

Title: OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding

Title: LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation

Title: Moving Beyond Next-Token Prediction: Transformers are Context-Sensitive Language Generators

Title: How to Enhance Downstream Adversarial Robustness (almost) without Touching the Pre-Trained Foundation Model?

Title: Enhancing Features in Long-tailed Data Using Large Vision Mode

Title: PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning

Title: LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation

Title: Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models

Title: Bridging Distribution Gaps in Time Series Foundation Model Pretraining with Prototype-Guided Normalization

Title: InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation

Title: Towards A Universal Graph Structural Encoder

Title: Transfer Learning for Temporal Link Prediction

Title: AFiRe: Anatomy-Driven Self-Supervised Learning for Fine-Grained Representation in Radiographic Images

Title: Self-Supervised Enhancement of Forward-Looking Sonar Images: Bridging Cross-Modal Degradation Gaps through Feature Space Transformation and Multi-Frame Fusion

Title: ProtFlow: Fast Protein Sequence Design via Flow Matching on Compressed Protein Language Model Embeddings

Title: TMCIR: Token Merge Benefits Composed Image Retrieval

Title: AnimeDL-2M: Million-Scale AI-Generated Anime Image Detection and Localization in Diffusion Era

Title: Defending Against Frequency-Based Attacks with Diffusion Models

Title: Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models

Title: Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections

Title: UKDM: Underwater keypoint detection and matching using underwater image enhancement techniques

Title: Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting

Title: Using LLMs as prompt modifier to avoid biases in AI image generators

Title: Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models

Title: Taming Consistency Distillation for Accelerated Human Image Animation

Title: SAR-to-RGB Translation with Latent Diffusion for Earth Observation

Title: TerraMind: Large-Scale Generative Multimodality for Earth Observation

Title: TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data

Title: R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

Title: Single-Input Multi-Output Model Merging: Leveraging Foundation Models for Dense Multi-Task Learning

Title: UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer

Title: Autoregressive Distillation of Diffusion Transformers

Title: Seedream 3.0 Technical Report

Title: DeepWheel: Generating a 3D Synthetic Wheel Dataset for Design and Performance Evaluation

Title: OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution

Title: ADT: Tuning Diffusion Models with Adversarial Supervision

Title: NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

Title: Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion

Title: Elucidating the Design Space of Multimodal Protein Language Models

Title: Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception