2025-09-30

Title: Pathological Truth Bias in Vision-Language Models

Title: GZSL-MoE: Apprentissage G{é}n{é}ralis{é} Z{é}ro-Shot bas{é} sur le M{é}lange d'Experts pour la Segmentation S{é}mantique de Nuages de Points 3DAppliqu{é} {à} un Jeu de Donn{é}es d'Environnement de Collaboration Humain-Robot

Title: LayoutAgent: A Vision-Language Agent Guided Compositional Diffusion for Spatial Layout Planning

Title: Responsible Diffusion: A Comprehensive Survey on Safety, Ethics, and Trust in Diffusion Models

Title: Enabling Approximate Joint Sampling in Diffusion LMs

Title: Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models

Title: Red Teaming Quantum-Resistant Cryptographic Standards: A Penetration Testing Framework Integrating AI and Quantum Security

Title: In-Context Learning can Perform Continual Learning Like Humans

Title: DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models

Title: VideoScore2: Think before You Score in Generative Video Evaluation

Title: ArFake: A Multi-Dialect Benchmark and Baselines for Arabic Spoof-Speech Detection

Title: Seeing Isn't Believing: Context-Aware Adversarial Patch Synthesis via Conditional GAN

Title: Adaptive Margin RLHF via Preference over Preferences

Title: Towards Generalizable Implicit In-Context Learning with Attention Routing

Title: ControlEvents: Controllable Synthesis of Event Camera Datawith Foundational Prior from Image Diffusion Models

Title: From Noise to Knowledge: A Comparative Study of Acoustic Anomaly Detection Models in Pumped-storage Hydropower Plants

Title: Convolutional Set Transformer

Title: Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings

Title: FishAI 2.0: Marine Fish Image Classification with Multi-modal Few-shot Learning

Title: LLMs Behind the Scenes: Enabling Narrative Scene Illustration

Title: What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?

Title: GDR-learners: Orthogonal Learning of Generative Models for Potential Outcomes

Title: Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

Title: Reinforcement Learning with Discrete Diffusion Policies for Combinatorial Action Spaces

Title: Emergent morpho-phonological representations in self-supervised speech models

Title: ARSS: Taming Decoder-only Autoregressive Visual Generation for View Synthesis From Single View

Title: Planning with Unified Multimodal Models

Title: On the Sheafification of Higher-Order Message Passing

Title: Copyright Infringement Detection in Text-to-Image Diffusion Models via Differential Privacy

Title: Dynamics of Learning: Generative Schedules from Latent ODEs

Title: Mask What Matters: Controllable Text-Guided Masking for Self-Supervised Medical Image Analysis

Title: CLAD-Net: Continual Activity Recognition in Multi-Sensor Wearable Systems

Title: Follow-Your-Preference: Towards Preference-Aligned Image Inpainting

Title: Demystifying Network Foundation Models

Title: Sensitivity Analysis for Diffusion Models

Title: d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching

Title: Streamline pathology foundation model by cross-magnification distillation

Title: How to Make Large Language Models Generate 100% Valid Molecules?

Title: Stochastic Interpolants via Conditional Dependent Coupling

Title: Impute-MACFM: Imputation based on Mask-Aware Flow Matching

Title: Benchmarking DINOv3 for Multi-Task Stroke Analysis on Non-Contrast CT

Title: Tree Reward-Aligned Search for TReASURe in Masked Diffusion Language Models

Title: CrystalGym: A New Benchmark for Materials Discovery Using Reinforcement Learning

Title: Dense associative memory on the Bures-Wasserstein space

Title: Sparse2Dense: A Keypoint-driven Generative Framework for Human Video Compression and Vertex Prediction

Title: From Harm to Help: Turning Reasoning In-Context Demos into Assets for Reasoning LMs

Title: Towards Monotonic Improvement in In-Context Reinforcement Learning

Title: More Data or Better Algorithms: Latent Diffusion Augmentation for Deep Imbalanced Regression

Title: OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting

Title: CREPE: Controlling Diffusion with Replica Exchange

Title: SynDoc: A Hybrid Discriminative-Generative Framework for Enhancing Synthetic Domain-Adaptive Document Key Information Extraction

Title: A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models

Title: Seeing Through the Blur: Unlocking Defocus Maps for Deepfake Detection

Title: Seeing the Unseen in Low-light Spike Streams

Title: Balanced Diffusion-Guided Fusion for Multimodal Remote Sensing Classification

Title: Learning to Reason in Structured In-context Environments with Reinforcement Learning

Title: Entering the Era of Discrete Diffusion Models: A Benchmark for Schrödinger Bridges and Entropic Optimal Transport

Title: Landing with the Score: Riemannian Optimization through Denoising

Title: UniPose: Unified Cross-modality Pose Prior Propagation towards RGB-D data for Weakly Supervised 3D Human Pose Estimation

Title: Generative Modeling of Shape-Dependent Self-Contact Human Poses

Title: WorldSplat: Gaussian-Centric Feed-Forward 4D Scene Generation for Autonomous Driving

Title: Planner Aware Path Learning in Diffusion Language Models Training

Title: Comparison of Scoring Rationales Between Large Language Models and Human Raters

Title: FoR-SALE: Frame of Reference-guided Spatial Adjustment in LLM-based Diffusion Editing

Title: 3DPCNet: Pose Canonicalization for Robust Viewpoint-Invariant 3D Kinematic Analysis from Monocular RGB cameras

Title: Generative Evolutionary Meta-Solver (GEMS): Scalable Surrogate-Free Multi-Agent Learning

Title: Memory-Efficient Fine-Tuning via Low-Rank Activation Compression

Title: RestoRect: Degraded Image Restoration via Latent Rectified Flow & Feature Distillation

Title: The Impact of Role Design in In-Context Learning for Large Language Models

Title: Disentanglement of Variations with Multimodal Generative Modeling

Title: Towards Interpretable Visual Decoding with Attention to Brain Representations

Title: RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization

Title: VividFace: High-Quality and Efficient One-Step Diffusion For Video Face Enhancement

Title: Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models

Title: MAN: Latent Diffusion Enhanced Multistage Anti-Noise Network for Efficient and High-Quality Low-Dose CT Image Denoising

Title: VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis

Title: BioVessel-Net and RetinaMix: Unsupervised Retinal Vessel Segmentation from OCTA Images

Title: DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation

Title: RIV: Recursive Introspection Mask Diffusion Vision Language Model

Title: Efficient Domain-Adaptive Multi-Task Dense Prediction with Vision Foundation Models

Title: LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders

Title: Griffin: Generative Reference and Layout Guided Image Composition

Title: Don't Settle Too Early: Self-Reflective Remasking for Diffusion Language Models

Title: Beyond English-Centric Training: How Reinforcement Learning Improves Cross-Lingual Reasoning in LLMs

Title: QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification

Title: Estimating Time Series Foundation Model Transferability via In-Context Learning

Title: CrimEdit: Controllable Editing for Counterfactual Object Removal, Insertion, and Movement

Title: PD-Diag-Net: Clinical-Priors guided Network on Brain MRI for Auxiliary Diagnosis of Parkinson's Disease

Title: DiffPCN: Latent Diffusion Model Based on Multi-view Depth Images for Point Cloud Completion

Title: M3DLayout: A Multi-Source Dataset of 3D Indoor Layouts and Structured Descriptions for 3D Generation

Title: ResAD++: Towards Class Agnostic Anomaly Detection via Residual Feature Learning

Title: UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception

Title: GenView++: Unifying Adaptive View Generation and Quality-Driven Supervision for Contrastive Representation Learning

Title: Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

Title: Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

Title: Space Group Conditional Flow Matching

Title: Uni4D-LLM: A Unified SpatioTemporal-Aware VLM for 4D Understanding and Generation

Title: Towards Fine-Grained Text-to-3D Quality Assessment: A Benchmark and A Two-Stage Rank-Learning Metric

Title: Adversarial Diffusion for Robust Reinforcement Learning

Title: Tunable-Generalization Diffusion Powered by Self-Supervised Contextual Sub-Data for Low-Dose CT Reconstruction

Title: EWC-Guided Diffusion Replay for Exemplar-Free Continual Learning in Medical Imaging

Title: EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling

Title: MoReact: Generating Reactive Motion from Textual Descriptions

Title: Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis

Title: Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models

Title: Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step

Title: SAR-KnowLIP: Towards Multimodal Foundation Models for Remote Sensing

Title: Diffusion Models are Kelly Gamblers

Title: Brain-language fusion enables interactive neural readout and in-silico experimentation

Title: HunyuanImage 3.0 Technical Report

Title: Reinforcement Learning with Inverse Rewards for World Model Post-training

Title: VFSI: Validity First Spatial Intelligence for Constraint-Guided Traffic Diffusion

Title: Towards Redundancy Reduction in Diffusion Models for Efficient Video Super-Resolution

Title: RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization

Title: Advancing Multi-agent Traffic Simulation via R1-Style Reinforcement Fine-Tuning

Title: SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Title: Sequential Diffusion Language Models

Title: Pretraining Scaling Laws for Generative Evaluations of Language Models

Title: SparseD: Sparse Attention for Diffusion Language Models

Title: GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning

Title: Automated Vulnerability Validation and Verification: A Large Language Model Approach

Title: In-Context Compositional Q-Learning for Offline Reinforcement Learning

Title: AQUAIR: A High-Resolution Indoor Environmental Quality Dataset for Smart Aquaculture Monitoring

Title: A Family of Kernelized Matrix Costs for Multiple-Output Mixture Neural Networks

Title: Unified Multi-Modal Interactive & Reactive 3D Motion Generation via Rectified Flow

Title: GeoFunFlow: Geometric Function Flow Matching for Inverse Operator Learning over Complex Geometries

Title: The Impossibility of Inverse Permutation Learning in Transformer Models

Title: GANji: A Framework for Introductory AI Image Generation

Title: Asymmetric VAE for One-Step Video Super-Resolution Acceleration

Title: Your thoughts tell who you are: Characterize the reasoning patterns of LRMs

Title: Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis

Title: LatXGen: Towards Radiation-Free and Accurate Quantitative Analysis of Sagittal Spinal Alignment Via Cross-Modal Radiographic View Synthesis

Title: Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight

Title: FM-FoG: A Real-Time Foundation Model-based Wearable System for Freezing-of-Gait Mitigation

Title: Tumor Synthesis conditioned on Radiomics

Title: Retrieval-augmented GUI Agents with Generative Guidelines

Title: Simulating Post-Neoadjuvant Chemotherapy Breast Cancer MRI via Diffusion Model with Prompt Tuning

Title: PET: Preference Evolution Tracking with LLM-Generated Explainable Distribution

Title: An Efficient 3D Latent Diffusion Model for T1-contrast Enhanced MRI Generation

Title: UniVid: The Open-Source Unified Video Model

Title: BALR-SAM: Boundary-Aware Low-Rank Adaptation of SAM for Resource-Efficient Medical Image Segmentation

Title: Forge4D: Feed-Forward 4D Human Reconstruction and Interpolation from Uncalibrated Sparse-view Videos

Title: Scalable Audio-Visual Masked Autoencoders for Efficient Affective Video Facial Analysis

Title: Semantic Editing with Coupled Stochastic Differential Equations

Title: EVLF-FM: Explainable Vision Language Foundation Model for Medicine

Title: FreeAction: Training-Free Techniques for Enhanced Fidelity of Trajectory-to-Video Generation

Title: Graph Foundation Models: Bridging Language Model Paradigms and Graph Optimization

Title: Cycle Diffusion Model for Counterfactual Image Generation

Title: ASIA: Adaptive 3D Segmentation using Few Image Annotations

Title: Let LLMs Speak Embedding Languages: Generative Text Embeddings via Iterative Contrastive Refinement

Title: DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

Title: ELASTIQ: EEG-Language Alignment with Semantic Task Instruction and Querying

Title: A study of Universal ODE approaches to predicting soil organic carbon

Title: Towards Foundation Models for Cryo-ET Subtomogram Analysis

Title: Hyperspherical Latents Improve Continuous-Token Autoregressive Generation

Title: Expanding Horizons of Level Diversity via Multi-objective Evolutionary Learning

Title: NeRV-Diffusion: Diffuse Implicit Neural Representations for Video Synthesis

Title: DRIFT: Divergent Response in Filtered Transformations for Robust Adversarial Defense

Title: Watermarking Diffusion Language Models

Title: From Satellite to Street: A Hybrid Framework Integrating Stable Diffusion and PanoGAN for Consistent Cross-View Synthesis

Title: DINOReg: Strong Point Cloud Registration with Vision Foundation Model

Title: AXIS: Explainable Time Series Anomaly Detection with Large Language Models

Title: REALIGN: Regularized Procedure Alignment with Matching Video Embeddings via Partial Gromov-Wasserstein Optimal Transport

Title: LLaDA-MoE: A Sparse MoE Diffusion Language Model

Title: RapidMV: Leveraging Spatio-Angular Representations for Efficient and Consistent Text-to-Multi-View Synthesis

Title: ScatterAD: Temporal-Topological Scattering Mechanism for Time Series Anomaly Detection

Title: CLQ: Cross-Layer Guided Orthogonal-based Quantization for Diffusion Transformers

Title: UI2V-Bench: An Understanding-based Image-to-video Generation Benchmark

Title: Alternatives To Next Token Prediction In Text Generation - A Survey

Title: Beyond Isolated Facts: Synthesizing Narrative and Grounded Supervision for VideoQA

Title: Generalist Multi-Class Anomaly Detection via Distillation to Two Heterogeneous Student Networks

Title: Interpretable Kernel Representation Learning at Scale: A Unified Framework Utilizing Nyström Approximation

Title: LaMoGen: Laban Movement-Guided Diffusion for Text-to-Motion Generation

Title: Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models

Title: CMT: Mid-Training for Efficient Learning of Consistency, Mean Flow, and Flow Map Models

Title: Diffusion Bridge or Flow Matching? A Unifying Framework and Comparative Analysis

Title: Training-Free Multimodal Guidance for Video to Audio Generation

Title: SCOPE: Semantic Conditioning for Sim2Real Category-Level Object Pose Estimation in Robotics

Title: SAIP: A Plug-and-Play Scale-adaptive Module in Diffusion-based Inverse Problems

Title: FreeRet: MLLMs as Training-Free Retrievers

Title: RIFLE: Removal of Image Flicker-Banding via Latent Diffusion Enhancement

Title: Learning Object-Centric Representations Based on Slots in Real World Scenarios

Title: SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

Title: Enhancing Physical Plausibility in Video Generation by Reasoning the Implausibility

Title: MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

Title: Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Title: ExGS: Extreme 3D Gaussian Compression with Diffusion Priors

Title: In-Context Learning of Temporal Point Processes with Foundation Inference Models

Title: MarS-FM: Generative Modeling of Molecular Dynamics via Markov State Models

Title: SkyLink: Unifying Street-Satellite Geo-Localization via UAV-Mediated 3D Scene Alignment

Title: Assessing the risk of future Dunkelflaute events for Germany using generative deep learning

Title: Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation

Title: Cell2Text: Multimodal LLM for Generating Single-Cell Descriptions from RNA-Seq Data

Title: DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning

Title: Environment-Aware Satellite Image Generation with Diffusion Models

Title: ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation

Title: VAGUEGAN: Stealthy Poisoning and Backdoor Attacks on Image Generative Pipelines

Title: DAM: Dual Active Learning with Multimodal Foundation Model for Source-Free Domain Adaptation

Title: Attention Surgery: An Efficient Recipe to Linearize Your Video Diffusion Transformer

Title: BOE-XSUM: Extreme Summarization in Clear Language of Spanish Legal Decrees and Notifications

Title: Scalable GANs with Transformers

Title: OAT-FM: Optimal Acceleration Transport for Improved Flow Matching

Title: Event-based Facial Keypoint Alignment via Cross-Modal Fusion Attention and Self-Supervised Multi-Event Representation Learning

Title: On-the-Fly Data Augmentation for Brain Tumor Segmentation

Title: Double Descent as a Lens for Sample Efficiency in Autoregressive vs. Discrete Diffusion Models

Title: Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel

Title: SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation

Title: Generalized Correctness Models: Learning Calibrated and Model-Agnostic Correctness Predictors from Historical Patterns

Title: PanoWorld-X: Generating Explorable Panoramic Worlds via Sphere-Aware Video Diffusion

Title: Score-based Membership Inference on Diffusion Models

Title: Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct

Title: Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models

Title: UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation

Title: Towards a Certificate of Trust: Task-Aware OOD Detection for Scientific AI

Title: MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification

Title: jina-reranker-v3: Last but Not Late Interaction for Document Reranking

Title: Towards Trustworthy Lexical Simplification: Exploring Safety and Efficiency with Small LLMs

Title: Score Distillation of Flow Matching Models

Title: Chance-constrained Flow Matching for High-Fidelity Constraint-aware Generation

Title: Rolling Forcing: Autoregressive Long Video Diffusion in Real Time

Title: Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models

Title: GLASS Flows: Transition Sampling for Alignment of Flow and Diffusion Models

Title: TR2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning for Discrete Diffusion

Title: Personalized Vision via Visual In-Context Learning

Title: GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs

Title: DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Title: DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

Title: PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos

Title: Learning to Parallel: Accelerating Diffusion Large Language Models via Adaptive Parallel Decoding

Title: Visual Jigsaw Post-Training Improves MLLMs

Title: VGGT-X: When VGGT Meets Dense Novel View Synthesis