2025-06-03

Title: Amadeus-Verbo Technical Report: The powerful Qwen2.5 family models trained in Portuguese

Title: Writing-Zero: Bridge the Gap Between Non-verifiable Problems and Verifiable Rewards

Title: On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning

Title: Cluster-Aware Causal Mixer for Online Anomaly Detection in Multivariate Time Series

Title: MOFGPT: Generative Design of Metal-Organic Frameworks using Language Models

Title: Structuring Radiology Reports: Challenging LLMs with Lightweight Models

Title: Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models

Title: Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

Title: Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model

Title: Emergent Abilities of Large Language Models under Continued Pretraining for Language Adaptation

Title: DLM-One: Diffusion Language Models for One-Step Sequence Generation

Title: Inference-Time Alignment of Diffusion Models with Evolutionary Algorithms

Title: SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation

Title: Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking

Title: Latent Guidance in Diffusion Models for Perceptual Evaluations

Title: Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation

Title: OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

Title: Scaling Textual Gradients via Sampling-Based Momentum

Title: JojoSCL: Shrinkage Contrastive Learning for single-cell RNA sequence Clustering

Title: Accelerating Diffusion LLMs via Adaptive Parallel Decoding

Title: Dual Debiasing for Noisy In-Context Learning for Text Generation

Title: A New Spatiotemporal Correlation Anomaly Detection Method that Integrates Contrastive Learning and Few-Shot Learning in Wireless Sensor Networks

Title: Channel Normalization for Time Series Channel Identification

Title: Latent Wavelet Diffusion: Enabling 4K Image Synthesis for Free

Title: G2S: A General-to-Specific Learning Framework for Temporal Knowledge Graph Forecasting with Large Language Models

Title: Comparing Traditional and Reinforcement-Learning Methods for Energy Storage Control

Title: Exploring In-context Example Generation for Machine Translation

Title: SSAM: Self-Supervised Association Modeling for Test-Time Adaption

Title: SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

Title: Decoupling Reasoning and Knowledge Injection for In-Context Knowledge Editing

Title: Imputation of Missing Data in Smooth Pursuit Eye Movements Using a Self-Attention-based Deep Learning Approach

Title: SEED: A Benchmark Dataset for Sequential Facial Attribute Editing with Diffusion Models

Title: MR2US-Pro: Prostate MR to Ultrasound Image Translation and Registration Based on Diffusion Models

Title: Graph Evidential Learning for Anomaly Detection

Title: Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control

Title: ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Title: Parallel Rescaling: Rebalancing Consistency Guidance for Personalized Diffusion Models

Title: Improving Dialogue State Tracking through Combinatorial Search for In-Context Examples

Title: Probabilistic Forecasting for Building Energy Systems using Time-Series Foundation Models

Title: Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

Title: Video Signature: In-generation Watermarking for Latent Video Diffusion Models

Title: Differential Privacy for Deep Learning in Medicine

Title: CineMA: A Foundation Model for Cine Cardiac MRI

Title: Concept-Centric Token Interpretation for Vector-Quantized Generative Models

Title: RelDiff: Relational Data Generative Modeling with Graph-Based Diffusion Models

Title: QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Title: From Local Cues to Global Percepts: Emergent Gestalt Organization in Self-Supervised Vision Models

Title: Common Inpainted Objects In-N-Out of Context

Title: MoPINNEnKF: Iterative Model Inference using generic-PINN-based ensemble Kalman filter

Title: ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary

Title: Aiding Medical Diagnosis through Image Synthesis and Classification

Title: TIME: TabPFN-Integrated Multimodal Engine for Robust Tabular-Image Learning

Title: From Plain Text to Poetic Form: Generating Metrically-Constrained Sanskrit Verses

Title: QuantFace: Low-Bit Post-Training Quantization for One-Step Diffusion Face Restoration

Title: SafeGenes: Evaluating the Adversarial Robustness of Genomic Foundation Models

Title: Probing the Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks

Title: HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs

Title: SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

Title: A Large Language Model-Supported Threat Modeling Framework for Transportation Cyber-Physical Systems

Title: Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience

Title: Generalization in VAE and Diffusion Models: A Unified Information-Theoretic Analysis

Title: FourierFlow: Frequency-aware Flow Matching for Generative Turbulence Modeling

Title: Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning

Title: Towards Predicting Any Human Trajectory In Context

Title: Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection

Title: State-Covering Trajectory Stitching for Diffusion Planners

Title: DS-VTON: High-Quality Virtual Try-on via Disentangled Dual-Scale Generation

Title: 3D Skeleton-Based Action Recognition: A Review

Title: Position as Probability: Self-Supervised Transformers that Think Past Their Training for Length Extrapolation

Title: Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs

Title: Continual-MEGA: A Large-scale Benchmark for Generalizable Continual Anomaly Detection

Title: From Objectives to Questions: A Planning-based Framework for Educational Mathematical Question Generation

Title: Pilot Contamination-Aware Graph Attention Network for Power Control in CFmMIMO

Title: NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction

Title: Quantization-based Bounds on the Wasserstein Metric

Title: IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection

Title: What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training

Title: GOBench: Benchmarking Geometric Optics Generation and Understanding of MLLMs

Title: Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

Title: Motion-Aware Concept Alignment for Consistent Video Editing

Title: Autoregressive Images Watermarking through Lexical Biasing: An Approach Resistant to Regeneration Attack

Title: AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting

Title: Modality Translation and Registration of MR and Ultrasound Images Using Diffusion Models

Title: Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution

Title: ECP-Mamba: An Efficient Multi-scale Self-supervised Contrastive Learning Method with State Space Model for PolSAR Image Classification

Title: AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation

Title: A Large Convolutional Neural Network for Clinical Target and Multi-organ Segmentation in Gynecologic Brachytherapy with Multi-stage Learning

Title: Contextual Candor: Enhancing LLM Trustworthiness Through Hierarchical Unanswerability Detection

Title: Revolutionizing Radiology Workflow with Factual and Efficient CXR Report Generation

Title: Neuro-Symbolic Generative Diffusion Models for Physically Grounded, Robust, and Safe Generation

Title: From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models

Title: FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Title: FORT: Forward-Only Regression Training of Normalizing Flows

Title: Bridging Quantum and Classical Computing in Drug Design: Architecture Principles for Improved Molecule Generation

Title: Self-Supervised Multi-View Representation Learning using Vision-Language Model for 3D/4D Facial Expression Recognition

Title: Visual Sparse Steering: Improving Zero-shot Image Classification with Sparsity Guided Steering Vectors

Title: Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines

Title: Schema as Parameterized Tools for Universal Information Extraction

Title: TSRating: Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment

Title: SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost

Title: $Ψ$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models

Title: Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation

Title: NoiseAR: AutoRegressing Initial Noise Prior for Diffusion Models

Title: Unraveling Spatio-Temporal Foundation Models via the Pipeline Lens: A Comprehensive Review

Title: Synthetic Data Augmentation using Pre-trained Diffusion Models for Long-tailed Food Image Classification

Title: Playing with Transformer at 30+ FPS via Next-Frame Diffusion

Title: System Calls for Malware Detection and Classification: Methodologies and Applications

Title: Self-supervised Latent Space Optimization with Nebula Variational Coding

Title: DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing

Title: Whale: Large-Scale multilingual ASR model with w2v-BERT and E-Branchformer with large speech data

Title: ShaTS: A Shapley-based Explainability Method for Time Series Artificial Intelligence Models applied to Anomaly Detection in Industrial Internet of Things

Title: DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion

Title: Towards Scalable Video Anomaly Retrieval: A Synthetic Video-Text Benchmark

Title: Feature-aware Hypergraph Generation via Next-Scale Prediction

Title: SemiVT-Surge: Semi-Supervised Video Transformer for Surgical Phase Recognition

Title: Automatic Stage Lighting Control: Is it a Rule-Driven Process or Generative Task?

Title: Efficiency without Compromise: CLIP-aided Text-to-Image GANs with Increased Diversity

Title: Continual Speech Learning with Fused Speech Features

Title: Analyzing the Importance of Blank for CTC-Based Knowledge Distillation

Title: Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment

Title: FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents

Title: Beyond Diagonal Covariance: Flexible Posterior VAEs via Free-Form Injective Flows

Title: A Diffusion-Based Method for Learning the Multi-Outcome Distribution of Medical Treatments

Title: G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models

Title: Adaptive Destruction Processes for Diffusion Samplers

Title: LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model

Title: HOSIG: Full-Body Human-Object-Scene Interaction Generation with Hierarchical Scene Perception

Title: PMNO: A novel physics guided multi-step neural operator predictor for partial differential equations

Title: Minimal Impact ControlNet: Advancing Multi-ControlNet Integration

Title: mdok of KInIT: Robustly Fine-tuned LLM for Binary and Multiclass AI-Generated Text Detection

Title: Principled data augmentation for learning to solve quadratic programming problems

Title: Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks

Title: Federated Gaussian Mixture Models

Title: Human-Centric Evaluation for Foundation Models

Title: WorldExplorer: Towards Generating Fully Navigable 3D Scenes

Title: OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation

Title: SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model

Title: Learning to Explore: An In-Context Learning Approach for Pure Exploration

Title: SMOTE-DP: Improving Privacy-Utility Tradeoff with Synthetic Data

Title: Elucidating the representation of images within an unconditional diffusion model denoiser

Title: TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation

Title: Esoteric Language Models

Title: E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models

Title: Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

Title: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Title: IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout

Title: Dual-Process Image Generation