2025-06-05

Title: Modular Diffusion Policy Training: Decoupling and Recombining Guidance and Diffusion for Offline RL

Title: Applying MambaAttention, TabPFN, and TabTransformers to Classify SAE Automation Levels in Crashes

Title: Dual Branch VideoMamba with Gated Class Token Fusion for Violence Detection

Title: Causal Discovery in Dynamic Fading Wireless Networks

Title: Test-Time Scaling of Diffusion Models via Noise Trajectory Search

Title: Farm-LightSeek: An Edge-centric Multimodal Agricultural IoT Data Analytics Framework with Lightweight LLMs

Title: Improvement of human health lifespan with hybrid group pose estimation methods

Title: PALADIN : Robust Neural Fingerprinting for Text-to-Image Diffusion Models

Title: EdgeVidSum: Real-Time Personalized Video Summarization at the Edge

Title: FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution

Title: Multimodal Foundation Model for Cross-Modal Retrieval and Activity Recognition Tasks

Title: Vid-SME: Membership Inference Attacks against Large Video Understanding Models

Title: Continual Learning in Vision-Language Models via Aligned Model Merging

Title: Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward

Title: HueManity: Probing Fine-Grained Visual Perception in MLLMs

Title: Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs

Title: Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing

Title: Fingerprinting Deep Learning Models via Network Traffic Patterns in Federated Learning

Title: FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution

Title: Channel-adaptive Cross-modal Generative Semantic Communication for Point Cloud Transmission

Title: ConMamba: Contrastive Vision Mamba for Plant Disease Detection

Title: OpenCarbon: A Contrastive Learning-based Cross-Modality Neural Approach for High-Resolution Carbon Emission Prediction Using Open Data

Title: DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

Title: BadReward: Clean-Label Poisoning of Reward Models in Text-to-Image RLHF

Title: Evaluating Large Language Models for Zero-Shot Disease Labeling in CT Radiology Reports Across Organ Systems

Title: Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas

Title: FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes

Title: Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

Title: From Instructions to ODRL Usage Policies: An Ontology Guided Approach

Title: Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony

Title: Hermes: High-Performance Homomorphically Encrypted Vector Databases

Title: The Future of Continual Learning in the Era of Foundation Models: Three Key Directions

Title: Mitigating Non-IID Drift in Zeroth-Order Federated LLM Fine-Tuning with Transferable Sparsity

Title: Semiconductor SEM Image Defect Classification Using Supervised and Semi-Supervised Learning with Vision Transformers

Title: Robustness in Both Domains: CLIP Needs a Robust Text Encoder

Title: Ask a Local: Detecting Hallucinations With Specialized Model Divergence

Title: A Multimodal, Multilingual, and Multidimensional Pipeline for Fine-grained Crowdsourcing Earthquake Damage Evaluation

Title: Comparison of different Unique hard attention transformer models by the formal languages they can recognize

Title: Toward Reliable VLM: A Fine-Grained Benchmark and Framework for Exposure, Bias, and Inference in Korean Street Views

Title: A Foundation Model for Spatial Proteomics

Title: Cross-Modal Urban Sensing: Evaluating Sound-Vision Alignment Across Street-Level and Aerial Imagery

Title: Trajectory Prediction Meets Large Language Models: A Survey

Title: Technical Options for Flexible Hardware-Enabled Guarantees

Title: DistRAG: Towards Distance-Based Spatial Reasoning in LLMs

Title: Adaptive Task Vectors for Large Language Models

Title: ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads

Title: Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models

Title: Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in Videos

Title: RoNFA: Robust Neural Field-based Approach for Few-Shot Image Classification with Noisy Labels

Title: Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection

Title: APT: Improving Specialist LLM Performance with Weakness Case Acquisition and Iterative Preference Training

Title: Explainable AI: XAI-Guided Context-Aware Data Augmentation

Title: EpiCoDe: Boosting Model Performance Beyond Training with Extrapolation and Contrastive Decoding

Title: Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing

Title: Measuring Human Involvement in AI-Generated Text: A Case Study on Academic Writing

Title: CHIME: Conditional Hallucination and Integrated Multi-scale Enhancement for Time Series Diffusion Model

Title: Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information

Title: DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Title: Target Semantics Clustering via Text Representations for Robust Universal Domain Adaptation

Title: Path Generation and Evaluation in Video Games: A Nonparametric Statistical Approach

Title: TokAlign: Efficient Vocabulary Adaptation via Token Alignment

Title: Seed-Coder: Let the Code Model Curate Data for Itself

Title: Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting

Title: Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement

Title: Learning Monotonic Probabilities with a Generative Cost Model

Title: A Threat Intelligence Event Extraction Conceptual Model for Cyber Threat Intelligence Feeds

Title: WIFE-Fusion:Wavelet-aware Intra-inter Frequency Enhancement for Multi-model Image Fusion

Title: BPO: Revisiting Preference Modeling in Direct Preference Optimization

Title: ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch

Title: POSS: Position Specialist Generates Better Draft for Speculative Decoding

Title: FreePRM: Training Process Reward Models Without Ground Truth Process Labels

Title: Exchange of Perspective Prompting Enhances Reasoning in Large Language Models

Title: KG-BiLM: Knowledge Graph Embedding via Bidirectional Language Models

Title: Automatically Suggesting Diverse Example Sentences for L2 Japanese Learners Using Pre-Trained Language Models

Title: ViTSGMM: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels

Title: A Large-Scale Referring Remote Sensing Image Segmentation Dataset and Benchmark

Title: A Class Inference Scheme With Dempster-Shafer Theory for Learning Fuzzy-Classifier Systems

Title: Resolving Task Objective Conflicts in Unified Multimodal Understanding and Generation via Task-Aware Mixture-of-Experts

Title: From Understanding to Generation: An Efficient Shortcut for Evaluating Language Models

Title: Auto prompt sql: a resource-efficient architecture for text-to-sql translation in constrained environments

Title: Adapting Rule Representation With Four-Parameter Beta Distribution for Learning Classifier Systems

Title: Analyzing Transformer Models and Knowledge Distillation Approaches for Image Captioning on Edge AI

Title: VLMs Can Aggregate Scattered Training Patches

Title: Isharah: A Large-Scale Multi-Scene Dataset for Continuous Sign Language Recognition

Title: Learning to Insert [PAUSE] Tokens for Better Reasoning

Title: GCFL: A Gradient Correction-based Federated Learning Framework for Privacy-preserving CPSS

Title: Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales

Title: Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation

Title: Robustness of Prompting: Enhancing Robustness of Large Language Models Against Prompting Attacks

Title: RewardAnything: Generalizable Principle-Following Reward Models

Title: Images are Worth Variable Length of Representations

Title: YOND: Practical Blind Raw Image Denoising Free from Camera-Specific Data Dependency

Title: Mono: Is Your "Clean" Vulnerability Dataset Really Solvable? Exposing and Trapping Undecidable Patches and Beyond

Title: EmoArt: A Multidimensional Dataset for Emotion-Aware Artistic Generation

Title: MambaNeXt-YOLO: A Hybrid State Space Model for Real-time Object Detection

Title: Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis

Title: Trustworthy Medical Question Answering: An Evaluation-Centric Survey

Title: BiXFormer: A Robust Framework for Maximizing Modality Effectiveness in Multi-Modal Semantic Segmentation

Title: Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering

Title: PRJ: Perception-Retrieval-Judgement for Generated Images

Title: DSSAU-Net:U-Shaped Hybrid Network for Pubic Symphysis and Fetal Head Segmentation

Title: Robust Preference Optimization via Dynamic Target Margins

Title: Advancements in Artificial Intelligence Applications for Cardiovascular Disease Research

Title: AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism

Title: Learning-at-Criticality in Large Language Models for Quantum Field Theory and Beyond

Title: ScoreRAG: A Retrieval-Augmented Generation Framework with Consistency-Relevance Scoring and Structured Summarization for News Generation

Title: OV-COAST: Cost Aggregation with Optimal Transport for Open-Vocabulary Semantic Segmentation

Title: AetherVision-Bench: An Open-Vocabulary RGB-Infrared Benchmark for Multi-Angle Segmentation across Aerial and Ground Perspectives

Title: FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Title: On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity

Title: Verbalized Confidence Triggers Self-Verification: Emergent Behavior Without Explicit Reasoning Supervision

Title: Sign-SGD is the Golden Gate between Multi-Node to Single-Node Learning: Significant Boost via Parameter-Free Optimization

Title: ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Title: SAAT: Synergistic Alternating Aggregation Transformer for Image Super-Resolution

Title: Dropout-Robust Mechanisms for Differentially Private and Fully Decentralized Mean Estimation

Title: Scaling CrossQ with Weight Normalization

Title: Act-as-Pet: Benchmarking the Abilities of Large Language Models as E-Pets in Social Network Services

Title: AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models

Title: ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations

Title: Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples

Title: FedFACT: A Provable Framework for Controllable Group-Fairness Calibration in Federated Learning

Title: Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models

Title: Knockout LLM Assessment: Using Large Language Models for Evaluations through Iterative Pairwise Comparisons

Title: Attention-Only Transformers via Unrolled Subspace Denoising

Title: Mark My Words: A Robust Multilingual Model for Punctuation in Text and Speech Transcripts

Title: ConText: Driving In-context Learning for Text Removal and Segmentation

Title: Automatic Correction of Writing Anomalies in Hausa Texts

Title: CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents

Title: Multi-objective Aligned Bidword Generation Model for E-commerce Search Advertising

Title: Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning

Title: Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation

Title: PulseReddit: A Novel Reddit Dataset for Benchmarking MAS in High-Frequency Cryptocurrency Trading

Title: EuroGEST: Investigating gender stereotypes in multilingual language models

Title: Evaluating Apple Intelligence's Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets

Title: JointSplat: Probabilistic Joint Flow-Depth Optimization for Sparse-View Gaussian Splatting

Title: RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing

Title: Video, How Do Your Tokens Merge?

Title: Magic Mushroom: A Customizable Benchmark for Fine-grained Analysis of Retrieval Noise Erosion in RAG Systems

Title: Learning Fair And Effective Points-Based Rewards Programs

Title: When Fairness Isn't Statistical: The Limits of Machine Learning in Evaluating Legal Reasoning

Title: Learning equivariant models by discovering symmetries with learnable augmentations

Title: Learning from Noise: Enhancing DNNs for Event-Based Vision through Controlled Noise Injection

Title: HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models

Title: More or Less Wrong: A Benchmark for Directional Bias in LLM Comparative Reasoning

Title: Vision Remember: Alleviating Visual Forgetting in Efficient MLLM with Vision Feature Resample

Title: DiffCAP: Diffusion-based Cumulative Adversarial Purification for Vision Language Models

Title: Depermissioning Web3: a Permissionless Accountable RPC Protocol for Blockchain Networks

Title: Average Calibration Losses for Reliable Uncertainty in Medical Image Segmentation

Title: Lower Ricci Curvature for Hypergraphs

Title: HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark

Title: Causality-Aware Contrastive Learning for Robust Multivariate Time-Series Anomaly Detection

Title: From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding

Title: MS-YOLO: A Multi-Scale Model for Accurate and Efficient Blood Cell Detection

Title: Structured Pruning for Diverse Best-of-N Reasoning Optimization

Title: Solving Inverse Problems via Diffusion-Based Priors: An Approximation-Free Ensemble Sampling Approach

Title: RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors

Title: CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor

Title: Vocabulary-free few-shot learning for Vision-Language Models

Title: Rex-Thinker: Grounded Object Referring via Chain-of-Thought Reasoning

Title: Privacy and Security Threat for OpenAI GPTs

Title: Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization

Title: Unveiling and Eliminating the Shortcut Learning for Locate-Then-Edit Knowledge Editing via Both Subject and Relation Awareness

Title: Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate

Title: Lacuna Inc. at SemEval-2025 Task 4: LoRA-Enhanced Influence-Based Unlearning for LLMs

Title: On Support Samples of Next Word Prediction

Title: EV-Flying: an Event-based Dataset for In-The-Wild Recognition of Flying Objects

Title: Explainability-Based Token Replacement on LLM-Generated Text

Title: High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning

Title: Curse of Slicing: Why Sliced Mutual Information is a Deceptive Measure of Statistical Dependence

Title: Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning

Title: Optimal Transport-based Domain Alignment as a Preprocessing Step for Federated Learning

Title: Controlling Difficulty of Generated Text for AI-Assisted Language Learning

Title: A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions

Title: LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation

Title: EuroLLM-9B: Technical Report

Title: Multimodal Tabular Reasoning with Privileged Structured Information

Title: AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

Title: TextAtari: 100K Frames Game Playing with Language Agents

Title: Rectified Sparse Attention

Title: Multi-view Surface Reconstruction Using Normal and Reflectance Cues

Title: Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

Title: CLAIM: An Intent-Driven Multi-Agent Framework for Analyzing Manipulation in Courtroom Dialogues

Title: MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

Title: Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis

Title: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

Title: Image Editing As Programs with Diffusion Models

Title: N$^2$: A Unified Python Package and Test Bench for Nearest Neighbor-Based Matrix Completion

Title: Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints

Title: Does Prompt Design Impact Quality of Data Imputation by LLMs?

Title: SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling

Title: SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

Title: R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning

Title: TracLLM: A Generic Framework for Attributing Long Context LLMs

Title: EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation

Title: Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Title: Language-Image Alignment with Fixed Text Encoders

Title: Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector

Title: FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers

Title: Sounding that Object: Interactive Object-Aware Image to Audio Generation

Title: UNIC: Unified In-Context Video Editing

Title: Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation

Title: LayerFlow: A Unified Model for Layer-aware Video Generation