2025-05-22

Title: Addressing the Challenges of Planning Language Generation

Title: Automated Journalistic Questions: A New Method for Extracting 5W1H in French

Title: Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Title: Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes

Title: WebNovelBench: Placing LLM Novelists on the Web Novel Distribution

Title: Tracing Multilingual Factual Knowledge Acquisition in Pretraining

Title: Text Generation Beyond Discrete Token Sampling

Title: SEPS: A Separability Measure for Robust Unlearning in LLMs

Title: A Comparative Study of Large Language Models and Human Personality Traits

Title: MAATS: A Multi-Agent Automated Translation System Based on MQM Evaluation

Title: EasyMath: A 0-shot Math Benchmark for SLMs

Title: Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models

Title: Incorporating Token Usage into Prompting Strategy Evaluation

Title: Strategic Planning and Rationalizing on Trees Make LLMs Better Debaters

Title: In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties

Title: Scaling Laws for State Dynamics in Large Language Models

Title: Concept Incongruence: An Exploration of Time and Death in Role Playing

Title: Understanding 6G through Language Models: A Case Study on LLM-aided Structured Entity Extraction in Telecom Domain

Title: ConspEmoLLM-v2: A robust and stable model to detect sentiment-transformed conspiracy theories

Title: Reliable Decision Support with LLMs: A Framework for Evaluating Consistency in Binary Text Classification Applications

Title: Too Long, Didn't Model: Decomposing LLM Long-Context Understanding With Novels

Title: MedBrowseComp: Benchmarking Medical Deep Research and Computer Use

Title: DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis

Title: Multimodal Cultural Safety: Evaluation Frameworks and Alignment Strategies

Title: CRAFT: Training-Free Cascaded Retrieval for Tabular QA

Title: Language Specific Knowledge: Do Models Know Better in X than in English?

Title: Effective and Efficient Schema-aware Information Extraction Using On-Device Large Language Models

Title: Meta-Design Matters: A Self-Design Multi-Agent System

Title: Towards Spoken Mathematical Reasoning: Benchmarking Speech-based Models over Multi-faceted Math Problems

Title: Diagnosing our datasets: How does my language model learn clinical information?

Title: Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering

Title: Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Title: ChartCards: A Chart-Metadata Generation Framework for Multi-Task Chart Understanding

Title: Improving the fact-checking performance of language models by relying on their entailment ability

Title: MolLangBench: A Comprehensive Benchmark for Language-Prompted Molecular Structure Recognition, Editing, and Generation

Title: Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory

Title: Self-GIVE: Associative Thinking from Limited Structured Knowledge for Enhanced Large Language Model Reasoning

Title: UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking

Title: The Pursuit of Empathy: Evaluating Small Language Models for PTSD Dialogue Support

Title: In-Domain African Languages Translation Using LLMs and Multi-armed Bandits

Title: Can Large Language Models Understand Internet Buzzwords Through User-Generated Content

Title: DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Title: Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs

Title: DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer

Title: SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models

Title: Nek Minit: Harnessing Pragmatic Metacognitive Prompting for Explainable Sarcasm Detection of Australian and Indian English

Title: Mechanistic evaluation of Transformers and state space models

Title: StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization

Title: A Risk Taxonomy for Evaluating AI-Powered Psychotherapy Agents

Title: RoT: Enhancing Table Reasoning with Iterative Row-Wise Traversals

Title: An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents

Title: Prolonged Reasoning Is Not All You Need: Certainty-Based Adaptive Routing for Efficient LLM/MLLM Reasoning

Title: ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection

Title: EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association

Title: DUSK: Do Not Unlearn Shared Knowledge

Title: Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs

Title: R-TOFU: Unlearning in Large Reasoning Models

Title: Multilingual Prompting for Improving LLM Generation Diversity

Title: Towards Explainable Temporal Reasoning in Large Language Models: A Structure-Aware Generative Framework

Title: Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation

Title: MentalMAC: Enhancing Large Language Models for Detecting Mental Manipulation via Multi-Task Anti-Curriculum Distillation

Title: When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners

Title: AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection

Title: Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Title: Hallucinate at the Last in Long Response Generation: A Case Study on Long Document Summarization

Title: Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites

Title: Emotional Supporters often Use Multiple Strategies in a Single Turn

Title: Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack

Title: Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation

Title: Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors

Title: FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management

Title: Revealing Language Model Trajectories via Kullback-Leibler Divergence

Title: NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging

Title: X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System

Title: RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection

Title: Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study

Title: An Empirical Study of the Anchoring Effect in LLMs: Existence, Mechanism, and Potential Mitigations

Title: How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

Title: Trends and Challenges in Authorship Analysis: A Review of ML, DL, and LLM Approaches

Title: Gated Integration of Low-Rank Adaptation for Continual Learning of Language Models

Title: NeoN: A Tool for Automated Detection, Linguistic and LLM-Driven Analysis of Neologisms in Polish

Title: Responsible Diffusion Models via Constraining Text Embeddings within Safe Regions

Title: Likelihood Variance as Text Importance for Resampling Texts to Map Language Models

Title: Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Title: On the Generalization vs Fidelity Paradox in Knowledge Distillation

Title: AdUE: Improving uncertainty estimation head for LoRA adapters in LLMs

Title: Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization

Title: Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment

Title: Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning

Title: CoLA: Collaborative Low-Rank Adaptation

Title: PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions

Title: LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models

Title: KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance

Title: Collaborative Problem-Solving in an Optimization Game

Title: Protoknowledge Shapes Behaviour of LLMs in Downstream Tasks: Memorization and Generalization with Knowledge Graphs

Title: Multilingual Test-Time Scaling via Initial Thought Transfer

Title: Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs

Title: Social Bias in Popular Question-Answering Benchmarks

Title: DayDreamer at CQs-Gen 2025: Generating Critical Questions through Argument Scheme Completion

Title: Do RAG Systems Suffer From Positional Bias?

Title: From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning

Title: Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning

Title: Listen to the Context: Towards Faithful Large Language Models for Retrieval Augmented Generation on Climate Questions

Title: Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models

Title: Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

Title: Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model

Title: UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models

Title: The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect

Title: A Federated Splitting Framework for LLMs: Security, Efficiency, and Adaptability

Title: ThinkLess: A Training-Free Inference-Efficient Method for Reducing Reasoning Redundancy

Title: Can Large Language Models be Effective Online Opinion Miners?

Title: LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing

Title: Advancing LLM Safe Alignment with Safety Representation Ranking

Title: TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games

Title: Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling

Title: Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities

Title: VocalBench: Benchmarking the Vocal Conversational Abilities for Speech Interaction Models

Title: DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning

Title: Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention

Title: ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning

Title: Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

Title: dKV-Cache: The Cache for Diffusion Language Models

Title: Long-Form Information Alignment Evaluation Beyond Atomic Facts

Title: Reverse Engineering Human Preferences with Reinforcement Learning

Title: VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models

Title: Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering

Title: The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation

Title: GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents

Title: Learning to Reason via Mixture-of-Thought for Logical Reasoning