2025-05-23

Title: BR-TaxQA-R: A Dataset for Question Answering with References for Brazilian Personal Income Tax Law, including case law

Title: Extracting Probabilistic Knowledge from Large Language Models for Bayesian Network Parameterization

Title: Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition

Title: Citation Parsing and Analysis with Language Models

Title: Training Step-Level Reasoning Verifiers with Formal Verification Tools

Title: Pre-training Large Memory Language Models with Internal and External Knowledge

Title: Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku

Title: Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model

Title: Causal Interventions Reveal Shared Structure Across English Filler-Gap Constructions

Title: SLMEval: Entropy-Based Calibration for Human-Aligned Evaluation of Large Language Models

Title: Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains

Title: NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning

Title: Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild

Title: OpenEthics: A Comprehensive Ethical Evaluation of Open-Source Generative Large Language Models

Title: Internal and External Impacts of Natural Language Processing Papers

Title: Small Language Models in the Real World: Insights from Industrial Text Classification

Title: BiasLab: Toward Explainable Political Bias Detection with Dual-Axis Annotations and Rationale Indicators

Title: Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning

Title: Continually Self-Improving Language Models for Bariatric Surgery Question--Answering

Title: Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models

Title: MPL: Multiple Programming Languages with Large Language Models for Information Extraction

Title: Semiotic Reconstruction of Destination Expectation Constructs An LLM-Driven Computational Paradigm for Social Media Tourism Analytics

Title: KoBALT: Korean Benchmark For Advanced Linguistic Tasks

Title: Veracity Bias and Beyond: Uncovering LLMs' Hidden Beliefs in Problem-Solving Reasoning

Title: LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods

Title: Position of Uncertainty: A Cross-Linguistic Study of Positional Bias in Large Language Models

Title: Distilling the Implicit Multi-Branch Structure in LLMs' Reasoning via Reinforcement Learning

Title: EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios

Title: KNN-SSD: Enabling Dynamic Self-Speculative Decoding via Nearest Neighbor Layer Set Optimization

Title: Can LLMs Simulate Human Behavioral Variability? A Case Study in the Phonemic Fluency Task

Title: When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction

Title: Automated Feedback Loops to Protect Text Simplification with Generative AI from Information Loss

Title: Understanding Fact Recall in Language Models: Why Two-Stage Training Encourages Memorization but Mixed Training Teaches Knowledge

Title: SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models

Title: An Empirical Study on Configuring In-Context Learning Demonstrations for Unleashing MLLMs' Sentimental Perception Capability

Title: Large Language Models based ASR Error Correction for Child Conversations

Title: Memorization or Reasoning? Exploring the Idiom Understanding of LLMs

Title: Don't Judge Code by Its Cover: Exploring Biases in LLM Judges for Code Evaluation

Title: Explain Less, Understand More: Jargon Detection via Personalized Parameter-Efficient Fine-tuning

Title: MuseRAG: Idea Originality Scoring At Scale

Title: LIFEBench: Evaluating Length Instruction Following in Large Language Models

Title: Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation

Title: Three Minds, One Legend: Jailbreak Large Reasoning Model with Adaptive Stacked Ciphers

Title: Diverse, not Short: A Length-Controlled Self-Learning Framework for Improving Response Diversity of Language Models

Title: Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models

Title: IRONIC: Coherence-Aware Reasoning Chains for Multi-Modal Sarcasm Detection

Title: Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning

Title: Spontaneous Speech Variables for Evaluating LLMs Cognitive Plausibility

Title: HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation

Title: Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA

Title: ToDi: Token-wise Distillation via Fine-Grained Divergence Control

Title: INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling

Title: PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models

Title: SC4ANM: Identifying Optimal Section Combinations for Automated Novelty Prediction in Academic Papers

Title: Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance

Title: Ask, Retrieve, Summarize: A Modular Pipeline for Scientific Literature Summarization

Title: PaTH Attention: Position Encoding via Accumulating Householder Transformations

Title: Semantic Pivots Enable Cross-Lingual Transfer in Large Language Models

Title: Resource for Error Analysis in Text Simplification: New Taxonomy and Test Collection

Title: From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs

Title: Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Title: Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation

Title: WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

Title: Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems

Title: University of Indonesia at SemEval-2025 Task 11: Evaluating State-of-the-Art Encoders for Multi-Label Emotion Detection

Title: Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization

Title: Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning

Title: LLaMAs Have Feelings Too: Unveiling Sentiment and Emotion Representations in LLaMA Models Through Probing

Title: Sparse Activation Editing for Reliable Instruction Following in Narratives

Title: CUB: Benchmarking Context Utilisation Techniques for Language Models

Title: Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs

Title: Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing

Title: EnSToM: Enhancing Dialogue Systems with Entropy-Scaled Steering Vectors for Topic Maintenance

Title: Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models

Title: Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains

Title: ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts

Title: URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training

Title: EMULATE: A Multi-Agent Framework for Determining the Veracity of Atomic Claims by Emulating Human Actions

Title: O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering

Title: Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering

Title: What Media Frames Reveal About Stance: A Dataset and Study about Memes in Climate Change Discourse

Title: From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment

Title: Steering Large Language Models for Machine Translation Personalization

Title: SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation

Title: Collaboration among Multiple Large Language Models for Medical Question Answering

Title: A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP

Title: Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence

Title: Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs

Title: Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification

Title: TRIM: Achieving Extreme Sparsity with Targeted Row-wise Iterative Metric-driven Pruning

Title: IFEval-Audio: Benchmarking Instruction-Following Capability in Audio-based Large Language Models

Title: Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Title: Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability

Title: Learning Beyond Limits: Multitask Learning and Synthetic Data for Low-Resource Canonical Morpheme Segmentation

Title: Two-way Evidence self-Alignment based Dual-Gated Reasoning Enhancement

Title: Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs

Title: SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis

Title: R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search

Title: Understanding and Analyzing Inappropriately Targeting Language in Online Discourse: A Comparative Annotation Study

Title: MPO: Multilingual Safety Alignment via Reward Gap Optimization

Title: CASTILLO: Characterizing Response Length Distributions of Large Language Models

Title: Shadows in the Attention: Contextual Perturbation and Representation Drift in the Dynamics of Hallucination in LLMs

Title: Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality

Title: UNCLE: Uncertainty Expressions in Long-Form Generation

Title: Latent Principle Discovery for Language Model Self-Improvement

Title: In-Context Watermarks for Large Language Models

Title: On Multilingual Encoder Language Model Compression for Low-Resource Languages

Title: VeriFastScore: Speeding up long-form factuality evaluation

Title: LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding

Title: T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning

Title: MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems

Title: DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization

Title: Do Large Language Models Excel in Complex Logical Reasoning with Formal Language?

Title: R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning