2026-03-20

Title: Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm

Title: TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots

Title: How Confident Is the First Token? An Uncertainty-Calibrated Prompt Optimization Framework for Large Language Model Classification and Understanding

Title: Agentic Framework for Political Biography Extraction

Title: DynaRAG: Bridging Static and Dynamic Knowledge in Retrieval-Augmented Generation

Title: Learned but Not Expressed: Capability-Expression Dissociation in Large Language Models

Title: Real-Time Trustworthiness Scoring for LLM Structured Outputs and Data Extraction

Title: MineDraft: A Framework for Batch Parallel Speculative Decoding

Title: An Agentic System for Schema Aware NL2SQL Generation

Title: BenchBrowser -- Collecting Evidence for Evaluating Benchmark Validity

Title: How LLMs Distort Our Written Language

Title: Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations

Title: GRAFITE: Generative Regression Analysis Framework for Issue Tracking and Evaluation

Title: From Noise to Signal: When Outliers Seed New Topics

Title: Synthetic Data Generation for Training Diversified Commonsense Reasoning Models

Title: PowerFlow: Unlocking the Dual Nature of LLMs via Principled Distribution Matching

Title: AutoScreen-FW: An LLM-based Framework for Resume Screening

Title: TopoChunker: Topology-Aware Agentic Document Chunking Framework

Title: TARo: Token-level Adaptive Routing for LLM Test-time Alignment

Title: Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs

Title: Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation

Title: UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference

Title: GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms

Title: WASD: Locating Critical Neurons as Sufficient Conditions for Explaining and Controlling LLM Behavior

Title: The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices

Title: EntropyCache: Decoded Token Entropy Guided KV Caching for Diffusion Language Models

Title: When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

Title: Cross-Lingual LLM-Judge Transfer via Evaluation Decomposition

Title: ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs

Title: Language Model Maps for Prompt-Response Distributions via Log-Likelihood Vectors

Title: Learning to Self-Evolve

Title: A Comparative Empirical Study of Catastrophic Forgetting Mitigation in Sequential Task Adaptation for Continual Natural Language Processing Systems

Title: Automatic detection of Gen-AI texts: A comparative framework of neural models

Title: Implicit Grading Bias in Large Language Models: How Writing Style Affects Automated Assessment Across Math, Programming, and Essay Tasks

Title: Mi:dm K 2.5 Pro

Title: Detecting Basic Values in A Noisy Russian Social Media Text Data: A Multi-Stage Classification Framework

Title: Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo

Title: A Human-in/on-the-Loop Framework for Accessible Text Generation

Title: Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs

Title: Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

Title: RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation

Title: Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

Title: What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

Title: MoRI: Learning Motivation-Grounded Reasoning for Scientific Ideation in Large Language Models

Title: Parallelograms Strike Back: LLMs Generate Better Analogies than People

Title: A Dataset and Resources for Identifying Patient Health Literacy Information from Clinical Notes

Title: DaPT: A Dual-Path Framework for Multilingual Multi-hop Question Answering

Title: UGID: Unified Graph Isomorphism for Debiasing Large Language Models

Title: Optimal Splitting of Language Models from Mixtures to Specialized Domains

Title: VEPO: Variable Entropy Policy Optimization for Low-Resource Language Foundation Models

Title: Evaluating Counterfactual Strategic Reasoning in Large Language Models

Title: Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Title: F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World