2026-03-05

Title: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Title: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Title: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Title: SE-Search: Self-Evolving Search Agent via Memory and Dense Reward

Title: Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

Title: Language Model Goal Selection Differs from Humans' in an Open-Ended Task

Title: PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

Title: TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Title: TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Title: How LLMs Cite and Why It Matters: A Cross-Model Audit of Reference Fabrication in AI-Assisted Academic Writing and Methods to Detect Phantom Citations

Title: Benchmarking Legal RAG: The Promise and Limits of AI Statutory Surveys

Title: From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

Title: Developing an AI Assistant for Knowledge Management and Workforce Training in State DOTs

Title: HumanLM: Simulating Users with State Alignment Beats Response Imitation

Title: Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

Title: Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

Title: Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

Title: Combating data scarcity in recommendation services: Integrating cognitive types of VARK and neural network technologies (LLM)

Title: Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

Title: Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

Title: How does fine-tuning improve sensorimotor representations in large language models?

Title: Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

Title: M-QUEST -- Meme Question-Understanding Evaluation on Semantics and Toxicity

Title: Retcon -- a Prompt-Based Technique for Precise Control of LLMs in Conversations

Title: Quantum-Inspired Self-Attention in a Large Language Model

Title: Automated Concept Discovery for LLM-as-a-Judge Preference Analysis

Title: From We to Me: Theory Informed Narrative Shift with Abductive Reasoning

Title: DIALEVAL: Automated Type-Theoretic Evaluation of LLM Instruction Following

Title: Can Large Language Models Derive New Knowledge? A Dynamic Benchmark for Biological Knowledge Discovery

Title: Discern Truth from Falsehood: Reducing Over-Refusal via Contrastive Refinement

Title: Controlling Chat Style in Language Models via Single-Direction Editing

Title: IntPro: A Proxy Agent for Context-Aware Intent Understanding via Retrieval-conditioned Inference

Title: Controllable and explainable personality sliders for LLMs at inference time

Title: StructLens: A Structural Lens for Language Models via Maximum Spanning Trees

Title: AutoHarness: improving LLM agents by automatically synthesizing a code harness

Title: Certainty robustness: Evaluating LLM stability under self-challenging prompts

Title: PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning

Title: Fragile Thoughts: How Large Language Models Handle Chain-of-Thought Perturbations

Title: Training-free Dropout Sampling for Semantic Token Acceptance in Speculative Decoding

Title: The CompMath-MCQ Dataset: Are LLMs Ready for Higher-Level Math?

Title: Compressed Sensing for Capability Localization in Large Language Models

Title: Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

Title: Tracing Pharmacological Knowledge In Large Language Models

Title: Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs

Title: Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi

Title: SafeCRS: Personalized Safety Alignment for LLM-Based Conversational Recommender Systems

Title: RAG-X: Systematic Diagnosis of Retrieval-Augmented Generation for Medical Question Answering

Title: Tucano 2 Cool: Better Open Source LLMs for Portuguese

Title: ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Title: Belief-Sim: Towards Belief-Driven Simulation of Demographic Misinformation Susceptibility

Title: A Neural Topic Method Using a Large-Language-Model-in-the-Loop for Business Research

Title: MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation

Title: Order Is Not Layout: Order-to-Space Bias in Image Generation

Title: ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement

Title: Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

Title: T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

Title: Benchmarking Motivational Interviewing Competence of Large Language Models

Title: Coupling Local Context and Global Semantic Prototypes via a Hierarchical Architecture for Rhetorical Roles Labeling

Title: Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy

Title: CzechTopic: A Benchmark for Zero-Shot Topic Localization in Historical Czech Documents

Title: Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Title: Who Judges the Judge? Evaluating LLM-as-a-Judge for French Medical open-ended QA

Title: Monitoring Emergent Reward Hacking During Generation via Internal Activations

Title: Hindsight Quality Prediction Experiments in Multi-Candidate Human-Post-Edited Machine Translation

Title: FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation

Title: Traces of Social Competence in Large Language Models

Title: Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

Title: When Do Language Models Endorse Limitations on Human Rights Principles?

Title: Retrieval or Representation? Reassessing Benchmark Gaps in Multilingual and Visually Rich RAG

Title: Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Title: Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

Title: The Company You Keep: How LLMs Respond to Dark Triad Traits

Title: World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Title: AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Title: AgentIR: Reasoning-Aware Retrival for Deep Research Agents