2026-01-15

Title: DeliberationBench: When Do More Voices Hurt? A Controlled Study of Multi-LLM Deliberation Protocols

Title: From Adversarial Poetry to Adversarial Tales: An Interpretability Research Agenda

Title: Companion Agents: A Table-Information Mining Paradigm for Text-to-SQL

Title: Recursive Knowledge Synthesis for Multi-LLM Systems: Stability Analysis and Tri-Agent Audit Framework

Title: Consistency-Aware Editing for Entity-level Unlearning in Language Models

Title: Resisting Correction: How RLHF Makes Language Models Ignore External Safety Signals in Natural Conversation

Title: Rubric-Conditioned LLM Grading: Alignment, Uncertainty, and Robustness

Title: Emissions and Performance Trade-off Between Small and Large Language Models

Title: Directional Attractors in LLM Reasoning: How Similarity Retrieval Steers Iterative Summarization Based Reasoning

Title: Scalable and Reliable Evaluation of AI Knowledge Retrieval Systems: RIKER and the Coherent Simulated Universe

Title: PediaMind-R1: A Temperament-Aware Language Model for Personalized Early Childhood Care Reasoning via Cognitive Modeling and Preference Alignment

Title: Gaming the Answer Matcher: Examining the Impact of Text Manipulation on Automated Judgment

Title: NewsScope: Schema-Grounded Cross-Domain News Claim Extraction with Open Models

Title: Evaluating Role-Consistency in LLMs for Counselor Training

Title: Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Title: Entropy Sentinel: Continuous LLM Accuracy Monitoring from Decoding Entropy Traces in STEM

Title: Multicultural Spyfall: Assessing LLMs through Dynamic Multilingual Social Deduction Game

Title: OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG

Title: SpectraQuery: A Hybrid Retrieval-Augmented Conversational Assistant for Battery Science

Title: Can LLMs interpret figurative language as humans do?: surface-level vs representational similarity

Title: Is Grokking Worthwhile? Functional Analysis and Transferability of Generalization Circuits in Transformers

Title: Efficient Multilingual Dialogue Processing via Translation Pipelines and Distilled Language Models

Title: Mi:dm 2.0 Korea-centric Bilingual Language Models

Title: From Symbolic to Natural-Language Relations: Rethinking Knowledge Graph Construction in the Era of Large Language Models

Title: How Many Human Judgments Are Enough? Feasibility Limits of Human Preference Evaluation

Title: SubTokenTest: A Practical Benchmark for Real-World Sub-token Understanding

Title: Contrastive Bi-Encoder Models for Multi-Label Skill Extraction: Enhancing ESCO Ontology Matching with BERT and Attention Mechanisms

Title: Adaptive Multi-Stage Patent Claim Generation with Unified Quality Assessment

Title: Identity-Robust Language Model Generation via Content Integrity Preservation

Title: OrthoGeoLoRA: Geometric Parameter-Efficient Fine-Tuning for Structured Social Science Concept Retrieval on theWeb

Title: ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Title: A.X K1 Technical Report

Title: UserLM-R1: Modeling Human Reasoning in User Language Models with Multi-Reward Reinforcement Learning

Title: When to Trust: A Causality-Aware Calibration Framework for Accurate Knowledge Graph Retrieval-Augmented Generation

Title: TeachPro: Multi-Label Qualitative Teaching Evaluation via Cross-View Graph Synergy and Semantic Anchored Evidence Encoding

Title: When to Invoke: Refining LLM Fairness with Toxicity Assessment

Title: MCGA: A Multi-task Classical Chinese Literary Genre Audio Corpus

Title: ReGraM: Region-First Knowledge Graph Reasoning for Medical Question Answering

Title: Understanding or Memorizing? A Case Study of German Definite Articles in Language Models

Title: Improving Implicit Hate Speech Detection via a Community-Driven Multi-Agent Framework

Title: Frame of Reference: Addressing the Challenges of Common Ground Representation in Situational Dialogs

Title: Relation Extraction Capabilities of LLMs on Clinical Text: A Bilingual Evaluation for English and Turkish

Title: The Imperfective Paradox in Large Language Models

Title: Ability Transfer and Recovery via Modularized Parameters Localization

Title: Structured Knowledge Representation through Contextual Pages for Retrieval-Augmented Generation

Title: Bias Dynamics in BabyLMs: Towards a Compute-Efficient Sandbox for Democratising Pre-Training Debiasing

Title: Where Knowledge Collides: A Mechanistic Study of Intra-Memory Knowledge Conflict in Language Models

Title: Improving Symbolic Translation of Language Models for Logical Reasoning

Title: SlidesGen-Bench: Evaluating Slides Generation via Computational and Quantitative Metrics

Title: SERM: Self-Evolving Relevance Model with Agent-Driven Learning from Massive Query Streams

Title: Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats

Title: Dialogue Telemetry: Turn-Level Instrumentation for Autonomous Information Gathering

Title: DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Title: LLMs Got Rhythm? Hybrid Phonological Filtering for Greek Poetry Rhyme Detection and Generation

Title: DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Title: Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection

Title: LLMs can Compress LLMs: Adaptive Pruning by Agents

Title: Empathy Applicability Modeling for General Health Queries

Title: Value-Aware Numerical Representations for Transformer Language Models