2026-01-14

Title: EmbeddingRWKV: State-Centric Retrieval with Reusable States

Title: A Human-Centric Pipeline for Aligning Large Language Models with Chinese Medical Ethics

Title: Knowing But Not Doing: Convergent Morality and Divergent Action in LLMs

Title: Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis

Title: Cross-Cultural Expert-Level Art Critique Evaluation with Vision-Language Models

Title: Multilingual, Multimodal Pipeline for Creating Authentic and Structured Fact-Checked Claim Dataset

Title: VULCA-Bench: A Multicultural Vision-Language Benchmark for Evaluating Cultural Understanding

Title: DYCP: Dynamic Context Pruning for Long-Form Dialogue with LLMs

Title: LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback

Title: Reasoning Beyond Chain-of-Thought: A Latent Computational Mode in Large Language Models

Title: Universal computation is intrinsic to language model decoding

Title: Calibration Is Not Enough: Evaluating Confidence Estimation Under Language Variations

Title: AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling

Title: Query Suggestion for Retrieval-Augmented Generation via Dynamic In-Context Learning

Title: Debiasing Large Language Models via Adaptive Causal Prompting with Sketch-of-Thought

Title: Qalb: Largest State-of-the-Art Urdu Large Language Model for 230M Speakers with Systematic Continued Pre-training

Title: Mechanisms are Transferable: Data-Efficient Low-Resource Adaptation via Circuit-Targeted Supervised Fine-Tuning

Title: WISE-Flow: Workflow-Induced Structured Experience for Self-Evolving Conversational Service Agents

Title: SwiftMem: Fast Agentic Memory via Query-aware Indexing

Title: Relational Knowledge Distillation Using Fine-tuned Function Vectors

Title: Prompt-Based Clarity Evaluation and Topic Detection in Political Question Answering

Title: Evaluating Implicit Regulatory Compliance in LLM Tool Invocation via Logic-Guided Synthesis

Title: Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

Title: Generation-Augmented Generation: A Plug-and-Play Framework for Private Knowledge Injection in Large Language Models

Title: Towards Principled Design of Mixture-of-Experts Language Models under Memory and Inference Constraints

Title: User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale

Title: Med-CoReasoner: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning

Title: Discovery and Reinforcement of Tool-Integrated Reasoning Chains via Rollout Trees

Title: D$^2$Plan: Dual-Agent Dynamic Global Planning for Complex Retrieval-Augmented Reasoning

Title: Enhancing Sentiment Classification and Irony Detection in Large Language Models through Advanced Prompt Engineering Techniques

Title: AgriAgent: Contract-Driven Planning and Capability-Aware Tool Orchestration in Real-World Agriculture

Title: CLaS-Bench: A Cross-Lingual Alignment and Steering Benchmark

Title: Detecting Mental Manipulation in Speech via Synthetic Multi-Speaker Dialogue

Title: PATS: Personality-Aware Teaching Strategies with Large Language Model Tutors

Title: Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering

Title: Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management

Title: JudgeRLVR: Judge First, Generate Second for Efficient Reasoning

Title: sui-1: Grounded and Verifiable Long-Form Summarization

Title: Do You Understand How I Feel?: Towards Verified Empathy in Therapy Chatbots

Title: Surgical Refusal Ablation: Disentangling Safety from Intelligence via Concept-Guided Spectral Cleaning

Title: BenchOverflow: Measuring Overflow in Large Language Models via Plain-Text Prompts

Title: It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models

Title: STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing over Movie Screenplays

Title: STAR: Detecting Inference-time Backdoors in LLM Reasoning via State-Transition Amplification Ratio

Title: DeepResearch Bench II: Diagnosing Deep Research Agents via Rubrics from Expert Report

Title: Ministral 3

Title: ExpSeek: Self-Triggered Experience Seeking for Web Agents

Title: GraphSearch: Agentic Search-Augmented Reasoning for Zero-Shot Graph Learning

Title: How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction

Title: Moral Lenses, Political Coordinates: Towards Ideological Positioning of Morally Conditioned LLMs

Title: RULERS: Locked Rubrics and Evidence-Anchored Scoring for Robust LLM Evaluation

Title: Analyzing Bias in False Refusal Behavior of Large Language Models for Hate Speech Detoxification

Title: Lessons from the Field: An Adaptable Lifecycle Approach to Applied Dialogue Summarization

Title: QuantEval: A Benchmark for Financial Quantitative Tasks in Large Language Models

Title: Nationality and Region Prediction from Names: A Comparative Study of Neural Models and Large Language Models

Title: RAGShaper: Eliciting Sophisticated Agentic RAG Skills via Automated Data Synthesis

Title: PrivGemo: Privacy-Preserving Dual-Tower Graph Retrieval for Empowering LLM Reasoning with Memory Augmentation

Title: From Rows to Reasoning: A Retrieval-Augmented Multimodal Framework for Spreadsheet Understanding

Title: Inferring Latent Intentions: Attributional Natural Language Inference in LLM Agents

Title: TableCache: Primary Foreign Key Guided KV Cache Precomputation for Low Latency Text-to-SQL

Title: To Retrieve or To Think? An Agentic Approach for Context Evolution

Title: Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Title: Modeling LLM Agent Reviewer Dynamics in Elo-Ranked Review System