2026-01-13

Title: TeleMem: Building Long-Term and Multimodal Memory for Agentic AI

Title: Operation Veja: Fixing Fundamental Concepts Missing from Modern Roleplaying Training Paradigms

Title: Reinforcement Learning for Chain of Thought Compression with One-Domain-to-All Generalization

Title: A Multi-Stage Workflow for the Review of Marketing Content with Reasoning Large Language Models

Title: AzeroS: Extending LLM to Speech with Self-Generated Instruction-Free Tuning

Title: Is Sanskrit the most token-efficient language? A quantitative study using GPT, Gemini, and SentencePiece

Title: Amory: Building Coherent Narrative-Driven Agent Memory through Agentic Reasoning

Title: How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning?

Title: $\texttt{AMEND++}$: Benchmarking Eligibility Criteria Amendments in Clinical Trials

Title: Why LoRA Fails to Forget: Regularized Low-Rank Adaptation Against Backdoors in Language Models

Title: A Rising Tide Lifts All Boats: MTQE Rewards for Idioms Improve General Translation Quality

Title: Annotating Dimensions of Social Perception in Text: The First Sentence-Level Dataset of Warmth and Competence

Title: On the Fallacy of Global Token Perplexity in Spoken Language Model Evaluation

Title: AfriqueLLM: How Data Mixing and Model Architecture Impact Continued Pre-training for African Languages

Title: MITRA: A Large-Scale Parallel Corpus and Multilingual Pretrained Language Model for Machine Translation and Semantic Retrieval for Pāli, Sanskrit, Buddhist Chinese, and Tibetan

Title: Steer Model beyond Assistant: Controlling System Prompt Strength via Contrastive Decoding

Title: Value of Information: A Framework for Human-Agent Communication

Title: Structured Episodic Event Memory

Title: Can a Unimodal Language Agent Provide Preferences to Tune a Multimodal Vision-Language Model?

Title: NC-Bench: An LLM Benchmark for Evaluating Conversational Competence

Title: Time Travel Engine: A Shared Latent Chronological Manifold Enables Historical Navigation in Large Language Models

Title: LitVISTA: A Benchmark for Narrative Orchestration in Literary Text

Title: PRISP: Privacy-Safe Few-Shot Personalization via Lightweight Adaptation

Title: IndRegBias: A Dataset for Studying Indian Regional Biases in English and Code-Mixed Social Media Comments

Title: Spec-o3: A Tool-Augmented Vision-Language Agent for Rare Celestial Object Candidate Vetting via Automated Spectral Inspection

Title: MedRAGChecker: Claim-Level Verification for Biomedical Retrieval-Augmented Generation

Title: Exposía: Academic Writing Assessment of Exposés and Peer Feedback

Title: SimLLM: Fine-Tuning Code LLMs for SimPy-Based Queueing System Simulation

Title: CSR-RAG: An Efficient Retrieval System for Text-to-SQL on the Enterprise Scale

Title: EVM-QuestBench: An Execution-Grounded Benchmark for Natural-Language Transaction Code Generation

Title: Are Emotions Arranged in a Circle? Geometric Analysis of Emotion Representations via Hyperspherical Contrastive Learning

Title: Stylistic Evolution and LLM Neutrality in Singlish Language

Title: Detecting LLM-Generated Text with Performance Guarantees

Title: How Context Shapes Truth: Geometric Transformations of Statement-level Truth Representations in LLMs

Title: Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

Title: N2N-GQA: Noise-to-Narrative for Graph-Based Table-Text Question Answering Using LLMs

Title: Pragya: An AI-Based Semantic Recommendation System for Sanskrit Subhasitas

Title: Labels have Human Values: Value Calibration of Subjective Tasks

Title: MedEinst: Benchmarking the Einstellung Effect in Medical LLMs through Counterfactual Differential Diagnosis

Title: Do Language Models Reason Across Languages?

Title: What makes for an enjoyable protagonist? An analysis of character warmth and competence

Title: InFi-Check: Interpretable and Fine-Grained Fact-Checking of LLMs

Title: Evaluating Cross-Lingual Unlearning in Multilingual Language Models

Title: IDRBench: Interactive Deep Research Benchmark

Title: Characterising Toxicity in Generative Large Language Models

Title: GRASP LoRA: GRPO Guided Adapter Sparsity Policy for Cross Lingual Transfer

Title: Evaluating Accounting Reasoning Capabilities of Large Language Models

Title: MTMCS-Bench: Evaluating Contextual Safety of Multimodal Large Language Models in Multi-Turn Dialogues

Title: GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO

Title: Multi-Stage Evolutionary Model Merging with Meta Data Driven Curriculum Learning for Sentiment-Specialized Large Language Modeling

Title: EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs

Title: Garbage Attention in Large Language Models: BOS Sink Heads and Sink-aware Pruning

Title: CIRAG: Construction-Integration Retrieval and Adaptive Generation for Multi-hop Question Answering

Title: Forest Before Trees: Latent Superposition for Efficient Visual Reasoning

Title: AgentHallu: Benchmarking Automated Hallucination Attribution of LLM-based Agents

Title: PDR: A Plug-and-Play Positional Decay Framework for LLM Pre-training Data Detection

Title: Explainable Multimodal Aspect-Based Sentiment Analysis with Dependency-guided Large Language Model

Title: †DAGGER: Distractor-Aware Graph Generation for Executable Reasoning in Math Problems

Title: BiasLab: A Multilingual, Dual-Framing Framework for Robust Measurement of Output-Level Bias in Large Language Models

Title: Paraphrasing Adversarial Attack on LLM-as-a-Reviewer

Title: Distributional Clarity: The Hidden Driver of RL-Friendliness in Large Language Models

Title: TreePS-RAG: Tree-based Process Supervision for Reinforcement Learning in Agentic RAG

Title: X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Title: RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction

Title: Categorize Early, Integrate Late: Divergent Processing Strategies in Automatic Speech Recognition

Title: LLMs Can't Play Hangman: On the Necessity of a Private Working Memory for Language Agents

Title: MedTutor: A Retrieval-Augmented LLM System for Case-Based Medical Education

Title: TurkBench: A Benchmark for Evaluating Turkish Large Language Models

Title: Solar Open Technical Report

Title: Codified Foreshadowing-Payoff Text Generation

Title: Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers

Title: When Abundance Conceals Weakness: Knowledge Conflict in Multilingual Models

Title: Engineering of Hallucination in Generative AI: It's not a Bug, it's a Feature

Title: Fine-Tuning vs. RAG for Multi-Hop Question Answering with Novel Knowledge

Title: The Need for a Socially-Grounded Persona Framework for User Simulation

Title: ReMIND: Orchestrating Modular Large Language Models for Controllable Serendipity A REM-Inspired System Design for Emergent Creative Ideation

Title: Measuring Iterative Temporal Reasoning with TimePuzzles

Title: Can Large Language Models Understand, Reason About, and Generate Code-Switched Text?

Title: Structured Reasoning for Large Language Models

Title: Relink: Constructing Query-Driven Evidence Graph On-the-Fly for GraphRAG

Title: MI-PRUN: Optimize Large Language Model Pruning via Mutual Information

Title: The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?

Title: ActiShade: Activating Overshadowed Knowledge to Guide Multi-Hop Reasoning in Large Language Models

Title: The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents

Title: Document-Level Zero-Shot Relation Extraction with Entity Side Information

Title: Towards Comprehensive Semantic Speech Embeddings for Chinese Dialects

Title: ReasonTabQA: A Comprehensive Benchmark for Table Question Answering from Real World Industrial Scenarios

Title: PsyCLIENT: Client Simulation via Conversational Trajectory Modeling for Trainee Practice and Model Evaluation in Mental Health Counseling

Title: BayesRAG: Probabilistic Mutual Evidence Corroboration for Multimodal Retrieval-Augmented Generation

Title: Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation

Title: DiffER: Diffusion Entity-Relation Modeling for Reversal Curse in Diffusion Large Language Models

Title: Controlled Self-Evolution for Algorithmic Code Optimization

Title: Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models

Title: TALON: Confidence-Aware Speculative Decoding with Adaptive Token Trees

Title: Semantic Compression of LLM Instructions via Symbolic Metalanguages

Title: Interpretable Text Classification Applied to the Detection of LLM-generated Creative Writing

Title: Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Title: GROKE: Vision-Free Navigation Instruction Evaluation via Graph Reasoning on OpenStreetMap

Title: Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning

Title: Two Pathways to Truthfulness: On the Intrinsic Encoding of LLM Hallucinations

Title: KALE: Enhancing Knowledge Manipulation in Large Language Models via Knowledge-aware Learning

Title: Judging Against the Reference: Uncovering Knowledge-Driven Failures in LLM-Judges on QA Evaluation

Title: High-Rank Structured Modulation for Parameter-Efficient Fine-Tuning

Title: Controlling Multimodal Conversational Agents with Coverage-Enhanced Latent Actions

Title: Thinking Before Constraining: A Unified Decoding Framework for Large Language Models

Title: From RAG to Agentic RAG for Faithful Islamic Question Answering

Title: A Unified Framework for Emotion Recognition and Sentiment Analysis via Expert-Guided Multimodal Fusion with Large Language Models

Title: ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents

Title: Proof of Time: A Benchmark for Evaluating Scientific Idea Judgments

Title: PlaM: Training-Free Plateau-Guided Model Merging for Better Visual Grounding in MLLMs

Title: Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends

Title: Adaptive Layer Selection for Layer-Wise Token Pruning in LLM Inference

Title: Exploring the Meta-level Reasoning of Large Language Models via a Tool-based Multi-hop Tabular Question Answering Task

Title: Emotional Support Evaluation Framework via Controllable and Diverse Seeker Simulator

Title: Is Agentic RAG worth it? An experimental comparison of RAG approaches

Title: Structure First, Reason Next: Enhancing a Large Language Model using Knowledge Graph for Numerical Reasoning in Financial Documents

Title: Contrastive Learning with Narrative Twins for Modeling Story Salience

Title: Enhancing Self-Correction in Large Language Models through Multi-Perspective Reflection

Title: Beyond Single-Shot: Multi-step Tool Retrieval via Query Planning

Title: Kinship Data Benchmark for Multi-hop Reasoning

Title: Learning Through Dialogue: Unpacking the Dynamics of Human-LLM Conversations on Political Issues

Title: The Confidence Trap: Gender Bias and Predictive Certainty in LLMs

Title: Reference Games as a Testbed for the Alignment of Model Uncertainty and Clarification Requests