2026-01-26

Title: ChiEngMixBench: Evaluating Large Language Models on Spontaneous and Natural Chinese-English Code-Mixed Generation

Title: M3Kang: Evaluating Multilingual Multimodal Mathematical Reasoning in Vision-Language Models

Title: Domain Specific Specialization in Low-Resource Settings: The Efficacy of Offline Response-Based Knowledge Distillation in Large Language Models

Title: Towards Latent Diffusion Suitable For Text

Title: Limits of n-gram Style Control for LLMs via Logit-Space Injection

Title: GameTalk: Training LLMs for Strategic Conversation

Title: Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification

Title: Generating Literature-Driven Scientific Theories at Scale

Title: Teaching and Evaluating LLMs to Reason About Polymer Design Related Tasks

Title: Machine-Assisted Grading of Nationwide School-Leaving Essay Exams with LLMs and Statistical NLP

Title: Regional Bias in Large Language Models

Title: Identity, Cooperation and Framing Effects within Groups of Real and Simulated Humans

Title: PolyAgent: Large Language Model Agent for Polymer Design

Title: Cross-Lingual Activation Steering for Multilingual Language Models

Title: Clarify or Answer: Reinforcement Learning for Agentic VQA with Context Under-specification

Title: Jacobian Scopes: token-level causal attributions in LLMs

Title: Learning Domain Knowledge in Multimodal Large Language Models through Reinforcement Fine-Tuning

Title: Exploring the Effects of Alignment on Numerical Bias in Large Language Models

Title: Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go

Title: Graph-Anchored Knowledge Indexing for Retrieval-Augmented Generation

Title: Persona Jailbreaking in Large Language Models

Title: DeepEra: A Deep Evidence Reranking Agent for Scientific Retrieval-Augmented Generated Question Answering

Title: TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization

Title: Timely Machine: Awareness of Time Makes Test-Time Scaling Agentic

Title: MRAG: Benchmarking Retrieval-Augmented Generation for Bio-medicine

Title: LOGICAL-COMMONSENSEQA: A Benchmark for Logical Commonsense Reasoning

Title: Is Length Really A Liability? An Evaluation of Multi-turn LLM Conversations using BoolQ

Title: SearchLLM: Detecting LLM Paraphrased Text by Measuring the Similarity with Regeneration of the Candidate Source via Search Engine

Title: Curate-Train-Refine: A Closed-Loop Agentic Framework for Zero Shot Classification

Title: Retrieve-Refine-Calibrate: A Framework for Complex Claim Fact-Checking

Title: Attention-MoA: Enhancing Mixture-of-Agents via Inter-Agent Semantic Attention and Deep Residual Synthesis

Title: AuroraEdge-V-2B: A Faster And Stronger Edge Visual Large Language Model

Title: PROST-LLM: Progressively Enhancing the Speech-to-Speech Translation Capability in LLMs

Title: How Does Personalized Memory Shape LLM Behavior? Benchmarking Rational Preference Utilization in Personalized Assistants

Title: MultiLexNorm++: A Unified Benchmark and a Generative Model for Lexical Normalization for Asian Languages

Title: Typologically Informed Parameter Aggregation

Title: Select or Project? Evaluating Lower-dimensional Vectors for LLM Training Data Explanations

Title: PLawBench: A Rubric-Based Benchmark for Evaluating LLMs in Real-World Legal Practice

Title: EMemBench: Interactive Benchmarking of Episodic Memory for VLM Agents

Title: Better Generalizing to Unseen Concepts: An Evaluation Framework and An LLM-Based Auto-Labeled Pipeline for Biomedical Concept Recognition

Title: Standardizing Longitudinal Radiology Report Evaluation via Large Language Model Annotation

Title: Do LLM hallucination detectors suffer from low-resource effect?

Title: Persuasion Tokens for Editing Factual Knowledge in LLMs

Title: Large Language Models as Automatic Annotators and Annotation Adjudicators for Fine-Grained Opinion Analysis

Title: SoS: Analysis of Surface over Semantics in Multilingual Text-To-Image Generation

Title: Trapped in the past? Disentangling fluid and crystallized intelligence of large language models using chess

Title: LLM-Based Adversarial Persuasion Attacks on Fact-Checking Systems

Title: Strategies for Span Labeling with Large Language Models