2026-01-09

Title: MedPI: Evaluating AI Systems in Medical Patient-facing Interactions

Title: RAGVUE: A Diagnostic View for Explainable and Automated Evaluation of Retrieval-Augmented Generation

Title: Automatic Construction of Chinese Verb Collostruction Database

Title: Attribute-Aware Controlled Product Generation with LLMs for E-commerce

Title: Collective Narrative Grounding: Community-Coordinated Data Contributions to Improve Local AI Systems

Title: TeleTables: A Benchmark for Large Language Models in Telecom Table Interpretation

Title: FronTalk: Benchmarking Front-End Development as Conversational Code Generation with Multi-Modal Feedback

Title: STDD:Spatio-Temporal Dynamics-Driven Token Refinement in Diffusion Language Models

Title: Enhancing Admission Inquiry Responses with Fine-Tuned Models and Retrieval-Augmented Generation

Title: Ideology as a Problem: Lightweight Logit Steering for Annotator-Specific Alignment in Social Media Analysis

Title: LLMs for Explainable Business Decision-Making: A Reinforcement Learning Fine-Tuning Approach

Title: Leveraging Language Models and RAG for Efficient Knowledge Discovery in Clinical Environments

Title: Complexity Agnostic Recursive Decomposition of Thoughts

Title: TrueBrief: Faithful Summarization through Small Language Models

Title: AnimatedLLM: Explaining LLMs with Interactive Visualizations

Title: From Domains to Instances: Dual-Granularity Data Synthesis for LLM Unlearning

Title: RIGOURATE: Quantifying Scientific Exaggeration with Evidence-Aligned Claim Evaluation

Title: Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

Title: MiJaBench: Revealing Minority Biases in Large Language Models via Hate Speech Jailbreaking

Title: ARREST: Adversarial Resilient Regulation Enhancing Safety and Truth in Large Language Models

Title: Gavel: Agent Meets Checklist for Evaluating LLMs on Long-Context Legal Summarization

Title: Accommodation and Epistemic Vigilance: A Pragmatic Account of Why LLMs Fail to Challenge Harmful Beliefs

Title: Learning to Simulate Human Dialogue

Title: Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models

Title: Beyond Static Summarization: Proactive Memory Extraction for LLM Agents

Title: Concept Tokens: Learning Behavioral Embeddings Through Concept Definitions

Title: SampoNLP: A Self-Referential Toolkit for Morphological Analysis of Subword Tokenizers

Title: WESR: Scaling and Evaluating Word-level Event-Speech Recognition

Title: LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation

Title: GRACE: Reinforcement Learning for Grounded Response and Abstention under Contextual Evidence

Title: BanglaLorica: Design and Evaluation of a Robust Watermarking Algorithm for Large Language Models in Bangla Text Generation

Title: Identifying Good and Bad Neurons for Task-Level Controllable LLMs

Title: FeedEval: Pedagogically Aligned Evaluation of LLM-Generated Essay Feedback

Title: Aligning Text, Code, and Vision: A Multi-Objective Reinforcement Learning Framework for Text-to-Visualization

Title: THaLLE-ThaiLLM: Domain-Specialized Small LLMs for Finance and Thai -- Technical Report

Title: When More Words Say Less: Decoupling Length and Specificity in Image Description Evaluation

Title: Character-R1: Enhancing Role-Aware Reasoning in Role-Playing Agents via RLVR

Title: From National Curricula to Cultural Awareness: Constructing Open-Ended Culture-Specific Question Answering Dataset

Title: MAGA-Bench: Machine-Augment-Generated Text via Alignment Detection Benchmark

Title: SpeechMedAssist: Efficiently and Effectively Adapting Speech Language Models for Medical Consultation

Title: CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models

Title: ToolGate: Contract-Grounded and Verified Tool Execution for LLMs

Title: See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation

Title: Thunder-KoNUBench: A Corpus-Aligned Benchmark for Korean Negation Understanding

Title: PRISM: A Unified Framework for Post-Training LLMs Without Verifiable Rewards

Title: Prior-Informed Zeroth-Order Optimization with Adaptive Direction Alignment for Memory-Efficient LLM Fine-Tuning

Title: DSC2025 -- ViHallu Challenge: Detecting Hallucination in Vietnamese LLMs

Title: Fame Fades, Nature Remains: Disentangling the Character Identity of Role-Playing Agents

Title: Automatic Classifiers Underdetect Emotions Expressed by Men

Title: AM$^3$Safety: Towards Data Efficient Alignment of Multi-modal Multi-turn Safety for MLLMs

Title: RiskAtlas: Exposing Domain-Specific Risks in LLMs through Knowledge-Graph-Guided Harmful Prompt Generation

Title: Tool-MAD: A Multi-Agent Debate Framework for Fact Verification with Diverse Tool Augmentation and Adaptive Retrieval

Title: PILOT-Bench: A Benchmark for Legal Reasoning in the Patent Domain with IRAC-Aligned Classification Tasks

Title: Differential syntactic and semantic encoding in LLMs

Title: Revisiting Judge Decoding from First Principles via Training-Free Distributional Divergence

Title: NC2C: Automated Convexification of Generic Non-Convex Optimization Problems

Title: Belief in Authority: Impact of Authority in Multi-Agent Evaluation Framework

Title: RAAR: Retrieval Augmented Agentic Reasoning for Cross-Domain Misinformation Detection

Title: Token Maturation: Autoregressive Language Generation via Continuous Token Dynamics

Title: MisSpans: Fine-Grained False Span Identification in Cross-Domain Fake News

Title: A Navigational Approach for Comprehensive RAG via Traversal over Proposition Graphs

Title: EvolSQL: Structure-Aware Evolution for Scalable Text-to-SQL Data Synthesis

Title: Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis

Title: CuMA: Aligning LLMs with Sparse Cultural Values via Demographic-Aware Mixture of Adapters

Title: Faithful Summarisation under Disagreement via Belief-Level Aggregation

Title: V-FAT: Benchmarking Visual Fidelity Against Text-bias

Title: Can AI-Generated Persuasion Be Detected? Persuaficial Benchmark and AI vs. Human Linguistic Differences

Title: GenProve: Learning to Generate Text with Fine-Grained Provenance

Title: A Unified Spoken Language Model with Injected Emotional-Attribution Thinking for Human-like Interaction

Title: Text as a Universal Interface for Transferable Personalization

Title: Learning from Mistakes: Negative Reasoning Samples Enhance Out-of-Domain Generalization

Title: Can Large Language Models Resolve Semantic Discrepancy in Self-Destructive Subcultures? Evidence from Jirai Kei

Title: Hán Dān Xué Bù (Mimicry) or Qīng Chū Yú Lán (Mastery)? A Cognitive Perspective on Reasoning Distillation in Large Language Models

Title: ArcAligner: Adaptive Recursive Aligner for Compressed Context Embeddings in RAG

Title: Compositional Steering of Large Language Models with Steering Tokens

Title: SemPA: Improving Sentence Embeddings of Large Language Models through Semantic Preference Alignment

Title: How Human is AI? Examining the Impact of Emotional Prompts on Artificial and Human and Responsiveness

Title: Agent-as-a-Judge

Title: DocDancer: Towards Agentic Document-Grounded Information Seeking

Title: RelayLLM: Efficient Reasoning via Collaborative Decoding

Title: Reverse-engineering NLI: A study of the meta-inferential properties of Natural Language Inference

Title: Inside Out: Evolving User-Centric Core Memory Trees for Long-Term Personalized Dialogue Systems

Title: LELA: an LLM-based Entity Linking Approach with Zero-Shot Domain Adaptation

Title: Measuring and Fostering Peace through Machine Learning and Artificial Intelligence

Title: GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization