2025-10-16

Title: Benchmarking Open-Source Large Language Models for Persian in Zero-Shot and Few-Shot Learning

Title: Cancer Diagnosis Categorization in Electronic Health Records Using Large Language Models and BioBERT: Model Performance Evaluation Study

Title: From Noise to Signal to Selbstzweck: Reframing Human Label Variation in the Era of Post-training in NLP

Title: MEDEQUALQA: Evaluating Biases in LLMs with Counterfactual Reasoning

Title: Classifier-Augmented Generation for Structured Workflow Prediction

Title: Scheming Ability in LLM-to-LLM Strategic Interactions

Title: Mathematics with large language models as provers and verifiers

Title: MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training

Title: Repurposing Annotation Guidelines to Instruct LLM Annotators: A Case Study

Title: A\textsuperscript{2}FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning

Title: FaStFACT: Faster, Stronger Long-Form Factuality Evaluations in LLMs

Title: VLURes: Benchmarking VLM Visual and Linguistic Understanding in Low-Resource Languages

Title: EduDial: Constructing a Large-scale Multi-turn Teacher-Student Dialogue Corpus

Title: Who's Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering

Title: The Curious Case of Curiosity across Human Cultures and LLMs

Title: 3-Model Speculative Decoding

Title: A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation

Title: OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning

Title: CurLL: A Developmental Framework to Evaluate Continual Learning in Language Models

Title: On the Role of Preference Variance in Preference Optimization

Title: GatePro: Parameter-Free Expert Selection Optimization for Mixture-of-Experts Models

Title: ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models

Title: Multi-Label Clinical Text Eligibility Classification and Summarization System

Title: Stable LLM Ensemble: Interaction between Example Representativeness and Diversity

Title: I Am Aligned, But With Whom? MENA Values Benchmark for Evaluating Cultural Alignment and Multilingual Bias in LLMs

Title: Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference

Title: A Matter of Representation: Towards Graph-Based Abstract Code Generation

Title: CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning

Title: Putting on the Thinking Hats: A Survey on Chain of Thought Fine-tuning from the Perspective of Human Reasoning Mechanism

Title: DSCD: Large Language Model Detoxification with Self-Constrained Decoding

Title: SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs

Title: Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation

Title: StressTransfer: Stress-Aware Speech-to-Speech Translation with Emphasis Preservation

Title: Text Anomaly Detection with Simplified Isolation Kernel

Title: LLM-Guided Synthetic Augmentation (LGSA) for Mitigating Bias in AI Systems

Title: Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain

Title: Do You Get the Hint? Benchmarking LLMs on the Board Game Concept

Title: Beyond Correctness: Rewarding Faithful Reasoning in Retrieval-Augmented Generation

Title: In-Distribution Steering: Balancing Control and Coherence in Language Model Generation

Title: Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan's Intelligent Interaction Systems

Title: Mismatch Aware Guidance for Robust Emotion Control in Auto-Regressive TTS Models

Title: LLM one-shot style transfer for Authorship Attribution and Verification

Title: ChatR1: Reinforcement Learning for Conversational Reasoning and Retrieval Augmented Question Answering

Title: Embedding-Based Context-Aware Reranker

Title: Taming the Fragility of KV Cache Eviction in LLM Inference

Title: Are Proverbs the New Pythian Oracles? Exploring Sentiment in Greek Sayings

Title: Protect: Towards Robust Guardrailing Stack for Trustworthy Enterprise LLM Systems

Title: D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree

Title: Document Intelligence in the Era of Large Language Models: A Survey

Title: Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment

Title: Doing Things with Words: Rethinking Theory of Mind Simulation in Large Language Models

Title: Evaluating Arabic Large Language Models: A Survey of Benchmarks, Methods, and Gaps

Title: Beyond Single-Reward: Multi-Pair, Multi-Perspective Preference Optimization for Machine Translation

Title: LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA

Title: ConsintBench: Evaluating Language Models on Real-World Consumer Intent Understanding

Title: MedREK: Retrieval-Based Editing for Medical LLMs with Key-Aware Prompts

Title: Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Title: Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models

Title: Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs

Title: FreshTab: Sourcing Fresh Data for Table-to-Text Generation Evaluation

Title: NOSA: Native and Offloadable Sparse Attention

Title: MemoTime: Memory-Augmented Temporal Knowledge Graph Enhanced Large Language Model Reasoning

Title: Unlocking Public Catalogues: Instruction-Tuning LLMs for ICD Coding of German Tumor Diagnoses

Title: Closing the Gap Between Text and Speech Understanding in LLMs

Title: How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study

Title: GAPS: A Clinically Grounded, Automated Benchmark for Evaluating AI Clinicians

Title: Assessing Web Search Credibility and Response Groundedness in Chat Assistants

Title: Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation

Title: The Mechanistic Emergence of Symbol Grounding in Language Models

Title: Breadcrumbs Reasoning: Memory-Efficient Reasoning with Compression Beacons

Title: BRIEF-Pro: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning