2025-09-05

Title: Speech-Based Cognitive Screening: A Systematic Evaluation of LLM Adaptation Strategies

Title: Enhancing Speech Large Language Models through Reinforced Behavior Alignment

Title: Multilevel Analysis of Cryptocurrency News using RAG Approach with Fine-Tuned Mistral Large Language Model

Title: The ProLiFIC dataset: Leveraging LLMs to Unveil the Italian Lawmaking Process

Title: Real-Time Detection of Hallucinated Entities in Long-Form Generation

Title: Topic Identification in LLM Input-Output Pairs through the Lens of Information Bottleneck

Title: AR$^2$: Adversarial Reinforcement Learning for Abstract Reasoning in Large Language Models

Title: Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

Title: ResearchPulse: Building Method-Experiment Chains through Multi-Document Scientific Inference

Title: NoteBar: An AI-Assisted Note-Taking System for Personal Knowledge Management

Title: E-ARMOR: Edge case Assessment and Review of Multilingual Optical Character Recognition

Title: Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

Title: Measuring How (Not Just Whether) VLMs Build Common Ground

Title: Align-then-Slide: A complete evaluation framework for Ultra-Long Document-Level Machine Translation

Title: Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Title: A Comprehensive Survey on Trustworthiness in Reasoning with Large Language Models

Title: False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize

Title: MobileRAG: Enhancing Mobile Agent with Retrieval-Augmented Generation

Title: MTQA:Matrix of Thought for Enhanced Reasoning in Complex Question Answering

Title: Decoding the Poetic Language of Emotion in Korean Modern Poetry: Insights from a Human-Labeled Dataset and AI Modeling

Title: SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment

Title: SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning

Title: VoxRole: A Comprehensive Benchmark for Evaluating Speech-Based Role-Playing Agents

Title: CANDY: Benchmarking LLMs' Limitations and Assistive Potential in Chinese Misinformation Fact-Checking

Title: Exploring NLP Benchmarks in an Extremely Low-Resource Setting

Title: Expanding Foundational Language Capabilities in Open-Source LLMs through a Korean Case Study

Title: RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models

Title: On Robustness and Reliability of Benchmark-Based Evaluation of LLMs

Title: What if I ask in \textit{alia lingua}? Measuring Functional Similarity Across Languages

Title: Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning

Title: Arabic Chatbot Technologies in Education: An Overview

Title: Improving Narrative Classification and Explanation via Fine Tuned Language Models

Title: Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue

Title: MultiWikiQA: A Reading Comprehension Benchmark in 300+ Languages

Title: MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions

Title: Explicit and Implicit Data Augmentation for Social Event Detection

Title: Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

Title: Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models

Title: Measuring Bias or Measuring the Task: Understanding the Brittle Nature of LLM Gender Biases

Title: Can Language Models Handle a Non-Gregorian Calendar?