2025-09-23

Title: On LLM-Based Scientific Inductive Reasoning Beyond Equations

Title: Gender and Political Bias in Large Language Models: A Demonstration Platform

Title: Language Modeling with Learned Meta-Tokens

Title: Overhearing LLM Agents: A Survey, Taxonomy, and Roadmap

Title: HARE: an entity and relation centric evaluation framework for histopathology reports

Title: RephQA: Evaluating Readability of Large Language Models in Public Health Question Answering

Title: Whisper-UT: A Unified Translation Framework for Speech and Text

Title: Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans

Title: 'Rich Dad, Poor Lad': How do Large Language Models Contextualize Socioeconomic Factors in College Admission ?

Title: Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research

Title: Evaluating CxG Generalisation in LLMs via Construction-Based NLI Fine Tuning

Title: Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations

Title: Intrinsic Meets Extrinsic Fairness: Assessing the Downstream Impact of Bias Mitigation in Large Language Models

Title: Computational Analysis of Conversation Dynamics through Participant Responsivity

Title: The Oracle Has Spoken: A Multi-Aspect Evaluation of Dialogue in Pythia

Title: Can an Individual Manipulate the Collective Decisions of Multi-Agents?

Title: AIPsychoBench: Understanding the Psychometric Differences between LLMs and Humans

Title: Challenging the Evaluator: LLM Sycophancy Under User Rebuttal

Title: InteGround: On the Evaluation of Verification and Retrieval Planning in Integrative Grounding

Title: ChemOrch: Empowering LLMs with Chemical Intelligence via Synthetic Instructions

Title: Rethinking the Role of Text Complexity in Language Model Pretraining

Title: MPCG: Multi-Round Persona-Conditioned Generation for Modeling the Evolution of Misinformation with LLMs

Title: From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations

Title: Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data

Title: From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature

Title: Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

Title: MCP: A Control-Theoretic Orchestration Framework for Synergistic Efficiency and Interpretability in Multimodal Large Language Models

Title: PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality

Title: LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts

Title: Redefining Experts: Interpretable Decomposition of Language Models for Toxicity Mitigation

Title: Robust Native Language Identification through Agentic Decomposition

Title: Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle

Title: EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs

Title: Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models

Title: OPEN-THEATRE: An Open-Source Toolkit for LLM-based Interactive Drama

Title: Semi-Supervised Synthetic Data Generation with Fine-Grained Relevance Control for Short Video Search Relevance Modeling

Title: Time to Revist Exact Match

Title: A Multi-Level Benchmark for Causal Language Understanding in Social Media Discourse

Title: The Sound of Syntax: Finetuning and Comprehensive Evaluation of Language Models for Speech Pathology

Title: Domain-Adaptive Pre-Training for Arabic Aspect-Based Sentiment Analysis: A Comparative Study of Domain Adaptation and Fine-Tuning Strategies

Title: KuBERT: Central Kurdish BERT Model and Its Application for Sentiment Analysis

Title: Cognitive Linguistic Identity Fusion Score (CLIFS): A Scalable Cognition-Informed Approach to Quantifying Identity Fusion from Text

Title: Can GRPO Boost Complex Multimodal Table Understanding?

Title: CLaC at DISRPT 2025: Hierarchical Adapters for Cross-Framework Multi-lingual Discourse Relation Classification

Title: CUTE: A Multilingual Dataset for Enhancing Cross-Lingual Knowledge Transfer in Low-Resource Languages

Title: K-DeCore: Facilitating Knowledge Transfer in Continual Structured Knowledge Reasoning via Knowledge Decoupling

Title: AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation

Title: Preference Distillation via Value based Reinforcement Learning

Title: Advancing Speech Understanding in Speech-Aware Language Models with GRPO

Title: The Transfer Neurons Hypothesis: An Underlying Mechanism for Language Latent Space Transitions in Multilingual LLMs

Title: Modeling Bottom-up Information Quality during Language Processing

Title: TactfulToM: Do LLMs Have the Theory of Mind Ability to Understand White Lies?

Title: SFT-TA: Supervised Fine-Tuned Agents in Multi-Agent LLMs for Automated Inductive Thematic Analysis

Title: FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions

Title: Attention Consistency for LLMs Explanation

Title: LifeAlign: Lifelong Alignment for Large Language Models with Memory-Augmented Focalized Preference Optimization

Title: Evolution of Concepts in Language Model Pre-Training

Title: Prompt-Based Simplification for Plain Language using Spanish Language Models

Title: Extending Automatic Machine Translation Evaluation to Book-Length Documents

Title: Probabilistic Token Alignment for Large Language Model Fusion

Title: Automated Knowledge Graph Construction using Large Language Models and Sentence Complexity Modelling

Title: Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortion Detection

Title: Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text

Title: AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning

Title: Scale-free Characteristics of Multilingual Legal Texts and the Limitations of LLMs

Title: Robustness of Neurosymbolic Reasoners on First-Order Logic Problems

Title: FinDebate: Multi-Agent Collaborative Intelligence for Financial Analysis

Title: EpiCache: Episodic KV Cache Management for Long Conversational Question Answering

Title: DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context

Title: Vision Language Models Are Not (Yet) Spelling Correctors

Title: RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios

Title: QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models

Title: MedFact: A Large-scale Chinese Dataset for Evidence-based Medical Fact-checking of LLM Responses

Title: GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

Title: Filling in the Clinical Gaps in Benchmark: Case for HealthBench for the Japanese medical system

Title: Semantic Reformulation Entropy for Robust Hallucination Detection in QA Tasks

Title: SLAyiNG: Towards Queer Language Processing

Title: PRINCIPLES: Synthetic Strategy Memory for Proactive Dialogue Agents

Title: Diagnosing Model Editing via Knowledge Spectrum

Title: AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation

Title: MapCoder-Lite: Squeezing Multi-Agent Coding into a Single Small LLM

Title: Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages

Title: CorefInst: Leveraging LLMs for Multilingual Coreference Resolution

Title: Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning

Title: Specification-Aware Machine Translation and Evaluation for Purpose Alignment

Title: Asking a Language Model for Diverse Responses

Title: MSCoRe: A Benchmark for Multi-Stage Collaborative Reasoning in LLM Agents

Title: AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing?

Title: PG-CE: A Progressive Generation Dataset with Constraint Enhancement for Controllable Text Generation

Title: Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications

Title: When TableQA Meets Noise: A Dual Denoising Framework for Complex Questions and Large-scale Tables

Title: Evaluating LLM-Generated Versus Human-Authored Responses in Role-Play Dialogues

Title: Investigating Bias: A Multilingual Pipeline for Generating, Solving, and Evaluating Math Problems with LLMs

Title: Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics

Title: Qwen3-Omni Technical Report

Title: A State-Update Prompting Strategy for Efficient and Robust Multi-turn Dialogue

Title: One Agent to Serve All: a Lite-Adaptive Stylized AI Assistant for Millions of Multi-Style Official Accounts

Title: Learning to vary: Teaching LMs to reproduce human linguistic variability in next-word prediction

Title: Findings of the Fourth Shared Task on Multilingual Coreference Resolution: Can LLMs Dethrone Traditional Approaches?

Title: Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora

Title: CorPipe at CRAC 2025: Evaluating Multilingual Encoders for Multilingual Coreference Resolution

Title: How Persuasive is Your Context?

Title: Training-free Truthfulness Detection via Value Vectors in LLMs

Title: D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

Title: HICode: Hierarchical Inductive Coding with LLMs

Title: Variation in Verification: Understanding Verification Dynamics in Large Language Models

Title: RadEval: A framework for radiology text evaluation

Title: The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

Title: ARK-V1: An LLM-Agent for Knowledge Graph Question Answering Requiring Commonsense Reasoning

Title: SEQR: Secure and Efficient QR-based LoRA Routing