2025-09-08

Title: INSEva: A Comprehensive Chinese Benchmark for Large Language Models in Insurance

Title: Mentalic Net: Development of RAG-based Conversational AI and Evaluation Framework for Mental Health Support

Title: Do MLLMs Really Understand the Charts?

Title: Predicting Failures of LLMs to Link Biomedical Ontology Terms to Identifiers Evidence Across Models and Ontologies

Title: Uncertainty-Aware Collaborative System of Large and Small Models for Multimodal Sentiment Analysis

Title: CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection

Title: From Post To Personality: Harnessing LLMs for MBTI Prediction in Social Media

Title: Benchmarking GPT-5 for biomedical natural language processing

Title: Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?

Title: Emotionally-Aware Agents for Dispute Resolution

Title: Just-in-time and distributed task representations in language models

Title: Enhancing LLM Efficiency: Targeted Pruning for Prefill-Decode Disaggregation in Inference

Title: Evaluating Large Language Models for Financial Reasoning: A CFA-Based Benchmark Study

Title: Multi-Modal Vision vs. Text-Based Parsing: Benchmarking LLM Strategies for Invoice Processing

Title: COCORELI: Cooperative, Compositional Reconstitution \& Execution of Language Instructions

Title: MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification

Title: RECAP: REwriting Conversations for Intent Understanding in Agentic Planning

Title: SpeechLLM: Unified Speech and Language Model for Enhanced Multi-Task Understanding in Low Resource Settings

Title: Scaling Up, Speeding Up: A Benchmark of Speculative Decoding for Efficient LLM Test-Time Scaling

Title: ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Title: Training Text-to-Molecule Models with Context-Aware Tokenization

Title: No Clustering, No Routing: How Transformers Actually Process Rare Tokens

Title: Discrete Prompt Tuning via Recursive Utilization of Black-box Multimodal Large Language Model for Personalized Visual Emotion Recognition

Title: Energy Landscapes Enable Reliable Abstention in Retrieval-Augmented Large Language Models for Healthcare

Title: DecMetrics: Structured Claim Decomposition Scoring for Factually Consistent LLM Outputs

Title: The Good, the Bad and the Constructive: Automatically Measuring Peer Review's Utility for Authors

Title: ASCENDgpt: A Phenotype-Aware Transformer Model for Cardiovascular Risk Prediction from Electronic Health Records

Title: Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition

Title: Refining Transcripts With TV Subtitles by Prompt-Based Weakly Supervised Training of ASR

Title: Learned Hallucination Detection in Black-Box LLMs using Token-level Entropy Production Rate

Title: Where Should I Study? Biased Language Models Decide! Evaluating Fairness in LMs for Academic Recommendations

Title: DeepTRACE: Auditing Deep Research AI Systems for Tracking Reliability Across Citations and Evidence

Title: Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts

Title: Understanding Reinforcement Learning for Model Training, and future directions with GRAPE

Title: VaccineRAG: Boosting Multimodal Large Language Models' Immunity to Harmful RAG Samples

Title: Behavioral Fingerprinting of Large Language Models

Title: From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach

Title: ProST: Progressive Sub-task Training for Pareto-Optimal Multi-agent Systems Using Small Language Models

Title: Scaling behavior of large language models in emotional safety classification across sizes and tasks

Title: Mitigation of Gender and Ethnicity Bias in AI-Generated Stories through Model Explanations

Title: Artificially Fluent: Swahili AI Performance Benchmarks Between English-Trained and Natively-Trained Datasets

Title: Advancing SLM Tool-Use Capability using Reinforcement Learning

Title: Hierarchical Section Matching Prediction (HSMP) BERT for Fine-Grained Extraction of Structured Data from Hebrew Free-Text Radiology Reports in Crohn's Disease

Title: Using LLMs to create analytical datasets: A case study of reconstructing the historical memory of Colombia

Title: Quantized Large Language Models in Biomedical Natural Language Processing: Evaluation and Recommendation

Title: Manipulating Transformer-Based Models: Controllability, Steerability, and Robust Interventions

Title: Sample-efficient Integration of New Modalities into Large Language Models

Title: Breaking to Build: A Threat Model of Prompt-Based Attacks for Securing LLMs

Title: Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs

Title: AraHalluEval: A Fine-grained Hallucination Evaluation Framework for Arabic LLMs

Title: Evaluating NL2SQL via SQL2NL

Title: Why Language Models Hallucinate

Title: ODKE+: Ontology-Guided Open-Domain Knowledge Extraction with LLMs

Title: KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering

Title: A Study of Large Language Models for Patient Information Extraction: Model Architecture, Fine-Tuning Strategy, and Multi-task Instruction Tuning

Title: Research on Multi-hop Inference Optimization of LLM Based on MQUAKE Framework

Title: Decoders Laugh as Loud as Encoders

Title: Enhancing Diversity in Large Language Models via Determinantal Point Processes

Title: Personality as a Probe for LLM Evaluation: Method Trade-offs and Downstream Effects

Title: Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training

Title: Mind the Gap: Evaluating Model- and Agentic-Level Vulnerabilities in LLMs with Action Graphs

Title: AFD-SLU: Adaptive Feature Distillation for Spoken Language Understanding

Title: Memorization $\neq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?

Title: Using LLMs for Multilingual Clinical Entity Linking to ICD-10

Title: L1RA: Dynamic Rank Assignment in LoRA Fine-Tuning

Title: PLaMo 2 Technical Report

Title: ACE-RL: Adaptive Constraint-Enhanced Reward for Long-form Generation Reinforcement Learning

Title: Classification of kinetic-related injury in hospital triage data using NLP

Title: Optimizing Small Transformer-Based Language Models for Multi-Label Sentiment Analysis in Short Texts

Title: Do Large Language Models Need Intent? Revisiting Response Generation Strategies for Service Assistant

Title: Masked Diffusion Language Models with Frequency-Informed Training

Title: Entropy2Vec: Crosslingual Language Modeling Entropy as End-to-End Learnable Language Representations

Title: ToM-SSI: Evaluating Theory of Mind in Situated Social Interactions

Title: Triadic Fusion of Cognitive, Functional, and Causal Dimensions for Explainable LLMs: The TAXAL Framework

Title: Hunyuan-MT Technical Report

Title: BEDTime: A Unified Benchmark for Automatically Describing Time Series

Title: HoPE: Hyperbolic Rotary Positional Encoding for Stable Long-Range Dependency Modeling in Large Language Models

Title: Less is More Tokens: Efficient Math Reasoning via Difficulty-Aware Chain-of-Thought Distillation

Title: CURE: Controlled Unlearning for Robust Embeddings -- Mitigating Conceptual Shortcuts in Pre-Trained Language Models

Title: Uniform Information Density and Syntactic Reduction: Revisiting $\textit{that}$-Mentioning in English Complement Clauses

Title: Elucidating the Design Space of Decay in Linear Attention

Title: Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining