2025-08-05

Title: FECT: Factuality Evaluation of Interpretive AI-Generated Claims in Contact Center Conversation Transcripts

Title: XAutoLM: Efficient Fine-Tuning of Language Models via Meta-Learning and AutoML

Title: MAO-ARAG: Multi-Agent Orchestration for Adaptive Retrieval-Augmented Generation

Title: UrBLiMP: A Benchmark for Evaluating the Linguistic Competence of Large Language Models in Urdu

Title: Cross-Domain Web Information Extraction at Pinterest

Title: Asking the Right Questions: Benchmarking Large Language Models in the Development of Clinical Consultation Templates

Title: CSIRO-LT at SemEval-2025 Task 11: Adapting LLMs for Emotion Recognition for Multiple Languages

Title: Adaptive Content Restriction for Large Language Models via Suffix Optimization

Title: Show or Tell? Modeling the evolution of request-making in Human-LLM conversations

Title: WebDS: An End-to-End Benchmark for Web-based Data Science

Title: WarriorMath: Enhancing the Mathematical Ability of Large Language Models with a Defect-aware Framework

Title: Bridging LLMs and Symbolic Reasoning in Educational QA Systems: Insights from the XAI Challenge at IJCNN 2025

Title: Prompting Large Language Models with Partial Knowledge for Answering Questions with Unseen Entities

Title: KEDAS: Knowledge Editing Alignment with Diverse Augmentation and Self-adaptive Inference

Title: D-SCoRE: Document-Centric Segmentation and CoT Reasoning with Structured Export for QA-CoT Data Generation

Title: LinkQA: Synthesizing Diverse QA from Multiple Seeds Strongly Linked by Knowledge Points

Title: Large-Scale Diverse Synthesis for Mid-Training

Title: MaRGen: Multi-Agent LLM Approach for Self-Directed Market Research and Analysis

Title: ArzEn-MultiGenre: An aligned parallel dataset of Egyptian Arabic song lyrics, novels, and subtitles, with English translations

Title: Discovering Bias Associations through Open-Ended LLM Generations

Title: From Query to Logic: Ontology-Driven Multi-Hop Reasoning in LLMs

Title: Towards Efficient Medical Reasoning with Minimal Fine-Tuning Data

Title: TreeDiff: AST-Guided Code Generation with Diffusion LLMs

Title: Harnessing Collective Intelligence of LLMs for Robust Biomedical QA: A Multi-Model Approach

Title: The Homogenizing Effect of Large Language Models on Human Expression and Thought

Title: A Theory of Adaptive Scaffolding for LLM-Based Pedagogical Agents

Title: MOPrompt: Multi-objective Semantic Evolution for Prompt Optimization

Title: Are All Prompt Components Value-Neutral? Understanding the Heterogeneous Adversarial Robustness of Dissected Prompt in Large Language Models

Title: OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets

Title: Authorship Attribution in Multilingual Machine-Generated Texts

Title: CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions

Title: The Bidirectional Process Reward Model

Title: Collaborative Chain-of-Agents for Parametric-Retrieved Knowledge Synergy

Title: Am I Blue or Is My Hobby Counting Teardrops? Expression Leakage in Large Language Models as a Symptom of Irrelevancy Disruption

Title: CultureGuard: Towards Culturally-Aware Dataset and Guard Model for Multilingual Safety Applications

Title: Enhancing the Preference Extractor in Multi-turn Dialogues: From Annotating Disasters to Accurate Preference Extraction

Title: A comprehensive taxonomy of hallucinations in Large Language Models

Title: AGENTICT$^2$S:Robust Text-to-SPARQL via Agentic Collaborative Reasoning over Heterogeneous Knowledge Graphs for the Circular Economy

Title: MLP Memory: Language Modeling with Retriever-pretrained External Memory

Title: Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents

Title: Counterfactual Probing for Hallucination Detection and Mitigation in Large Language Models

Title: Quantum-RAG and PunGPT2: Advancing Low-Resource Language Generation and Retrieval for the Punjabi Language

Title: Word Overuse and Alignment in Large Language Models: The Influence of Learning from Human Feedback

Title: ROVER: Recursive Reasoning Over Videos with Vision-Language Models for Embodied Tasks

Title: SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension

Title: TIBSTC-CoT: A Multi-Domain Instruction Dataset for Chain-of-Thought Reasoning in Language Models

Title: Contextually Aware E-Commerce Product Question Answering using RAG

Title: Prompting Large Language Models to Detect Dementia Family Caregivers

Title: SpeechRole: A Large-Scale Dataset and Benchmark for Evaluating Speech Role-Playing Agents

Title: SpeechR: A Benchmark for Speech Reasoning in Large Audio-Language Models

Title: Diagnosing Memorization in Chain-of-Thought Reasoning, One Token at a Time

Title: Harnessing Temporal Databases for Systematic Evaluation of Factual Time-Sensitive Question-Answering in Large Language Models

Title: ProCut: LLM Prompt Compression via Attribution Estimation

Title: The SMeL Test: A simple benchmark for media literacy in language models

Title: When Truth Is Overridden: Uncovering the Internal Origins of Sycophancy in Large Language Models

Title: Learning Dynamics of Meta-Learning in Small Model Pretraining

Title: Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Title: Proof2Hybrid: Automatic Mathematical Benchmark Synthesis for Proof-Centric Problems

Title: Isolating Culture Neurons in Multilingual Large Language Models

Title: Decomposing the Entropy-Performance Exchange: The Missing Keys to Unlocking Effective Reinforcement Learning

Title: SHAMI-MT: A Syrian Arabic Dialect to Modern Standard Arabic Bidirectional Machine Translation System

Title: Simple Methods Defend RAG Systems Well Against Real-World Attacks

Title: LaMPE: Length-aware Multi-grained Position Encoding for Adaptive Long-context Scaling Without Training

Title: VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Title: CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis

Title: Understanding and Mitigating Political Stance Cross-topic Generalization in Large Language Models

Title: CompressKV: Semantic Retrieval Heads Know What Tokens are Not Important Before Generation

Title: AI-Based Measurement of Innovation: Mapping Expert Insight into Large Language Model Applications

Title: LatentPrompt: Optimizing Promts in Latent Space

Title: From Monolingual to Bilingual: Investigating Language Conditioning in Large Language Models for Psycholinguistic Tasks

Title: Modular Arithmetic: Language Models Solve Math Digit by Digit

Title: PoeTone: A Framework for Constrained Generation of Structured Chinese Songci with LLMs

Title: I Have No Mouth, and I Must Rhyme: Uncovering Internal Phonetic Representations in LLaMA 3.2

Title: Contextual Graph Transformer: A Small Language Model for Enhanced Engineering Document Information Extraction

Title: Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction

Title: Guess or Recall? Training CNNs to Classify and Localize Memorization in LLMs

Title: EHSAN: Leveraging ChatGPT in a Hybrid Framework for Arabic Aspect-Based Sentiment Analysis in Healthcare

Title: MArgE: Meshing Argumentative Evidence from Multiple Large Language Models for Justifiable Claim Verification

Title: CharBench: Evaluating the Role of Tokenization in Character-Level Tasks

Title: Mitigating Attention Hacking in Preference-Based Reward Modeling via Interaction Distillation

Title: Test Set Quality in Multilingual LLM Evaluation