2025-10-21

Title: Fusion-Augmented Large Language Models: Boosting Diagnostic Trustworthiness via Model Consensus

Title: Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs

Title: EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

Title: Evaluating Prompting Strategies and Large Language Models in Systematic Literature Review Screening: Relevance and Task-Stage Classification

Title: Facts in Stats: Impacts of Pretraining Diversity on Language Model Generalization

Title: In Generative AI We (Dis)Trust? Computational Analysis of Trust and Distrust in Reddit Discussions

Title: EgMM-Corpus: A Multimodal Vision-Language Dataset for Egyptian Culture

Title: What Can String Probability Tell Us About Grammaticality?

Title: Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback

Title: Instant Personalized Large Language Model Adaptation via Hypernetwork

Title: Thinking About Thinking: Evaluating Reasoning in Post-Trained Language Models

Title: Utilising Large Language Models for Generating Effective Counter Arguments to Anti-Vaccine Tweets

Title: End-to-End Argument Mining through Autoregressive Argumentative Structure Prediction

Title: Navigating through the hidden embedding space: steering LLMs to improve mental health assessment

Title: MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes

Title: ATA: A Neuro-Symbolic Approach to Implement Autonomous and Trustworthy Agents

Title: Probing the Hidden Talent of ASR Foundation Models for L2 English Oral Assessment

Title: FrugalPrompt: Reducing Contextual Overhead in Large Language Models via Token Attribution

Title: TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model

Title: RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning

Title: Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety

Title: Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection

Title: ReviewGuard: Enhancing Deficient Peer Review Detection via LLM-Driven Data Augmentation

Title: Language over Content: Tracing Cultural Understanding in Multilingual Large Language Models

Title: Hallucination Benchmark for Speech Foundation Models

Title: AI-Generated Text Detection in Low-Resource Languages: A Case Study on Urdu

Title: Fine-tuning of Large Language Models for Constituency Parsing Using a Sequence to Sequence Approach

Title: Unleashing Diverse Thinking Modes in LLMs through Multi-Agent Collaboration

Title: All You Need is One: Capsule Prompt Tuning with a Single Vector

Title: Temporal Understanding under Deictic Frame of Reference

Title: Investigating the Impact of Rationales for LLMs on Natural Language Understanding

Title: The Chameleon Nature of LLMs: Quantifying Multi-Turn Stance Instability in Search-Enabled Language Models

Title: so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs

Title: Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy in Large Language Models

Title: Enhancing Language Agent Strategic Reasoning through Self-Play in Adversarial Games

Title: LC-Eval: A Bilingual Multi-Task Evaluation Benchmark for Long-Context Understanding

Title: MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning

Title: Knowing the Facts but Choosing the Shortcut: Understanding How Large Language Models Compare Entities

Title: Cross-Genre Authorship Attribution via LLM-Based Retrieve-and-Rerank

Title: Who's Asking? Simulating Role-Based Questions for Conversational AI Evaluation

Title: FinSight: Towards Real-World Financial Deep Research

Title: Neuronal Group Communication for Efficient Neural representation

Title: Does Visual Grounding Enhance the Understanding of Embodied Knowledge in Large Language Models?

Title: ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models

Title: Prompt-MII: Meta-Learning Instruction Induction for LLMs

Title: Parameter-Efficient Fine-Tuning for Low-Resource Languages: A Comparative Study of LLMs for Bengali Hate Speech Detection

Title: Back to Bytes: Revisiting Tokenization Through UTF-8

Title: Vocab Diet: Reshaping the Vocabulary of LLMs with Vector Arithmetic

Title: Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization

Title: DiscoTrack: A Multilingual LLM Benchmark for Discourse Tracking

Title: SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

Title: Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models

Title: Investigating Thinking Behaviours of Reasoning-Based Language Models for Social Bias Mitigation

Title: Verification-Aware Planning for Multi-Agent Systems

Title: DVAGen: Dynamic Vocabulary Augmented Generation

Title: Rethinking On-policy Optimization for Query Augmentation

Title: When AI companions become witty: Can human brain recognize AI-generated irony?

Title: Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models

Title: Wisdom is Knowing What not to Say: Hallucination-Free LLMs Unlearning via Attention Shifting

Title: StreamingThinker: Large Language Models Can Think While Reading

Title: From Preferences to Prejudice: The Role of Alignment Tuning in Shaping Social Bias in Video Diffusion Models

Title: Explainability of Large Language Models: Opportunities and Challenges toward Generating Trustworthy Explanations

Title: TaxoAlign: Scholarly Taxonomy Generation Using Language Models

Title: Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation

Title: The Atomic Instruction Gap: Instruction-Tuned LLMs Struggle with Simple, Self-Contained Directives

Title: EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs

Title: Leveraging Group Relative Policy Optimization to Advance Large Language Models in Traditional Chinese Medicine

Title: BenCao: An Instruction-Tuned Large Language Model for Traditional Chinese Medicine

Title: Agentic Reinforcement Learning for Search is Unsafe

Title: Multilingual Clinical NER for Diseases and Medications Recognition in Cardiology Texts using BERT Embeddings

Title: Evaluating Large Language Models on Urdu Idiom Translation

Title: Disparities in Multilingual LLM-Based Healthcare Q&A

Title: ReXMoE: Reusing Experts with Minimal Overhead in Mixture-of-Experts

Title: DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning

Title: Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents

Title: Deep Self-Evolving Reasoning

Title: Lingua Custodi's participation at the WMT 2025 Terminology shared task

Title: Annotation-Efficient Universal Honesty Alignment

Title: SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

Title: OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction

Title: When Annotators Disagree, Topology Explains: Mapper, a Topological Tool for Exploring Text Embedding Geometry and Ambiguity

Title: Language Confusion Gate: Language-Aware Decoding Through Model Self-Distillation

Title: HGAdapter: Hypergraph-based Adapters in Language Models for Code Summarization and Clone Detection

Title: LawChain: Modeling Legal Reasoning Chains for Chinese Tort Case Analysis

Title: Forget to Know, Remember to Use: Context-Aware Unlearning for Large Language Models

Title: Qomhra: A Bilingual Irish-English Large Language Model

Title: Towards Mining Effective Pedagogical Strategies from Learner-LLM Educational Dialogues

Title: QueST: Incentivizing LLMs to Generate Difficult Problems

Title: PANER: A Paraphrase-Augmented Framework for Low-Resource Named Entity Recognition

Title: AcademicEval: Live Long-Context LLM Benchmark

Title: Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations

Title: Evaluating Medical LLMs by Levels of Autonomy: A Survey Moving from Benchmarks to Applications

Title: Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains

Title: Executable Knowledge Graphs for Replicating AI Research

Title: Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics