2025-06-11

Title: Conservative Bias in Large Language Models: Measuring Relation Predictions

Title: QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA

Title: EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments

Title: Multilingual Hate Speech Detection in Social Media Using Translation-Based Approaches with Large Language Models

Title: Can Artificial Intelligence Write Like Borges? An Evaluation Protocol for Spanish Microfiction

Title: LLM-BT: Back-Translation as a Framework for Terminology Standardization and Dynamic Semantic Embedding

Title: Unable to forget: Proactive lnterference Reveals Working Memory Limits in LLMs Beyond Context Length

Title: "I Wrote, I Paused, I Rewrote" Teaching LLMs to Read Between the Lines of Student Writing

Title: Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future Directions

Title: Can AI Validate Science? Benchmarking LLMs for Accurate Scientific Claim $\rightarrow$ Evidence Reasoning

Title: Automatic Generation of Inference Making Questions for Reading Comprehension Assessments

Title: Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability

Title: Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Title: Evaluating LLMs Across Multi-Cognitive Levels: From Medical Knowledge Mastery to Scenario-Based Problem Solving

Title: DEAL: Disentangling Transformer Head Activations for LLM Steering

Title: CC-RAG: Structured Multi-Hop Reasoning via Theme-Based Causal Graphs

Title: Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding

Title: Draft-based Approximate Inference for LLMs

Title: EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models

Title: mSTEB: Massively Multilingual Evaluation of LLMs on Speech and Text Tasks

Title: TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration

Title: Large Language Models Have Intrinsic Meta-Cognition, but Need a Good Lens

Title: Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models

Title: CAF-I: A Collaborative Multi-Agent Framework for Enhanced Irony Detection with Large Language Models

Title: Low-resource domain adaptation while minimizing energy and hardware resource consumption

Title: Olica: Efficient Structured Pruning of Large Language Models without Retraining

Title: Detecting Harmful Memes with Decoupled Understanding and Guided CoT Reasoning

Title: Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$

Title: Re-Thinking the Automatic Evaluation of Image-Text Alignment in Text-to-Image Models

Title: Fairness is Not Silence: Unmasking Vacuous Neutrality in Small Language Models

Title: EtiCor++: Towards Understanding Etiquettical Bias in LLMs

Title: Integration of Old and New Knowledge for Generalized Intent Discovery: A Consistency-driven Prototype-Prompting Framework

Title: DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs

Title: Efficient Post-Training Refinement of Latent Reasoning in Large Language Models

Title: CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmark of Large Language Models in Mental Health Counseling

Title: Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models

Title: RAISE: Enhancing Scientific Reasoning in LLMs via Step-by-Step Retrieval

Title: MEMETRON: Metaheuristic Mechanisms for Test-time Response Optimization of Large Language Models

Title: TableDreamer: Progressive and Weakness-guided Data Synthesis from Scratch for Table Instruction Tuning

Title: Summarization for Generative Relation Extraction in the Microbiome Domain

Title: Brevity is the soul of sustainability: Characterizing LLM response lengths

Title: ClimateViz: A Benchmark for Statistical Reasoning and Fact Verification on Scientific Charts

Title: ConfPO: Exploiting Policy Model Confidence for Critical Token Selection in Large Language Model Preference Optimization

Title: Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure

Title: Improved LLM Agents for Financial Document Question Answering

Title: Towards Secure and Private Language Models for Nuclear Power Plants

Title: Unlocking the Potential of Large Language Models in the Nuclear Industry with Synthetic Data

Title: Factors affecting the in-context learning abilities of LLMs for dialogue state tracking

Title: Enhancing Accuracy and Maintainability in Nuclear Plant Data Retrieval: A Function-Calling LLM Approach Over NL-to-SQL

Title: AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLP

Title: The impact of fine tuning in LLaMA on hallucinations for named entity extraction in legal documentation

Title: AdversariaL attacK sAfety aLIgnment(ALKALI): Safeguarding LLMs through GRACE: Geometric Representation-Aware Contrastive Enhancement- Introducing Adversarial Vulnerability Quality Index (AVQI)

Title: PlantBert: An Open Source Language Model for Plant Science

Title: From Legal Texts to Defeasible Deontic Logic via LLMs: A Study in Automated Semantic Analysis

Title: Dialect Normalization using Large Language Models and Morphological Rules

Title: PropMEND: Hypernetworks for Knowledge Propagation in LLMs

Title: Can A Gamer Train A Mathematical Reasoning Model?

Title: FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation

Title: Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions

Title: Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

Title: Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Scheduling System

Title: Learning to Reason Across Parallel Samples for LLM Reasoning

Title: Comparing human and LLM proofreading in L2 writing: Impact on lexical and syntactic features

Title: Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Title: Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs