2025-05-27

Title: Advancing Uto-Aztecan Language Technologies: A Case Study on the Endangered Comanche Language

Title: Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs?

Title: CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games

Title: IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis

Title: Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens

Title: Taming LLMs with Negative Samples: A Reference-Free Framework to Evaluate Presentation Content with Actionable Feedback

Title: Multi-Scale Probabilistic Generation Theory: A Hierarchical Framework for Interpreting Large Language Models

Title: MetaGen Blended RAG: Higher Accuracy for Domain-Specific Q&A Without Fine-Tuning

Title: TAGS: A Test-Time Generalist-Specialist Framework with Retrieval-Augmented Reasoning and Verification

Title: Thinking Fast and Right: Balancing Accuracy and Reasoning Length with Adaptive Rewards

Title: Is It Bad to Work All the Time? Cross-Cultural Evaluation of Social Norm Biases in GPT-4

Title: PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language

Title: Model Editing with Graph-Based External Memory

Title: The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs

Title: SchemaGraphSQL: Efficient Schema Linking with Pathfinding Graph Algorithms for Text-to-SQL on Large-Scale Databases

Title: ShIOEnv: A CLI Behavior-Capturing Environment Enabling Grammar-Guided Command Synthesis for Dataset Curation

Title: NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities

Title: RaDeR: Reasoning-aware Dense Retrieval Models

Title: DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

Title: Retrieval Augmented Generation-based Large Language Models for Bridging Transportation Cybersecurity Legal Knowledge Gaps

Title: Efficient Long CoT Reasoning in Small Language Models

Title: BRIT: Bidirectional Retrieval over Unified Image-Text Graph

Title: MedScore: Factuality Evaluation of Free-Form Medical Answers

Title: Hybrid Latent Reasoning via Reinforcement Learning

Title: Anchored Diffusion Language Model

Title: Measuring South Asian Biases in Large Language Models

Title: Investigating AI Rater Effects of Large Language Models: GPT, Claude, Gemini, and DeepSeek

Title: The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models

Title: How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation

Title: metaTextGrad: Automatically optimizing language model optimizers

Title: Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

Title: Business as \textit{Rule}sual: A Benchmark and Framework for Business Rule Flow Modeling with LLMs

Title: Composable Cross-prompt Essay Scoring by Merging Models

Title: MSA at BEA 2025 Shared Task: Disagreement-Aware Instruction Tuning for Multi-Dimensional Evaluation of LLMs as Math Tutors

Title: Unraveling Misinformation Propagation in LLM Reasoning

Title: Exploring the Vulnerability of the Content Moderation Guardrail in Large Language Models via Intent Manipulation

Title: TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

Title: From Word to World: Evaluate and Mitigate Culture Bias via Word Association Test

Title: Removal of Hallucination on Hallucination: Debate-Augmented RAG

Title: Safety Alignment via Constrained Knowledge Unlearning

Title: Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate with Large Language Models

Title: Flex-Judge: Think Once, Judge Anywhere

Title: PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs

Title: MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

Title: DDO: Dual-Decision Optimization via Multi-Agent Collaboration for LLM-Based Medical Consultation

Title: Multilingual Question Answering in Low-Resource Settings: A Dzongkha-English Benchmark for Foundation Models

Title: Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster

Title: Climate-Eval: A Comprehensive Benchmark for NLP Tasks Related to Climate Change

Title: Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics

Title: Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models

Title: Social Good or Scientific Curiosity? Uncovering the Research Framing Behind NLP Artefacts

Title: TULUN: Transparent and Adaptable Low-resource Machine Translation

Title: Large Language Models in the Task of Automatic Validation of Text Classifier Predictions

Title: Benchmarking and Rethinking Knowledge Editing for Large Language Models

Title: Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization

Title: LogicCat: A Chain-of-Thought Text-to-SQL Benchmark for Multi-Domain Reasoning Challenges

Title: Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning

Title: Few-Shot Optimization for Sensor Data Using Large Language Models: A Case Study on Fatigue Detection

Title: How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark

Title: Disentangling Knowledge Representations for Large Language Model Editing

Title: ALPS: Attention Localization and Pruning Strategy for Efficient Alignment of Large Language Models

Title: Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation

Title: Multi-Party Conversational Agents: A Survey

Title: Writing Like the Best: Exemplar-Based Expository Text Generation

Title: Audio Jailbreak Attacks: Exposing Vulnerabilities in SpeechGPT in a White-Box Framework

Title: Sci-LoRA: Mixture of Scientific LoRAs for Cross-Domain Lay Paraphrasing

Title: CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions

Title: Building a Functional Machine Translation Corpus for Kpelle

Title: Federated Retrieval-Augmented Generation: A Systematic Mapping Study

Title: SCRum-9: Multilingual Stance Classification over Rumours on Social Media

Title: Benchmarking Large Language Models for Cyberbullying Detection in Real-World YouTube Comments

Title: MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems

Title: The Price of Format: Diversity Collapse in LLMs

Title: BnMMLU: Measuring Massive Multitask Language Understanding in Bengali

Title: Evaluating AI for Finance: Is AI Credible at Assessing Investment Risk?

Title: System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts

Title: Learning to Explain: Prototype-Based Surrogate Models for LLM Classification

Title: Hierarchical Mamba Meets Hyperbolic Geometry: A New Paradigm for Structured Language Embeddings

Title: AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models

Title: FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)

Title: VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization

Title: Efficient Data Selection at Scale via Influence Distillation

Title: An Embarrassingly Simple Defense Against LLM Abliteration Attacks

Title: UNCERTAINTY-LINE: Length-Invariant Estimation of Uncertainty for Large Language Models

Title: Towards Harmonized Uncertainty Estimation for Large Language Models

Title: ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Title: ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning

Title: CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models

Title: Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering

Title: Controlling Language Confusion in Multilingual LLMs

Title: Delving into Multilingual Ethical Bias: The MSQAD with Statistical Hypothesis Tests for Large Language Models

Title: MMATH: A Multilingual Benchmark for Mathematical Reasoning

Title: RetrieveAll: A Multilingual Named Entity Recognition Framework with Large Language Models

Title: Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Title: SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs

Title: Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge

Title: Two LLMs debate, both are certain they've won

Title: LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling

Title: Misleading through Inconsistency: A Benchmark for Political Inconsistencies Detection

Title: DREAM: Drafting with Refined Target Features and Entropy-Adaptive Cross-Attention Fusion for Multimodal Speculative Decoding

Title: SpeakStream: Streaming Text-to-Speech with Interleaved Data

Title: MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search

Title: When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas

Title: The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

Title: Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator

Title: LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

Title: PATS: Process-Level Adaptive Thinking Mode Switching

Title: Unveiling Dual Quality in Product Reviews: An NLP-Based Approach

Title: A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models

Title: 100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

Title: A Necessary Step toward Faithfulness: Measuring and Improving Consistency in Free-Text Explanations

Title: SituatedThinker: Grounding LLM Reasoning with Real-World through Situated Thinking

Title: PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims

Title: GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance

Title: ChartLens: Fine-grained Visual Attribution in Charts

Title: Belief Attribution as Mental Explanation: The Role of Accuracy, Informativity, and Causality

Title: Simple and Effective Baselines for Code Summarisation Evaluation

Title: CoTGuard: Using Chain-of-Thought Triggering for Copyright Protection in Multi-Agent LLM Systems

Title: Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering

Title: The Role of Diversity in In-Context Learning for Large Language Models

Title: Frictional Agent Alignment Framework: Slow Down and Don't Break Things

Title: Rhapsody: A Dataset for Highlight Detection in Podcasts

Title: Deriving Strategic Market Insights with Large Language Models: A Benchmark for Forward Counterfactual Generation

Title: Route to Reason: Adaptive Routing for LLM and Reasoning Strategy Selection

Title: Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers

Title: The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

Title: Balancing Computation Load and Representation Expressivity in Parallel Hybrid Neural Networks

Title: Continuous Self-Improvement of Large Language Models by Test-time Training with Verifier-Driven Sample Selection

Title: CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data Synthesis

Title: Anveshana: A New Benchmark Dataset for Cross-Lingual Information Retrieval On English Queries and Sanskrit Documents

Title: LLM Meets Scene Graph: Can Large Language Models Understand and Generate Scene Graphs? A Benchmark and Empirical Study

Title: Causal Distillation: Transferring Structured Explanations from Large to Compact Language Models

Title: SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback

Title: Bias in Political Dialogue: Tagging U.S. Presidential Debates with an Extended DAMSL Framework

Title: Small Language Models: Architectures, Techniques, Evaluation, Problems and Future Adaptation

Title: DoctorRAG: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients

Title: How Syntax Specialization Emerges in Language Models

Title: Towards Multi-Granularity Memory Association and Selection for Long-Term Conversational Agents

Title: DocMEdit: Towards Document-Level Model Editing

Title: TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization

Title: Multi-Agent Collaboration via Evolving Orchestration

Title: Evaluating Robustness of Large Audio Language Models to Audio Injection: An Empirical Study

Title: Inconsistent Tokenizations Cause Language Models to be Perplexed by Japanese Grammar

Title: Languages in Multilingual Speech Foundation Models Align Both Phonetically and Semantically

Title: HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices

Title: DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue

Title: Segment First or Comprehend First? Explore the Limit of Unsupervised Word Segmentation with Large Language Models

Title: Faster and Better LLMs via Latency-Aware Test-Time Scaling

Title: Interleaved Reasoning for Large Language Models via Reinforcement Learning

Title: Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work Generation

Title: GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models

Title: LeCoDe: A Benchmark Dataset for Interactive Legal Consultation Dialogue Evaluation

Title: Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models

Title: Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations

Title: Calibrating Pre-trained Language Classifiers on LLM-generated Noisy Labels via Iterative Refinement

Title: Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs

Title: Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models

Title: Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision

Title: MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning

Title: Graceful Forgetting in Generative Language Models

Title: Distilling Closed-Source LLM's Knowledge for Locally Stable and Economic Biomedical Entity Linking

Title: Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Title: NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering

Title: Efficient Reasoning via Chain of Unconscious Thought

Title: SGM: A Framework for Building Specification-Guided Moderation Filters

Title: T^2Agent A Tool-augmented Multimodal Misinformation Detection Agent with Monte Carlo Tree Search

Title: What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs

Title: Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification

Title: The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants

Title: MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs

Title: Compliance-to-Code: Enhancing Financial Compliance Checking via Code Generation

Title: Exploring Consciousness in LLMs: A Systematic Survey of Theories, Implementations, and Frontier Risks

Title: Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective

Title: FoodTaxo: Generating Food Taxonomies with Large Language Models

Title: Improving Multilingual Math Reasoning for African Languages

Title: Beyond Specialization: Benchmarking LLMs for Transliteration of Indian Languages

Title: APE: A Data-Centric Benchmark for Efficient LLM Adaptation in Text Summarization

Title: Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Title: ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs

Title: MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models

Title: CP-Router: An Uncertainty-Aware Router Between LLM and LRM

Title: Conversational Lexicography: Querying Lexicographic Data on Knowledge Graphs with SPARQL through Natural Language

Title: DeepDialogue: A Multi-Turn Emotionally-Rich Spoken Dialogue Dataset

Title: How Well Do Large Reasoning Models Translate? A Comprehensive Evaluation for Multi-Domain Machine Translation

Title: WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

Title: Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation

Title: TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation

Title: Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking

Title: Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs

Title: Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

Title: Incentivizing Reasoning from Weak Supervision

Title: Inference-time Alignment in Continuous Space

Title: Multi-Domain Explainability of Preferences

Title: MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning

Title: S2LPP: Small-to-Large Prompt Prediction across LLMs

Title: Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities

Title: Adaptive Deep Reasoning: Triggering Deep Thinking When Needed

Title: Language-Agnostic Suicidal Risk Detection Using Large Language Models

Title: ResSVD: Residual Compensated SVD for Large Language Model Compression

Title: Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi's Zibaldone

Title: TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent

Title: Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers

Title: AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings

Title: SeMe: Training-Free Language Model Merging via Semantic Alignment

Title: UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models

Title: Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs

Title: Exploring Generative Error Correction for Dysarthric Speech Recognition

Title: Visual Abstract Thinking Empowers Multimodal Reasoning

Title: THiNK: Can Large Language Models Think-aloud?

Title: Monocle: Hybrid Local-Global In-Context Evaluation for Long-Text Generation with Uncertainty-Based Active Learning

Title: Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

Title: Reasoning Is Not All You Need: Examining LLMs for Multi-Turn Mental Health Conversations

Title: How to Improve the Robustness of Closed-Source Models on NLI

Title: FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

Title: Efficient Speech Translation through Model Compression and Knowledge Distillation

Title: It's High Time: A Survey of Temporal Information Retrieval and Question Answering

Title: KnowTrace: Bootstrapping Iterative Retrieval-Augmented Generation with Structured Knowledge Tracing

Title: WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models

Title: Does quantization affect models' performance on long-context tasks?

Title: OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

Title: One-shot Entropy Minimization

Title: MASKSEARCH: A Universal Pre-Training Framework to Enhance Agentic Search Capability

Title: Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery

Title: Self-reflective Uncertainties: Do LLMs Know Their Internal Answer Distribution?

Title: Reasoning LLMs are Wandering Solution Explorers

Title: MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding