2026-03-23

Title: When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

Title: DuCCAE: A Hybrid Engine for Immersive Conversation via Collaboration, Augmentation, and Evolution

Title: Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams

Title: Enhancing Legal LLMs through Metadata-Enriched RAG Pipelines and Direct Preference Optimization

Title: GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

Title: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

Title: From Comprehension to Reasoning: A Hierarchical Benchmark for Automated Financial Research Reporting

Title: LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

Title: ShobdoSetu: A Data-Centric Framework for Bengali Long-Form Speech Recognition and Speaker Diarization

Title: Constraint-aware Path Planning from Natural Language Instructions Using Large Language Models

Title: MAPLE: Metadata Augmented Private Language Evolution

Title: Significance-Gain Pair Encoding for LLMs: A Statistical Alternative to Frequency-Based Subword Merging

Title: The α-Law of Observable Belief Revision in Large Language Model Inference

Title: Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

Title: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

Title: Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

Title: Reviewing the Reviewer: Graph-Enhanced LLMs for E-commerce Appeal Adjudication

Title: Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

Title: From Tokens To Agents: A Researcher's Guide To Understanding Large Language Models

Title: Autonoma: A Hierarchical Multi-Agent Framework for End-to-End Workflow Automation

Title: A Human-Centered Workflow for Using Large Language Models in Content Analysis

Title: Transformers are Stateless Differentiable Neural Computers

Title: LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

Title: CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

Title: Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

Title: From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

Title: HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning

Title: From Feature-Based Models to Generative AI: Validity Evidence for Constructed Response Scoring

Title: URAG: A Benchmark for Uncertainty Quantification in Retrieval-Augmented Large Language Models

Title: Framing Effects in Independent-Agent Large Language Models: A Cross-Family Behavioral Analysis

Title: Automated Motif Indexing on the Arabian Nights

Title: LLM-MRD: LLM-Guided Multi-View Reasoning Distillation for Fake News Detection

Title: PrefPO: Pairwise Preference Prompt Optimization

Title: Memory-Driven Role-Playing: Evaluation and Enhancement of Persona Knowledge Utilization in LLMs

Title: Prompt-tuning with Attribute Guidance for Low-resource Entity Matching

Title: Scalable Prompt Routing via Fine-Grained Latent Task Discovery

Title: Is Evaluation Awareness Just Format Sensitivity? Limitations of Probe-Based Evidence under Controlled Prompt Structure

Title: Vocabulary shapes cross-lingual variation of word-order learnability in language models

Title: Cooperation and Exploitation in LLM Policy Synthesis for Sequential Social Dilemmas

Title: Inducing Sustained Creativity and Diversity in Large Language Models

Title: EvidenceRL: Reinforcing Evidence Consistency for Trustworthy Language Models

Title: FDARxBench: Benchmarking Regulatory and Clinical Reasoning on FDA Generic Drug Assessment

Title: TextReasoningBench: Does Reasoning Really Improve Text Classification in Large Language Models?

Title: BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

Title: Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach

Title: DataProphet: Demystifying Supervision Data Generalization in Multimodal LLMs

Title: EvoTaxo: Building and Evolving Taxonomy from Social Media Streams

Title: LoopRPT: Reinforcement Pre-Training for Looped Language Models

Title: PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction

Title: Rethinking Ground Truth: A Case Study on Human Label Variation in MLLM Benchmarking

Title: Neither Here Nor There: Cross-Lingual Representation Dynamics of Code-Mixed Text in Multilingual Encoders

Title: Semantic Delta: An Interpretable Signal Differentiating Human and LLMs Dialogue

Title: SAGE: Sustainable Agent-Guided Expert-tuning for Culturally Attuned Translation in Low-Resource Southeast Asia

Title: When Contextual Inference Fails: Cancelability in Interactive Instruction Following

Title: An Agentic Approach to Generating XAI-Narratives

Title: RouterKGQA: Specialized--General Model Routing for Constraint-Aware Knowledge Graph Question Answering

Title: LoASR-Bench: Evaluating Large Speech Language Models on Low-Resource Automatic Speech Recognition Across Language Families

Title: An Empirical Study of SFT-DPO Interaction and Parameterization in Small Language Models

Title: Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax

Title: Reasoning Gets Harder for LLMs Inside A Dialogue

Title: Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models

Title: Evaluating Evidence Grounding Under User Pressure in Instruction-Tuned Language Models

Title: Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation