2026-03-26

Title: Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

Title: Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

Title: Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems

Title: Internal Safety Collapse in Frontier Large Language Models

Title: Visuospatial Perspective Taking in Multimodal Language Models

Title: DISCO: Document Intelligence Suite for COmparative Evaluation

Title: S-Path-RAG: Semantic-Aware Shortest-Path Retrieval Augmented Generation for Multi-Hop Knowledge Graph Question Answering

Title: Berta: an open-source, modular tool for AI-enabled clinical documentation

Title: DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

Title: Training a Large Language Model for Medical Coding Using Privacy-Preserving Synthetic Clinical Data

Title: MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Title: Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents

Title: MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

Title: From Physician Expertise to Clinical Agents: Preserving, Standardizing, and Scaling Physicians' Medical Expertise with Lightweight LLM

Title: Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages

Title: Qworld: Question-Specific Evaluation Criteria for LLMs

Title: Do 3D Large Language Models Really Understand 3D Spatial Relationships?

Title: Navigating the Concept Space of Language Models

Title: Prompt Compression in Production Task Orchestration: A Pre-Registered Randomized Trial

Title: Plato's Cave: A Human-Centered Research Verification System

Title: Compression Method Matters: Benchmark-Dependent Output Dynamics in LLM Prompt Compression

Title: The Compression Paradox in LLM Inference: Provider-Dependent Energy Effects of Prompt Compression

Title: Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language

Title: Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

Title: Large Language Models Unpack Complex Political Opinions through Target-Stance Extraction

Title: Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs

Title: MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG

Title: Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths

Title: Swiss-Bench SBP-002: A Frontier Model Comparison on Swiss Legal and Regulatory Tasks

Title: Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges

Title: PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation

Title: The Diminishing Returns of Early-Exit Decoding in Modern LLMs

Title: IslamicMMLU: A Benchmark for Evaluating LLMs on Islamic Knowledge

Title: Perturbation: A simple and efficient adversarial tracer for representation learning in language models

Title: PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay

Title: Language Model Planners do not Scale, but do Formalizers?

Title: BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

Title: Self-Distillation for Multi-Token Prediction

Title: Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

Title: Argument Mining as a Text-to-Text Generation Task

Title: From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents

Title: The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Title: Grounding Arabic LLMs in the Doha Historical Dictionary: Retrieval-Augmented Understanding of Quran and Hadith

Title: CoCR-RAG: Enhancing Retrieval-Augmented Generation in Web Q&A via Concept-oriented Context Reconstruction

Title: Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning

Title: CVPD at QIAS 2026: RAG-Guided LLM Reasoning for Al-Mawarith Share Computation and Heir Allocation

Title: Schema on the Inside: A Two-Phase Fine-Tuning Method for High-Efficiency Text-to-SQL at Scale

Title: From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs

Title: FinToolSyn: A forward synthesis Framework for Financial Tool-Use Dialogue Data with Dynamic Tool Retrieval

Title: ConceptKT: A Benchmark for Concept-Level Deficiency Prediction in Knowledge Tracing

Title: LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale

Title: Alignment Reduces Expressed but Not Encoded Gender Bias: A Unified Framework and Study

Title: MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare

Title: Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition

Title: Semantic Alignment across Ancient Egyptian Language Stages via Normalization-Aware Multitask Learning

Title: GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

Title: When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools

Title: PINGALA: Prosody-Aware Decoding for Sanskrit Poetry Generation

Title: Mechanic: Sorrifier-Driven Formal Decomposition Workflow for Automated Theorem Proving

Title: Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Title: Representation Learning to Study Temporal Dynamics in Tutorial Scaffolding

Title: MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

Title: Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA