2026-03-30

Title: RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

Title: Doctorina MedBench: End-to-End Evaluation of Agent-Based Medical AI

Title: Density-aware Soft Context Compression with Semi-Dynamic Compression Ratio

Title: Can Small Models Reason About Legal Documents? A Comparative Study

Title: When Chain-of-Thought Backfires: Evaluating Prompt Sensitivity in Medical Language Models

Title: MemoryCD: Benchmarking Long-Context User Memory of LLM Agents for Lifelong Cross-Domain Personalization

Title: Toward Culturally Grounded Natural Language Processing

Title: AgentCollab: A Self-Evaluation-Driven Collaboration Paradigm for Efficient LLM Agents

Title: Retrieval-Augmented Generation Based Nurse Observation Extraction

Title: LLM Benchmark-User Need Misalignment for Climate Change

Title: Clash of the models: Comparing performance of BERT-based variants for generic news frame detection

Title: ClinicalAgents: Multi-Agent Orchestration for Clinical Decision Making with Dual-Memory

Title: Sparse Auto-Encoders and Holism about Large Language Models

Title: Ask or Assume? Uncertainty-Aware Clarification-Seeking in Coding Agents

Title: A Universal Vibe? Finding and Controlling Language-Agnostic Informal Register with SAEs

Title: Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR

Title: findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

Title: From Human Cognition to Neural Activations: Probing the Computational Primitives of Spatial Reasoning in LLMs

Title: CALRK-Bench: Evaluating Context-Aware Legal Reasoning in Korean Law

Title: Switch Attention: Towards Dynamic and Fine-grained Hybrid Transformers

Title: Why Models Know But Don't Say: Chain-of-Thought Faithfulness Divergence Between Thinking Tokens and Answers in Open-Weight Reasoning Models

Title: Automating Clinical Information Retrieval from Finnish Electronic Health Records Using Large Language Models

Title: ClimateCheck 2026: Scientific Fact-Checking and Disinformation Narrative Classification of Climate-related Claims

Title: Clinical named entity recognition in the Portuguese language: a benchmark of modern BERT models and LLMs

Title: AMALIA Technical Report: A Fully Open Source Large Language Model for European Portuguese

Title: JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems

Title: ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs

Title: How Open Must Language Models be to Enable Reliable Scientific Inference?

Title: Development of a European Union Time-Indexed Reference Dataset for Assessing the Performance of Signal Detection Methods in Pharmacovigilance using a Large Language Model

Title: MemBoost: A Memory-Boosted Framework for Cost-Aware LLM Inference

Title: Weight Tying Biases Token Embeddings Towards the Output Space