2026-03-03

Title: Personalization Increases Affective Alignment but Has Role-Dependent Effects on Epistemic Independence in LLMs

Title: TAB-PO: Preference Optimization with a Token-Level Adaptive Barrier for Token-Critical Structured Generation

Title: ActMem: Bridging the Gap Between Memory Retrieval and Reasoning in LLM Agents

Title: EPPCMinerBen: A Novel Benchmark for Evaluating Large Language Models on Electronic Patient-Provider Communication via the Patient Portal

Title: Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

Title: SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

Title: GRIP: Geometric Refinement and Adaptive Information Potential for Data Efficiency

Title: Autorubric: A Unified Framework for Rubric-Based LLM Evaluation

Title: Iterative LLM-based improvement for French Clinical Interview Transcription and Speaker Diarization

Title: Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

Title: From Prerequisites to Predictions: Validating a Geometric Hallucination Taxonomy Through Controlled Induction

Title: When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

Title: How Large Language Models Get Stuck: Early structure with persistent errors

Title: Distribution-Aware Companding Quantization of Large Language Models

Title: Policy Compliance of User Requests in Natural Language for AI Systems

Title: LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation

Title: A Typologically Grounded Evaluation Framework for Word Order and Morphology Sensitivity in Multilingual Masked LMs

Title: CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging

Title: Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research

Title: From Literature to Hypotheses: An AI Co-Scientist System for Biomarker-Guided Drug Combination Hypothesis Generation

Title: BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages

Title: SSKG Hub: An Expert-Guided Platform for LLM-Empowered Sustainability Standards Knowledge Graphs

Title: RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis

Title: DRIV-EX: Counterfactual Explanations for Driving LLMs

Title: SkillCraft: Can LLM Agents Learn to Use Tools Skillfully?

Title: RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

Title: LaSTR: Language-Driven Time-Series Segment Retrieval

Title: Qwen3-Coder-Next Technical Report

Title: A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction

Title: Constitutional Black-Box Monitoring for Scheming in LLM Agents

Title: Learning Nested Named Entity Recognition from Flat Annotations

Title: MedGPT-oss: Training a General-Purpose Vision-Language Model for Biomedicine

Title: CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning

Title: KVSlimmer: Theoretical Insights and Practical Optimizations for Asymmetric KV Merging

Title: Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment

Title: Hybrid Neural-LLM Pipeline for Morphological Glossing in Endangered Language Documentation: A Case Study of Jungar Tuvan

Title: Conformal Prediction for Risk-Controlled Medical Entity Extraction Across Clinical Domains

Title: The Aftermath of DrawEduMath: Vision Language Models Underperform with Struggling Students and Misdiagnose Errors

Title: Towards Orthographically-Informed Evaluation of Speech Recognition Systems for Indian Languages

Title: S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature

Title: Thoth: Mid-Training Bridges LLMs to Time Series Understanding

Title: GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

Title: How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning

Title: CARD: Towards Conditional Design of Multi-agent Topological Structures

Title: DEP: A Decentralized Large Language Model Evaluation Protocol

Title: Token-level Data Selection for Safe LLM Fine-tuning

Title: Reasoning or Rationalization? The Role of Justifications in Masked Diffusion Models for Fact Verification

Title: Reasoning Boosts Opinion Alignment in LLMs

Title: Generative AI & Fictionality: How Novels Power Large Language Models

Title: Can Thinking Models Think to Detect Hateful Memes?

Title: Self-Anchoring Calibration Drift in Large Language Models: How Multi-Turn Conversations Reshape Model Confidence

Title: Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Title: Linking Knowledge to Care: Knowledge Graph-Augmented Medical Follow-Up Question Generation

Title: LLM Self-Explanations Fail Semantic Invariance

Title: Spectral Attention Steering for Prompt Highlighting

Title: Individual Turing Test: A Case Study of LLM-based Simulation Using Longitudinal Personal Data

Title: Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent

Title: Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

Title: MetaState: Persistent Working Memory for Discrete Diffusion Language Models

Title: PanCanBench: A Comprehensive Benchmark for Evaluating Large Language Models in Pancreatic Oncology

Title: Toward Graph-Tokenizing Large Language Models with Reconstructive Graph Instruction Tuning

Title: Quantifying Conversational Reliability of Large Language Models under Multi-Turn Interaction

Title: LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval

Title: Understanding the Physics of Key-Value Cache Compression for LLMs through Attention Dynamics

Title: Enhancing Persona Following at Decoding Time via Dynamic Importance Estimation for Role-Playing Agents

Title: Anatomy of the Modality Gap: Dissecting the Internal States of End-to-End Speech LLMs

Title: Extracting Training Dialogue Data from Large Language Model based Task Bots

Title: Markovian ODE-guided scoring can assess the quality of offline reasoning traces in language models

Title: Measuring What VLMs Don't Say: Validation Metrics Hide Clinical Terminology Erasure in Radiology Report Generation

Title: Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning

Title: LexChronos: An Agentic Framework for Structured Event Timeline Extraction in Indian Jurisprudence

Title: Surgical Post-Training: Cutting Errors, Keeping Knowledge

Title: Building a Strong Instruction Language Model for a Less-Resourced Language

Title: Legal RAG Bench: an end-to-end benchmark for legal RAG

Title: Bootstrapping Embeddings for Low Resource Languages

Title: AnnoABSA: A Web-Based Annotation Tool for Aspect-Based Sentiment Analysis with Retrieval-Augmented Suggestions

Title: Beyond the Resumé: A Rubric-Aware Automatic Interview System for Information Elicitation

Title: FreeAct: Freeing Activations for LLM Quantization

Title: LLM-as-an-Annotator: Training Lightweight Models with LLM-Annotated Examples for Aspect Sentiment Tuple Prediction

Title: nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

Title: ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs

Title: OpenAutoNLU: Open Source AutoML Library for NLU

Title: Let the Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering

Title: CyclicJudge: Mitigating Judge Bias Efficiently in LLM-based Evaluation

Title: KDFlow: A User-Friendly and Efficient Knowledge Distillation Framework for Large Language Models

Title: FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

Title: Demonstrating ViviDoc: Generating Interactive Documents through Human-Agent Collaboration

Title: AdaPonderLM: Gated Pondering Language Models with Token-Wise Adaptive Depth

Title: AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations

Title: CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

Title: MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning

Title: EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training

Title: What Exactly do Children Receive in Language Acquisition? A Case Study on CHILDES with Automated Detection of Filler-Gap Dependencies

Title: ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels

Title: Recursive Think-Answer Process for LLMs and VLMs

Title: LLMs as Strategic Actors: Behavioral Alignment, Risk Calibration, and Argumentation Framing in Geopolitical Simulations

Title: LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards

Title: Zero- and Few-Shot Named-Entity Recognition: Case Study and Dataset in the Crime Domain (CrimeNER)

Title: Organizing, Orchestrating, and Benchmarking Agent Skills at Ecosystem Scale

Title: Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training