2025-11-21

Title: What Really Counts? Examining Step and Token Level Attribution in Multilingual CoT Reasoning

Title: TOD-ProcBench: Benchmarking Complex Instruction-Following in Task-Oriented Dialogues

Title: Liars' Bench: Evaluating Lie Detectors for Language Models

Title: Learning Tractable Distributions Of Language Model Continuations

Title: Early science acceleration experiments with GPT-5

Title: ELPO: Ensemble Learning Based Prompt Optimization for Large Language Models

Title: SemanticCite: Citation Verification with AI-Powered Full-Text Analysis and Evidence-Based Reasoning

Title: SeSE: A Structural Information-Guided Uncertainty Quantification Framework for Hallucination Detection in LLMs

Title: SDA: Steering-Driven Distribution Alignment for Open LLMs without Fine-Tuning

Title: Incorporating Self-Rewriting into Large Language Model Reasoning Reinforcement

Title: NLP Datasets for Idiom and Figurative Language Tasks

Title: AICC: Parse HTML Finer, Make Models Better -- A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser

Title: ESGBench: A Benchmark for Explainable ESG Question Answering in Corporate Sustainability Reports

Title: Anatomy of an Idiom: Tracing Non-Compositionality in Language Models

Title: Beyond Tokens in Language Models: Interpreting Activations through Text Genre Chunks

Title: WER is Unaware: Assessing How ASR Errors Distort Clinical Understanding in Patient Facing Dialogue

Title: Integrating Symbolic Natural Language Understanding and Language Models for Word Sense Disambiguation

Title: Comparison of Text-Based and Image-Based Retrieval in Multimodal Retrieval Augmented Generation Large Language Model Systems

Title: Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs