2025-08-22

Title: Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training

Title: Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages

Title: Improving LLMs for Machine Translation Using Synthetic Preference Data

Title: Multilingual Datasets for Custom Input Extraction and Explanation Requests Parsing in Conversational XAI Systems

Title: Reward-Shifted Speculative Sampling Is An Efficient Test-Time Weak-to-Strong Aligner

Title: LongRecall: A Structured Approach for Robust Recall Evaluation in Long-Form Text

Title: Mapping the Course for Prompt-based Structured Prediction

Title: Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Title: Identifying and Answering Questions with False Assumptions: An Interpretable Approach

Title: ContextualLVLM-Agent: A Holistic Framework for Multi-Turn Visually-Grounded Dialogue and Complex Instruction Following

Title: SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling

Title: Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

Title: SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning

Title: Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering

Title: Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall

Title: Are Checklists Really Useful for Automatic Evaluation of Generative Tasks?

Title: VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models

Title: WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai

Title: EMNLP: Educator-role Moral and Normative Large Language Models Profiling

Title: Conflict-Aware Soft Prompting for Retrieval-Augmented Generation

Title: TComQA: Extracting Temporal Commonsense from Text

Title: A Survey on Large Language Model Benchmarks

Title: Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation

Title: Confidence-Modulated Speculative Decoding for Large Language Models

Title: Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training

Title: Attribution, Citation, and Quotation: A Survey of Evidence-based Text Generation with Large Language Models

Title: When Audio and Text Disagree: Revealing Text Bias in Large Audio-Language Models

Title: LLaSO: A Foundational Framework for Reproducible Research in Large Language and Speech Model

Title: A Study of Privacy-preserving Language Modeling Approaches

Title: PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback

Title: RadReason: Radiology Report Evaluation Metric with Reasons and Sub-Scores

Title: SLM4Offer: Personalized Marketing Offer Generation Using Contrastive Learning Based Fine-Tuning

Title: Subjective Behaviors and Preferences in LLM: Language of Browsing

Title: Influence-driven Curriculum Learning for Pre-training on Limited Data

Title: SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts -- Extended Version

Title: HebID: Detecting Social Identities in Hebrew-language Political Text

Title: Dream 7B: Diffusion Large Language Models

Title: The Enemy from Within: A Study of Political Delegitimization Discourse in Israeli Political Speech

Title: SafetyFlow: An Agent-Flow System for Automated LLM Safety Benchmarking

Title: Trained Miniatures: Low cost, High Efficacy SLMs for Sales & Marketing

Title: SDGO: Self-Discrimination-Guided Optimization for Consistent Safety in Large Language Models

Title: Benchmarking Computer Science Survey Generation

Title: EcomMMMU: Strategic Utilization of Visuals for Robust Multimodal E-Commerce Models

Title: End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning

Title: Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis

Title: LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries