2025-12-16

Title: Enhancing Urban Visual Place Recognition for Crowdsourced Flood Imagery via LLM-Guided Attention

Title: Reinforcement Learning for Latent-Space Thinking in LLMs

Title: Direct Confidence Alignment: Aligning Verbalized Confidence with Internal Confidence In Large Language Models

Title: Hold Onto That Thought: Assessing KV Cache Compression On Reasoning

Title: Benchmarking Contextual Understanding for In-Car Conversational Systems

Title: VOYAGER: A Training Free Approach for Generating Diverse Datasets using LLMs

Title: BLASST: Dynamic BLocked Attention Sparsity via Softmax Thresholding

Title: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings

Title: Diffusion Language Model Inference with Monte Carlo Tree Search

Title: Semantic Distance Measurement based on Multi-Kernel Gaussian Processes

Title: Market-Bench: Evaluating Large Language Models on Introductory Quantitative Trading and Market Dynamics

Title: SCIR: A Self-Correcting Iterative Refinement Framework for Enhanced Information Extraction Based on Schema

Title: Can GPT replace human raters? Validity and reliability of machine-generated norms for metaphors

Title: Large language models have learned to use language

Title: The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting

Title: NagaNLP: Bootstrapping NLP for Low-Resource Nagamese Creole with Human-in-the-Loop Synthetic Data

Title: HyperEdit: Unlocking Instruction-based Text Editing in LLMs via Hypernetworks

Title: Coupled Variational Reinforcement Learning for Language Model General Reasoning

Title: Human-Inspired Learning for Large Language Models via Obvious Record and Maximum-Entropy Method Discovery

Title: Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives

Title: LexRel: Benchmarking Legal Relation Extraction for Chinese Civil Cases

Title: Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches

Title: CoDA: A Context-Decoupled Hierarchical Agent with Reinforcement Learning

Title: NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Title: Curió-Edu 7B: Examining Data Selection Impacts in LLM Continued Pretraining

Title: Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions

Title: State over Tokens: Characterizing the Role of Reasoning Tokens

Title: Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, LLaMA

Title: Hindsight is 20/20: Building Agent Memory that Retains, Recalls, and Reflects

Title: What Matters in Evaluating Book-Length Stories? A Systematic Study of Long Story Evaluation

Title: Counting Clues: A Lightweight Probabilistic Baseline Can Match an LLM

Title: Building from Scratch: A Multi-Agent Framework with Human-in-the-Loop for Multilingual Legal Terminology Mapping

Title: QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Title: Authors Should Annotate

Title: An Open and Reproducible Deep Research Agent for Long-Form Question Answering

Title: LLM Rationalis? Measuring Bargaining Capabilities of AI Negotiators

Title: Uncovering the Role of Initial Saliency in U-Shaped Attention Bias: Scaling Initial Token Weight for Enhanced Long-Text Processing

Title: Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models

Title: AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning

Title: AIR: Post-training Data Selection for Reasoning via Attention Head Influence

Title: MiniLingua: A Small Open-Source LLM for European Languages

Title: FIN-bench-v2: A Unified and Robust Benchmark Suite for Evaluating Finnish Large Language Models

Title: Large language models are not about language

Title: Scaling Laws for Code: Every Programming Language Matters

Title: Non-Resolution Reasoning: A Framework for Preserving Semantic Ambiguity in Language Models

Title: SkipCat: Rank-Maximized Low-Rank Compression of Large Language Models via Shared Projection and Block Skipping

Title: Memory in the Age of AI Agents

Title: ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Title: Textual Gradients are a Flawed Metaphor for Automatic Prompt Optimization

Title: Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Title: Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models

Title: Large-Language Memorization During the Classification of United States Supreme Court Cases

Title: Comparative Analysis of LLM Abliteration Methods: A Cross-Architecture Evaluation

Title: Towards Effective Model Editing for LLM Personalization

Title: Beyond surface form: A pipeline for semantic analysis in Alzheimer's Disease detection from spontaneous speech