2025-05-19

Title: GeoGrid-Bench: Can Foundation Models Understand Multimodal Gridded Geo-Spatial Data?

Title: A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

Title: AI-enhanced semantic feature norms for 786 concepts

Title: Tracr-Injection: Distilling Algorithms into Pre-trained Language Models

Title: Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization

Title: Ranked Voting based Self-Consistency of Large Language Models

Title: A Systematic Analysis of Base Model Choice for Reward Modeling

Title: Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation

Title: Enhancing Low-Resource Minority Language Translation with LLMs and Retrieval-Augmented Generation for Cultural Nuances

Title: Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL

Title: Multimodal Event Detection: Current Approaches and Defining the New Playground through LLMs and VLMs

Title: Have Multimodal Large Language Models (MLLMs) Really Learned to Tell the Time on Analog Clocks?

Title: Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate

Title: A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?

Title: Connecting the Dots: A Chain-of-Collaboration Prompting Framework for LLM Agents

Title: Reasoning with OmniThought: A Large CoT Dataset with Verbosity and Cognitive Difficulty Annotations

Title: Accurate KV Cache Quantization with Outlier Tokens Tracing

Title: GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction

Title: Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer

Title: The Way We Prompt: Conceptual Blending, Neural Dynamics, and Prompt-Induced Transitions in LLMs

Title: Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning

Title: Review-Instruct: A Review-Driven Multi-Turn Conversations Generation Method for Large Language Models

Title: OntoURL: A Benchmark for Evaluating Large Language Models on Symbolic Ontological Understanding, Reasoning and Learning

Title: BLEUBERI: BLEU is a surprisingly effective reward for instruction following

Title: Towards Better Evaluation for Generated Patent Claims

Title: Scaling Reasoning can Improve Factuality in Large Language Models

Title: SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

Title: Low-Resource Language Processing: An OCR-Driven Summarization and Translation Pipeline

Title: HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization

Title: Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models

Title: Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs

Title: Temporal fine-tuning for early risk detection

Title: XtraGPT: LLMs for Human-AI Collaboration on Controllable Academic Paper Revision

Title: Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models

Title: LegoSLM: Connecting LLM with Speech Encoder using CTC Posteriors

Title: GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents

Title: CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs

Title: Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model

Title: When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs

Title: GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art

Title: Is Compression Really Linear with Code Intelligence?

Title: Disentangling Reasoning and Knowledge in Medical Large Language Models

Title: HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

Title: Improving Assembly Code Performance with Large Language Models via Reinforcement Learning

Title: SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning

Title: Modeling cognitive processes of natural reading with transformer-based Language Models