2025-10-20

Title: Rethinking Toxicity Evaluation in Large Language Models: A Multi-Label Perspective

Title: Can generative AI figure out figurative language? The influence of idioms on essay scoring by ChatGPT, Gemini, and Deepseek

Title: A Generalizable Rhetorical Strategy Annotation Model Using LLM-based Debate Simulation and Labelling

Title: Continual Learning via Sparse Memory Finetuning

Title: Measuring the Effect of Disfluency in Multilingual Knowledge Probing Benchmarks

Title: Latent Topic Synthesis: Leveraging LLMs for Electoral Ad Analysis

Title: FarsiMCQGen: a Persian Multiple-choice Question Generation Framework

Title: Structure-R1: Dynamically Leveraging Structural Knowledge in LLM Reasoning through Reinforcement Learning

Title: Extending Audio Context for Long-Form Understanding in Large Audio-Language Models

Title: Planner and Executor: Collaboration between Discrete Diffusion And Autoregressive Models in Reasoning

Title: Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding

Title: TraceCoder: Towards Traceable ICD Coding via Multi-Source Knowledge Integration

Title: Exemplar-Guided Planing: Enhanced LLM Agent for KGQA

Title: Accelerating Mobile Language Model Generation via Hybrid Context and Hardware Coordination

Title: Capabilities and Evaluation Biases of Large Language Models in Classical Chinese Poetry Generation: A Case Study on Tang Poetry

Title: AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction

Title: When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling

Title: Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing

Title: VocalBench-DF: A Benchmark for Evaluating Speech LLM Robustness to Disfluency

Title: Fine-Tuning MedGemma for Clinical Captioning to Enhance Multimodal RAG over Malaysia CPGs

Title: When Seeing Is not Enough: Revealing the Limits of Active Reasoning in MLLMs

Title: Controllable Abstraction in Summary Generation for Large Language Models via Prompt Engineering

Title: CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs

Title: DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios

Title: Temporal Referential Consistency: Do LLMs Favor Sequences Over Absolute Time References?

Title: From Characters to Tokens: Dynamic Grouping with Hierarchical BPE

Title: Latent Reasoning in LLMs as a Vocabulary-Space Superposition

Title: MCA: Modality Composition Awareness for Robust Composed Multimodal Retrieval

Title: TokenTiming: A Dynamic Alignment Method for Universal Speculative Decoding Model Pairs

Title: Rethinking Cross-lingual Gaps from a Statistical Viewpoint

Title: Think Parallax: Solving Multi-Hop Problems via Multi-View Knowledge-Graph-Based Retrieval-Augmented Generation

Title: KITE: A Benchmark for Evaluating Korean Instruction-Following Abilities in Large Language Models

Title: Finetuning LLMs for EvaCun 2025 token prediction shared task

Title: HypoSpace: Evaluating LLM Creativity as Set-Valued Hypothesis Generators under Underdetermination

Title: Leveraging LLMs for Context-Aware Implicit Textual and Multimodal Hate Speech Detection

Title: Attention Sinks in Diffusion Language Models

Title: LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

Title: On Non-interactive Evaluation of Animal Communication Translators

Title: Emergence of Linear Truth Encodings in Language Models

Title: Paper2Web: Let's Make Your Paper Alive!

Title: SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling

Title: InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training

Title: PolySkill: Learning Generalizable Skills Through Polymorphic Abstraction