2025-06-17

Title: Focusing on Students, not Machines: Grounded Question Generation and Automated Answer Grading

Title: ChatbotManip: A Dataset to Facilitate Evaluation and Oversight of Manipulative Chatbot Behaviour

Title: Continuously Updating Digital Twins using Large Language Models

Title: UCD: Unlearning in LLMs via Contrastive Decoding

Title: Personalized LLM Decoding via Contrasting Personal Preference

Title: Eliciting Reasoning in Language Models with Cognitive Tools

Title: Can Mixture-of-Experts Surpass Dense LLMs Under Strictly Equal Resources?

Title: Hatevolution: What Static Benchmarks Don't Tell Us

Title: Maximally-Informative Retrieval for State Space Model Generation

Title: A Rigorous Evaluation of LLM Data Generation Strategies for Low-Resource Languages

Title: Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs

Title: Supernova Event Dataset: Interpreting Large Language Model's Personality through Critical Event Analysis

Title: Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index

Title: Large Language Models for History, Philosophy, and Sociology of Science: Interpretive Uses, Methodological Challenges, and Critical Perspectives

Title: The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs

Title: Med-U1: Incentivizing Unified Medical Reasoning in LLMs via Large-scale Reinforcement Learning

Title: Intersectional Bias in Japanese Large Language Models from a Contextualized Perspective

Title: Investigating the Effects of Cognitive Biases in Prompts on Large Language Model Outputs

Title: Refract ICL: Rethinking Example Selection in the Era of Million-Token Models

Title: Efficient Reasoning Through Suppression of Self-Affirmation Reflections in Large Reasoning Models

Title: Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics

Title: Understanding the Effect of Knowledge Graph Extraction Error on Downstream Graph Analyses: A Case Study on Affiliation Graphs

Title: Training-free LLM Merging for Multi-task Learning

Title: Recent Advances and Future Directions in Literature-Based Discovery

Title: Group then Scale: Dynamic Mixture-of-Experts Multilingual Language Model

Title: Exploring Cultural Variations in Moral Judgments with Large Language Models

Title: From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment

Title: Language Surgery in Multilingual Large Language Models

Title: TagRouter: Learning Route to LLMs through Tags for Open-Domain Text Generation Tasks

Title: FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation

Title: Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation

Title: Towards Fairness Assessment of Dutch Hate Speech Detection

Title: Detection, Classification, and Mitigation of Gender Bias in Large Language Models

Title: Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction

Title: RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking

Title: Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts

Title: DoTA-RAG: Dynamic of Thought Aggregation RAG

Title: Overview of the NLPCC 2025 Shared Task: Gender Bias Mitigation Challenge

Title: Enabling Precise Topic Alignment in Large Language Models Via Sparse Autoencoders

Title: OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases

Title: An Exploration of Mamba for Speech Self-Supervised Models

Title: Towards Building General Purpose Embedding Models for Industry 4.0 Agents

Title: OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics

Title: Between Predictability and Randomness: Seeking Artistic Inspiration from AI Generative Models

Title: Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics

Title: Flexible Realignment of Language Models

Title: Rethinking Hate Speech Detection on Social Media: Can LLMs Replace Traditional Models?

Title: Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

Title: Surprise Calibration for Better In-Context Learning

Title: Transforming Chatbot Text: A Sequence-to-Sequence Approach

Title: QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Title: ArgHiTZ at ArchEHR-QA 2025: A Two-Step Divide and Conquer Approach to Patient Question Answering for Top Factuality

Title: SciDA: Scientific Dynamic Assessor of LLMs

Title: PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization

Title: SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models

Title: CliniDial: A Naturally Occurring Multimodal Dialogue Dataset for Team Reflection in Action During Clinical Operation

Title: Assessing the Role of Data Quality in Training Bilingual Language Models

Title: Multi-document Summarization through Multi-document Event Relation Graph Reasoning in LLMs: a case study in Framing Bias Mitigation

Title: Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis

Title: Missing the human touch? A computational stylometry analysis of GPT-4 translations of online Chinese literature

Title: Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models

Title: CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model

Title: Multipole Attention for Efficient Long Context Reasoning

Title: MotiveBench: How Far Are We From Human-Like Motivational Reasoning in Large Language Models?

Title: CHILL at SemEval-2025 Task 2: You Can't Just Throw Entities and Hope -- Make Your LLM to Get Them Right

Title: Rethinking Test-Time Scaling for Medical AI: Model and Task-Aware Strategies for LLMs and VLMs

Title: Leveraging In-Context Learning for Language Model Agents

Title: Adapting LLMs for Minimal-edit Grammatical Error Correction

Title: Ai-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns

Title: Development of the user-friendly decision aid Rule-based Evaluation and Support Tool (REST) for optimizing the resources of an information extraction task

Title: Enhancing Large Language Models with Reliable Knowledge Graphs

Title: Align-then-Unlearn: Embedding Alignment for LLM Unlearning

Title: Breaking Thought Patterns: A Multi-Dimensional Reasoning Framework for LLMs

Title: Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey

Title: Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law

Title: IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation

Title: AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

Title: Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMs

Title: Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models

Title: Large Language Models as 'Hidden Persuaders': Fake Product Reviews are Indistinguishable to Humans and Machines

Title: Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach

Title: NTU Speechlab LLM-Based Multilingual ASR System for Interspeech MLC-SLM Challenge 2025

Title: Direct Reasoning Optimization: LLMs Can Reward And Refine Their Own Reasoning for Open-Ended Tasks

Title: StoryBench: A Dynamic Benchmark for Evaluating Long-Term Memory with Multi Turns

Title: Efficient Medical VIE via Reinforcement Learning

Title: Decompositional Reasoning for Graph Retrieval with Large Language Models

Title: Bi-directional Context-Enhanced Speech Large Language Models for Multilingual Conversational ASR

Title: RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis

Title: Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study

Title: ROSAQ: Rotation-based Saliency-Aware Weight Quantization for Efficiently Compressing Large Language Models

Title: Language Agents for Hypothesis-driven Clinical Decision Making with Reinforcement Learning

Title: Position: Pause Recycling LoRAs and Prioritize Mechanisms to Uncover Limits and Effectiveness

Title: TurBLiMP: A Turkish Benchmark of Linguistic Minimal Pairs

Title: BOW: Bottlenecked Next Word Exploration

Title: TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices

Title: Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization

Title: Understand the Implication: Learning to Think for Pragmatic Understanding

Title: Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems

Title: CAMS: A CityGPT-Powered Agentic Framework for Urban Human Mobility Simulation

Title: An Empirical Study of LLM-as-a-Judge: How Design Choices Impact Evaluation Reliability

Title: EvolvTrip: Enhancing Literary Character Understanding with Temporal Theory-of-Mind Graphs

Title: Prefix-Tuning+: Modernizing Prefix-Tuning through Attention Independent Prefix Data

Title: Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models

Title: Balancing Knowledge Delivery and Emotional Comfort in Healthcare Conversational Systems

Title: Instruction Following by Boosting Attention of Large Language Models

Title: LTRR: Learning To Rank Retrievers for LLMs

Title: Steering LLM Thinking with Budget Guidance