2025-09-01

Title: Mapping Toxic Comments Across Demographics: A Dataset from German Public Broadcasting

Title: How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation Simulations

Title: Can Multimodal LLMs Solve the Basic Perception Problems of Percept-V?

Title: A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

Title: Quantifying Label-Induced Bias in Large Language Model Self- and Cross-Evaluations

Title: BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

Title: Improving Aviation Safety Analysis: Automated HFACS Classification Using Reinforcement Learning with Group Relative Policy Optimization

Title: Enhancing Robustness of Autoregressive Language Models against Orthographic Attacks via Pixel-based Approach

Title: Do Self-Supervised Speech Models Exhibit the Critical Period Effects in Language Acquisition?

Title: Decoding Memories: An Efficient Pipeline for Self-Consistency Hallucination Detection

Title: BLUEX Revisited: Enhancing Benchmark Coverage with Automatic Captioning

Title: Challenges and Applications of Large Language Models: A Comparison of GPT and DeepSeek family of models

Title: Normality and the Turing Test

Title: AllSummedUp: un framework open-source pour comparer les metriques d'evaluation de resume

Title: Automatic Reviewers Fail to Detect Faulty Reasoning in Research Papers: A New Counterfactual Evaluation Framework

Title: Med-RewardBench: Benchmarking Reward Models and Judges for Medical Multimodal Large Language Models

Title: Discovering Semantic Subdimensions through Disentangled Conceptual Representations

Title: Beyond the Surface: Probing the Ideological Depth of Large Language Models

Title: Igniting Creative Writing in Small Language Models: LLM-as-a-Judge versus Multi-Agent Refined Rewards

Title: A Survey on Current Trends and Recent Advances in Text Anonymization

Title: Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning

Title: Personality Matters: User Traits Predict LLM Preferences in Multi-Turn Collaborative Tasks

Title: QZhou-Embedding Technical Report

Title: Is this chart lying to me? Automating the detection of misleading visualizations

Title: Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance

Title: Reasoning-Intensive Regression

Title: PiCSAR: Probabilistic Confidence Selection And Ranking

Title: Going over Fine Web with a Fine-Tooth Comb: Technical Report of Indexing Fine Web for Problematic Content Search and Retrieval