2025-10-14

Title: Table Question Answering in the Era of Large Language Models: A Comprehensive Survey of Tasks, Methods, and Evaluation

Title: Emotionally Charged, Logically Blurred: AI-driven Emotional Framing Impairs Human Fallacy Detection

Title: The Idola Tribus of AI: Large Language Models tend to perceive order where none exists

Title: SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Title: ReaLM: Residual Quantization Bridging Knowledge Graph Embeddings and Large Language Models

Title: All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language

Title: Preference-Aware Memory Update for Long-Term LLM Agents

Title: Layout-Aware Parsing Meets Efficient LLMs: A Unified, Scalable Framework for Resume Information Extraction and Evaluation

Title: VisRAG 2.0: Evidence-Guided Multi-Image Reasoning in Visual Retrieval-Augmented Generation

Title: Judge's Verdict: A Comprehensive Analysis of LLM Judge Capability Through Human Agreement

Title: Gold Panning: Turning Positional Bias into Signal for Multi-Document LLM Reasoning

Title: PromptGuard at BLP-2025 Task 1: A Few-Shot Classification Framework Using Majority Voting and Keyword Similarity for Bengali Hate Speech Detection

Title: Text Prompt Injection of Vision Language Models

Title: NG-Router: Graph-Supervised Multi-Agent Collaboration for Nutrition Question Answering

Title: NarraBench: A Comprehensive Framework for Narrative Benchmarking

Title: CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs

Title: Closing the Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs

Title: Abductive Preference Learning

Title: HIPPD: Brain-Inspired Hierarchical Information Processing for Personality Detection

Title: Don't Throw Away Your Pretrained Model

Title: Enhancing Faithfulness in Abstractive Summarization via Span-Level Fine-Tuning

Title: Unpacking Hateful Memes: Presupposed Context and False Claims

Title: Beyond Fertility: Analyzing STRR as a Metric for Multilingual Tokenization Evaluation

Title: Unifying Tree Search Algorithm and Reward Design for LLM Reasoning: A Survey

Title: Toward Machine Translation Literacy: How Lay Users Perceive and Rely on Imperfect Translations

Title: Beyond the limitation of a single query: Train your LLM for query expansion with Reinforcement Learning

Title: Path Drift in Large Reasoning Models:How First-Person Commitments Override Safety

Title: Lightweight Baselines for Medical Abstract Classification: DistilBERT with Cross-Entropy as a Strong Default

Title: CLMN: Concept based Language Models via Neural Symbolic Reasoning

Title: Unilaw-R1: A Large Language Model for Legal Reasoning with Reinforcement Learning and Iterative Inference

Title: A-IPO: Adaptive Intent-driven Preference Optimization

Title: Stop When Enough: Adaptive Early-Stopping for Chain-of-Thought Reasoning

Title: LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora

Title: Hybrid OCR-LLM Framework for Enterprise-Scale Document Information Extraction Under Copy-heavy Task

Title: DiffHeads: Differential Analysis and Inference-Time Masking of Bias Heads in Large Language Models

Title: BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation

Title: BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data

Title: Large Language Model Sourcing: A Survey

Title: A Survey of Inductive Reasoning for Large Language Models

Title: MedAgentAudit: Diagnosing and Quantifying Collaborative Failure Modes in Medical Multi-Agent Systems

Title: Weed Out, Then Harvest: Dual Low-Rank Adaptation is an Effective Noisy Label Detector for Noise-Robust Learning

Title: You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

Title: Text2Token: Unsupervised Text Representation Learning with Token Target Prediction

Title: ImCoref-CeS: An Improved Lightweight Pipeline for Coreference Resolution with LLM-based Checker-Splitter Refinement

Title: Audit-of-Understanding: Posterior-Constrained Inference for Mathematical Reasoning in Language Models

Title: Backdoor Collapse: Eliminating Unknown Threats via Known Backdoor Aggregation in Language Models

Title: On the Entity-Level Alignment in Crosslingual Consistency

Title: MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning

Title: Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model's Empathy

Title: End-to-end Automatic Speech Recognition and Speech Translation: Integration of Speech Foundational Models and LLMs

Title: RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models

Title: STEAM: A Semantic-Level Knowledge Editing Framework for Large Language Models

Title: LONGQAEVAL: Designing Reliable Evaluations of Long-Form Clinical QA under Resource Constraints

Title: Do Audio LLMs Really LISTEN, or Just Transcribe? Measuring Lexical vs. Acoustic Emotion Cues Reliance

Title: RECON: Reasoning with Condensation for Efficient Retrieval-Augmented Generation

Title: Steering Over-refusals Towards Safety in Retrieval Augmented Generation

Title: Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?

Title: NIM: Neuro-symbolic Ideographic Metalanguage for Inclusive Communication

Title: FML-bench: A Benchmark for Automatic ML Research Agents Highlighting the Importance of Exploration Breadth

Title: Assessing Large Language Models for Structured Medical Order Extraction

Title: UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models

Title: Merlin's Whisper: Enabling Efficient Reasoning in LLMs via Black-box Adversarial Prompting

Title: Detecting Hallucinations in Authentic LLM-Human Interactions

Title: BitMar: Low-Bit Multimodal Fusion with Episodic Memory for Edge Devices

Title: Dynamic Topic Evolution with Temporal Decay and Attention in Large Language Models

Title: Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization

Title: AGENTIQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation

Title: BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Title: Unlocking LLM Safeguards for Low-Resource Languages via Reasoning and Alignment with Minimal Training Data

Title: RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

Title: Sarcasm Detection Using Deep Convolutional Neural Networks: A Modular Deep Learning Framework

Title: Large Language Models for Full-Text Methods Assessment: A Case Study on Mediation Analysis

Title: Review of Inference-Time Scaling Strategies: Reasoning, Search and RAG

Title: Is Implicit Knowledge Enough for LLMs? A RAG Approach for Tree-based Structures

Title: DUAL-Bench: Measuring Over-Refusal and Robustness in Vision-Language Models

Title: Rethinking Agentic Workflows: Evaluating Inference-Based Test-Time Scaling Strategies in Text2SQL Tasks

Title: LLM$\times$MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System

Title: ADVICE: Answer-Dependent Verbalized Confidence Estimation

Title: Evaluating Language Models' Evaluations of Games

Title: KOTOX: A Korean Toxic Dataset for Deobfuscation and Detoxification

Title: Judge Before Answer: Can MLLM Discern the False Premise in Question?

Title: Enhancing Large Language Model Reasoning via Selective Critical Token Fine-Tuning

Title: DeepResearchGuard: Deep Research with Open-Domain Evaluation and Multi-Stage Guardrails for Safety

Title: ABLEIST: Intersectional Disability Bias in LLM-Generated Hiring Scenarios

Title: DND: Boosting Large Language Models with Dynamic Nested Depth

Title: LogiNumSynth: Synthesizing Joint Logical-Numerical Reasoning Problems for Language Models

Title: Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks

Title: Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Title: Enhancing LLM Reasoning via Non-Human-Like Reasoning Path Preference Optimization

Title: TypePilot: Leveraging the Scala Type System for Secure LLM-generated Code

Title: Bridging Gaps in Hate Speech Detection: Meta-Collections and Benchmarks for Low-Resource Iberian Languages

Title: Evaluating Reasoning Faithfulness in Medical Vision-Language Models using Multimodal Perturbations

Title: Discursive Circuits: How Do Language Models Understand Discourse Relations?

Title: Domain-Specific Data Generation Framework for RAG Adaptation

Title: The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers

Title: WebRouter: Query-specific Router via Variational Information Bottleneck for Cost-sensitive Web Agent

Title: A Theorem-Proving-Based Evaluation of Neural Semantic Parsing

Title: CNSocialDepress: A Chinese Social Media Dataset for Depression Risk Detection and Structured Analysis

Title: XQuant: Achieving Ultra-Low Bit KV Cache Quantization with Cross-Layer Compression

Title: Attacks by Content: Automated Fact-checking is an AI Security Issue

Title: Do Psychometric Tests Work for Large Language Models? Evaluation of Tests on Sexism, Racism, and Morality

Title: Towards Real-Time Fake News Detection under Evidence Scarcity

Title: Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs

Title: Are Large Language Models Effective Knowledge Graph Constructors?

Title: FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks

Title: Template-Based Text-to-Image Alignment for Language Accessibility: A Study on Visualizing Text Simplifications

Title: Do LLMs "Feel"? Emotion Circuits Discovery and Control

Title: LLM-Specific Utility: A New Perspective for Retrieval-Augmented Generation

Title: Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers

Title: Early Detection and Reduction of Memorisation for Domain Adaptation and Instruction Tuning

Title: Beyond Survival: Evaluating LLMs in Social Deduction Games with Human-Aligned Strategies

Title: KnowRL: Teaching Language Models to Know What They Know

Title: Valid Survey Simulations with Limited Human Data: The Roles of Prompting, Fine-Tuning, and Rectification

Title: Who are you, ChatGPT? Personality and Demographic Style in LLM-Generated Content

Title: Investigating Large Language Models' Linguistic Abilities for Text Preprocessing

Title: Hallucination Detection via Internal States and Structured Reasoning Consistency in Large Language Models

Title: Information-Preserving Reformulation of Reasoning Traces for Antidistillation

Title: Invisible Languages of the LLM Universe

Title: Culturally-Aware Conversations: A Framework & Benchmark for LLMs

Title: LLMAtKGE: Large Language Models as Explainable Attackers against Knowledge Graph Embeddings

Title: Survey Response Generation: Generating Closed-Ended Survey Responses In-Silico with Large Language Models

Title: MeTA-LoRA: Data-Efficient Multi-Task Fine-Tuning for Large Language Models

Title: Deconstructing Attention: Investigating Design Principles for Effective Language Modeling

Title: LLM-Oriented Token-Adaptive Knowledge Distillation

Title: StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models

Title: Enhancing Long Chain-of-Thought Reasoning through Multi-Path Plan Aggregation

Title: ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

Title: Scaling Language-Centric Omnimodal Representation Learning

Title: When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents

Title: Demystifying Reinforcement Learning in Agentic Reasoning