2025-06-30

Title: VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation

Title: Debunk and Infer: Multimodal Fake News Detection via Diffusion-Generated Evidence and LLM Reasoning

Title: Bench to the Future: A Pastcasting Benchmark for Forecasting Agents

Title: GraphLAMA: Enabling Efficient Adaptation of Graph Language Models with Limited Annotations

Title: Reinforcement Learning Fine-Tuning of Language Model for Instruction Following and Math Reasoning

Title: Reasoning Isn't Enough: Examining Truth-Bias and Sycophancy in LLMs

Title: FloorPlan-DeepSeek (FPDS): A multimodal approach to floorplan generation using vector-based next room prediction

Title: FormosanBench: Benchmarking Low-Resource Austronesian Languages in the Era of Large Language Models

Title: Team QUST at SemEval-2025 Task 10: Evaluating Large Language Models in Multiclass Multi-label Classification of News Entity Framing

Title: A Multi-Agent Probabilistic Inference Framework Inspired by Kairanban-Style CoT System with IdoBata Conversation for Debiasing

Title: BioPars: A Pretrained Biomedical Large Language Model for Persian Biomedical Text Mining

Title: Assessing RAG and HyDE on 1B vs. 4B-Parameter Gemma LLMs for Personal Assistants Integretion

Title: Hybrid-NL2SVA: Integrating RAG and Finetuning for LLM-based NL2SVA

Title: Random Initialization Can't Catch Up: The Advantage of Language Model Transfer for Time Series Forecasting

Title: Towards Understanding the Cognitive Habits of Large Reasoning Models

Title: Aligning MLLM Benchmark With Human Preferences via Structural Equation Modeling

Title: Instruction Learning Paradigms: A Dual Perspective on White-box and Black-box LLMs

Title: Digital Gatekeepers: Exploring Large Language Model's Role in Immigration Decisions

Title: STRuCT-LLM: Unifying Tabular and Graph Reasoning with Reinforcement Learning for Semantic Parsing

Title: Adapting Whisper for Parameter-efficient Code-Switching Speech Recognition via Soft Prompt Tuning

Title: Language-Aware Prompt Tuning for Parameter-Efficient Seamless Language Expansion in Multilingual ASR

Title: HealthQA-BR: A System-Wide Benchmark Reveals Critical Knowledge Gaps in Large Language Models

Title: From General Reasoning to Domain Expertise: Uncovering the Limits of Generalization in Large Language Models

Title: VIDEE: Visual and Interactive Decomposition, Execution, and Evaluation of Text Analytics with Intelligent Agents

Title: Empirical Evidence for Alignment Faking in Small LLMs and Prompt-Based Mitigation Techniques

Title: Evaluation of LLM-based Strategies for the Extraction of Food Product Information from Online Shops

Title: Can Vision Language Models Understand Mimed Actions?

Title: Is DeepSeek a New Voice Among LLMs in Public Opinion Simulation?

Title: Understanding Verbatim Memorization in LLMs Through Circuit Discovery

Title: A General Method for Detecting Information Generated by Large Language Models

Title: Representation Consistency for Accurate and Coherent LLM Answer Aggregation

Title: FinEval-KR: A Financial Domain Evaluation Framework for Large Language Models' Knowledge and Reasoning

Title: Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

Title: Thunder-LLM: Efficiently Adapting LLMs to Korean with Minimal Resources

Title: Evaluating Multimodal Large Language Models on Educational Textbook Question Answering

Title: Overview of the ClinIQLink 2025 Shared Task on Medical Question-Answering

Title: Structured Attention Matters to Multimodal LLMs in Document Understanding

Title: BiMark: Unbiased Multilayer Watermarking for Large Language Models

Title: Operationalizing Automated Essay Scoring: A Human-Aware Approach

Title: MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents

Title: Large Language Models as symbolic DNA of cultural dynamics

Title: CORE-KG: An LLM-Driven Knowledge Graph Construction Framework for Human Smuggling Networks

Title: SysTemp: A Multi-Agent System for Template-Based Generation of SysML v2

Title: From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models

Title: Does Multimodality Lead to Better Time Series Forecasting?

Title: ChildGuard: A Specialized Dataset for Combatting Child-Targeted Hate Speech

Title: LastingBench: Defend Benchmarks Against Knowledge Leakage

Title: Refine Medical Diagnosis Using Generation Augmented Retrieval and Clinical Practice Guidelines

Title: TIM: A Large-Scale Dataset and large Timeline Intelligence Model for Open-domain Timeline Summarization

Title: TrajTok: Technical Report for 2025 Waymo Open Sim Agents Challenge

Title: IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

Title: How Large Language Models play humans in online conversations: a simulated study of the 2016 US politics on Reddit

Title: The Open Proof Corpus: A Large-Scale Study of LLM-Generated Mathematical Proofs

Title: Doc2SAR: A Synergistic Framework for High-Fidelity Extraction of Structure-Activity Relationships from Scientific Documents

Title: Do We Really Need GNNs with Explicit Structural Modeling? MLPs Suffice for Language Model Representations

Title: (Fact) Check Your Bias

Title: Evaluating List Construction and Temporal Understanding capabilities of Large Language Models

Title: Offensive Language Detection on Social Media Using XLNet

Title: Towards Transparent AI: A Survey on Explainable Large Language Models

Title: Exploring the Structure of AI-Induced Language Change in Scientific English

Title: The Consistency Hypothesis in Uncertainty Quantification for Large Language Models

Title: Derivational Probing: Unveiling the Layer-wise Derivation of Syntactic Structures in Neural Language Models

Title: DeepTalk: Towards Seamless and Smart Speech Interaction with Adaptive Modality-Specific MoE

Title: WildSpeech-Bench: Benchmarking Audio LLMs in Natural Speech Conversation

Title: Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation

Title: A Dual-Layered Evaluation of Geopolitical and Cultural Bias in LLMs

Title: AutoMixer: Checkpoint Artifacts as Automatic Data Mixers

Title: PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory

Title: More Vulnerable than You Think: On the Stability of Tool-Integrated LLM Agents

Title: Advancing Jailbreak Strategies: A Hybrid Approach to Exploiting LLM Vulnerabilities and Bypassing Modern Defenses

Title: Don't Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism

Title: Analyzing and Fine-Tuning Whisper Models for Multilingual Pilot Speech Transcription in the Cockpit

Title: Can Peter Pan Survive MT? A Stylometric Study of LLMs, NMTs, and HTs in Children's Literature Translation

Title: Decoding Machine Translationese in English-Chinese News: LLMs vs. NMTs

Title: Lost at the Beginning of Reasoning

Title: Identifying a Circuit for Verb Conjugation in GPT-2

Title: SAGE: Spliced-Audio Generated Data for Enhancing Foundational Models in Low-Resource Arabic-English Code-Switched Speech Recognition

Title: Training Language Model to Critique for Better Refinement

Title: Leveraging In-Context Learning for Political Bias Testing of LLMs

Title: Detection of Personal Data in Structured Datasets Using a Large Language Model

Title: Evaluating Scoring Bias in LLM-as-a-Judge

Title: Why Are Parsing Actions for Understanding Message Hierarchies Not Random?

Title: QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization

Title: Refining Czech GEC: Insights from a Multi-Experiment Approach

Title: HyperCLOVA X THINK Technical Report

Title: Sequential Diagnosis with Language Models