2025-08-15

Title: A Transparent Fairness Evaluation Protocol for Open-Source Language Model Benchmarking on the Blockchain

Title: Thematic and Task-Based Categorization of K-12 GenAI Usages with Hierarchical Topic Modeling

Title: INTIMA: A Benchmark for Human-AI Companionship Behavior

Title: XFacta: Contemporary, Real-World Dataset and Evaluation for Multimodal Misinformation Detection with Multimodal LLMs

Title: AutoGeTS: Knowledge-based Automated Generation of Text Synthetics for Improving Text Classification

Title: Semantic Structure in Large Language Model Embeddings

Title: From Answers to Questions: EQGBench for Evaluating LLMs' Educational Question Generation

Title: Automated scoring of the Ambiguous Intentions Hostility Questionnaire using fine-tuned large language models

Title: Multidimensional classification of posts for online course discussion forum curation

Title: An Audit and Analysis of LLM-Assisted Health Misinformation Jailbreaks Against LLMs

Title: Evaluation of GPT-based large language generative AI models as study aids for the national licensure examination for registered dietitians in Japan

Title: Guided Navigation in Knowledge-Dense Environments: Structured Semantic Exploration with Guidance Graphs

Title: Semantic Bridge: Universal Multi-Hop Question Generation via AMR-Driven Graph Synthesis

Title: PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?

Title: RealTalk-CN: A Realistic Chinese Speech-Text Dialogue Benchmark With Cross-Modal Interaction Analysis

Title: Training-Free Multimodal Large Language Model Orchestration

Title: A Rose by Any Other Name Would Smell as Sweet: Categorical Homotopy Theory for Large Language Models

Title: Decoupling Understanding from Reasoning via Problem Space Mapping for Small-scale Model Reasoning

Title: FedCoT: Communication-Efficient Federated Reasoning Enhancement for Large Language Models

Title: LATTE: Learning Aligned Transactions and Textual Embeddings for Bank Clients

Title: Conformal P-Value in Multiple-Choice Question Answering Tasks with Provable Risk Control

Title: RTTC: Reward-Guided Collaborative Test-Time Compute

Title: Detecting and explaining postpartum depression in real-time with generative artificial intelligence

Title: SABER: Switchable and Balanced Training for Efficient LLM Reasoning

Title: LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data

Title: PREF: Reference-Free Evaluation of Personalised Text Generation in LLMs

Title: Latent Fusion Jailbreak: Blending Harmful and Harmless Representations to Elicit Unsafe LLM Outputs

Title: Inference-Aware Prompt Optimization for Aligning Black-Box Large Language Models

Title: The Cost of Thinking: Increased Jailbreak Risk in Large Language Models

Title: Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion

Title: mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning

Title: Multi-Turn Puzzles: Evaluating Interactive Reasoning and Strategic Dialogue in LLMs

Title: LaajMeter: A Framework for LaaJ Evaluation

Title: Estimating Machine Translation Difficulty

Title: Efficient Forward-Only Data Valuation for Pretrained LLMs and VLMs

Title: PakBBQ: A Culturally Adapted Bias Benchmark for QA

Title: Prompt-Response Semantic Divergence Metrics for Faithfulness Hallucination and Misalignment Detection in Large Language Models

Title: Using Large Language Models to Measure Symptom Severity in Patients At Risk for Schizophrenia

Title: Inductive Bias Extraction and Matching for LLM Prompts

Title: Yet another algorithmic bias: A Discursive Analysis of Large Language Models Reinforcing Dominant Discourses on Gender and Race

Title: From Surface to Semantics: Semantic Structure Parsing for Table-Centric Document Analysis

Title: Beyond Semantic Understanding: Preserving Collaborative Frequency Components in LLM-based Recommendation

Title: Cross-Prompt Encoder for Low-Performing Languages

Title: Making Qwen3 Think in Korean with Reinforcement Learning

Title: Advancing Cross-lingual Aspect-Based Sentiment Analysis with LLMs and Constrained Decoding for Sequence-to-Sequence Models

Title: Large Language Models for Summarizing Czech Historical Documents and Beyond

Title: Improving Generative Cross-lingual Aspect-Based Sentiment Analysis with Constrained Decoding

Title: Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts

Title: Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation

Title: ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning

Title: Evaluating LLMs on Chinese Idiom Translation

Title: Computational Economics in Large Language Models: Exploring Model Behavior and Incentive Design under Resource Constraints

Title: DiFaR: Enhancing Multimodal Misinformation Detection with Diverse, Factual, and Relevant Rationales

Title: When Language Overrules: Revealing Text Dominance in Multimodal Large Language Models

Title: eDIF: A European Deep Inference Fabric for Remote Interpretability of LLM

Title: Learning from Natural Language Feedback for Personalized Question Answering

Title: Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs

Title: Beyond "Not Novel Enough": Enriching Scholarly Critique with LLM-Assisted Feedback

Title: Reinforced Language Models for Sequential Decision Making

Title: Psyche-R1: Towards Reliable Psychological LLMs through Unified Empathy, Expertise, and Reasoning

Title: SSRL: Self-Search Reinforcement Learning

Title: A Survey on Diffusion Language Models