2025-04-11

Title: EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models

Title: How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities

Title: ChatBench: From Static Benchmarks to Human-AI Evaluation

Title: CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning

Title: DeepSeek-R1 Thoughtology: Let's about LLM Reasoning

Title: HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation

Title: SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog

Title: ConceptCarve: Dynamic Realization of Evidence

Title: Language Modeling for the Future of Finance: A Quantitative Survey into Metrics, Tasks, and Data Opportunities

Title: RAISE: Reinforenced Adaptive Instruction Selection For Large Language Models

Title: MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning

Title: PAYADOR: A Minimalist Approach to Grounding Language Models on Structured Data for Interactive Storytelling and Role-playing Games

Title: Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization

Title: Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction

Title: Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs

Title: TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models

Title: Talking Point based Ideological Discourse Analysis in News Events

Title: AI Coding with Few-Shot Prompting for Thematic Analysis

Title: AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery

Title: From Token to Line: Enhancing Code Generation with a Long-Term Perspective

Title: Revisiting LLM Evaluation through Mechanism Interpretability: a New Metric and Model Utility Law

Title: Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts

Title: Defense against Prompt Injection Attacks via Mixture of Encodings

Title: Transformer-Based Temporal Information Extraction and Application: A Review

Title: Supervised Optimism Correction: Be Confident When LLMs Are Sure

Title: AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation

Title: Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering

Title: ConceptFormer: Towards Efficient Use of Knowledge-Graph Embeddings in Large Language Models

Title: On the Temporal Question-Answering Capabilities of Large Language Models Over Anonymized Data

Title: Unveiling the Impact of Multimodal Features on Chinese Spelling Correction: From Analysis to Design

Title: Synthetic Fluency: Hallucinations, Confabulations, and the Creation of Irish Words in LLM-Generated Translations

Title: Proactive User Information Acquisition via Chats on User-Favored Topics

Title: MRD-RAG: Enhancing Medical Diagnosis with Multi-Round Retrieval-Augmented Generation

Title: DeepGreen: Effective LLM-Driven Green-washing Monitoring System Designed for Empirical Testing -- Evidence from China

Title: Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information

Title: NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark

Title: Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation

Title: Plan-and-Refine: Diverse and Comprehensive Retrieval-Augmented Generation

Title: A System for Comprehensive Assessment of RAG Frameworks

Title: Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language Models

Title: What the HellaSwag? On the Validity of Common-Sense Reasoning Benchmarks

Title: MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations

Title: The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models

Title: Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs

Title: Token Level Routing Inference System for Edge Devices

Title: Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge

Title: Redefining Machine Translation on Social Network Services with Large Language Models