2025-04-22

Title: MEQA: A Meta-Evaluation Framework for Question & Answer LLM Benchmarks

Title: A Baseline for Self-state Identification and Classification in Mental Health Data: CLPsych 2025 Task

Title: LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models

Title: PEFT A2Z: Parameter-Efficient Fine-Tuning Survey for Large Language and Vision Models

Title: Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

Title: SConU: Selective Conformal Uncertainty in Large Language Models

Title: Self-Correction Makes LLMs Better Parsers

Title: Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion

Title: Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models

Title: Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification

Title: Understanding the Repeat Curse in Large Language Models from a Feature Perspective

Title: SimplifyMyText: An LLM-Based System for Inclusive Plain Language Text Simplification

Title: Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale

Title: Probing the Subtle Ideological Manipulation of Large Language Models

Title: Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models

Title: Diverse Prompts: Illuminating the Prompt Space of Large Language Models with MAP-Elites

Title: ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data

Title: CoLoTa: A Dataset for Entity-based Commonsense Reasoning over Long-Tail Knowledge

Title: DialogueAgents: A Hybrid Agent-Based Speech Synthesis Framework for Multi-Party Dialogue

Title: FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering

Title: Functional Abstraction of Knowledge Recall in Large Language Models

Title: Causality for Natural Language Processing

Title: BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation

Title: a1: Steep Test-time Scaling Law via Environment Augmented Generation

Title: Translation Analytics for Freelancers: I. Introduction, Data Preparation, Baseline Evaluations

Title: A Hierarchical Framework for Measuring Scientific Paper Innovation via Large Language Models

Title: Automatic Text Summarization (ATS) for Research Documents in Sorani Kurdish

Title: Harnessing Generative LLMs for Enhanced Financial Event Entity Extraction Performance

Title: A Case Study Exploring the Current Landscape of Synthetic Medical Record Generation with Commercial LLMs

Title: Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data

Title: FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models

Title: OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding

Title: PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines

Title: Disentangling Linguistic Features with Dimension-Wise Analysis of Vector Embeddings

Title: Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions

Title: Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends

Title: On Self-improving Token Embeddings

Title: Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation

Title: Natural Fingerprints of Large Language Models

Title: Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey

Title: CRAVE: A Conflicting Reasoning Approach for Explainable Claim Verification Using LLMs

Title: Evaluating LLMs on Chinese Topic Constructions: A Research Proposal Inspired by Tian et al. (2024)

Title: Efficient Pretraining Length Scaling

Title: Stay Hungry, Stay Foolish: On the Extended Reading Articles Generation with LLMs

Title: LLMs as Data Annotators: How Close Are We to Human Performance

Title: DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models

Title: RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search

Title: Testing LLMs' Capabilities in Annotating Translations Based on an Error Typology Designed for LSP Translation: First Experiments with ChatGPT

Title: Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models

Title: Kuwain 1.5B: An Arabic SLM via Language Injection

Title: EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

Title: The Synthetic Imputation Approach: Generating Optimal Synthetic Texts For Underrepresented Categories In Supervised Classification Tasks

Title: Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges

Title: EvalAgent: Discovering Implicit Evaluation Criteria from the Web

Title: Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions

Title: MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning

Title: Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators