2025-05-06

Title: Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation

Title: MoxE: Mixture of xLSTM Experts with Entropy-Aware Routing for Efficient Language Modeling

Title: SymPlanner: Deliberate Planning in Language Models with Symbolic Representation

Title: On the effectiveness of Large Language Models in the mechanical design domain

Title: AI agents may be worth the hype but not the resources (yet): An initial exploration of machine translation quality and costs in three language pairs in the legal and news domains

Title: PIPA: A Unified Evaluation Protocol for Diagnosing Interactive Planning Agents

Title: Always Tell Me The Odds: Fine-grained Conditional Probability Estimation

Title: A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency

Title: High-Fidelity Pseudo-label Generation by Large Language Models for Training Robust Radiology Report Classifiers

Title: Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models

Title: Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models

Title: $\textit{New News}$: System-2 Fine-tuning for Robust Integration of New Knowledge

Title: Intra-Layer Recurrence in Transformers for Language Modeling

Title: Humans can learn to detect AI-generated texts, or at least learn when they can't

Title: CAMOUFLAGE: Exploiting Misinformation Detection Systems Through LLM-driven Adversarial Claim Transformation

Title: Analyzing Cognitive Differences Among Large Language Models through the Lens of Social Worldview

Title: LLM-based Text Simplification and its Effect on User Comprehension and Cognitive Load

Title: Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs

Title: An overview of artificial intelligence in computer-assisted language learning

Title: What do Language Model Probabilities Represent? From Distribution Estimation to Response Prediction

Title: LecEval: An Automated Metric for Multimodal Knowledge Acquisition in Multimedia Learning

Title: LLM-OptiRA: LLM-Driven Optimization of Resource Allocation for Non-Convex Problems in Wireless Communications

Title: Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study

Title: QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach

Title: Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Title: Incorporating Legal Structure in Retrieval-Augmented Generation: A Case Study on Copyright Fair Use

Title: A New HOPE: Domain-agnostic Automatic Evaluation of Text Chunking

Title: Identifying Legal Holdings with LLMs: A Systematic Study of Performance, Scale, and Memorization

Title: Measuring Hong Kong Massive Multi-Task Language Understanding

Title: SEval-Ex: A Statement-Level Framework for Explainable Summarization Evaluation

Title: Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models

Title: Parameter-Efficient Transformer Embeddings

Title: Demystifying optimized prompts in language models

Title: Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition

Title: Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering

Title: SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning

Title: RM-R1: Reward Modeling as Reasoning

Title: Bielik 11B v2 Technical Report

Title: Colombian Waitresses y Jueces canadienses: Gender and Country Biases in Occupation Recommendations from LLMs

Title: EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning

Title: Automatic Proficiency Assessment in L2 English Learners

Title: LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis

Title: Proper Name Diacritization for Arabic Wikipedia: A Benchmark Dataset

Title: A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Title: Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models

Title: Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models

Title: ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations