2025-02-12

Title: Survey on Vision-Language-Action Models

Title: Self-Supervised Prompt Optimization

Title: LLM-Supported Natural Language to Bash Translation

Title: Knowledge Graph-Guided Retrieval Augmented Generation

Title: Forbidden Science: Dual-Use AI Challenge Benchmark and Scientific Refusal Tests

Title: Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject

Title: Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey

Title: Multimodal Cognitive Reframing Therapy via Multi-hop Psychotherapeutic Reasoning

Title: Group Reasoning Emission Estimation Networks

Title: Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

Title: Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction

Title: Investigating the Zone of Proximal Development of Language Models for In-Context Learning

Title: Demystifying Singular Defects in Large Language Models

Title: Finding Words Associated with DIF: Predicting Differential Item Functioning using LLMs and Explainable AI

Title: AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements

Title: Tokenization Standards for Linguistic Integrity: Turkish as a Benchmark

Title: Using Contextually Aligned Online Reviews to Measure LLMs' Performance Disparities Across Language Varieties

Title: Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations

Title: IRepair: An Intent-Aware Approach to Repair Data-Driven Errors in Large Language Models

Title: Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Title: SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation

Title: Structural Reformation of Large Language Model Neuron Encapsulation for Divergent Information Aggregation

Title: Cardiverse: Harnessing LLMs for Novel Card Game Prototyping

Title: Language-TPP: Integrating Temporal Point Processes with Language Models for Event Analysis

Title: Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning

Title: Does Training on Synthetic Data Make Models Less Robust?

Title: Don't Just Demo, Teach Me the Principles: A Principle-Based Multi-Agent Prompting Strategy for Text Classification

Title: Refine Knowledge of Large Language Models via Adaptive Contrastive Learning

Title: Perceived Confidence Scoring for Data Annotation with Zero-Shot LLMs

Title: A Large-Scale Benchmark for Vietnamese Sentence Paraphrases

Title: Graph RAG-Tool Fusion

Title: GENERator: A Long-Context Generative Genomic Foundation Model

Title: Small Language Model Makes an Effective Long Text Extractor

Title: CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Title: MEMIT-Merge: Addressing MEMIT's Key-Value Conflicts in Same-Subject Batch Editing for LLMs

Title: Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering

Title: BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Title: Bridging the Evaluation Gap: Leveraging Large Language Models for Topic Model Evaluation

Title: LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation

Title: Target-Augmented Shared Fusion-based Multimodal Sarcasm Explanation Generation

Title: Entity Linking using LLMs for Automated Product Carbon Footprint Estimation

Title: RomanLens: Latent Romanization and its role in Multilinguality in LLMs

Title: Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon

Title: PerCul: A Story-Driven Cultural Evaluation of LLMs in Persian

Title: Multi-Agent Collaboration for Multilingual Code Instruction Tuning

Title: Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More

Title: Grammar Control in Dialogue Response Generation for Language Learning Chatbots

Title: Unsupervised Translation of Emergent Communication

Title: O1 Embedder: Let Retrievers Think Before Action

Title: We Can't Understand AI Using our Existing Vocabulary

Title: DPO-Shift: Shifting the Distribution of Direct Preference Optimization

Title: Tractable Transformers for Flexible Conditional Generation

Title: FoQA: A Faroese Question-Answering Dataset

Title: Auto-Drafting Police Reports from Noisy ASR Outputs: A Trust-Centered LLM Approach

Title: Large Language Models as Proxies for Theories of Human Linguistic Cognition

Title: Making Language Models Robust Against Negation

Title: WHODUNIT: Evaluation benchmark for culprit detection in mystery stories

Title: Breaking Down Bias: On The Limits of Generalizable Pruning Strategies

Title: Auditing Prompt Caching in Language Model APIs