2024-06-10

Title: Large Language Model Confidence Estimation via Black-Box Access

Title: Phased Instruction Fine-Tuning for Large Language Models

Title: Exploring the Latest LLMs for Leaderboard Extraction

Title: MoralBench: Moral Evaluation of LLMs

Title: MAIRA-2: Grounded Radiology Report Generation

Title: Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs

Title: PromptFix: Few-shot Backdoor Removal via Adversarial Prompt Tuning

Title: Automatic Bug Detection in LLM-Powered Text-Based Games Using LLMs

Title: Time Sensitive Knowledge Editing through Efficient Finetuning

Title: NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Title: Proofread: Fixes All Errors with One Tap

Title: llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power of Large Language Models

Title: Creating an AI Observer: Generative Semantic Workspaces

Title: SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models

Title: Extroversion or Introversion? Controlling The Personality of Your Large Language Models

Title: Learning Task Decomposition to Assist Humans in Competitive Programming

Title: LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

Title: Key-Element-Informed sLLM Tuning for Document Summarization

Title: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models

Title: Large Language Model-guided Document Selection

Title: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization

Title: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources

Title: Mixture-of-Agents Enhances Large Language Model Capabilities

Title: AICoderEval: Improving AI Domain Code Generation of Large Language Models

Title: CRAG -- Comprehensive RAG Benchmark

Title: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models

Title: Think out Loud: Emotion Deducing Explanation in Dialogues

Title: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Title: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Title: BERTs are Generative In-Context Learners

Title: Annotating FrameNet via Structure-Conditioned Language Generation

Title: Revisiting Catastrophic Forgetting in Large Language Model Tuning

Title: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models

Title: Do Language Models Exhibit Human-like Structural Priming Effects?

Title: Uncertainty Aware Learning for Language Model Alignment

Title: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering

Title: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques

Title: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models

Title: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models

Title: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense

Title: Quantifying Geospatial in the Common Crawl Corpus

Title: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

Title: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences

Title: Compositional Generalization with Grounded Language Models

Title: Scenarios and Approaches for Situated Natural Language Explanations

Title: Are Large Language Models More Empathetic than Humans?

Title: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization

Title: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs

Title: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models