2024-02-26

Title: Orca-Math: Unlocking the potential of SLMs in Grade School Math

Title: CliqueParcel: An Approach For Batching LLM Prompts That Jointly Optimizes Efficiency And Faithfulness

Title: MIKE: A New Benchmark for Fine-grained Multimodal Entity Knowledge Editing

Title: Stealthy Attack on Large Language Model based Recommendation

Title: An Empirical Categorization of Prompting Techniques for Large Language Models: A Practitioner's Guide

Title: RFBES at SemEval-2024 Task 8: Investigating Syntactic and Semantic Features for Distinguishing AI-Generated and Human-Written Texts

Title: RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning

Title: Purifying Large Language Models by Ensembling a Small Language Model

Title: Stick to your Role! Stability of Personal Values Expressed in Large Language Models

Title: Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Title: CHATATC: Large Language Model-Driven Conversational Agents for Supporting Strategic Air Traffic Flow Management

Title: SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning

Title: HumanEval on Latest GPT Models -- 2024

Title: NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries

Title: A Dual-Prompting for Interpretable Mental Health Language Models

Title: An LLM Maturity Model for Reliable and Transparent Text-to-Query

Title: Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning

Title: Is the System Message Really Important to Jailbreaks in Large Language Models?

Title: ChatEL: Entity Linking with Chatbots

Title: Ranking Large Language Models without Ground Truth

Title: Evaluation of a semi-autonomous attentive listening system with takeover prompting

Title: DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents

Title: APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models

Title: LLM Based Multi-Agent Generation of Semi-structured Documents from Semantic Templates in the Public Administration Domain

Title: Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs

Title: Technical Report on the Checkfor.ai AI-Generated Text Classifier

Title: Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation

Title: What's in a Name? Auditing Large Language Models for Race and Gender Bias

Title: Driving Generative Agents With Their Personality

Title: Automatic Histograms: Leveraging Language Models for Text Dataset Exploration

Title: A Study on the Vulnerability of Test Questions against ChatGPT-based Cheating

Title: COBIAS: Contextual Reliability in Bias Assessment

Title: LLMBind: A Unified Modality-Task Integration Framework

Title: Data Augmentation is Dead, Long Live Data Augmentation

Title: Chain-of-Thought Unfaithfulness as Disguised Accuracy

Title: Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs

Title: MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Title: Mirror: A Multiple-perspective Self-Reflection Method for Knowledge-rich Reasoning

Title: MultiLS: A Multi-task Lexical Simplification Framework

Title: GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal Data

Title: Optimizing Language Models for Human Preferences is a Causal Inference Problem

Title: tinyBenchmarks: evaluating LLMs with fewer examples

Title: Divide-or-Conquer? Which Part Should You Distill Your LLM?

Title: How Important Is Tokenization in French Medical Masked Language Models?

Title: Unintended Impacts of LLM Alignment on Global Representation

Title: Probabilistically-sound beam search with masked language models

Title: KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

Title: CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean

Title: Unlocking the Power of Large Language Models for Entity Alignment

Title: ToMBench: Benchmarking Theory of Mind in Large Language Models

Title: Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

Title: On the Multi-turn Instruction Following for Conversational Web Agents

Title: ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

Title: Fine-tuning Large Language Models for Domain-specific Machine Translation

Title: Gotcha! Don't trick me with unanswerable questions! Self-aligning Large Language Models for Responding to Unknown Questions

Title: Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition

Title: PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning

Title: AttributionBench: How Hard is Automatic Attribution Evaluation?

Title: Trajectory-wise Iterative Reinforcement Learning Framework for Auto-bidding

Title: Multi-Armed Bandits with Abstention

Title: Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models

Title: Improving Sentence Embeddings with an Automatically Generated NLI Dataset

Title: Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised Sentence Embeddings

Title: Machine Unlearning of Pre-trained Large Language Models

Title: Spatially-Aware Transformer Memory for Embodied Agents

Title: Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models

Title: Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer

Title: Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition

Title: Advancing Parameter Efficiency in Fine-tuning via Representation Editing

Title: Break the Breakout: Reinventing LM Defense Against Jailbreak Attacks with Self-Refinement

Title: GraphEdit: Large Language Models for Graph Structure Learning

Title: Biomedical Entity Linking as Multiple Choice Question Answering

Title: DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators

Title: Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models

Title: ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

Title: GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?

Title: Chitchat as Interference: Adding User Backstories to Task-Oriented Dialogues

Title: DEEM: Dynamic Experienced Expert Modeling for Stance Detection

Title: MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

Title: When in Doubt, Think Slow: Iterative Reasoning with Latent Imagination

Title: Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Title: How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries

Title: ArabianGPT: Native Arabic GPT-based Large Language

Title: GPTVQ: The Blessing of Dimensionality for LLM Quantization

Title: Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies

Title: NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data

Title: AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

Title: Explorations of Self-Repair in Language Models

Title: Genie: Generative Interactive Environments

Title: Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms

Title: Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?

Title: A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Title: Repetition Improves Language Model Embeddings

Title: Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization

Title: Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models

Title: API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs

Title: AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning