2024-09-25

Title: Evaluating Large Language Models with Tests of Spanish as a Foreign Language: Pass or Fail?

Title: Watch Your Steps: Observable and Modular Chains of Thought

Title: Multitask Mayhem: Unveiling and Mitigating Safety Gaps in LLMs Fine-tuning

Title: VERA: Validation and Enhancement for Retrieval Augmented systems

Title: Bone: Block Affine Transformation as Parameter Efficient Fine-tuning Methods for Large Language Models

Title: Prompting Large Language Models for Supporting the Differential Diagnosis of Anemia

Title: Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino

Title: Adversarial Attacks on Parts of Speech: An Empirical Study in Text-to-Image Generation

Title: Parse Trees Guided LLM Prompt Compression

Title: CUTE: Measuring LLMs' Understanding of Their Tokens

Title: In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models

Title: Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA

Title: GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation

Title: Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents

Title: A Survey of Stance Detection on Social Media: New Directions and Perspectives

Title: Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation

Title: XTRUST: On the Multilingual Trustworthiness of Large Language Models

Title: CHBench: A Chinese Dataset for Evaluating Health in Large Language Models

Title: Small Language Models: Survey, Measurements, and Insights

Title: NER-Luxury: Named entity recognition for the fashion and luxury domain

Title: Empirical Insights on Fine-Tuning Large Language Models for Question-Answering

Title: Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability

Title: A Zero-Shot Open-Vocabulary Pipeline for Dialogue Understanding

Title: Privacy Evaluation Benchmarks for NLP Models

Title: HLB: Benchmarking LLMs' Humanlikeness in Language Use

Title: Konstruktor: A Strong Baseline for Simple Knowledge Graph Question Answering

Title: Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection

Title: SLIMER-IT: Zero-Shot NER on Italian Language

Title: Automated test generation to evaluate tool-augmented LLMs as conversational AI agents

Title: Finetuning LLMs for Comparative Assessment Tasks

Title: Bridging Speech and Text: Enhancing ASR with Pinyin-to-Character Pre-training in LLMs

Title: AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment

Title: Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering

Title: Exploring Hint Generation Approaches in Open-Domain Question Answering

Title: Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework

Title: HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Title: EuroLLM: Multilingual Language Models for Europe