2024-06-11

Title: LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs

Title: On Subjective Uncertainty Quantification and Calibration in Natural Language Generation

Title: Improving Logits-based Detector without Logits from Black-box LLMs

Title: Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers

Title: SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings

Title: Concept Formation and Alignment in Language Models: Bridging Statistical Patterns in Latent Space to Concept Taxonomy

Title: Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios

Title: Hidden Question Representations Tell Non-Factuality Within and Across Large Language Models

Title: MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention

Title: Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets

Title: Flexible and Adaptable Summarization via Expertise Separation

Title: Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization

Title: CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation

Title: Venn Diagram Prompting : Accelerating Comprehension with Scaffolding Effect

Title: VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

Title: Planning Like Human: A Dual-process Framework for Dialogue Planning

Title: Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

Title: MaTableGPT: GPT-based Table Data Extractor from Materials Science Literature

Title: Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity Recognition

Title: Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation

Title: ThatiAR: Subjectivity Detection in Arabic News Sentences

Title: Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts

Title: Creativity Has Left the Chat: The Price of Debiasing Language Models

Title: CERET: Cost-Effective Extrinsic Refinement for Text Generation

Title: GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?

Title: How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States

Title: DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation

Title: Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses

Title: MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations

Title: SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models

Title: Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions

Title: MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation

Title: MrRank: Improving Question Answering Retrieval System through Multi-Result Ranking Model

Title: Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization

Title: The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Title: RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

Title: Hidden Holes: topological aspects of language models

Title: Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

Title: Seventeenth-Century Spanish American Notary Records for Fine-Tuning Spanish Large Language Models

Title: MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering

Title: II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

Title: Zero-Shot End-To-End Spoken Question Answering In Medical Domain

Title: Are Large Language Models Actually Good at Text Style Transfer?

Title: Feriji: A French-Zarma Parallel Corpus, Glossary & Translator

Title: Why Don't Prompt-Based Fairness Metrics Correlate?

Title: Hello Again! LLM-powered Personalized Agent for Long-term Dialogue

Title: HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs

Title: The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models

Title: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models

Title: Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text

Title: Recurrent Context Compression: Efficiently Expanding the Context Window of LLM

Title: Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation

Title: Verifiable Generation with Subsentence-Level Fine-Grained Citations

Title: Can I understand what I create? Self-Knowledge Evaluation of Large Language Models

Title: Language Models Resist Alignment

Title: LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages

Title: Multi-Prompting Decoder Helps Better Language Understanding

Title: Tx-LLM: A Large Language Model for Therapeutics

Title: Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Title: MedExQA: Medical Question Answering Benchmark with Multiple Explanations

Title: MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows

Title: Symmetric Dot-Product Attention for Efficient Training of BERT Language Models

Title: Annotation alignment: Comparing LLM and human annotations of conversational safety

Title: Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue

Title: Controlling Emotion in Text-to-Speech with Natural Language Prompts

Title: Language Models are Alignable Decision-Makers: Dataset and Application to the Medical Triage Domain

Title: Multimodal Contextualized Semantic Parsing from Speech

Title: Interpretability of Language Models via Task Spaces

Title: Evaluating the Retrieval Component in LLM-Based Question Answering Systems

Title: Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies

Title: Can Language Models Serve as Text-Based World Simulators?