language model

Title: I Know You Did Not Write That! A Sampling Based Watermarking Method for Identifying Machine Generated Text. (arXiv:2311.18054v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18054
Code URL: null
Copy Paste: [[2311.18054]] I Know You Did Not Write That! A Sampling Based Watermarking Method for Identifying Machine Generated Text(http://arxiv.org/abs/2311.18054)
Summary:
Potential harms of Large Language Models such as mass misinformation and plagiarism can be partially mitigated if there exists a reliable way to detect machine generated text. In this paper, we propose a new watermarking method to detect machine-generated texts. Our method embeds a unique pattern within the generated text, ensuring that while the content remains coherent and natural to human readers, it carries distinct markers that can be identified algorithmically. Specifically, we intervene with the token sampling process in a way which enables us to trace back our token choices during the detection phase. We show how watermarking affects textual quality and compare our proposed method with a state-of-the-art watermarking method in terms of robustness and detectability. Through extensive experiments, we demonstrate the effectiveness of our watermarking scheme in distinguishing between watermarked and non-watermarked text, achieving high detection rates while maintaining textual quality.

Title: Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation. (arXiv:2311.18062v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18062
Code URL: null
Copy Paste: [[2311.18062]] Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation(http://arxiv.org/abs/2311.18062)
Summary:
Intelligent agents such as robots are increasingly deployed in real-world, safety-critical settings. It is vital that these agents are able to explain the reasoning behind their decisions to human counterparts; however, their behavior is often produced by uninterpretable models such as deep neural networks. We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions, thus making our method independent from the underlying model's representation. For such models, we first learn a behavior representation and subsequently use it to produce plausible explanations with minimal hallucination while affording user interaction with a pre-trained large language model. We evaluate our method in a multi-agent search-and-rescue environment and demonstrate the effectiveness of our explanations for agents executing various behaviors. Through user studies and empirical experiments, we show that our approach generates explanations as helpful as those produced by a human domain expert while enabling beneficial interactions such as clarification and counterfactual queries.

Title: LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models. (arXiv:2311.18232v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18232
Code URL: null
Copy Paste: [[2311.18232]] LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models(http://arxiv.org/abs/2311.18232)
Summary:
Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional or goal-directed agents and might necessitate considerable prompt tuning. This becomes particularly apparent in multi-turn conversations: even the best current LLMs rarely ask clarifying questions, engage in explicit information gathering, or take actions now that lead to better decisions after multiple turns. Reinforcement learning has the potential to leverage the powerful modeling capabilities of LLMs, as well as their internal representation of textual interactions, to create capable goal-directed language agents. This can enable intentional and temporally extended interactions, such as with humans, through coordinated persuasion and carefully crafted questions, or in goal-directed play through text games to bring about desired final outcomes. However, enabling this requires the community to develop stable and reliable reinforcement learning algorithms that can effectively train LLMs. Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms. Our paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for LLMs, together with an open-source research framework containing a basic toolkit for getting started on multi-turn RL with offline value-based and policy-based RL methods. Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.

Title: ESG Accountability Made Easy: DocQA at Your Service. (arXiv:2311.18481v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18481
Code URL: null
Copy Paste: [[2311.18481]] ESG Accountability Made Easy: DocQA at Your Service(http://arxiv.org/abs/2311.18481)
Summary:
We present Deep Search DocQA. This application enables information extraction from documents via a question-answering conversational assistant. The system integrates several technologies from different AI disciplines consisting of document conversion to machine-readable format (via computer vision), finding relevant data (via natural language processing), and formulating an eloquent response (via large language models). Users can explore over 10,000 Environmental, Social, and Governance (ESG) disclosure reports from over 2000 corporations. The Deep Search platform can be accessed at: https://ds4sd.github.io.

Title: CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation. (arXiv:2311.18702v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18702
Code URL: https://github.com/thu-coai/critiquellm
Copy Paste: [[2311.18702]] CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation(http://arxiv.org/abs/2311.18702)
Summary:
Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most of them only train a critique generation model of a specific scale on specific datasets. We argue that a comprehensive investigation on the key factor of LLM-based evaluation models, such as scaling properties, is lacking, so that it is still inconclusive whether these models have potential to replace GPT-4's evaluation in practical scenarios. In this paper, we propose a new critique generation model called CritiqueLLM, which includes a dialogue-based prompting method for high-quality referenced / reference-free evaluation data. Experimental results show that our model can achieve comparable evaluation performance to GPT-4 especially in system-level correlations, and even outperform GPT-4 in 3 out of 8 tasks in a challenging reference-free setting. We conduct detailed analysis to show promising scaling properties of our model in the quality of generated critiques. We also demonstrate that our generated critiques can act as scalable feedback to directly improve the generation quality of LLMs.

Title: Zero-shot Conversational Summarization Evaluations with small Large Language Models. (arXiv:2311.18041v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18041
Code URL: null
Copy Paste: [[2311.18041]] Zero-shot Conversational Summarization Evaluations with small Large Language Models(http://arxiv.org/abs/2311.18041)
Summary:
Large Language Models (LLMs) exhibit powerful summarization abilities. However, their capabilities on conversational summarization remains under explored. In this work we evaluate LLMs (approx. 10 billion parameters) on conversational summarization and showcase their performance on various prompts. We show that the summaries generated by models depend on the instructions and the performance of LLMs vary with different instructions sometimes resulting steep drop in ROUGE scores if prompts are not selected carefully. We also evaluate the models with human evaluations and discuss the limitations of the models on conversational summarization

Title: TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis. (arXiv:2311.18063v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18063
Code URL: null
Copy Paste: [[2311.18063]] TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis(http://arxiv.org/abs/2311.18063)
Summary:
Turkish is one of the most popular languages in the world. Wide us of this language on social media platforms such as Twitter, Instagram, or Tiktok and strategic position of the country in the world politics makes it appealing for the social network researchers and industry. To address this need, we introduce TurkishBERTweet, the first large scale pre-trained language model for Turkish social media built using almost 900 million tweets. The model shares the same architecture as base BERT model with smaller input length, making TurkishBERTweet lighter than BERTurk and can have significantly lower inference time. We trained our model using the same approach for RoBERTa model and evaluated on two text classification tasks: Sentiment Classification and Hate Speech Detection. We demonstrate that TurkishBERTweet outperforms the other available alternatives on generalizability and its lower inference time gives significant advantage to process large-scale datasets. We also compared our models with the commercial OpenAI solutions in terms of cost and performance to demonstrate TurkishBERTweet is scalable and cost-effective solution. As part of our research, we released TurkishBERTweet and fine-tuned LoRA adapters for the mentioned tasks under the MIT License to facilitate future research and applications on Turkish social media. Our TurkishBERTweet model is available at: https://github.com/ViralLab/TurkishBERTweet

Title: ROBBIE: Robust Bias Evaluation of Large Generative Language Models. (arXiv:2311.18140v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18140
Code URL: null
Copy Paste: [[2311.18140]] ROBBIE: Robust Bias Evaluation of Large Generative Language Models(http://arxiv.org/abs/2311.18140)
Summary:
As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensure equal and equitable treatment of marginalized demographic groups. In this work, our focus is two-fold:

(1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. Out of those 6 metrics, AdvPromptSet and HolisticBiasR are novel datasets proposed in the paper. The comparison of those benchmarks gives us insights about the bias and toxicity of the compared models. Therefore, we explore the frequency of demographic terms in common LLM pre-training corpora and how this may relate to model biases.

(2) Mitigation: we conduct a comprehensive study of how well 3 bias/toxicity mitigation techniques perform across our suite of measurements. ROBBIE aims to provide insights for practitioners while deploying a model, emphasizing the need to not only measure potential harms, but also understand how they arise by characterizing the data, mitigate harms once found, and balance any trade-offs. We open-source our analysis code in hopes of encouraging broader measurements of bias in future LLMs.

Title: DisCGen: A Framework for Discourse-Informed Counterspeech Generation. (arXiv:2311.18147v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18147
Code URL: https://github.com/sabithsn/discgen
Copy Paste: [[2311.18147]] DisCGen: A Framework for Discourse-Informed Counterspeech Generation(http://arxiv.org/abs/2311.18147)
Summary:
Counterspeech can be an effective method for battling hateful content on social media. Automated counterspeech generation can aid in this process. Generated counterspeech, however, can be viable only when grounded in the context of topic, audience and sensitivity as these factors influence both the efficacy and appropriateness. In this work, we propose a novel framework based on theories of discourse to study the inferential links that connect counter speeches to the hateful comment. Within this framework, we propose: i) a taxonomy of counterspeech derived from discourse frameworks, and ii) discourse-informed prompting strategies for generating contextually-grounded counterspeech. To construct and validate this framework, we present a process for collecting an in-the-wild dataset of counterspeech from Reddit. Using this process, we manually annotate a dataset of 3.9k Reddit comment pairs for the presence of hatespeech and counterspeech. The positive pairs are annotated for 10 classes in our proposed taxonomy. We annotate these pairs with paraphrased counterparts to remove offensiveness and first-person references. We show that by using our dataset and framework, large language models can generate contextually-grounded counterspeech informed by theories of discourse. According to our human evaluation, our approaches can act as a safeguard against critical failures of discourse-agnostic models.

Title: COVID-19 Vaccine Misinformation in Middle Income Countries. (arXiv:2311.18195v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18195
Code URL: https://github.com/zzoliman/covid-vaccine-misinfo-mic
Copy Paste: [[2311.18195]] COVID-19 Vaccine Misinformation in Middle Income Countries(http://arxiv.org/abs/2311.18195)
Summary:
This paper introduces a multilingual dataset of COVID-19 vaccine misinformation, consisting of annotated tweets from three middle-income countries: Brazil, Indonesia, and Nigeria. The expertly curated dataset includes annotations for 5,952 tweets, assessing their relevance to COVID-19 vaccines, presence of misinformation, and the themes of the misinformation. To address challenges posed by domain specificity, the low-resource setting, and data imbalance, we adopt two approaches for developing COVID-19 vaccine misinformation detection models: domain-specific pre-training and text augmentation using a large language model. Our best misinformation detection models demonstrate improvements ranging from 2.7 to 15.9 percentage points in macro F1-score compared to the baseline models. Additionally, we apply our misinformation detection models in a large-scale study of 19 million unlabeled tweets from the three countries between 2020 and 2022, showcasing the practical application of our dataset and models for detecting and analyzing vaccine misinformation in multiple countries and languages. Our analysis indicates that percentage changes in the number of new COVID-19 cases are positively associated with COVID-19 vaccine misinformation rates in a staggered manner for Brazil and Indonesia, and there are significant positive associations between the misinformation rates across the three countries.

Title: Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models. (arXiv:2311.18215v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18215
Code URL: null
Copy Paste: [[2311.18215]] Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models(http://arxiv.org/abs/2311.18215)
Summary:
Caution: this paper may include material that could be offensive or distressing.

The advent of Large Language Models (LLMs) necessitates the development of training approaches that mitigate the generation of unethical language and aptly manage toxic user queries. Given the challenges related to human labor and the scarcity of data, we present KoTox, comprising 39K unethical instruction-output pairs. This collection of automatically generated toxic instructions refines the training of LLMs and establishes a foundational framework for improving LLMs' ethical awareness and response to various toxic inputs, promoting more secure and responsible interactions in Natural Language Processing (NLP) applications.

Title: Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension. (arXiv:2311.18353v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18353
Code URL: null
Copy Paste: [[2311.18353]] Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension(http://arxiv.org/abs/2311.18353)
Summary:
To precisely evaluate a language model's capability for logical reading comprehension, we present a dataset for testing the understanding of the rationale behind critical reasoning. For questions taken from an existing multiplechoice logical reading comprehension dataset, we crowdsource rationale texts that explain why we should select or eliminate answer options, resulting in 3,003 multiple-choice subquestions that are associated with 943 main questions. Experiments on our dataset show that recent large language models (e.g., InstructGPT) struggle to answer the subquestions even if they are able to answer the main questions correctly. We find that the models perform particularly poorly in answering subquestions written for the incorrect options of the main questions, implying that the models have a limited capability for explaining why incorrect alternatives should be eliminated. These results suggest that our dataset encourages further investigation into the critical reasoning ability of language models while focusing on the elimination process of relevant alternatives.

Title: IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions. (arXiv:2311.18397v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18397
Code URL: null
Copy Paste: [[2311.18397]] IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions(http://arxiv.org/abs/2311.18397)
Summary:
Retrieval-Augmented Generation (RAG), by incorporating external knowledge with parametric memory of language models, has become the state-of-the-art architecture for open-domain QA tasks. However, common knowledge bases are inherently constrained by limited coverage and noisy information, making retrieval-based approaches inadequate to answer implicit reasoning questions. In this paper, we propose an Induction-Augmented Generation (IAG) framework that utilizes inductive knowledge along with the retrieved documents for implicit reasoning. We leverage large language models (LLMs) for deriving such knowledge via a novel prompting method based on inductive reasoning patterns. On top of this, we implement two versions of IAG named IAG-GPT and IAG-Student, respectively. IAG-GPT directly utilizes the knowledge generated by GPT-3 for answer prediction, while IAG-Student gets rid of dependencies on GPT service at inference time by incorporating a student inductor model. The inductor is firstly trained via knowledge distillation and further optimized by back-propagating the generator feedback via differentiable beam scores. Experimental results show that IAG outperforms RAG baselines as well as ChatGPT on two Open-Domain QA tasks. Notably, our best models have won the first place in the official leaderboards of CSQA2.0 (since Nov 1, 2022) and StrategyQA (since Jan 8, 2023).

Title: ArthModel: Enhance Arithmetic Skills to Large Language Model. (arXiv:2311.18609v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18609
Code URL: null
Copy Paste: [[2311.18609]] ArthModel: Enhance Arithmetic Skills to Large Language Model(http://arxiv.org/abs/2311.18609)
Summary:
With the great success of ChatGPT, the research of large language models has become increasingly popular. However, the models have several limitations, such as toxicity and pool performance of arithmetic solving. Meanwhile, LLM may have some potential abilities that have yet to be exploited. In this paper, we choose a different way to enhance the arithmetic ability of LLM. We propose to train LLM to generate a postfix expression related to the arithmetic problem and incorporate it with small pretrained models. Moreover, this small model transfers the token embeddings into real dense numbers and invokes native functions of a deep learning platform to get the correct answer. To generate the final result, we propose prompt injection for adding the result outputs by the small model to LLM. This work provides different ways of thinking, training and using a language model. The codes and models will be released at \url{https://github.com/eteced/arithmetic_finetuning_v1}.

Title: ArcMMLU: A Library and Information Science Benchmark for Large Language Models. (arXiv:2311.18658v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18658
Code URL: https://github.com/stzhang-patrick/arcmmlu
Copy Paste: [[2311.18658]] ArcMMLU: A Library and Information Science Benchmark for Large Language Models(http://arxiv.org/abs/2311.18658)
Summary:
In light of the rapidly evolving capabilities of large language models (LLMs), it becomes imperative to develop rigorous domain-specific evaluation benchmarks to accurately assess their capabilities. In response to this need, this paper introduces ArcMMLU, a specialized benchmark tailored for the Library & Information Science (LIS) domain in Chinese. This benchmark aims to measure the knowledge and reasoning capability of LLMs within four key sub-domains: Archival Science, Data Science, Library Science, and Information Science. Following the format of MMLU/CMMLU, we collected over 6,000 high-quality questions for the compilation of ArcMMLU. This extensive compilation can reflect the diverse nature of the LIS domain and offer a robust foundation for LLM evaluation. Our comprehensive evaluation reveals that while most mainstream LLMs achieve an average accuracy rate above 50% on ArcMMLU, there remains a notable performance gap, suggesting substantial headroom for refinement in LLM capabilities within the LIS domain. Further analysis explores the effectiveness of few-shot examples on model performance and highlights challenging questions where models consistently underperform, providing valuable insights for targeted improvements. ArcMMLU fills a critical gap in LLM evaluations within the Chinese LIS domain and paves the way for future development of LLMs tailored to this specialized area.

Title: Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling. (arXiv:2311.18711v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18711
Code URL: null
Copy Paste: [[2311.18711]] Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling(http://arxiv.org/abs/2311.18711)
Summary:
We present GEST -- a new dataset for measuring gender-stereotypical reasoning in masked LMs and English-to-X machine translation systems. GEST contains samples that are compatible with 9 Slavic languages and English for 16 gender stereotypes about men and women (e.g., Women are beautiful, Men are leaders). The definition of said stereotypes was informed by gender experts. We used GEST to evaluate 11 masked LMs and 4 machine translation systems. We discovered significant and consistent amounts of stereotypical reasoning in almost all the evaluated models and languages.

Title: Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent. (arXiv:2311.18307v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18307
Code URL: null
Copy Paste: [[2311.18307]] Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent(http://arxiv.org/abs/2311.18307)
Summary:
Adept traffic models are critical to both planning and closed-loop simulation for autonomous vehicles (AV), and key design objectives include accuracy, diverse multimodal behaviors, interpretability, and downstream compatibility. Recently, with the advent of large language models (LLMs), an additional desirable feature for traffic models is LLM compatibility. We present Categorical Traffic Transformer (CTT), a traffic model that outputs both continuous trajectory predictions and tokenized categorical predictions (lane modes, homotopies, etc.). The most outstanding feature of CTT is its fully interpretable latent space, which enables direct supervision of the latent variable from the ground truth during training and avoids mode collapse completely. As a result, CTT can generate diverse behaviors conditioned on different latent modes with semantic meanings while beating SOTA on prediction accuracy. In addition, CTT's ability to input and output tokens enables integration with LLMs for common-sense reasoning and zero-shot generalization.

gpt

llm

Title: Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings. (arXiv:2311.18034v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18034
Code URL: https://github.com/andreawwenyi/hyperpolyglot
Copy Paste: [[2311.18034]] Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings(http://arxiv.org/abs/2311.18034)
Summary:
Cross-lingual transfer learning is an important property of multilingual large language models (LLMs). But how do LLMs represent relationships between languages? Every language model has an input layer that maps tokens to vectors. This ubiquitous layer of language models is often overlooked. We find that similarities between these input embeddings are highly interpretable and that the geometry of these embeddings differs between model families. In one case (XLM-RoBERTa), embeddings encode language: tokens in different writing systems can be linearly separated with an average of 99.2% accuracy. Another family (mT5) represents cross-lingual semantic similarity: the 50 nearest neighbors for any token represent an average of 7.61 writing systems, and are frequently translations. This result is surprising given that there is no explicit parallel cross-lingual training corpora and no explicit incentive for translations in pre-training objectives. Our research opens the door for investigations in 1) The effect of pre-training and model architectures on representations of languages and 2) The applications of cross-lingual representations embedded in language models.

Title: Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes. (arXiv:2311.18194v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18194
Code URL: null
Copy Paste: [[2311.18194]] Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes(http://arxiv.org/abs/2311.18194)
Summary:
In-context learning (ICL) refers to the ability of a model to condition on a few in-context demonstrations (input-output examples of the underlying task) to generate the answer for a new query input, without updating parameters. Despite the impressive ICL ability of LLMs, it has also been found that ICL in LLMs is sensitive to input demonstrations and limited to short context lengths. To understand the limitations and principles for successful ICL, we conduct an investigation with ICL linear regression of transformers. We characterize several Out-of-Distribution (OOD) cases for ICL inspired by realistic LLM ICL failures and compare transformers with DeepSet, a simple yet powerful architecture for ICL. Surprisingly, DeepSet outperforms transformers across a variety of distribution shifts, implying that preserving permutation invariance symmetry to input demonstrations is crucial for OOD ICL. The phenomenon specifies a fundamental requirement by ICL, which we termed as ICL invariance. Nevertheless, the positional encodings in LLMs will break ICL invariance. To this end, we further evaluate transformers with identical positional encodings and find preserving ICL invariance in transformers achieves state-of-the-art performance across various ICL distribution shifts

Title: FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity. (arXiv:2311.18580v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18580
Code URL: null
Copy Paste: [[2311.18580]] FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity(http://arxiv.org/abs/2311.18580)
Summary:
The widespread of generative artificial intelligence has heightened concerns about the potential harms posed by AI-generated texts, primarily stemming from factoid, unfair, and toxic content. Previous researchers have invested much effort in assessing the harmlessness of generative language models. However, existing benchmarks are struggling in the era of large language models (LLMs), due to the stronger language generation and instruction following capabilities, as well as wider applications. In this paper, we propose FFT, a new benchmark with 2116 elaborated-designed instances, for LLM harmlessness evaluation with factuality, fairness, and toxicity. To investigate the potential harms of LLMs, we evaluate 9 representative LLMs covering various parameter scales, training stages, and creators. Experiments show that the harmlessness of LLMs is still under-satisfactory, and extensive analysis derives some insightful findings that could inspire future research for harmless LLM research.

long context

lora

Title: Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization. (arXiv:2311.18703v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18703
Code URL: null
Copy Paste: [[2311.18703]] Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization(http://arxiv.org/abs/2311.18703)
Summary:
In Reinforcement Learning (RL), agents have no incentive to exhibit predictable behaviors, and are often pushed (through e.g. policy entropy regularization) to randomize their actions in favor of exploration. From a human perspective, this makes RL agents hard to interpret and predict, and from a safety perspective, even harder to formally verify. We propose a novel method to induce predictable behavior in RL agents, referred to as Predictability-Aware RL (PA-RL), which employs the state sequence entropy rate as a predictability measure. We show how the entropy rate can be formulated as an average reward objective, and since its entropy reward function is policy-dependent, we introduce an action-dependent surrogate entropy enabling the use of PG methods. We prove that deterministic policies minimizing the average surrogate reward exist and also minimize the actual entropy rate, and show how, given a learned dynamical model, we are able to approximate the value function associated to the true entropy rate. Finally, we demonstrate the effectiveness of the approach in RL tasks inspired by human-robot use-cases, and show how it produces agents with more predictable behavior while achieving near-optimal rewards.

Title: The Sliding Regret in Stochastic Bandits: Discriminating Index and Randomized Policies. (arXiv:2311.18437v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18437
Code URL: null
Copy Paste: [[2311.18437]] The Sliding Regret in Stochastic Bandits: Discriminating Index and Randomized Policies(http://arxiv.org/abs/2311.18437)
Summary:
This paper studies the one-shot behavior of no-regret algorithms for stochastic bandits. Although many algorithms are known to be asymptotically optimal with respect to the expected regret, over a single run, their pseudo-regret seems to follow one of two tendencies: it is either smooth or bumpy. To measure this tendency, we introduce a new notion: the sliding regret, that measures the worst pseudo-regret over a time-window of fixed length sliding to infinity. We show that randomized methods (e.g. Thompson Sampling and MED) have optimal sliding regret, while index policies, although possibly asymptotically optimal for the expected regret, have the worst possible sliding regret under regularity conditions on their index (e.g. UCB, UCB-V, KL-UCB, MOSS, IMED etc.). We further analyze the average bumpiness of the pseudo-regret of index policies via the regret of exploration, that we show to be suboptimal as well.

hallucination

prompt

Title: Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum. (arXiv:2311.18578v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18578
Code URL: null
Copy Paste: [[2311.18578]] Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum(http://arxiv.org/abs/2311.18578)
Summary:
Federated Learning (FL) is the state-of-the-art approach for learning from decentralized data in privacy-constrained scenarios. As the current literature reports, the main problems associated with FL refer to system and statistical challenges: the former ones demand for efficient learning from edge devices, including lowering communication bandwidth and frequency, while the latter require algorithms robust to non-iidness. State-of-art approaches either guarantee convergence at increased communication cost or are not sufficiently robust to handle extreme heterogeneous local distributions. In this work we propose a novel generalization of the heavy-ball momentum, and present FedHBM to effectively address statistical heterogeneity in FL without introducing any communication overhead. We conduct extensive experimentation on common FL vision and NLP datasets, showing that our FedHBM algorithm empirically yields better model quality and higher convergence speed w.r.t. the state-of-art, especially in pathological non-iid scenarios. While being designed for cross-silo settings, we show how FedHBM is applicable in moderate-to-high cross-device scenarios, and how good model initializations (e.g. pre-training) can be exploited for prompt acceleration. Extended experimentation on large-scale real-world federated datasets further corroborates the effectiveness of our approach for real-world FL applications.

code

Title: A trainable manifold for accurate approximation with ReLU Networks. (arXiv:2311.18022v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18022
Code URL: null
Copy Paste: [[2311.18022]] A trainable manifold for accurate approximation with ReLU Networks(http://arxiv.org/abs/2311.18022)
Summary:
We present a novel technique for exercising greater control of the weights of ReLU activated neural networks to produce more accurate function approximations. Many theoretical works encode complex operations into ReLU networks using smaller base components. In these works, a common base component is a constant width approximation to x^2, which has exponentially decaying error with respect to depth. We extend this block to represent a greater range of convex one-dimensional functions. We derive a manifold of weights such that the output of these new networks utilizes exponentially many piecewise-linear segments. This manifold guides their training process to overcome drawbacks associated with random initialization and unassisted gradient descent. We train these networks to approximate functions which do not necessarily lie on the manifold, showing a significant reduction of error values over conventional approaches.

Title: Solving the Team Orienteering Problem with Transformers. (arXiv:2311.18662v1 [cs.AI])

Paper URL: http://arxiv.org/abs/2311.18662
Code URL: https://github.com/danifuertes/top_transformer
Copy Paste: [[2311.18662]] Solving the Team Orienteering Problem with Transformers(http://arxiv.org/abs/2311.18662)
Summary:
Route planning for a fleet of vehicles is an important task in applications such as package delivery, surveillance, or transportation. This problem is usually modeled as a Combinatorial Optimization problem named as Team Orienteering Problem. The most popular Team Orienteering Problem solvers are mainly based on either linear programming, which provides accurate solutions by employing a large computation time that grows with the size of the problem, or heuristic methods, which usually find suboptimal solutions in a shorter amount of time. In this paper, a multi-agent route planning system capable of solving the Team Orienteering Problem in a very fast and accurate manner is presented. The proposed system is based on a centralized Transformer neural network that can learn to encode the scenario (modeled as a graph) and the context of the agents to provide fast and accurate solutions. Several experiments have been performed to demonstrate that the presented system can outperform most of the state-of-the-art works in terms of computation speed. In addition, the code is publicly available at \url{this http URL}.

Title: C3Net: Compound Conditioned ControlNet for Multimodal Content Generation. (arXiv:2311.17951v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.17951
Code URL: null
Copy Paste: [[2311.17951]] C3Net: Compound Conditioned ControlNet for Multimodal Content Generation(http://arxiv.org/abs/2311.17951)
Summary:
We present Compound Conditioned ControlNet, C3Net, a novel generative neural architecture taking conditions from multiple modalities and synthesizing multimodal contents simultaneously (e.g., image, text, audio). C3Net adapts the ControlNet architecture to jointly train and make inferences on a production-ready diffusion model and its trainable copies. Specifically, C3Net first aligns the conditions from multi-modalities to the same semantic latent space using modality-specific encoders based on contrastive training. Then, it generates multimodal outputs based on the aligned latent space, whose semantic information is combined using a ControlNet-like architecture called Control C3-UNet. Correspondingly, with this system design, our model offers an improved solution for joint-modality generation through learning and explaining multimodal conditions instead of simply taking linear interpolations on the latent space. Meanwhile, as we align conditions to a unified latent space, C3Net only requires one trainable Control C3-UNet to work on multimodal semantic information. Furthermore, our model employs unimodal pretraining on the condition alignment stage, outperforming the non-pretrained alignment even on relatively scarce training data and thus demonstrating high-quality compound condition generation. We contribute the first high-quality tri-modal validation set to validate quantitatively that C3Net outperforms or is on par with first and contemporary state-of-the-art multimodal generation. Our codes and tri-modal dataset will be released.

Title: SMaRt: Improving GANs with Score Matching Regularity. (arXiv:2311.18208v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18208
Code URL: null
Copy Paste: [[2311.18208]] SMaRt: Improving GANs with Score Matching Regularity(http://arxiv.org/abs/2311.18208)
Summary:
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex. In this work, we revisit the mathematical foundations of GANs, and theoretically reveal that the native adversarial loss for GAN training is insufficient to fix the problem of subsets with positive Lebesgue measure of the generated data manifold lying out of the real data manifold. Instead, we find that score matching serves as a valid solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold. We thereby propose to improve the optimization of GANs with score matching regularity (SMaRt). Regarding the empirical evidences, we first design a toy example to show that training GANs by the aid of a ground-truth score function can help reproduce the real data distribution more accurately, and then confirm that our approach can consistently boost the synthesis performance of various state-of-the-art GANs on real-world datasets with pre-trained diffusion models acting as the approximate score function. For instance, when training Aurora on the ImageNet 64x64 dataset, we manage to improve FID from 8.87 to 7.11, on par with the performance of one-step consistency model. The source code will be made public.

Title: Learning Robust Precipitation Forecaster by Temporal Frame Interpolation. (arXiv:2311.18341v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18341
Code URL: https://github.com/secilia-cxy/unettfi
Copy Paste: [[2311.18341]] Learning Robust Precipitation Forecaster by Temporal Frame Interpolation(http://arxiv.org/abs/2311.18341)
Summary:
Recent advancements in deep learning have propelled the field of weather prediction models to new heights. Despite their progress, these models often struggle with real-world application due to their sensitivity to spatial-temporal shifts, a vulnerability particularly pronounced in weather prediction tasks where overfitting to local and temporal variations is common. This paper presents an investigation into the development of a robust precipitation forecasting model that stands resilient to such shifts. We introduce Temporal Frame Interpolation (TFI), an innovative technique designed to fortify forecasting models against spatial-temporal discrepancies. TFI operates by generating synthetic samples through the interpolation of adjacent frames from satellite imagery and ground radar data, thereby enriching the training dataset and bolstering the model's defense against noise on frames. Additionally, we integrate a novel multi-level dice loss, which exploits the ordinal nature of rainfall intensities to further refine model performance. These methodologies have collectively advanced our model's forecasting precision, achieving \textit{1st place} on the transfer learning leaderboard in the \textit{Weather4Cast'23 competition}.It not only demonstrates the efficacy of our approaches but also sets a new benchmark for deep learning applications in meteorological forecasting. Our code and weights have been public on \url{https://github.com/Secilia-Cxy/UNetTFI}.

Title: Data-Agnostic Model Poisoning against Federated Learning: A Graph Autoencoder Approach. (arXiv:2311.18498v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2311.18498
Code URL: null
Copy Paste: [[2311.18498]] Data-Agnostic Model Poisoning against Federated Learning: A Graph Autoencoder Approach(http://arxiv.org/abs/2311.18498)
Summary:
This paper proposes a novel, data-agnostic, model poisoning attack on Federated Learning (FL), by designing a new adversarial graph autoencoder (GAE)-based framework. The attack requires no knowledge of FL training data and achieves both effectiveness and undetectability. By listening to the benign local models and the global model, the attacker extracts the graph structural correlations among the benign local models and the training data features substantiating the models. The attacker then adversarially regenerates the graph structural correlations while maximizing the FL training loss, and subsequently generates malicious local models using the adversarial graph structure and the training data features of the benign ones. A new algorithm is designed to iteratively train the malicious local models using GAE and sub-gradient descent. The convergence of FL under attack is rigorously proved, with a considerably large optimality gap. Experiments show that the FL accuracy drops gradually under the proposed attack and existing defense mechanisms fail to detect it. The attack can give rise to an infection across all benign devices, making it a serious threat to FL.

chat

Title: Use of explicit replies as coordination mechanisms in online student debate. (arXiv:2311.18466v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2311.18466
Code URL: null
Copy Paste: [[2311.18466]] Use of explicit replies as coordination mechanisms in online student debate(http://arxiv.org/abs/2311.18466)
Summary:
People in conversation entrain their linguistic behaviours through spontaneous alignment mechanisms [7] - both in face-to-face and computer-mediated communication (CMC) [8]. In CMC, one of the mechanisms through which linguistic entrainment happens is through explicit replies. Indeed, the use of explicit replies influences the structure of conversations, favouring the formation of reply-trees typically delineated by topic shifts [5]. The interpersonal coordination mechanisms realized by how actors address each other have been studied using a probabilistic framework proposed by David Gibson [2,3]. Other recent approaches use computational methods and information theory to quantify changes in text. We explore coordination mechanisms concerned with some of the roles utterances play in dialogues - specifically in explicit replies. We identify these roles by finding community structure in the conversation's vocabulary using a non-parametric, hierarchical topic model. Some conversations may always stay on the ground, remaining at the level of general introductory chatter. Some others may develop a specific sub-topic in significant depth and detail. Even others may jump between general chatter, out-of-topic remarks and people agreeing or disagreeing without further elaboration.