secure

Title: Blockchain Technology to Secure Bluetooth. (arXiv:2211.06451v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06451
Code URL: null
Copy Paste: [[2211.06451] Blockchain Technology to Secure Bluetooth](http://arxiv.org/abs/2211.06451)
Summary:
Bluetooth is a communication technology used to wirelessly exchange data between devices. In the last few years there have been found a great number of security vulnerabilities, and adversaries are taking advantage of them causing harm and significant loss. Numerous system security updates have been approved and installed in order to sort out security holes and bugs, and prevent attacks that could expose personal or other valuable information. But those updates are not sufficient and appropriate and new bugs keep showing up. In Bluetooth technology, pairing is identified as the step where most bugs are found and most attacks target this particular process part of Bluetooth. A new technology that has been proved bulletproof when it comes to security and the exchange of sensitive information is Blockchain. Blockchain technology is promising to be incorporated well in a network of smart devices, and secure an Internet of Things (IoT), where Bluetooth technology is being extensively used. This work presents a vulnerability discovered in Bluetooth pairing process, and proposes a Blockchain solution approach to secure pairing and mitigate this vulnerability. The paper first introduces the Bluetooth technology and delves into how Blockchain technology can be a solution to certain security problems. Then a solution approach shows how Blockchain can be integrated and implemented to ensure the required level of security. Certain attack incidents on Bluetooth vulnerable points are examined and discussion and conclusions give the extension of the security related problems.

Title: Distributed and secure linear algebra -- Master Thesis. (arXiv:2211.06732v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06732
Code URL: null
Copy Paste: [[2211.06732] Distributed and secure linear algebra -- Master Thesis](http://arxiv.org/abs/2211.06732)
Summary:
Cryptography is the discipline that allows securing of the exchange of information. In this internship, we will focus on a certain branch of this discipline, secure computation in a network. The main goal of this internship, illustrated in this report, is to adapt a roster of protocols intended to do linear algebra. We want to adapt them to do algebra for matrices with polynomial coefficients. We then wish to make a complete analysis of the different complexities of these protocols.

security

Title: Investigating co-occurrences of MITRE ATT\&CK Techniques. (arXiv:2211.06495v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06495
Code URL: null
Copy Paste: [[2211.06495] Investigating co-occurrences of MITRE ATT\&CK Techniques](http://arxiv.org/abs/2211.06495)
Summary:
Cyberattacks use adversarial techniques to bypass system defenses, persist, and eventually breach systems. The MITRE ATT\&CK framework catalogs a set of adversarial techniques and maps between adversaries and their used techniques and tactics. Understanding how adversaries deploy techniques in conjunction is pivotal for learning adversary behavior, hunting potential threats, and formulating a proactive defense. The goal of this research is to aid cybersecurity practitioners and researchers in choosing detection and mitigation strategies through co-occurrence analysis of adversarial techniques reported in MITRE ATT&CK. We collect the adversarial techniques of 115 cybercrime groups and 484 malware from the MITRE ATT\&CK. We apply association rule mining and network analysis to investigate how adversarial techniques co-occur. We identify that adversaries pair T1059: Command and scripting interface and T1105: Ingress tool transfer techniques with a relatively large number of ATT\&CK techniques. We also identify adversaries using the T1082: System Information Discovery technique to determine their next course of action. We observe adversaries deploy the highest number of techniques from the TA0005: Defense evasion and TA0007: Discovery tactics. Based on our findings on co-occurrence, we identify six detection, six mitigation strategies, and twelve adversary behaviors. We urge defenders to prioritize primarily the detection of TA0007: Discovery and mitigation of TA0005: Defense evasion techniques. Overall, this study approximates how adversaries leverage techniques based on publicly reported documents. We advocate organizations investigate adversarial techniques in their environment and make the findings available for a more precise and actionable understanding.

Title: An investigation of security controls and MITRE ATT\&CK techniques. (arXiv:2211.06500v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06500
Code URL: null
Copy Paste: [[2211.06500] An investigation of security controls and MITRE ATT\&CK techniques](http://arxiv.org/abs/2211.06500)
Summary:
Attackers utilize a plethora of adversarial techniques in cyberattacks to compromise the confidentiality, integrity, and availability of the target organizations and systems. Information security standards such as NIST, ISO/IEC specify hundreds of security controls that organizations can enforce to protect and defend the information systems from adversarial techniques. However, implementing all the available controls at the same time can be infeasible and security controls need to be investigated in terms of their mitigation ability over adversarial techniques used in cyberattacks as well. The goal of this research is to aid organizations in making informed choices on security controls to defend against cyberthreats through an investigation of adversarial techniques used in current cyberattacks. In this study, we investigated the extent of mitigation of 298 NIST SP800-53 controls over 188 adversarial techniques used in 669 cybercrime groups and malware cataloged in the MITRE ATT\&CK framework based upon an existing mapping between the controls and techniques. We identify that, based on the mapping, only 101 out of 298 control are capable of mitigating adversarial techniques. However, we also identify that 53 adversarial techniques cannot be mitigated by any existing controls, and these techniques primarily aid adversaries in bypassing system defense and discovering targeted system information. We identify a set of 20 critical controls that can mitigate 134 adversarial techniques, and on average, can mitigate 72\% of all techniques used by 98\% of the cataloged adversaries in MITRE ATT\&CK. We urge organizations, that do not have any controls enforced in place, to implement the top controls identified in the study.

Title: A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges. (arXiv:2211.06665v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06665
Code URL: null
Copy Paste: [[2211.06665] A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges](http://arxiv.org/abs/2211.06665)
Summary:
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal. Driven by the resurgence of deep learning, Deep RL (DRL) has witnessed great success over a wide spectrum of complex control tasks. Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential. To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability. In this survey, we provide a comprehensive review of existing works on eXplainable RL (XRL) and introduce a new taxonomy where prior works are clearly categorized into model-explaining, reward-explaining, state-explaining, and task-explaining methods. We also review and highlight RL methods that conversely leverage human knowledge to promote learning efficiency and final performance of agents while this kind of method is often ignored in XRL field. Some open challenges and opportunities in XRL are discussed. This survey intends to provide a high-level summarization and better understanding of XRL and to motivate future research on more effective XRL solutions. Corresponding open source codes are collected and categorized at https://github.com/Plankson/awesome-explainable-reinforcement-learning.

privacy

Title: More Generalized and Personalized Unsupervised Representation Learning In A Distributed System. (arXiv:2211.06470v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06470
Code URL: null
Copy Paste: [[2211.06470] More Generalized and Personalized Unsupervised Representation Learning In A Distributed System](http://arxiv.org/abs/2211.06470)
Summary:
Discriminative unsupervised learning methods such as contrastive learning have demonstrated the ability to learn generalized visual representations on centralized data. It is nonetheless challenging to adapt such methods to a distributed system with unlabeled, private, and heterogeneous client data due to user styles and preferences. Federated learning enables multiple clients to collectively learn a global model without provoking any privacy breach between local clients. On the other hand, another direction of federated learning studies personalized methods to address the local heterogeneity. However, work on solving both generalization and personalization without labels in a decentralized setting remains unfamiliar. In this work, we propose a novel method, FedStyle, to learn a more generalized global model by infusing local style information with local content information for contrastive learning, and to learn more personalized local models by inducing local style information for downstream tasks. The style information is extracted by contrasting original local data with strongly augmented local data (Sobel filtered images). Through extensive experiments with linear evaluations in both IID and non-IID settings, we demonstrate that FedStyle outperforms both the generalization baseline methods and personalization baseline methods in a stylized decentralized setting. Through comprehensive ablations, we demonstrate our design of style infusion and stylized personalization improve performance significantly.

Title: MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text Removal. (arXiv:2211.06565v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06565
Code URL: null
Copy Paste: [[2211.06565] MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text Removal](http://arxiv.org/abs/2211.06565)
Summary:
Scene text removal aims to remove the text and fill the regions with perceptually plausible background information in natural images. It has attracted increasing attention due to its various applications in privacy protection, scene text retrieval, and text editing. With the development of deep learning, the previous methods have achieved significant improvements. However, most of the existing methods seem to ignore the large perceptive fields and global information. The pioneer method can get significant improvements by only changing training data from the cropped image to the full image. In this paper, we present a single-stage multi-scale network MSLKANet for scene text removal in full images. For obtaining large perceptive fields and global information, we propose multi-scale large kernel attention (MSLKA) to obtain long-range dependencies between the text regions and the backgrounds at various granularity levels. Furthermore, we combine the large kernel decomposition mechanism and atrous spatial pyramid pooling to build a large kernel spatial pyramid pooling (LKSPP), which can perceive more valid pixels in the spatial dimension while maintaining large receptive fields and low cost of computation. Extensive experimental results indicate that the proposed method achieves state-of-the-art performance on both synthetic and real-world datasets and the effectiveness of the proposed components MSLKA and LKSPP.

Title: Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning. (arXiv:2211.06530v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06530
Code URL: null
Copy Paste: [[2211.06530] Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning](http://arxiv.org/abs/2211.06530)
Summary:
We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) training involving multiple passes (epochs) of a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. Our key contribution is an extension of the online matrix factorization DP mechanism to multiple participations, substantially generalizing the approach of DMRST2022. We first give conditions under which it is possible to reduce the problem with per-iteration vector contributions to the simpler one of scalar contributions. Using this, we formulate the construction of optimal (in total squared error at each iterate) matrix mechanisms for SGD variants as a convex program. We propose an efficient optimization algorithm via a closed form solution to the dual function.

While tractable, both solving the convex problem offline and computing the necessary noise masks during training can become prohibitively expensive when many training steps are necessary. To address this, we design a Fourier-transform-based mechanism with significantly less computation and only a minor utility decrease.

Extensive empirical evaluation on two tasks: example-level DP for image classification and user-level DP for language modeling, demonstrate substantial improvements over the previous state-of-the-art. Though our primary application is to ML, we note our main DP results are applicable to arbitrary linear queries and hence may have much broader applicability.

Title: Dark patterns in e-commerce: a dataset and its baseline evaluations. (arXiv:2211.06543v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06543
Code URL: https://github.com/yamanalab/ec-darkpattern
Copy Paste: [[2211.06543] Dark patterns in e-commerce: a dataset and its baseline evaluations](http://arxiv.org/abs/2211.06543)
Summary:
Dark patterns, which are user interface designs in online services, induce users to take unintended actions. Recently, dark patterns have been raised as an issue of privacy and fairness. Thus, a wide range of research on detecting dark patterns is eagerly awaited. In this work, we constructed a dataset for dark pattern detection and prepared its baseline detection performance with state-of-the-art machine learning methods. The original dataset was obtained from Mathur et al.'s study in 2019, which consists of 1,818 dark pattern texts from shopping sites. Then, we added negative samples, i.e., non-dark pattern texts, by retrieving texts from the same websites as Mathur et al.'s dataset. We also applied state-of-the-art machine learning methods to show the automatic detection accuracy as baselines, including BERT, RoBERTa, ALBERT, and XLNet. As a result of 5-fold cross-validation, we achieved the highest accuracy of 0.975 with RoBERTa. The dataset and baseline source codes are available at https://github.com/yamanalab/ec-darkpattern.

Title: TAPAS: a Toolbox for Adversarial Privacy Auditing of Synthetic Data. (arXiv:2211.06550v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06550
Code URL: null
Copy Paste: [[2211.06550] TAPAS: a Toolbox for Adversarial Privacy Auditing of Synthetic Data](http://arxiv.org/abs/2211.06550)
Summary:
Personal data collected at scale promises to improve decision-making and accelerate innovation. However, sharing and using such data raises serious privacy concerns. A promising solution is to produce synthetic data, artificial records to share instead of real data. Since synthetic records are not linked to real persons, this intuitively prevents classical re-identification attacks. However, this is insufficient to protect privacy. We here present TAPAS, a toolbox of attacks to evaluate synthetic data privacy under a wide range of scenarios. These attacks include generalizations of prior works and novel attacks. We also introduce a general framework for reasoning about privacy threats to synthetic data and showcase TAPAS on several examples.

Title: Provable Membership Inference Privacy. (arXiv:2211.06582v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06582
Code URL: null
Copy Paste: [[2211.06582] Provable Membership Inference Privacy](http://arxiv.org/abs/2211.06582)
Summary:
In applications involving sensitive data, such as finance and healthcare, the necessity for preserving data privacy can be a significant barrier to machine learning model development. Differential privacy (DP) has emerged as one canonical standard for provable privacy. However, DP's strong theoretical guarantees often come at the cost of a large drop in its utility for machine learning, and DP guarantees themselves can be difficult to interpret. In this work, we propose a novel privacy notion, membership inference privacy (MIP), to address these challenges. We give a precise characterization of the relationship between MIP and DP, and show that MIP can be achieved using less amount of randomness compared to the amount required for guaranteeing DP, leading to a smaller drop in utility. MIP guarantees are also easily interpretable in terms of the success rate of membership inference attacks. Our theoretical results also give rise to a simple algorithm for guaranteeing MIP which can be used as a wrapper around any algorithm with a continuous output, including parametric model training.

Title: Privacy-Preserving Credit Card Fraud Detection using Homomorphic Encryption. (arXiv:2211.06675v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06675
Code URL: https://github.com/davidnugent2425/he-cc-fraud-detection
Copy Paste: [[2211.06675] Privacy-Preserving Credit Card Fraud Detection using Homomorphic Encryption](http://arxiv.org/abs/2211.06675)
Summary:
Credit card fraud is a problem continuously faced by financial institutions and their customers, which is mitigated by fraud detection systems. However, these systems require the use of sensitive customer transaction data, which introduces both a lack of privacy for the customer and a data breach vulnerability to the card provider. This paper proposes a system for private fraud detection on encrypted transactions using homomorphic encryption. Two models, XGBoost and a feedforward classifier neural network, are trained as fraud detectors on plaintext data. They are then converted to models which use homomorphic encryption for private inference. Latency, storage, and detection results are discussed, along with use cases and feasibility of deployment. The XGBoost model has better performance, with an encrypted inference as low as 6ms, compared to 296ms for the neural network. However, the neural network implementation may still be preferred, as it is simpler to deploy securely. A codebase for the system is also provided, for simulation and further development.

Title: PriMask: Cascadable and Collusion-Resilient Data Masking for Mobile Cloud Inference. (arXiv:2211.06716v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2211.06716
Code URL: https://github.com/jls2007/primask
Copy Paste: [[2211.06716] PriMask: Cascadable and Collusion-Resilient Data Masking for Mobile Cloud Inference](http://arxiv.org/abs/2211.06716)
Summary:
Mobile cloud offloading is indispensable for inference tasks based on large-scale deep models. However, transmitting privacy-rich inference data to the cloud incurs concerns. This paper presents the design of a system called PriMask, in which the mobile device uses a secret small-scale neural network called MaskNet to mask the data before transmission. PriMask significantly weakens the cloud's capability to recover the data or extract certain private attributes. The MaskNet is em cascadable in that the mobile can opt in to or out of its use seamlessly without any modifications to the cloud's inference service. Moreover, the mobiles use different MaskNets, such that the collusion between the cloud and some mobiles does not weaken the protection for other mobiles. We devise a {\em split adversarial learning} method to train a neural network that generates a new MaskNet quickly (within two seconds) at run time. We apply PriMask to three mobile sensing applications with diverse modalities and complexities, i.e., human activity recognition, urban environment crowdsensing, and driver behavior recognition. Results show PriMask's effectiveness in all three applications.

Title: Modular Clinical Decision Support Networks (MoDN) -- Updatable, Interpretable, and Portable Predictions for Evolving Clinical Environments. (arXiv:2211.06637v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06637
Code URL: null
Copy Paste: [[2211.06637] Modular Clinical Decision Support Networks (MoDN) -- Updatable, Interpretable, and Portable Predictions for Evolving Clinical Environments](http://arxiv.org/abs/2211.06637)
Summary:
Data-driven Clinical Decision Support Systems (CDSS) have the potential to improve and standardise care with personalised probabilistic guidance. However, the size of data required necessitates collaborative learning from analogous CDSS's, which are often unsharable or imperfectly interoperable (IIO), meaning their feature sets are not perfectly overlapping. We propose Modular Clinical Decision Support Networks (MoDN) which allow flexible, privacy-preserving learning across IIO datasets, while providing interpretable, continuous predictive feedback to the clinician.

MoDN is a novel decision tree composed of feature-specific neural network modules. It creates dynamic personalised representations of patients, and can make multiple predictions of diagnoses, updatable at each step of a consultation. The modular design allows it to compartmentalise training updates to specific features and collaboratively learn between IIO datasets without sharing any data.

protect

defense

attack

Title: Generating Textual Adversaries with Minimal Perturbation. (arXiv:2211.06571v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2211.06571
Code URL: https://github.com/xingyizhao/tampers
Copy Paste: [[2211.06571] Generating Textual Adversaries with Minimal Perturbation](http://arxiv.org/abs/2211.06571)
Summary:
Many word-level adversarial attack approaches for textual data have been proposed in recent studies. However, due to the massive search space consisting of combinations of candidate words, the existing approaches face the problem of preserving the semantics of texts when crafting adversarial counterparts. In this paper, we develop a novel attack strategy to find adversarial texts with high similarity to the original texts while introducing minimal perturbation. The rationale is that we expect the adversarial texts with small perturbation can better preserve the semantic meaning of original texts. Experiments show that, compared with state-of-the-art attack approaches, our approach achieves higher success rates and lower perturbation rates in four benchmark datasets.

robust

Title: Affinity Feature Strengthening for Accurate, Complete and Robust Vessel Segmentation. (arXiv:2211.06578v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06578
Code URL: https://github.com/TY-Shi/AFN
Copy Paste: [[2211.06578] Affinity Feature Strengthening for Accurate, Complete and Robust Vessel Segmentation](http://arxiv.org/abs/2211.06578)
Summary:
Vessel segmentation is essential in many medical image applications, such as the detection of coronary stenoses, retinal vessel diseases and brain aneurysms. A high pixel-wise accuracy, complete topology structure and robustness to various contrast variations are three critical aspects of vessel segmentation. However, most existing methods only focus on achieving part of them via dedicated designs while few of them can concurrently achieve the three goals. In this paper, we present a novel affinity feature strengthening network (AFN) which adopts a contrast-insensitive approach based on multiscale affinity to jointly model topology and refine pixel-wise segmentation features. Specifically, for each pixel we derive a multiscale affinity field which captures the semantic relationships of the pixel with its neighbors on the predicted mask image. Such a multiscale affinity field can effectively represent the local topology of a vessel segment of different sizes. Meanwhile, it does not depend on image intensities and hence is robust to various illumination and contrast changes. We further learn spatial- and scale-aware adaptive weights for the corresponding affinity fields to strengthen vessel features. We evaluate our AFN on four different types of vascular datasets: X-ray angiography coronary vessel dataset (XCAD), portal vein dataset (PV), digital subtraction angiography cerebrovascular vessel dataset (DSA) and retinal vessel dataset (DRIVE). Extensive experimental results on the four datasets demonstrate that our AFN outperforms the state-of-the-art methods in terms of both higher accuracy and topological metrics, and meanwhile is more robust to various contrast changes than existing methods. Codes will be made public.

Title: OpenGait: Revisiting Gait Recognition Toward Better Practicality. (arXiv:2211.06597v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06597
Code URL: https://github.com/shiqiyu/opengait
Copy Paste: [[2211.06597] OpenGait: Revisiting Gait Recognition Toward Better Practicality](http://arxiv.org/abs/2211.06597)
Summary:
Gait recognition is one of the most important long-distance identification technologies and increasingly gains popularity in both research and industry communities. Although significant progress has been made in indoor datasets, much evidence shows that gait recognition techniques perform poorly in the wild. More importantly, we also find that many conclusions from prior works change with the evaluation datasets. Therefore, the more critical goal of this paper is to present a comprehensive benchmark study for better practicality rather than only a particular model for better performance. To this end, we first develop a flexible and efficient gait recognition codebase named OpenGait. Based on OpenGait, we deeply revisit the recent development of gait recognition by re-conducting the ablative experiments. Encouragingly, we find many hidden troubles of prior works and new insights for future research. Inspired by these discoveries, we develop a structurally simple, empirically powerful and practically robust baseline model, GaitBase. Experimentally, we comprehensively compare GaitBase with many current gait recognition methods on multiple public datasets, and the results reflect that GaitBase achieves significantly strong performance in most cases regardless of indoor or outdoor situations. The source code is available at \url{https://github.com/ShiqiYu/OpenGait}.

Title: AU-Aware Vision Transformers for Biased Facial Expression Recognition. (arXiv:2211.06609v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06609
Code URL: null
Copy Paste: [[2211.06609] AU-Aware Vision Transformers for Biased Facial Expression Recognition](http://arxiv.org/abs/2211.06609)
Summary:
Studies have proven that domain bias and label bias exist in different Facial Expression Recognition (FER) datasets, making it hard to improve the performance of a specific dataset by adding other datasets. For the FER bias issue, recent researches mainly focus on the cross-domain issue with advanced domain adaption algorithms. This paper addresses another problem: how to boost FER performance by leveraging cross-domain datasets. Unlike the coarse and biased expression label, the facial Action Unit (AU) is fine-grained and objective suggested by psychological studies. Motivated by this, we resort to the AU information of different FER datasets for performance boosting and make contributions as follows. First, we experimentally show that the naive joint training of multiple FER datasets is harmful to the FER performance of individual datasets. We further introduce expression-specific mean images and AU cosine distances to measure FER dataset bias. This novel measurement shows consistent conclusions with experimental degradation of joint training. Second, we propose a simple yet conceptually-new framework, AU-aware Vision Transformer (AU-ViT). It improves the performance of individual datasets by jointly training auxiliary datasets with AU or pseudo-AU labels. We also find that the AU-ViT is robust to real-world occlusions. Moreover, for the first time, we prove that a carefully-initialized ViT achieves comparable performance to advanced deep convolutional networks. Our AU-ViT achieves state-of-the-art performance on three popular datasets, namely 91.10% on RAF-DB, 65.59% on AffectNet, and 90.15% on FERPlus. The code and models will be released soon.

Title: MARLIN: Masked Autoencoder for facial video Representation LearnINg. (arXiv:2211.06627v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06627
Code URL: https://github.com/ControlNet/MARLIN
Copy Paste: [[2211.06627] MARLIN: Masked Autoencoder for facial video Representation LearnINg](http://arxiv.org/abs/2211.06627)
Summary:
This paper proposes a self-supervised approach to learn universal facial representations from videos, that can transfer across a variety of facial analysis tasks such as Facial Attribute Recognition (FAR), Facial Expression Recognition (FER), DeepFake Detection (DFD), and Lip Synchronization (LS). Our proposed framework, named MARLIN, is a facial video masked autoencoder, that learns highly robust and generic facial embeddings from abundantly available non-annotated web crawled facial videos. As a challenging auxiliary task, MARLIN reconstructs the spatio-temporal details of the face from the densely masked facial regions which mainly include eyes, nose, mouth, lips, and skin to capture local and global aspects that in turn help in encoding generic and transferable features. Through a variety of experiments on diverse downstream tasks, we demonstrate MARLIN to be an excellent facial video encoder as well as feature extractor, that performs consistently well across a variety of downstream tasks including FAR (1.13% gain over supervised benchmark), FER (2.64% gain over unsupervised benchmark), DFD (1.86% gain over unsupervised benchmark), LS (29.36% gain for Frechet Inception Distance), and even in low data regime. Our codes and pre-trained models will be made public.

Title: Far Away in the Deep Space: Nearest-Neighbor-Based Dense Out-of-Distribution Detection. (arXiv:2211.06660v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06660
Code URL: null
Copy Paste: [[2211.06660] Far Away in the Deep Space: Nearest-Neighbor-Based Dense Out-of-Distribution Detection](http://arxiv.org/abs/2211.06660)
Summary:
The key to out-of-distribution detection is density estimation of the in-distribution data or of its feature representations. While good parametric solutions to this problem exist for well curated classification data, these are less suitable for complex domains, such as semantic segmentation. In this paper, we show that a k-Nearest-Neighbors approach can achieve surprisingly good results with small reference datasets and runtimes, and be robust with respect to hyperparameters, such as the number of neighbors and the choice of the support set size. Moreover, we show that it combines well with anomaly scores from standard parametric approaches, and we find that transformer features are particularly well suited to detect novel objects in combination with k-Nearest-Neighbors. Ultimately, the approach is simple and non-invasive, i.e., it does not affect the primary segmentation performance, avoids training on examples of anomalies, and achieves state-of-the-art results on the common benchmarks with +23% and +16% AP improvements on on RoadAnomaly and StreetHazards respectively.

Title: NeighborTrack: Improving Single Object Tracking by Bipartite Matching with Neighbor Tracklets. (arXiv:2211.06663v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06663
Code URL: https://github.com/franktpmvu/NeighborTrack
Copy Paste: [[2211.06663] NeighborTrack: Improving Single Object Tracking by Bipartite Matching with Neighbor Tracklets](http://arxiv.org/abs/2211.06663)
Summary:
We propose a post-processor, called NeighborTrack, that leverages neighbor information of the tracking target to validate and improve single-object tracking (SOT) results. It requires no additional data or retraining. Instead, it uses the confidence score predicted by the backbone SOT network to automatically derive neighbor information and then uses this information to improve the tracking results. When tracking an occluded target, its appearance features are untrustworthy. However, a general siamese network often cannot tell whether the tracked object is occluded by reading the confidence score alone, because it could be misled by neighbors with high confidence scores. Our proposed NeighborTrack takes advantage of unoccluded neighbors' information to reconfirm the tracking target and reduces false tracking when the target is occluded. It not only reduces the impact caused by occlusion, but also fixes tracking problems caused by object appearance changes. NeighborTrack is agnostic to SOT networks and post-processing methods. For the VOT challenge dataset commonly used in short-term object tracking, we improve three famous SOT networks, Ocean, TransT, and OSTrack, by an average of ${1.92\%}$ EAO and ${2.11\%}$ robustness. For the mid- and long-term tracking experiments based on OSTrack, we achieve state-of-the-art ${72.25\%}$ AUC on LaSOT and ${75.7\%}$ AO on GOT-10K.

Title: Adversarial and Random Transformations for Robust Domain Adaptation and Generalization. (arXiv:2211.06788v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06788
Code URL: null
Copy Paste: [[2211.06788] Adversarial and Random Transformations for Robust Domain Adaptation and Generalization](http://arxiv.org/abs/2211.06788)
Summary:
Data augmentation has been widely used to improve generalization in training deep neural networks. Recent works show that using worst-case transformations or adversarial augmentation strategies can significantly improve the accuracy and robustness. However, due to the non-differentiable properties of image transformations, searching algorithms such as reinforcement learning or evolution strategy have to be applied, which are not computationally practical for large scale problems. In this work, we show that by simply applying consistency training with random data augmentation, state-of-the-art results on domain adaptation (DA) and generalization (DG) can be obtained. To further improve the accuracy and robustness with adversarial examples, we propose a differentiable adversarial data augmentation method based on spatial transformer networks (STN). The combined adversarial and random transformations based method outperforms the state-of-the-art on multiple DA and DG benchmark datasets. Besides, the proposed method shows desirable robustness to corruption, which is also validated on commonly used datasets.

Title: Few-shot Multimodal Sentiment Analysis based on Multimodal Probabilistic Fusion Prompts. (arXiv:2211.06607v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2211.06607
Code URL: null
Copy Paste: [[2211.06607] Few-shot Multimodal Sentiment Analysis based on Multimodal Probabilistic Fusion Prompts](http://arxiv.org/abs/2211.06607)
Summary:
Multimodal sentiment analysis is a trending topic with the explosion of multimodal content on the web. Present studies in multimodal sentiment analysis rely on large-scale supervised data. Collating supervised data is time-consuming and labor-intensive. As such, it is essential to investigate the problem of few-shot multimodal sentiment analysis. Previous works in few-shot models generally use language model prompts, which can improve performance in low-resource settings. However, the textual prompt ignores the information from other modalities. We propose Multimodal Probabilistic Fusion Prompts, which can provide diverse cues for multimodal sentiment detection. We first design a unified multimodal prompt to reduce the discrepancy in different modal prompts. To improve the robustness of our model, we then leverage multiple diverse prompts for each input and propose a probabilistic method to fuse the output predictions. Extensive experiments conducted on three datasets confirm the effectiveness of our approach.

Title: Robust Training of Graph Neural Networks via Noise Governance. (arXiv:2211.06614v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06614
Code URL: null
Copy Paste: [[2211.06614] Robust Training of Graph Neural Networks via Noise Governance](http://arxiv.org/abs/2211.06614)
Summary:
Graph Neural Networks (GNNs) have become widely-used models for semi-supervised learning. However, the robustness of GNNs in the presence of label noise remains a largely under-explored problem. In this paper, we consider an important yet challenging scenario where labels on nodes of graphs are not only noisy but also scarce. In this scenario, the performance of GNNs is prone to degrade due to label noise propagation and insufficient learning. To address these issues, we propose a novel RTGNN (Robust Training of Graph Neural Networks via Noise Governance) framework that achieves better robustness by learning to explicitly govern label noise. More specifically, we introduce self-reinforcement and consistency regularization as supplemental supervision. The self-reinforcement supervision is inspired by the memorization effects of deep neural networks and aims to correct noisy labels. Further, the consistency regularization prevents GNNs from overfitting to noisy labels via mimicry loss in both the inter-view and intra-view perspectives. To leverage such supervisions, we divide labels into clean and noisy types, rectify inaccurate labels, and further generate pseudo-labels on unlabeled nodes. Supervision for nodes with different types of labels is then chosen adaptively. This enables sufficient learning from clean labels while limiting the impact of noisy ones. We conduct extensive experiments to evaluate the effectiveness of our RTGNN framework, and the results validate its consistent superior performance over state-of-the-art methods with two types of label noises and various noise rates.

Title: Using Features at Multiple Temporal and Spatial Resolutions to Predict Human Behavior in Real Time. (arXiv:2211.06721v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06721
Code URL: null
Copy Paste: [[2211.06721] Using Features at Multiple Temporal and Spatial Resolutions to Predict Human Behavior in Real Time](http://arxiv.org/abs/2211.06721)
Summary:
When performing complex tasks, humans naturally reason at multiple temporal and spatial resolutions simultaneously. We contend that for an artificially intelligent agent to effectively model human teammates, i.e., demonstrate computational theory of mind (ToM), it should do the same. In this paper, we present an approach for integrating high and low-resolution spatial and temporal information to predict human behavior in real time and evaluate it on data collected from human subjects performing simulated urban search and rescue (USAR) missions in a Minecraft-based environment. Our model composes neural networks for high and low-resolution feature extraction with a neural network for behavior prediction, with all three networks trained simultaneously. The high-resolution extractor encodes dynamically changing goals robustly by taking as input the Manhattan distance difference between the humans' Minecraft avatars and candidate goals in the environment for the latest few actions, computed from a high-resolution gridworld representation. In contrast, the low-resolution extractor encodes participants' historical behavior using a historical state matrix computed from a low-resolution graph representation. Through supervised learning, our model acquires a robust prior for human behavior prediction, and can effectively deal with long-term observations. Our experimental results demonstrate that our method significantly improves prediction accuracy compared to approaches that only use high-resolution information.

Title: Pareto-Optimal Learning-Augmented Algorithms for Online k-Search Problems. (arXiv:2211.06567v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06567
Code URL: null
Copy Paste: [[2211.06567] Pareto-Optimal Learning-Augmented Algorithms for Online k-Search Problems](http://arxiv.org/abs/2211.06567)
Summary:
This paper leverages machine learned predictions to design online algorithms for the k-max and k-min search problems. Our algorithms can achieve performances competitive with the offline algorithm in hindsight when the predictions are accurate (i.e., consistency) and also provide worst-case guarantees when the predictions are arbitrarily wrong (i.e., robustness). Further, we show that our algorithms have attained the Pareto-optimal trade-off between consistency and robustness, where no other algorithms for k-max or k-min search can improve on the consistency for a given robustness. To demonstrate the performance of our algorithms, we evaluate them in experiments of buying and selling Bitcoin.

Title: RISE: Robust Individualized Decision Learning with Sensitive Variables. (arXiv:2211.06569v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06569
Code URL: https://github.com/ellenxtan/rise
Copy Paste: [[2211.06569] RISE: Robust Individualized Decision Learning with Sensitive Variables](http://arxiv.org/abs/2211.06569)
Summary:
This paper introduces RISE, a robust individualized decision learning framework with sensitive variables, where sensitive variables are collectible data and important to the intervention decision, but their inclusion in decision making is prohibited due to reasons such as delayed availability or fairness concerns. A naive baseline is to ignore these sensitive variables in learning decision rules, leading to significant uncertainty and bias. To address this, we propose a decision learning framework to incorporate sensitive variables during offline training but not include them in the input of the learned decision rule during model deployment. Specifically, from a causal perspective, the proposed framework intends to improve the worst-case outcomes of individuals caused by sensitive variables that are unavailable at the time of decision. Unlike most existing literature that uses mean-optimal objectives, we propose a robust learning framework by finding a newly defined quantile- or infimum-optimal decision rule. The reliable performance of the proposed method is demonstrated through synthetic experiments and three real-world applications.

Title: A Generalized Doubly Robust Learning Framework for Debiasing Post-Click Conversion Rate Prediction. (arXiv:2211.06684v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06684
Code URL: null
Copy Paste: [[2211.06684] A Generalized Doubly Robust Learning Framework for Debiasing Post-Click Conversion Rate Prediction](http://arxiv.org/abs/2211.06684)
Summary:
Post-click conversion rate (CVR) prediction is an essential task for discovering user interests and increasing platform revenues in a range of industrial applications. One of the most challenging problems of this task is the existence of severe selection bias caused by the inherent self-selection behavior of users and the item selection process of systems. Currently, doubly robust (DR) learning approaches achieve the state-of-the-art performance for debiasing CVR prediction. However, in this paper, by theoretically analyzing the bias, variance and generalization bounds of DR methods, we find that existing DR approaches may have poor generalization caused by inaccurate estimation of propensity scores and imputation errors, which often occur in practice. Motivated by such analysis, we propose a generalized learning framework that not only unifies existing DR methods, but also provides a valuable opportunity to develop a series of new debiasing techniques to accommodate different application scenarios. Based on the framework, we propose two new DR methods, namely DR-BIAS and DR-MSE. DR-BIAS directly controls the bias of DR loss, while DR-MSE balances the bias and variance flexibly, which achieves better generalization performance. In addition, we propose a novel tri-level joint learning optimization method for DR-MSE in CVR prediction, and an efficient training algorithm correspondingly. We conduct extensive experiments on both real-world and semi-synthetic datasets, which validate the effectiveness of our proposed methods.

Title: Deep Reinforcement Learning with Vector Quantized Encoding. (arXiv:2211.06733v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06733
Code URL: null
Copy Paste: [[2211.06733] Deep Reinforcement Learning with Vector Quantized Encoding](http://arxiv.org/abs/2211.06733)
Summary:
Human decision-making often involves combining similar states into categories and reasoning at the level of the categories rather than the actual states. Guided by this intuition, we propose a novel method for clustering state features in deep reinforcement learning (RL) methods to improve their interpretability. Specifically, we propose a plug-and-play framework termed \emph{vector quantized reinforcement learning} (VQ-RL) that extends classic RL pipelines with an auxiliary classification task based on vector quantized (VQ) encoding and aligns with policy training. The VQ encoding method categorizes features with similar semantics into clusters and results in tighter clusters with better separation compared to classic deep RL methods, thus enabling neural models to learn similarities and differences between states better. Furthermore, we introduce two regularization methods to help increase the separation between clusters and avoid the risks associated with VQ training. In simulations, we demonstrate that VQ-RL improves interpretability and investigate its impact on robustness and generalization of deep RL.

biometric

Title: Few-Shot Learning for Biometric Verification. (arXiv:2211.06761v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06761
Code URL: null
Copy Paste: [[2211.06761] Few-Shot Learning for Biometric Verification](http://arxiv.org/abs/2211.06761)
Summary:
In machine learning applications, it is common practice to feed as much information as possible. In most cases, the model can handle large data sets that allow to predict more accurately. In the presence of data scarcity, a Few-Shot learning (FSL) approach aims to build more accurate algorithms with limited training data. We propose a novel end-to-end lightweight architecture that verifies biometric data by producing competitive results as compared to state-of-the-art accuracies through Few-Shot learning methods. The dense layers add to the complexity of state-of-the-art deep learning models which inhibits them to be used in low-power applications. In presented approach, a shallow network is coupled with a conventional machine learning technique that exploits hand-crafted features to verify biometric images from multi-modal sources such as signatures, periocular region, iris, face, fingerprints etc. We introduce a self-estimated threshold that strictly monitors False Acceptance Rate (FAR) while generalizing its results hence eliminating user-defined thresholds from ROC curves that are likely to be biased on local data distribution. This hybrid model benefits from few-shot learning to make up for scarcity of data in biometric use-cases. We have conducted extensive experimentation with commonly used biometric datasets. The obtained results provided an effective solution for biometric verification systems.

steal

extraction

Title: Data-driven Approach for Automatically Correcting Faulty Road Maps. (arXiv:2211.06544v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06544
Code URL: null
Copy Paste: [[2211.06544] Data-driven Approach for Automatically Correcting Faulty Road Maps](http://arxiv.org/abs/2211.06544)
Summary:
Maintaining road networks is labor-intensive, especially in actively developing countries where the road frequently changes. Many automatic road extraction approaches have been introduced to solve this real-world problem, fueled by the abundance of large-scale high-resolution satellite imagery and advances in data-driven vision technology. However, their performance is limited to fully automating road map extraction in real-world services. Hence, many services employ the human-in-the-loop approaches on the extracted road maps: semi-automatic detection and repairment of faulty road maps. Our paper exclusively focuses on the latter, introducing a novel data-driven approach for fixing road maps. We incorporate image inpainting approaches to tackle complex road geometries without custom-made algorithms for each road shape, yielding a method that is readily applicable to any road map segmentation model. We compare our method with the baselines on various road geometries, such as straight and curvy roads, T-junctions, and intersections, to demonstrate the effectiveness of our approach.

Title: Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning. (arXiv:2211.06612v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06612
Code URL: null
Copy Paste: [[2211.06612] Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning](http://arxiv.org/abs/2211.06612)
Summary:
We investigate a practical domain adaptation task, called source-free domain adaptation (SFUDA), where the source-pretrained model is adapted to the target domain without access to the source data. Existing techniques mainly leverage self-supervised pseudo labeling to achieve class-wise global alignment [1] or rely on local structure extraction that encourages feature consistency among neighborhoods [2]. While impressive progress has been made, both lines of methods have their own drawbacks - the "global" approach is sensitive to noisy labels while the "local" counterpart suffers from source bias. In this paper, we present Divide and Contrast (DaC), a new paradigm for SFUDA that strives to connect the good ends of both worlds while bypassing their limitations. Based on the prediction confidence of the source model, DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals under an adaptive contrastive learning framework. Specifically, the source-like samples are utilized for learning global class clustering thanks to their relatively clean labels. The more noisy target-specific data are harnessed at the instance level for learning the intrinsic local structures. We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch. Extensive experiments on VisDA, Office-Home, and the more challenging DomainNet have verified the superior performance of DaC over current state-of-the-art approaches. The code is available at https://github.com/ZyeZhang/DaC.git.

Title: Auto Lead Extraction and Digitization of ECG Paper Records using cGAN. (arXiv:2211.06720v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06720
Code URL: null
Copy Paste: [[2211.06720] Auto Lead Extraction and Digitization of ECG Paper Records using cGAN](http://arxiv.org/abs/2211.06720)
Summary:
Purpose: An Electrocardiogram (ECG) is the simplest and fastest bio-medical test that is used to detect any heart-related disease. ECG signals are generally stored in paper form, which makes it difficult to store and analyze the data. While capturing ECG leads from paper ECG records, a lot of background information is also captured, which results in incorrect data interpretation.

Methods: We propose a deep learning-based model for individually extracting all 12 leads from 12-lead ECG images captured using a camera. To simplify the analysis of the ECG and the calculation of complex parameters, we also propose a method to convert the paper ECG format into a storable digital format. The You Only Look Once, Version 3 (YOLOv3) algorithm has been used to extract the leads present in the image. These leads are then passed on to another deep learning model which separates the ECG signal and background from the single-lead image. After that, vertical scanning is performed on the ECG signal to convert it into a 1-Dimensional (1D) digital form. To perform the task of digitalization, we used the pix-2-pix deep learning model and binarized the ECG signals.

Results: Our proposed method was able to achieve an accuracy of 97.4 %.

Conclusion: The information on the paper ECG fades away over time. Hence, the digitized ECG signals make it possible to store the records and access them anytime. This proves highly beneficial for heart patients who require frequent ECG reports. The stored data can also be useful for research purposes, as this data can be used to develop computer algorithms that are capable of analyzing the data.

Title: Deep Unsupervised Key Frame Extraction for Efficient Video Classification. (arXiv:2211.06742v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06742
Code URL: null
Copy Paste: [[2211.06742] Deep Unsupervised Key Frame Extraction for Efficient Video Classification](http://arxiv.org/abs/2211.06742)
Summary:
Video processing and analysis have become an urgent task since a huge amount of videos (e.g., Youtube, Hulu) are uploaded online every day. The extraction of representative key frames from videos is very important in video processing and analysis since it greatly reduces computing resources and time. Although great progress has been made recently, large-scale video classification remains an open problem, as the existing methods have not well balanced the performance and efficiency simultaneously. To tackle this problem, this work presents an unsupervised method to retrieve the key frames, which combines Convolutional Neural Network (CNN) and Temporal Segment Density Peaks Clustering (TSDPC). The proposed TSDPC is a generic and powerful framework and it has two advantages compared with previous works, one is that it can calculate the number of key frames automatically. The other is that it can preserve the temporal information of the video. Thus it improves the efficiency of video classification. Furthermore, a Long Short-Term Memory network (LSTM) is added on the top of the CNN to further elevate the performance of classification. Moreover, a weight fusion strategy of different input networks is presented to boost the performance. By optimizing both video classification and key frame extraction simultaneously, we achieve better classification performance and higher efficiency. We evaluate our method on two popular datasets (i.e., HMDB51 and UCF101) and the experimental results consistently demonstrate that our strategy achieves competitive performance and efficiency compared with the state-of-the-art approaches.

Title: Large-Scale Bidirectional Training for Zero-Shot Image Captioning. (arXiv:2211.06774v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06774
Code URL: https://github.com/tgisaturday/BITTERS
Copy Paste: [[2211.06774] Large-Scale Bidirectional Training for Zero-Shot Image Captioning](http://arxiv.org/abs/2211.06774)
Summary:
When trained on large-scale datasets, image captioning models can understand the content of images from a general domain but often fail to generate accurate, detailed captions. To improve performance, pretraining-and-finetuning has been a key strategy for image captioning. However, we find that large-scale bidirectional training between image and text enables zero-shot image captioning. In this paper, we introduce Bidirectional Image Text Training in largER Scale, BITTERS, an efficient training and inference framework for zero-shot image captioning. We also propose a new evaluation benchmark which comprises of high quality datasets and an extensive set of metrics to properly evaluate zero-shot captioning accuracy and societal bias. We additionally provide an efficient finetuning approach for keyword extraction. We show that careful selection of large-scale training set and model architecture is the key to achieving zero-shot image captioning.

membership infer

federate

Title: Differentially Private Vertical Federated Learning. (arXiv:2211.06782v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06782
Code URL: null
Copy Paste: [[2211.06782] Differentially Private Vertical Federated Learning](http://arxiv.org/abs/2211.06782)
Summary:
A successful machine learning (ML) algorithm often relies on a large amount of high-quality data to train well-performed models. Supervised learning approaches, such as deep learning techniques, generate high-quality ML functions for real-life applications, however with large costs and human efforts to label training data. Recent advancements in federated learning (FL) allow multiple data owners or organisations to collaboratively train a machine learning model without sharing raw data. In this light, vertical FL allows organisations to build a global model when the participating organisations have vertically partitioned data. Further, in the vertical FL setting the participating organisation generally requires fewer resources compared to sharing data directly, enabling lightweight and scalable distributed training solutions. However, privacy protection in vertical FL is challenging due to the communication of intermediate outputs and the gradients of model update. This invites adversary entities to infer other organisations underlying data. Thus, in this paper, we aim to explore how to protect the privacy of individual organisation data in a differential privacy (DP) setting. We run experiments with different real-world datasets and DP budgets. Our experimental results show that a trade-off point needs to be found to achieve a balance between the vertical FL performance and privacy protection in terms of the amount of perturbation noise.

Title: FedRule: Federated Rule Recommendation System with Graph Neural Networks. (arXiv:2211.06812v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2211.06812
Code URL: null
Copy Paste: [[2211.06812] FedRule: Federated Rule Recommendation System with Graph Neural Networks](http://arxiv.org/abs/2211.06812)
Summary:
Much of the value that IoT (Internet-of-Things) devices bring to ``smart'' homes lies in their ability to automatically trigger other devices' actions: for example, a smart camera triggering a smart lock to unlock a door. Manually setting up these rules for smart devices or applications, however, is time-consuming and inefficient. Rule recommendation systems can automatically suggest rules for users by learning which rules are popular based on those previously deployed (e.g., in others' smart homes). Conventional recommendation formulations require a central server to record the rules used in many users' homes, which compromises their privacy and leaves them vulnerable to attacks on the central server's database of rules. Moreover, these solutions typically leverage generic user-item matrix methods that do not fully exploit the structure of the rule recommendation problem. In this paper, we propose a new rule recommendation system, dubbed as FedRule, to address these challenges. One graph is constructed per user upon the rules s/he is using, and the rule recommendation is formulated as a link prediction task in these graphs. This formulation enables us to design a federated training algorithm that is able to keep users' data private. Extensive experiments corroborate our claims by demonstrating that FedRule has comparable performance as the centralized setting and outperforms conventional solutions.

fair

interpretability

Title: Generalization Beyond Feature Alignment: Concept Activation-Guided Contrastive Learning. (arXiv:2211.06843v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2211.06843
Code URL: null
Copy Paste: [[2211.06843] Generalization Beyond Feature Alignment: Concept Activation-Guided Contrastive Learning](http://arxiv.org/abs/2211.06843)
Summary:
Learning invariant representations via contrastive learning has seen state-of-the-art performance in domain generalization (DG). Despite such success, in this paper, we find that its core learning strategy -- feature alignment -- could heavily hinder the model generalization. Inspired by the recent progress in neuron interpretability, we characterize this problem from a neuron activation view. Specifically, by treating feature elements as neuron activation states, we show that conventional alignment methods tend to deteriorate the diversity of learned invariant features, as they indiscriminately minimize all neuron activation differences. This instead ignores rich relations among neurons -- many of them often identify the same visual concepts though they emerge differently. With this finding, we present a simple yet effective approach, \textit{Concept Contrast} (CoCo), which relaxes element-wise feature alignments by contrasting high-level concepts encoded in neurons. This approach is highly flexible and can be integrated into any contrastive method in DG. Through extensive experiments, we further demonstrate that our CoCo promotes the diversity of feature representations, and consistently improves model generalization capability over the DomainBed benchmark.

Title: Instance-based Learning for Knowledge Base Completion. (arXiv:2211.06807v1 [cs.AI])

Paper URL: http://arxiv.org/abs/2211.06807
Code URL: https://github.com/chenxran/instancebasedlearning
Copy Paste: [[2211.06807] Instance-based Learning for Knowledge Base Completion](http://arxiv.org/abs/2211.06807)
Summary:
In this paper, we propose a new method for knowledge base completion (KBC): instance-based learning (IBL). For example, to answer (Jill Biden, lived city,? ), instead of going directly to Washington D.C., our goal is to find Joe Biden, who has the same lived city as Jill Biden. Through prototype entities, IBL provides interpretability. We develop theories for modeling prototypes and combining IBL with translational models. Experiments on various tasks confirmed the IBL model's effectiveness and interpretability.

In addition, IBL shed light on the mechanism of rule-based KBC models. Previous research has generally agreed that rule-based models provide rules with semantically compatible premises and hypotheses. We challenge this view. We begin by demonstrating that some logical rules represent {\it instance-based equivalence} (i.e. prototypes) rather than semantic compatibility. These are denoted as {\it IBL rules}. Surprisingly, despite occupying only a small portion of the rule space, IBL rules outperform non-IBL rules in all four benchmarks. We use a variety of experiments to demonstrate that rule-based models work because they have the ability to represent instance-based equivalence via IBL rules. The findings provide new insights of how rule-based models work and how to interpret their rules.