secure

Title: OblivIO: Securing reactive programs by oblivious execution with bounded traffic overheads. (arXiv:2301.08148v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2301.08148
Code URL: null
Copy Paste: [[2301.08148] OblivIO: Securing reactive programs by oblivious execution with bounded traffic overheads](http://arxiv.org/abs/2301.08148) #secure
Summary:
Traffic analysis attacks remain a significant problem for online security. Communication between nodes can be observed by network level attackers as it inherently takes place in the open. Despite online services increasingly using encrypted traffic, the shape of the traffic is not hidden. To prevent traffic analysis, the shape of a system's traffic must be independent of secrets. We investigate adapting the data-oblivious approach the reactive setting and present OblivIO, a secure language for writing reactive programs driven by network events. Our approach pads with dummy messages to hide which program sends are genuinely executed. We use an information-flow type system to provably enforce timing-sensitive noninterference. The type system is extended with potentials to bound the overhead in traffic introduced by our approach. We address challenges that arise from joining data-oblivious and reactive programming and demonstrate the feasibility of our resulting language by developing an interpreter that implements security critical operations as constant-time algorithms.

security

privacy

Title: Differentially Private Online Bayesian Estimation With Adaptive Truncation. (arXiv:2301.08202v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.08202
Code URL: https://github.com/sinanyildirim/smc_dp_adatr
Copy Paste: [[2301.08202] Differentially Private Online Bayesian Estimation With Adaptive Truncation](http://arxiv.org/abs/2301.08202) #privacy
Summary:
We propose a novel online and adaptive truncation method for differentially private Bayesian online estimation of a static parameter regarding a population. We assume that sensitive information from individuals is collected sequentially and the inferential aim is to estimate, on-the-fly, a static parameter regarding the population to which those individuals belong. We propose sequential Monte Carlo to perform online Bayesian estimation. When individuals provide sensitive information in response to a query, it is necessary to perturb it with privacy-preserving noise to ensure the privacy of those individuals. The amount of perturbation is proportional to the sensitivity of the query, which is determined usually by the range of the queried information. The truncation technique we propose adapts to the previously collected observations to adjust the query range for the next individual. The idea is that, based on previous observations, we can carefully arrange the interval into which the next individual's information is to be truncated before being perturbed with privacy-preserving noise. In this way, we aim to design predictive queries with small sensitivity, hence small privacy-preserving noise, enabling more accurate estimation while maintaining the same level of privacy. To decide on the location and the width of the interval, we use an exploration-exploitation approach a la Thompson sampling with an objective function based on the Fisher information of the generated observation. We show the merits of our methodology with numerical examples.

protect

defense

Title: On the Vulnerability of Backdoor Defenses for Federated Learning. (arXiv:2301.08170v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.08170
Code URL: null
Copy Paste: [[2301.08170] On the Vulnerability of Backdoor Defenses for Federated Learning](http://arxiv.org/abs/2301.08170) #defense
Summary:
Federated Learning (FL) is a popular distributed machine learning paradigm that enables jointly training a global model without sharing clients' data. However, its repetitive server-client communication gives room for backdoor attacks with aim to mislead the global model into a targeted misprediction when a specific trigger pattern is presented. In response to such backdoor threats on federated learning, various defense measures have been proposed. In this paper, we study whether the current defense mechanisms truly neutralize the backdoor threats from federated learning in a practical setting by proposing a new federated backdoor attack method for possible countermeasures. Different from traditional training (on triggered data) and rescaling (the malicious client model) based backdoor injection, the proposed backdoor attack framework (1) directly modifies (a small proportion of) local model weights to inject the backdoor trigger via sign flips; (2) jointly optimize the trigger pattern with the client model, thus is more persistent and stealthy for circumventing existing defenses. In a case study, we examine the strength and weaknesses of recent federated backdoor defenses from three major categories and provide suggestions to the practitioners when training federated models in practice.

attack

Title: System on Chip Rejuvenation in the Wake of Persistent Attacks. (arXiv:2301.08018v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2301.08018
Code URL: null
Copy Paste: [[2301.08018] System on Chip Rejuvenation in the Wake of Persistent Attacks](http://arxiv.org/abs/2301.08018) #attack
Summary:
To cope with the ever increasing threats of dynamic and adaptive persistent attacks, Fault and Intrusion Tolerance (FIT) is being studied at the hardware level to increase critical systems resilience. Based on state-machine replication, FIT is known to be effective if replicas are compromised and fail independently. This requires different ways of diversification at the software and hardware levels. In this paper, we introduce the first hardware-based rejuvenation framework, we call Samsara, that allows for creating new computing cores (on which FIT replicas run) with diverse architectures. This is made possible by taking advantage of the programmable and reconfigurable features of MPSoC with an FPGA. A persistent attack that analyzes and exploits the vulnerability of a core will not be able to exploit it as rejuvenation to a different core architecture is made fast enough. We discuss the feasibility of this design, and we leave the empirical evaluations for future work.

robust

Title: A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems. (arXiv:2301.07799v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.07799
Code URL: null
Copy Paste: [[2301.07799] A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems](http://arxiv.org/abs/2301.07799) #robust
Summary:
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.

Title: Measuring uncertainty in human visual segmentation. (arXiv:2301.07807v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.07807
Code URL: null
Copy Paste: [[2301.07807] Measuring uncertainty in human visual segmentation](http://arxiv.org/abs/2301.07807) #robust
Summary:
Segmenting visual stimuli into distinct groups of features and visual objects is central to visual function. Classical psychophysical methods have helped uncover many rules of human perceptual segmentation, and recent progress in machine learning has produced successful algorithms. Yet, the computational logic of human segmentation remains unclear, partially because we lack well-controlled paradigms to measure perceptual segmentation maps and compare models quantitatively. Here we propose a new, integrated approach: given an image, we measure multiple pixel-based same-different judgments and perform model--based reconstruction of the underlying segmentation map. The reconstruction is robust to several experimental manipulations and captures the variability of individual participants. We demonstrate the validity of the approach on human segmentation of natural images and composite textures. We show that image uncertainty affects measured human variability, and it influences how participants weigh different visual features. Because any putative segmentation algorithm can be inserted to perform the reconstruction, our paradigm affords quantitative tests of theories of perception as well as new benchmarks for segmentation algorithms.

Title: Spatio-Temporal Context Modeling for Road Obstacle Detection. (arXiv:2301.07921v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.07921
Code URL: null
Copy Paste: [[2301.07921] Spatio-Temporal Context Modeling for Road Obstacle Detection](http://arxiv.org/abs/2301.07921) #robust
Summary:
Road obstacle detection is an important problem for vehicle driving safety. In this paper, we aim to obtain robust road obstacle detection based on spatio-temporal context modeling. Firstly, a data-driven spatial context model of the driving scene is constructed with the layouts of the training data. Then, obstacles in the input image are detected via the state-of-the-art object detection algorithms, and the results are combined with the generated scene layout. In addition, to further improve the performance and robustness, temporal information in the image sequence is taken into consideration, and the optical flow is obtained in the vicinity of the detected objects to track the obstacles across neighboring frames. Qualitative and quantitative experiments were conducted on the Small Obstacle Detection (SOD) dataset and the Lost and Found dataset. The results indicate that our method with spatio-temporal context modeling is superior to existing methods for road obstacle detection.

Title: RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation. (arXiv:2301.08092v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.08092
Code URL: null
Copy Paste: [[2301.08092] RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation](http://arxiv.org/abs/2301.08092) #robust
Summary:
Deep Neural Networks are vulnerable to adversarial attacks. Neural Architecture Search (NAS), one of the driving tools of deep neural networks, demonstrates superior performance in prediction accuracy in various machine learning applications. However, it is unclear how it performs against adversarial attacks. Given the presence of a robust teacher, it would be interesting to investigate if NAS would produce robust neural architecture by inheriting robustness from the teacher. In this paper, we propose Robust Neural Architecture Search by Cross-Layer Knowledge Distillation (RNAS-CL), a novel NAS algorithm that improves the robustness of NAS by learning from a robust teacher through cross-layer knowledge distillation. Unlike previous knowledge distillation methods that encourage close student/teacher output only in the last layer, RNAS-CL automatically searches for the best teacher layer to supervise each student layer. Experimental result evidences the effectiveness of RNAS-CL and shows that RNAS-CL produces small and robust neural architecture.

biometric

steal

extraction

Title: Spatio-temporal neural structural causal models for bike flow prediction. (arXiv:2301.07843v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.07843
Code URL: null
Copy Paste: [[2301.07843] Spatio-temporal neural structural causal models for bike flow prediction](http://arxiv.org/abs/2301.07843) #extraction
Summary:
As a representative of public transportation, the fundamental issue of managing bike-sharing systems is bike flow prediction. Recent methods overemphasize the spatio-temporal correlations in the data, ignoring the effects of contextual conditions on the transportation system and the inter-regional timevarying causality. In addition, due to the disturbance of incomplete observations in the data, random contextual conditions lead to spurious correlations between data and features, making the prediction of the model ineffective in special scenarios. To overcome this issue, we propose a Spatio-temporal Neural Structure Causal Model(STNSCM) from the perspective of causality. First, we build a causal graph to describe the traffic prediction, and further analyze the causal relationship between the input data, contextual conditions, spatiotemporal states, and prediction results. Second, we propose to apply the frontdoor criterion to eliminate confounding biases in the feature extraction process. Finally, we propose a counterfactual representation reasoning module to extrapolate the spatio-temporal state under the factual scenario to future counterfactual scenarios to improve the prediction performance. Experiments on real-world datasets demonstrate the superior performance of our model, especially its resistance to fluctuations caused by the external environment. The source code and data will be released.

membership infer

federate

Title: Federated Automatic Differentiation. (arXiv:2301.07806v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.07806
Code URL: null
Copy Paste: [[2301.07806] Federated Automatic Differentiation](http://arxiv.org/abs/2301.07806) #federate
Summary:
Federated learning (FL) is a general framework for learning across heterogeneous clients while preserving data privacy, under the orchestration of a central server. FL methods often compute gradients of loss functions purely locally (ie. entirely at each client, or entirely at the server), typically using automatic differentiation (AD) techniques. We propose a federated automatic differentiation (FAD) framework that 1) enables computing derivatives of functions involving client and server computation as well as communication between them and 2) operates in a manner compatible with existing federated technology. In other words, FAD computes derivatives across communication boundaries. We show, in analogy with traditional AD, that FAD may be implemented using various accumulation modes, which introduce distinct computation-communication trade-offs and systems requirements. Further, we show that a broad class of federated computations is closed under these various modes of FAD, implying in particular that if the original computation can be implemented using privacy-preserving primitives, its derivative may be computed using only these same primitives. We then show how FAD can be used to create algorithms that dynamically learn components of the algorithm itself. In particular, we show that FedAvg-style algorithms can exhibit significantly improved performance by using FAD to adjust the server optimization step automatically, or by using FAD to learn weighting schemes for computing weighted averages across clients.

fair

Title: Unposed: Unsupervised Pose Estimation based Product Image Recommendations. (arXiv:2301.07879v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.07879
Code URL: null
Copy Paste: [[2301.07879] Unposed: Unsupervised Pose Estimation based Product Image Recommendations](http://arxiv.org/abs/2301.07879) #fair
Summary:
Product images are the most impressing medium of customer interaction on the product detail pages of e-commerce websites. Millions of products are onboarded on to webstore catalogues daily and maintaining a high quality bar for a product's set of images is a problem at scale. Grouping products by categories, clothing is a very high volume and high velocity category and thus deserves its own attention. Given the scale it is challenging to monitor the completeness of image set, which adequately details the product for the consumers, which in turn often leads to a poor customer experience and thus customer drop off.

To supervise the quality and completeness of the images in the product pages for these product types and suggest improvements, we propose a Human Pose Detection based unsupervised method to scan the image set of a product for the missing ones. The unsupervised approach suggests a fair approach to sellers based on product and category irrespective of any biases. We first create a reference image set of popular products with wholesome imageset. Then we create clusters of images to label most desirable poses to form the classes for the reference set from these ideal products set. Further, for all test products we scan the images for all desired pose classes w.r.t. reference set poses, determine the missing ones and sort them in the order of potential impact. These missing poses can further be used by the sellers to add enriched product listing image. We gathered data from popular online webstore and surveyed ~200 products manually, a large fraction of which had at least 1 repeated image or missing variant, and sampled 3K products(~20K images) of which a significant proportion had scope for adding many image variants as compared to high rated products which had more than double image variants, indicating that our model can potentially be used on a large scale.

Title: RGB-D-Based Categorical Object Pose and Shape Estimation: Methods, Datasets, and Evaluation. (arXiv:2301.08147v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.08147
Code URL: https://github.com/roym899/pose_and_shape_evaluation
Copy Paste: [[2301.08147] RGB-D-Based Categorical Object Pose and Shape Estimation: Methods, Datasets, and Evaluation](http://arxiv.org/abs/2301.08147) #fair
Summary:
Recently, various methods for 6D pose and shape estimation of objects at a per-category level have been proposed. This work provides an overview of the field in terms of methods, datasets, and evaluation protocols. First, an overview of existing works and their commonalities and differences is provided. Second, we take a critical look at the predominant evaluation protocol, including metrics and datasets. Based on the findings, we propose a new set of metrics, contribute new annotations for the Redwood dataset, and evaluate state-of-the-art methods in a fair comparison. The results indicate that existing methods do not generalize well to unconstrained orientations and are actually heavily biased towards objects being upright. We provide an easy-to-use evaluation toolbox with well-defined metrics, methods, and dataset interfaces, which allows evaluation and comparison with various state-of-the-art approaches (https://github.com/roym899/pose_and_shape_evaluation).

interpretability

Title: Emergence of the SVD as an interpretable factorization in deep learning for inverse problems. (arXiv:2301.07820v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.07820
Code URL: https://github.com/shashanksule/descrambling-nn
Copy Paste: [[2301.07820] Emergence of the SVD as an interpretable factorization in deep learning for inverse problems](http://arxiv.org/abs/2301.07820) #interpretability
Summary:
We demonstrate the emergence of weight matrix singular value decomposition (SVD) in interpreting neural networks (NNs) for parameter estimation from noisy signals. The SVD appears naturally as a consequence of initial application of a descrambling transform - a recently-developed technique for addressing interpretability in NNs \cite{amey2021neural}. We find that within the class of noisy parameter estimation problems, the SVD may be the means by which networks memorize the signal model. We substantiate our theoretical findings with empirical evidence from both linear and non-linear settings. Our results also illuminate the connections between a mathematical theory of semantic development \cite{saxe2019mathematical} and neural network interpretability.

explainability

Title: CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions. (arXiv:2301.07941v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.07941
Code URL: null
Copy Paste: [[2301.07941] CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions](http://arxiv.org/abs/2301.07941) #explainability
Summary:
Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature $X>x$, the output $Y$ would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints.

In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.

Title: Identification, explanation and clinical evaluation of hospital patient subtypes. (arXiv:2301.08019v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.08019
Code URL: null
Copy Paste: [[2301.08019] Identification, explanation and clinical evaluation of hospital patient subtypes](http://arxiv.org/abs/2301.08019) #explainability
Summary:
We present a pipeline in which unsupervised machine learning techniques are used to automatically identify subtypes of hospital patients admitted between 2017 and 2021 in a large UK teaching hospital. With the use of state-of-the-art explainability techniques, the identified subtypes are interpreted and assigned clinical meaning. In parallel, clinicians assessed intra-cluster similarities and inter-cluster differences of the identified patient subtypes within the context of their clinical knowledge. By confronting the outputs of both automatic and clinician-based explanations, we aim to highlight the mutual benefit of combining machine learning techniques with clinical expertise.

watermark

diffusion

Title: Fast Inference in Denoising Diffusion Models via MMD Finetuning. (arXiv:2301.07969v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.07969
Code URL: https://github.com/diegovalsesia/mmd-ddm
Copy Paste: [[2301.07969] Fast Inference in Denoising Diffusion Models via MMD Finetuning](http://arxiv.org/abs/2301.07969) #diffusion
Summary:
Denoising Diffusion Models (DDMs) have become a popular tool for generating high-quality samples from complex data distributions. These models are able to capture sophisticated patterns and structures in the data, and can generate samples that are highly diverse and representative of the underlying distribution. However, one of the main limitations of diffusion models is the complexity of sample generation, since a large number of inference timesteps is required to faithfully capture the data distribution. In this paper, we present MMD-DDM, a novel method for fast sampling of diffusion models. Our approach is based on the idea of using the Maximum Mean Discrepancy (MMD) to finetune the learned distribution with a given budget of timesteps. This allows the finetuned model to significantly improve the speed-quality trade-off, by substantially increasing fidelity in inference regimes with few steps or, equivalently, by reducing the required number of steps to reach a target fidelity, thus paving the way for a more practical adoption of diffusion models in a wide range of applications. We evaluate our approach on unconditional image generation with extensive experiments across the CIFAR-10, CelebA, ImageNet and LSUN-Church datasets. Our findings show that the proposed method is able to produce high-quality samples in a fraction of the time required by widely-used diffusion models, and outperforms state-of-the-art techniques for accelerated sampling. Code is available at: https://github.com/diegovalsesia/MMD-DDM.

Title: Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models. (arXiv:2301.08072v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2301.08072
Code URL: null
Copy Paste: [[2301.08072] Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models](http://arxiv.org/abs/2301.08072) #diffusion
Summary:
Color plays an important role in human visual perception, reflecting the spectrum of objects. However, the existing infrared and visible image fusion methods rarely explore how to handle multi-spectral/channel data directly and achieve high color fidelity. This paper addresses the above issue by proposing a novel method with diffusion models, termed as Dif-Fusion, to generate the distribution of the multi-channel input data, which increases the ability of multi-source information aggregation and the fidelity of colors. In specific, instead of converting multi-channel images into single-channel data in existing fusion methods, we create the multi-channel data distribution with a denoising network in a latent space with forward and reverse diffusion process. Then, we use the the denoising network to extract the multi-channel diffusion features with both visible and infrared information. Finally, we feed the multi-channel diffusion features to the multi-channel fusion module to directly generate the three-channel fused image. To retain the texture and intensity information, we propose multi-channel gradient loss and intensity loss. Along with the current evaluation metrics for measuring texture and intensity fidelity, we introduce a new evaluation metric to quantify color fidelity. Extensive experiments indicate that our method is more effective than other state-of-the-art image fusion methods, especially in color fidelity.

Title: Understanding the diffusion models by conditional expectations. (arXiv:2301.07882v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2301.07882
Code URL: null
Copy Paste: [[2301.07882] Understanding the diffusion models by conditional expectations](http://arxiv.org/abs/2301.07882) #diffusion
Summary:
This paper provide several mathematical analyses of the diffusion model in machine learning. The drift term of the backwards sampling process is represented as a conditional expectation involving the data distribution and the forward diffusion. The training process aims to find such a drift function by minimizing the mean-squared residue related to the conditional expectation. Using small-time approximations of the Green's function of the forward diffusion, we show that the analytical mean drift function in DDPM and the score function in SGM asymptotically blow up in the final stages of the sampling process for singular data distributions such as those concentrated on lower-dimensional manifolds, and is therefore difficult to approximate by a network. To overcome this difficulty, we derive a new target function and associated loss, which remains bounded even for singular data distributions. We illustrate the theoretical findings with several numerical examples.