secure
Title: A survey of Digital Manufacturing Hardware and Software Trojans. (arXiv:2301.10336v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2301.10336
- Code URL: null
- Copy Paste:
[[2301.10336] A survey of Digital Manufacturing Hardware and Software Trojans](http://arxiv.org/abs/2301.10336) #secure
- Summary:
Digital Manufacturing (DM) refers to the on-going adoption of smarter, more agile manufacturing processes and cyber-physical systems. This includes modern techniques and technologies such as Additive Manufacturing (AM)/3D printing, as well as the Industrial Internet of Things (IIoT) and the broader trend toward Industry 4.0. However, this adoption is not without risks: with a growing complexity and connectivity, so too grows the cyber-physical attack surface. Here, malicious actors might seek to steal sensitive information or sabotage products or production lines, causing financial and reputational loss. Of particular concern are where such malicious attacks may enter the complex supply chains of DM systems as Trojans -- malicious modifications that may trigger their payloads at later times or stages of the product lifecycle.
In this work, we thus present a comprehensive overview of the threats posed by Trojans in Digital Manufacturing. We cover both hardware and software Trojans which may exist in products or their production and supply lines. From this, we produce a novel taxonomy for classifying and analyzing these threats, and elaborate on how different side channels (e.g. visual, thermal, acoustic, power, and magnetic) may be used to either enhance the impact of a given Trojan or utilized as part of a defensive strategy. Other defenses are also presented -- including hardware, web-, and software-related. To conclude, we discuss seven different case studies and elaborate how they fit into our taxonomy. Overall, this paper presents a detailed survey of the Trojan landscape for Digital Manufacturing: threats, defenses, and the importance of implementing secure practices.
security
Title: Is This Abstract Generated by AI? A Research for the Gap between AI-generated Scientific Text and Human-written Scientific Text. (arXiv:2301.10416v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2301.10416
- Code URL: null
- Copy Paste:
[[2301.10416] Is This Abstract Generated by AI? A Research for the Gap between AI-generated Scientific Text and Human-written Scientific Text](http://arxiv.org/abs/2301.10416) #security
- Summary:
BACKGROUND: Recent neural language models have taken a significant step forward in producing remarkably controllable, fluent, and grammatical text. Although some recent works have found that AI-generated text is not distinguishable from human-authored writing for crowd-sourcing workers, there still exist errors in AI-generated text which are even subtler and harder to spot. METHOD: In this paper, we investigate the gap between scientific content generated by AI and written by humans. Specifically, we first adopt several publicly available tools or models to investigate the performance for detecting GPT-generated scientific text. Then we utilize features from writing style to analyze the similarities and differences between the two types of content. Furthermore, more complex and deep perspectives, such as consistency, coherence, language redundancy, and factual errors, are also taken into consideration for in-depth analysis. RESULT: The results suggest that while AI has the potential to generate scientific content that is as accurate as human-written content, there is still a gap in terms of depth and overall quality. AI-generated scientific content is more likely to contain errors in language redundancy and factual issues. CONCLUSION: We find that there exists a ``writing style'' gap between AI-generated scientific text and human-written scientific text. Moreover, based on the analysis result, we summarize a series of model-agnostic or distribution-agnostic features, which could be utilized to unknown or novel domain distribution and different generation methods. Future research should focus on not only improving the capabilities of AI models to produce high-quality content but also examining and addressing ethical and security concerns related to the generation and the use of AI-generated content.
Title: Breaking Bad: Quantifying the Addiction of Web Elements to JavaScript. (arXiv:2301.10597v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2301.10597
- Code URL: null
- Copy Paste:
[[2301.10597] Breaking Bad: Quantifying the Addiction of Web Elements to JavaScript](http://arxiv.org/abs/2301.10597) #security
- Summary:
While JavaScript established itself as a cornerstone of the modern web, it also constitutes a major tracking and security vector, thus raising critical privacy and security concerns. In this context, some browser extensions propose to systematically block scripts reported by crowdsourced trackers lists. However, this solution heavily depends on the quality of these built-in lists, which may be deprecated or incomplete, thus exposing the visitor to unknown trackers. In this paper, we explore a different strategy, by investigating the benefits of disabling JavaScript in the browser. More specifically, by adopting such a strict policy, we aim to quantify the JavaScript addiction of web elements composing a web page, through the observation of web breakages. As there is no standard mechanism for detecting such breakages, we introduce a framework to inspect several page features when blocking JavaScript, that we deploy to analyze 6,384 pages, including landing and internal web pages. We discover that 43% of web pages are not strictly dependent on JavaScript and that more than 67% of pages are likely to be usable as long as the visitor only requires the content from the main section of the page, for which the user most likely reached the page, while reducing the number of tracking requests by 85% on average. Finally, we discuss the viability of currently browsing the web without JavaScript and detail multiple incentives for websites to be kept usable without JavaScript.
privacy
Title: Huff-DP: Huffman Coding based Differential Privacy Mechanism for Real-Time Data. (arXiv:2301.10395v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2301.10395
- Code URL: null
- Copy Paste:
[[2301.10395] Huff-DP: Huffman Coding based Differential Privacy Mechanism for Real-Time Data](http://arxiv.org/abs/2301.10395) #privacy
- Summary:
With the advancements in connected devices, a huge amount of real-time data is being generated. Efficient storage, transmission, and analysation of this real-time big data is important, as it serves a number of purposes ranging from decision making to fault prediction, etc. Alongside this, real-time big data has rigorous utility and privacy requirements, therefore, it is also significantly important to choose the handling strategies meticulously. One of the optimal way to store and transmit data in the form of lossless compression is Huffman coding, which compresses the data into a variable length binary stream. Similarly, in order to protect the privacy of such big data, differential privacy is being used nowadays, which perturbs the data on the basis of privacy budget and sensitivity. Nevertheless, traditional differential privacy mechanisms provide privacy guarantees. However, on the other hand, real-time data cannot be dealt as an ordinary set of records, because it usually has certain underlying patterns and cycles, which can be used for forming a link to a specific individuals private information that can lead to severe privacy leakages (e.g., analysing smart metering data can lead to classification of individuals daily routine). Thus, it is equally important to develop a privacy preservation model, which preserves the privacy on the basis of occurrences and patterns in the data. In this paper, we design a novel Huff-DP mechanism, which selects the optimal privacy budget on the basis of privacy requirement for that specific record. In order to further enhance the budget determination, we propose static, sine, and fuzzy logic based decision algorithms. From the experimental evaluations, it can be concluded that our proposed Huff-DP mechanism provides effective privacy protection alongside reducing the privacy budget computational cost.
protect
Title: SCANTRAP: Protecting Content Management Systems from Vulnerability Scanners with Cyber Deception and Obfuscation. (arXiv:2301.10502v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2301.10502
- Code URL: null
- Copy Paste:
[[2301.10502] SCANTRAP: Protecting Content Management Systems from Vulnerability Scanners with Cyber Deception and Obfuscation](http://arxiv.org/abs/2301.10502) #protect
- Summary:
Every attack begins with gathering information about the target. The entry point for network breaches are often vulnerabilities in internet facing websites, which often rely on an off-the-shelf Content Management System (CMS). Bot networks and human attackers alike rely on automated scanners to gather information about the CMS software installed and potential vulnerabilities. To increase the security of websites using a CMS, it is desirable to make the use of CMS scanners less reliable. The aim of this work is to extend the current knowledge about cyber deception in regard to CMS. To demonstrate this, a WordPress Plugin called 'SCANTRAP' was created, which uses simulation and dissimulation in regards to plugins, themes, versions, and users. We found that the resulting plugin is capable of obfuscating real information and to a certain extent inject false information to the output of one of the most popular WordPress scanners, WPScan, without limiting the legitimate functionality of the WordPress installation.
defense
Title: BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing. (arXiv:2301.10412v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2301.10412
- Code URL: null
- Copy Paste:
[[2301.10412] BDMMT: Backdoor Sample Detection for Language Models through Model Mutation Testing](http://arxiv.org/abs/2301.10412) #defense
- Summary:
Deep neural networks (DNNs) and natural language processing (NLP) systems have developed rapidly and have been widely used in various real-world fields. However, they have been shown to be vulnerable to backdoor attacks. Specifically, the adversary injects a backdoor into the model during the training phase, so that input samples with backdoor triggers are classified as the target class. Some attacks have achieved high attack success rates on the pre-trained language models (LMs), but there have yet to be effective defense methods. In this work, we propose a defense method based on deep model mutation testing. Our main justification is that backdoor samples are much more robust than clean samples if we impose random mutations on the LMs and that backdoors are generalizable. We first confirm the effectiveness of model mutation testing in detecting backdoor samples and select the most appropriate mutation operators. We then systematically defend against three extensively studied backdoor attack levels (i.e., char-level, word-level, and sentence-level) by detecting backdoor samples. We also make the first attempt to defend against the latest style-level backdoor attacks. We evaluate our approach on three benchmark datasets (i.e., IMDB, Yelp, and AG news) and three style transfer datasets (i.e., SST-2, Hate-speech, and AG news). The extensive experimental results demonstrate that our approach can detect backdoor samples more efficiently and accurately than the three state-of-the-art defense approaches.
Title: Evaluating Deception and Moving Target Defense with Network Attack Simulation. (arXiv:2301.10629v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2301.10629
- Code URL: null
- Copy Paste:
[[2301.10629] Evaluating Deception and Moving Target Defense with Network Attack Simulation](http://arxiv.org/abs/2301.10629) #defense
- Summary:
In the field of network security, with the ongoing arms race between attackers, seeking new vulnerabilities to bypass defense mechanisms and defenders reinforcing their prevention, detection and response strategies, the novel concept of cyber deception has emerged. Starting from the well-known example of honeypots, many other deception strategies have been developed such as honeytokens and moving target defense, all sharing the objective of creating uncertainty for attackers and increasing the chance for the attacker of making mistakes. In this paper a methodology to evaluate the effectiveness of honeypots and moving target defense in a network is presented. This methodology allows to quantitatively measure the effectiveness in a simulation environment, allowing to make recommendations on how many honeypots to deploy and on how quickly network addresses have to be mutated to effectively disrupt an attack in multiple network and attacker configurations. With this optimum, attacks can be detected and slowed down with a minimal resource and configuration overhead. With the provided methodology, the optimal number of honeypots to be deployed and the optimal network address mutation interval can be determined. Furthermore, this work provides guidance on how to optimally deploy and configure them with respect to the attacker model and several network parameters.
attack
Title: A Data-Centric Approach for Improving Adversarial Training Through the Lens of Out-of-Distribution Detection. (arXiv:2301.10454v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2301.10454
- Code URL: null
- Copy Paste:
[[2301.10454] A Data-Centric Approach for Improving Adversarial Training Through the Lens of Out-of-Distribution Detection](http://arxiv.org/abs/2301.10454) #attack
- Summary:
Current machine learning models achieve super-human performance in many real-world applications. Still, they are susceptible against imperceptible adversarial perturbations. The most effective solution for this problem is adversarial training that trains the model with adversarially perturbed samples instead of original ones. Various methods have been developed over recent years to improve adversarial training such as data augmentation or modifying training attacks. In this work, we examine the same problem from a new data-centric perspective. For this purpose, we first demonstrate that the existing model-based methods can be equivalent to applying smaller perturbation or optimization weights to the hard training examples. By using this finding, we propose detecting and removing these hard samples directly from the training procedure rather than applying complicated algorithms to mitigate their effects. For detection, we use maximum softmax probability as an effective method in out-of-distribution detection since we can consider the hard samples as the out-of-distribution samples for the whole data distribution. Our results on SVHN and CIFAR-10 datasets show the effectiveness of this method in improving the adversarial training without adding too much computational cost.
robust
Title: Learning Trustworthy Model from Noisy Labels based on Rough Set for Surface Defect Detection. (arXiv:2301.10441v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2301.10441
- Code URL: null
- Copy Paste:
[[2301.10441] Learning Trustworthy Model from Noisy Labels based on Rough Set for Surface Defect Detection](http://arxiv.org/abs/2301.10441) #robust
- Summary:
In the surface defect detection, there are some suspicious regions that cannot be uniquely classified as abnormal or normal. The annotating of suspicious regions is easily affected by factors such as workers' emotional fluctuations and judgment standard, resulting in noisy labels, which in turn leads to missing and false detections, and ultimately leads to inconsistent judgments of product quality. Unlike the usual noisy labels, the ones used for surface defect detection appear to be inconsistent rather than mislabeled. The noise occurs in almost every label and is difficult to correct or evaluate. In this paper, we proposed a framework that learns trustworthy models from noisy labels for surface defect defection. At first, to avoid the negative impact of noisy labels on the model, we represent the suspicious regions with consistent and precise elements at the pixel-level and redesign the loss function. Secondly, without changing network structure and adding any extra labels, pluggable spatially correlated Bayesian module is proposed. Finally, the defect discrimination confidence is proposed to measure the uncertainty, with which anomalies can be identified as defects. Our results indicate not only the effectiveness of the proposed method in learning from noisy labels, but also robustness and real-time performance.
Title: Connecting metrics for shape-texture knowledge in computer vision. (arXiv:2301.10608v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2301.10608
- Code URL: null
- Copy Paste:
[[2301.10608] Connecting metrics for shape-texture knowledge in computer vision](http://arxiv.org/abs/2301.10608) #robust
- Summary:
Modern artificial neural networks, including convolutional neural networks and vision transformers, have mastered several computer vision tasks, including object recognition. However, there are many significant differences between the behavior and robustness of these systems and of the human visual system. Deep neural networks remain brittle and susceptible to many changes in the image that do not cause humans to misclassify images. Part of this different behavior may be explained by the type of features humans and deep neural networks use in vision tasks. Humans tend to classify objects according to their shape while deep neural networks seem to rely mostly on texture. Exploring this question is relevant, since it may lead to better performing neural network architectures and to a better understanding of the workings of the vision system of primates. In this work, we advance the state of the art in our understanding of this phenomenon, by extending previous analyses to a much larger set of deep neural network architectures. We found that the performance of models in image classification tasks is highly correlated with their shape bias measured at the output and penultimate layer. Furthermore, our results showed that the number of neurons that represent shape and texture are strongly anti-correlated, thus providing evidence that there is competition between these two types of features. Finally, we observed that while in general there is a correlation between performance and shape bias, there are significant variations between architecture families.
Title: Out of Distribution Performance of State of Art Vision Model. (arXiv:2301.10750v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2301.10750
- Code URL: null
- Copy Paste:
[[2301.10750] Out of Distribution Performance of State of Art Vision Model](http://arxiv.org/abs/2301.10750) #robust
- Summary:
The vision transformer (ViT) has advanced to the cutting edge in the visual recognition task. Transformers are more robust than CNN, according to the latest research. ViT's self-attention mechanism, according to the claim, makes it more robust than CNN. Even with this, we discover that these conclusions are based on unfair experimental conditions and just comparing a few models, which did not allow us to depict the entire scenario of robustness performance. In this study, we investigate the performance of 58 state-of-the-art computer vision models in a unified training setup based not only on attention and convolution mechanisms but also on neural networks based on a combination of convolution and attention mechanisms, sequence-based model, complementary search, and network-based method. Our research demonstrates that robustness depends on the training setup and model types, and performance varies based on out-of-distribution type. Our research will aid the community in better understanding and benchmarking the robustness of computer vision models.
Title: On the Adversarial Robustness of Camera-based 3D Object Detection. (arXiv:2301.10766v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2301.10766
- Code URL: null
- Copy Paste:
[[2301.10766] On the Adversarial Robustness of Camera-based 3D Object Detection](http://arxiv.org/abs/2301.10766) #robust
- Summary:
In recent years, camera-based 3D object detection has gained widespread attention for its ability to achieve high performance with low computational cost. However, the robustness of these methods to adversarial attacks has not been thoroughly examined. In this study, we conduct the first comprehensive investigation of the robustness of leading camera-based 3D object detection methods under various adversarial conditions. Our experiments reveal five interesting findings: (a) the use of accurate depth estimation effectively improves robustness; (b) depth-estimation-free approaches do not show superior robustness; (c) bird's-eye-view-based representations exhibit greater robustness against localization attacks; (d) incorporating multi-frame benign inputs can effectively mitigate adversarial attacks; and (e) addressing long-tail problems can enhance robustness. We hope our work can provide guidance for the design of future camera-based object detection modules with improved adversarial robustness.
Title: Towards Robust Metrics for Concept Representation Evaluation. (arXiv:2301.10367v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2301.10367
- Code URL: null
- Copy Paste:
[[2301.10367] Towards Robust Metrics for Concept Representation Evaluation](http://arxiv.org/abs/2301.10367) #robust
- Summary:
Recent work on interpretability has focused on concept-based explanations, where deep learning models are explained in terms of high-level units of information, referred to as concepts. Concept learning models, however, have been shown to be prone to encoding impurities in their representations, failing to fully capture meaningful features of their inputs. While concept learning lacks metrics to measure such phenomena, the field of disentanglement learning has explored the related notion of underlying factors of variation in the data, with plenty of metrics to measure the purity of such factors. In this paper, we show that such metrics are not appropriate for concept learning and propose novel metrics for evaluating the purity of concept representations in both approaches. We show the advantage of these metrics over existing ones and demonstrate their utility in evaluating the robustness of concept representations and interventions performed on them. In addition, we show their utility for benchmarking state-of-the-art methods from both families and find that, contrary to common assumptions, supervision alone may not be sufficient for pure concept representations.
Title: Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning. (arXiv:2301.10500v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2301.10500
- Code URL: null
- Copy Paste:
[[2301.10500] Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning](http://arxiv.org/abs/2301.10500) #robust
- Summary:
We propose
Banker-OMD
, a novel framework generalizing the classical Online Mirror Descent (OMD) technique in the online learning literature. TheBanker-OMD
framework almost completely decouples feedback delay handling and the task-specific OMD algorithm design, thus allowing the easy design of new algorithms capable of easily and robustly handling feedback delays. Specifically, it offers a general methodology for achieving $\tilde{\mathcal O}(\sqrt{T} + \sqrt{D})$-style regret bounds in online bandit learning tasks with delayed feedback, where $T$ is the number of rounds and $D$ is the total feedback delay. We demonstrate the power of \texttt{Banker-OMD} by applications to two important bandit learning scenarios with delayed feedback, including delayed scale-free adversarial Multi-Armed Bandits (MAB) and delayed adversarial linear bandits.Banker-OMD
leads to the first delayed scale-free adversarial MAB algorithm achieving $\tilde{\mathcal O}(\sqrt{K(D+T)}L)$ regret and the first delayed adversarial linear bandit algorithm achieving $\tilde{\mathcal O}(\text{poly}(n)(\sqrt{T} + \sqrt{D}))$ regret. As a corollary, the first application also implies $\tilde{\mathcal O}(\sqrt{KT}L)$ regret for non-delayed scale-free adversarial MABs, which is the first to match the $\Omega(\sqrt{KT}L)$ lower bound up to logarithmic factors and can be of independent interest.
biometric
steal
extraction
Title: Few-Shot Learning Enables Population-Scale Analysis of Leaf Traits in Populus trichocarpa. (arXiv:2301.10351v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2301.10351
- Code URL: https://github.com/jlager/few-shot-leaf-segmentation
- Copy Paste:
[[2301.10351] Few-Shot Learning Enables Population-Scale Analysis of Leaf Traits in Populus trichocarpa](http://arxiv.org/abs/2301.10351) #extraction
- Summary:
Plant phenotyping is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is the main bottleneck in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. In this work, we address these challenges by leveraging few-shot learning with convolutional neural networks (CNNs) to segment the leaf body and visible venation of 2,906 P. trichocarpa leaf images obtained in the field. In contrast to previous methods, our approach (i) does not require experimental or image pre-processing, (ii) uses the raw RGB images at full resolution, and (iii) requires very few samples for training (e.g., just eight images for vein segmentation). Traits relating to leaf morphology and vein topology are extracted from the resulting segmentations using traditional open-source image-processing tools, validated using real-world physical measurements, and used to conduct a genome-wide association study to identify genes controlling the traits. In this way, the current work is designed to provide the plant phenotyping community with (i) methods for fast and accurate image-based feature extraction that require minimal training data, and (ii) a new population-scale data set, including 68 different leaf phenotypes, for domain scientists and machine learning researchers. All of the few-shot learning code, data, and results are made publicly available.
Title: Local Feature Extraction from Salient Regions by Feature Map Transformation. (arXiv:2301.10413v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2301.10413
- Code URL: null
- Copy Paste:
[[2301.10413] Local Feature Extraction from Salient Regions by Feature Map Transformation](http://arxiv.org/abs/2301.10413) #extraction
- Summary:
Local feature matching is essential for many applications, such as localization and 3D reconstruction. However, it is challenging to match feature points accurately in various camera viewpoints and illumination conditions. In this paper, we propose a framework that robustly extracts and describes salient local features regardless of changing light and viewpoints. The framework suppresses illumination variations and encourages structural information to ignore the noise from light and to focus on edges. We classify the elements in the feature covariance matrix, an implicit feature map information, into two components. Our model extracts feature points from salient regions leading to reduced incorrect matches. In our experiments, the proposed method achieved higher accuracy than the state-of-the-art methods in the public dataset, such as HPatches, Aachen Day-Night, and ETH, which especially show highly variant viewpoints and illumination.
membership infer
federate
Title: When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning. (arXiv:2301.10400v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2301.10400
- Code URL: null
- Copy Paste:
[[2301.10400] When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning](http://arxiv.org/abs/2301.10400) #federate
- Summary:
Federated Learning has become a widely-used framework which allows learning a global model on decentralized local datasets under the condition of protecting local data privacy. However, federated learning faces severe optimization difficulty when training samples are not independently and identically distributed (non-i.i.d.). In this paper, we point out that the client sampling practice plays a decisive role in the aforementioned optimization difficulty. We find that the negative client sampling will cause the merged data distribution of currently sampled clients heavily inconsistent with that of all available clients, and further make the aggregated gradient unreliable. To address this issue, we propose a novel learning rate adaptation mechanism to adaptively adjust the server learning rate for the aggregated gradient in each round, according to the consistency between the merged data distribution of currently sampled clients and that of all available clients. Specifically, we make theoretical deductions to find a meaningful and robust indicator that is positively related to the optimal server learning rate and can effectively reflect the merged data distribution of sampled clients, and we utilize it for the server learning rate adaptation. Extensive experiments on multiple image and text classification tasks validate the great effectiveness of our method.
Title: Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning. (arXiv:2301.10394v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2301.10394
- Code URL: null
- Copy Paste:
[[2301.10394] Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning](http://arxiv.org/abs/2301.10394) #federate
- Summary:
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively in a data privacy-preserving manner. However, the data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model severely biased to the head classes with the majority of the training samples. To alleviate this issue, decoupled training has recently been introduced to FL, considering it has achieved promising results in centralized long-tailed learning by re-balancing the biased classifier after the instance-balanced training. However, the current study restricts the capacity of decoupled training in federated long-tailed learning with a sub-optimal classifier re-trained on a set of pseudo features, due to the unavailability of a global balanced dataset in FL. In this work, in order to re-balance the classifier more effectively, we integrate the local real data with the global gradient prototypes to form the local balanced datasets, and thus re-balance the classifier during the local training. Furthermore, we introduce an extra classifier in the training phase to help model the global data distribution, which addresses the problem of contradictory optimization goals caused by performing classifier re-balancing locally. Extensive experiments show that our method consistently outperforms the existing state-of-the-art methods in various settings.
fair
interpretability
explainability
watermark
diffusion
Title: Score Matching via Differentiable Physics. (arXiv:2301.10250v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2301.10250
- Code URL: null
- Copy Paste:
[[2301.10250] Score Matching via Differentiable Physics](http://arxiv.org/abs/2301.10250) #diffusion
- Summary:
Diffusion models based on stochastic differential equations (SDEs) gradually perturb a data distribution $p(\mathbf{x})$ over time by adding noise to it. A neural network is trained to approximate the score $\nabla_\mathbf{x} \log p_t(\mathbf{x})$ at time $t$, which can be used to reverse the corruption process. In this paper, we focus on learning the score field that is associated with the time evolution according to a physics operator in the presence of natural non-deterministic physical processes like diffusion. A decisive difference to previous methods is that the SDE underlying our approach transforms the state of a physical system to another state at a later time. For that purpose, we replace the drift of the underlying SDE formulation with a differentiable simulator or a neural network approximation of the physics. We propose different training strategies based on the so-called probability flow ODE to fit a training set of simulation trajectories and discuss their relation to the score matching objective. For inference, we sample plausible trajectories that evolve towards a given end state using the reverse-time SDE and demonstrate the competitiveness of our approach for different challenging inverse problems.