secure
Title: Which anonymization technique is best for which NLP task? -- It depends. A Systematic Study on Clinical Text Processing. (arXiv:2209.00262v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00262
- Code URL: null
- Copy Paste:
[[2209.00262] Which anonymization technique is best for which NLP task? -- It depends](http://arxiv.org/abs/2209.00262)
- Summary:
Clinical text processing has gained more and more attention in recent years. The access to sensitive patient data, on the other hand, is still a big challenge, as text cannot be shared without legal hurdles and without removing personal information. There are many techniques to modify or remove patient related information, each with different strengths. This paper investigates the influence of different anonymization techniques on the performance of ML models using multiple datasets corresponding to five different NLP tasks. Several learnings and recommendations are presented. This work confirms that particularly stronger anonymization techniques lead to a significant drop of performance. In addition to that, most of the presented techniques are not secure against a re-identification attack based on similarity search.
Title: Efficient ML Models for Practical Secure Inference. (arXiv:2209.00411v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00411
- Code URL: null
- Copy Paste:
[[2209.00411] Efficient ML Models for Practical Secure Inference](http://arxiv.org/abs/2209.00411)
- Summary:
ML-as-a-service continues to grow, and so does the need for very strong privacy guarantees. Secure inference has emerged as a potential solution, wherein cryptographic primitives allow inference without revealing users' inputs to a model provider or model's weights to a user. For instance, the model provider could be a diagnostics company that has trained a state-of-the-art DenseNet-121 model for interpreting a chest X-ray and the user could be a patient at a hospital. While secure inference is in principle feasible for this setting, there are no existing techniques that make it practical at scale. The CrypTFlow2 framework provides a potential solution with its ability to automatically and correctly translate clear-text inference to secure inference for arbitrary models. However, the resultant secure inference from CrypTFlow2 is impractically expensive: Almost 3TB of communication is required to interpret a single X-ray on DenseNet-121. In this paper, we address this outstanding challenge of inefficiency of secure inference with three contributions. First, we show that the primary bottlenecks in secure inference are large linear layers which can be optimized with the choice of network backbone and the use of operators developed for efficient clear-text inference. This finding and emphasis deviates from many recent works which focus on optimizing non-linear activation layers when performing secure inference of smaller networks. Second, based on analysis of a bottle-necked convolution layer, we design a X-operator which is a more efficient drop-in replacement. Third, we show that the fast Winograd convolution algorithm further improves efficiency of secure inference. In combination, these three optimizations prove to be highly effective for the problem of X-ray interpretation trained on the CheXpert dataset.
security
Title: CPS Attack Detection under Limited Local Information in Cyber Security: A Multi-node Multi-class Classification Ensemble Approach. (arXiv:2209.00170v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00170
- Code URL: null
- Copy Paste:
[[2209.00170] CPS Attack Detection under Limited Local Information in Cyber Security: A Multi-node Multi-class Classification Ensemble Approach](http://arxiv.org/abs/2209.00170)
- Summary:
Cybersecurity breaches are the common anomalies for distributed cyber-physical systems (CPS). However, the cyber security breach classification is still a difficult problem, even using cutting-edge artificial intelligence (AI) approaches. In this paper, we study the multi-class classification problem in cyber security for attack detection. A challenging multi-node data-censoring case is considered. In such a case, data within each data center/node cannot be shared while the local data is incomplete. Particularly, local nodes contain only a part of the multiple classes. In order to train a global multi-class classifier without sharing the raw data across all nodes, the main result of our study is designing a multi-node multi-class classification ensemble approach. By gathering the estimated parameters of the binary classifiers and data densities from each local node, the missing information for each local node is completed to build the global multi-class classifier. Numerical experiments are given to validate the effectiveness of the proposed approach under the multi-node data-censoring case. Under such a case, we even show the out-performance of the proposed approach over the full-data approach.
Title: Memory Tagging: A Memory Efficient Design. (arXiv:2209.00307v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00307
- Code URL: null
- Copy Paste:
[[2209.00307] Memory Tagging: A Memory Efficient Design](http://arxiv.org/abs/2209.00307)
- Summary:
ARM recently introduced a security feature called Memory Tagging Extension or MTE, which is designed to defend against common memory safety vulnerabilities, such as buffer overflow and use after free. In this paper, we examine three aspects of MTE. First, we survey how modern software systems, such as Glibc, Android, Chrome, Linux, and LLVM, use MTE. We identify some common weaknesses and propose improvements. Second, we develop and experiment with an architectural improvement to MTE that improves its memory efficiency. Our design enables longer memory tags, which improves the accuracy of MTE. Finally, we discuss a number of enhancements to MTE to improve its security against certain memory safety attacks.
Title: Towards Assessing Isolation Properties in Partitioning Hypervisors. (arXiv:2209.00405v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00405
- Code URL: null
- Copy Paste:
[[2209.00405] Towards Assessing Isolation Properties in Partitioning Hypervisors](http://arxiv.org/abs/2209.00405)
- Summary:
Partitioning hypervisor solutions are becoming increasingly popular, to ensure stringent security and safety requirements related to isolation between co-hosted applications and to make more efficient use of available hardware resources. However, assessment and certification of isolation requirements remain a challenge and it is not trivial to understand what and how to test to validate these properties. Although the high-level requirements to be verified are mentioned in the different security- and safety-related standards, there is a lack of precise guidelines for the evaluator. This guidance should be comprehensive, generalizable to different products that implement partitioning, and tied specifically to lower-level requirements. The goal of this work is to provide a systematic framework that addresses this need.
Title: Authentication, Authorization, and Selective Disclosure for IoT data sharing using Verifiable Credentials and Zero-Knowledge Proofs. (arXiv:2209.00586v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00586
- Code URL: null
- Copy Paste:
[[2209.00586] Authentication, Authorization, and Selective Disclosure for IoT data sharing using Verifiable Credentials and Zero-Knowledge Proofs](http://arxiv.org/abs/2209.00586)
- Summary:
As IoT becomes omnipresent vast amounts of data are generated, which can be used for building innovative applications. However,interoperability issues and security concerns, prevent harvesting the full potentials of these data. In this paper we consider the use case of data generated by smart buildings. Buildings are becoming ever "smarter" by integrating IoT devices that improve comfort through sensing and automation. However, these devices and their data are usually siloed in specific applications or manufacturers, even though they can be valuable for various interested stakeholders who provide different types of "over the top" services, e.g., energy management. Most data sharing techniques follow an "all or nothing" approach, creating significant security and privacy threats, when even partially revealed, privacy-preserving, data subsets can fuel innovative applications. With these in mind we develop a platform that enables controlled, privacy-preserving sharing of data items. Our system innovates in two directions: Firstly, it provides a framework for allowing discovery and selective disclosure of IoT data without violating their integrity. Secondly, it provides a user-friendly, intuitive mechanisms allowing efficient, fine-grained access control over the shared data. Our solution leverages recent advances in the areas of Self-Sovereign Identities, Verifiable Credentials, and Zero-Knowledge Proofs, and it integrates them in a platform that combines the industry-standard authorization framework OAuth 2.0 and the Web of Things specifications.
privacy
Title: Trading Off Privacy, Utility and Efficiency in Federated Learning. (arXiv:2209.00230v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00230
- Code URL: null
- Copy Paste:
[[2209.00230] Trading Off Privacy, Utility and Efficiency in Federated Learning](http://arxiv.org/abs/2209.00230)
- Summary:
Federated learning (FL) enables participating parties to collaboratively build a global model with boosted utility without disclosing private data information. Appropriate protection mechanisms have to be adopted to fulfill the opposing requirements in preserving \textit{privacy} and maintaining high model \textit{utility}. In addition, it is a mandate for a federated learning system to achieve high \textit{efficiency} in order to enable large-scale model training and deployment. We propose a unified federated learning framework that reconciles horizontal and vertical federated learning. Based on this framework, we formulate and quantify the trade-offs between privacy leakage, utility loss, and efficiency reduction, which leads us to the No-Free-Lunch (NFL) theorem for the federated learning system. NFL indicates that it is unrealistic to expect an FL algorithm to simultaneously provide excellent privacy, utility, and efficiency in certain scenarios. We then analyze the lower bounds for the privacy leakage, utility loss and efficiency reduction for several widely-adopted protection mechanisms including \textit{Randomization}, \textit{Homomorphic Encryption}, \textit{Secret Sharing} and \textit{Compression}. Our analysis could serve as a guide for selecting protection parameters to meet particular requirements.
Title: Ensembling Neural Networks for Improved Prediction and Privacy in Early Diagnosis of Sepsis. (arXiv:2209.00439v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00439
- Code URL: https://github.com/statnlp/sepens
- Copy Paste:
[[2209.00439] Ensembling Neural Networks for Improved Prediction and Privacy in Early Diagnosis of Sepsis](http://arxiv.org/abs/2209.00439)
- Summary:
Ensembling neural networks is a long-standing technique for improving the generalization error of neural networks by combining networks with orthogonal properties via a committee decision. We show that this technique is an ideal fit for machine learning on medical data: First, ensembles are amenable to parallel and asynchronous learning, thus enabling efficient training of patient-specific component neural networks. Second, building on the idea of minimizing generalization error by selecting uncorrelated patient-specific networks, we show that one can build an ensemble of a few selected patient-specific models that outperforms a single model trained on much larger pooled datasets. Third, the non-iterative ensemble combination step is an optimal low-dimensional entry point to apply output perturbation to guarantee the privacy of the patient-specific networks. We exemplify our framework of differentially private ensembles on the task of early prediction of sepsis, using real-life intensive care unit data labeled by clinical experts.
protect
defense
attack
Title: On the detection of morphing attacks generated by GANs. (arXiv:2209.00404v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00404
- Code URL: null
- Copy Paste:
[[2209.00404] On the detection of morphing attacks generated by GANs](http://arxiv.org/abs/2209.00404)
- Summary:
Recent works have demonstrated the feasibility of GAN-based morphing attacks that reach similar success rates as more traditional landmark-based methods. This new type of "deep" morphs might require the development of new adequate detectors to protect face recognition systems. We explore simple deep morph detection baselines based on spectral features and LBP histograms features, as well as on CNN models, both in the intra-dataset and cross-dataset case. We observe that simple LBP-based systems are already quite accurate in the intra-dataset setting, but struggle with generalization, a phenomenon that is partially mitigated by fusing together several of those systems at score-level. We conclude that a pretrained ResNet effective for GAN image detection is the most effective overall, reaching close to perfect accuracy. We note however that LBP-based systems maintain a level of interest : additionally to their lower computational requirements and increased interpretability with respect to CNNs, LBP+ResNet fusions sometimes also showcase increased performance versus ResNet-only, hinting that LBP-based systems can focus on meaningful signal that is not necessarily picked up by the CNN detector.
Title: Be Your Own Neighborhood: Detecting Adversarial Example by the Neighborhood Relations Built on Self-Supervised Learning. (arXiv:2209.00005v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00005
- Code URL: null
- Copy Paste:
[[2209.00005] Be Your Own Neighborhood: Detecting Adversarial Example by the Neighborhood Relations Built on Self-Supervised Learning](http://arxiv.org/abs/2209.00005)
- Summary:
Deep Neural Networks (DNNs) have achieved excellent performance in various fields. However, DNNs' vulnerability to Adversarial Examples (AE) hinders their deployments to safety-critical applications. This paper presents a novel AE detection framework, named BEYOND, for trustworthy predictions. BEYOND performs the detection by distinguishing the AE's abnormal relation with its augmented versions, i.e. neighbors, from two prospects: representation similarity and label consistency. An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label for its highly informative representation capacity compared to supervised learning models. For clean samples, their representations and predictions are closely consistent with their neighbors, whereas those of AEs differ greatly. Furthermore, we explain this observation and show that by leveraging this discrepancy BEYOND can effectively detect AEs. We develop a rigorous justification for the effectiveness of BEYOND. Furthermore, as a plug-and-play model, BEYOND can easily cooperate with the Adversarial Trained Classifier (ATC), achieving the state-of-the-art (SOTA) robustness accuracy. Experimental results show that BEYOND outperforms baselines by a large margin, especially under adaptive attacks. Empowered by the robust relation net built on SSL, we found that BEYOND outperforms baselines in terms of both detection ability and speed. Our code will be publicly available.
Title: Wiggle: Physical Challenge-Response Verification of Vehicle Platooning. (arXiv:2209.00080v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00080
- Code URL: null
- Copy Paste:
[[2209.00080] Wiggle: Physical Challenge-Response Verification of Vehicle Platooning](http://arxiv.org/abs/2209.00080)
- Summary:
Autonomous vehicle platooning promises many benefits such as fuel efficiency, road safety, reduced traffic congestion, and passenger comfort. Platooning vehicles travel in a single file, in close distance, and at the same velocity. The platoon formation is autonomously maintained by a Cooperative Adaptive Cruise Control (CACC) system which relies on sensory data and vehicle-to-vehicle (V2V) communications. In fact, V2V messages play a critical role in shortening the platooning distance while maintaining safety. Whereas V2V message integrity and source authentication can be verified via cryptographic methods, establishing the truthfulness of the message contents is a much harder task.
This work establishes a physical access control mechanism to restrict V2V messages to platooning members. Specifically, we aim at tying the digital identity of a candidate requesting to join a platoon to its physical trajectory relative to the platoon. We propose the {\em Wiggle} protocol that employs a physical challenge-response exchange to prove that a candidate requesting to be admitted into a platoon actually follows it. The protocol name is inspired by the random longitudinal movements that the candidate is challenged to execute. {\em Wiggle} prevents any remote adversary from joining the platoon and injecting fake CACC messages. Compared to prior works, {\em Wiggle} is resistant to pre-recording attacks and can verify that the candidate is directly behind the verifier at the same lane.
Title: Attack Tactic Identification by Transfer Learning of Language Model. (arXiv:2209.00263v1 [cs.CR])
- Paper URL: http://arxiv.org/abs/2209.00263
- Code URL: null
- Copy Paste:
[[2209.00263] Attack Tactic Identification by Transfer Learning of Language Model](http://arxiv.org/abs/2209.00263)
- Summary:
Cybersecurity has become a primary global concern with the rapid increase in security attacks and data breaches. Artificial intelligence is promising to help humans analyzing and identifying attacks. However, labeling millions of packets for supervised learning is never easy. This study aims to leverage transfer learning technique that stores the knowledge gained from well-defined attack lifecycle documents and applies it to hundred thousands of unlabeled attacks (packets) for identifying their attack tactics. We anticipate the knowledge of an attack is well-described in the documents, and the cutting edge transformer-based language model can embed the knowledge into a high-dimensional latent space. Then, reusing the information from the language model for the learning of attack tactic carried by packets to improve the learning efficiency. We propose a system, PELAT, that fine-tunes BERT model with 1,417 articles from MITRE ATT&CK lifecycle framework to enhance its attack knowledge (including syntax used and semantic meanings embedded). PELAT then transfers its knowledge to perform semi-supervised learning for unlabeled packets to generate their tactic labels. Further, when a new attack packet arrives, the packet payload will be processed by the PELAT language model with a downstream classifier to predict its tactics. In this way, we can effectively reduce the burden of manually labeling big datasets. In a one-week honeypot attack dataset (227 thousand packets per day), PELAT performs 99% of precision, recall, and F1 on testing dataset. PELAT can infer over 99% of tactics on two other testing datasets (while nearly 90% of tactics are identified).
Title: Probabilistic Deduction: an Approach to Probabilistic Structured Argumentation. (arXiv:2209.00210v1 [cs.AI])
- Paper URL: http://arxiv.org/abs/2209.00210
- Code URL: null
- Copy Paste:
[[2209.00210] Probabilistic Deduction: an Approach to Probabilistic Structured Argumentation](http://arxiv.org/abs/2209.00210)
- Summary:
This paper introduces Probabilistic Deduction (PD) as an approach to probabilistic structured argumentation. A PD framework is composed of probabilistic rules (p-rules). As rules in classical structured argumentation frameworks, p-rules form deduction systems. In addition, p-rules also represent conditional probabilities that define joint probability distributions. With PD frameworks, one performs probabilistic reasoning by solving Rule-Probabilistic Satisfiability. At the same time, one can obtain an argumentative reading to the probabilistic reasoning with arguments and attacks. In this work, we introduce a probabilistic version of the Closed-World Assumption (P-CWA) and prove that our probabilistic approach coincides with the complete extension in classical argumentation under P-CWA and with maximum entropy reasoning. We present several approaches to compute the joint probability distribution from p-rules for achieving a practical proof theory for PD. PD provides a framework to unify probabilistic reasoning with argumentative reasoning. This is the first work in probabilistic structured argumentation where the joint distribution is not assumed form external sources.
robust
Title: Addressing Class Imbalance in Semi-supervised Image Segmentation: A Study on Cardiac MRI. (arXiv:2209.00123v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00123
- Code URL: null
- Copy Paste:
[[2209.00123] Addressing Class Imbalance in Semi-supervised Image Segmentation: A Study on Cardiac MRI](http://arxiv.org/abs/2209.00123)
- Summary:
Due to the imbalanced and limited data, semi-supervised medical image segmentation methods often fail to produce superior performance for some specific tailed classes. Inadequate training for those particular classes could introduce more noise to the generated pseudo labels, affecting overall learning. To alleviate this shortcoming and identify the under-performing classes, we propose maintaining a confidence array that records class-wise performance during training. A fuzzy fusion of these confidence scores is proposed to adaptively prioritize individual confidence metrics in every sample rather than traditional ensemble approaches, where a set of predefined fixed weights are assigned for all the test cases. Further, we introduce a robust class-wise sampling method and dynamic stabilization for a better training strategy. Our proposed method considers all the under-performing classes with dynamic weighting and tries to remove most of the noises during training. Upon evaluation on two cardiac MRI datasets, ACDC and MMWHS, our proposed method shows effectiveness and generalizability and outperforms several state-of-the-art methods found in the literature.
Title: Wasserstein Embedding for Capsule Learning. (arXiv:2209.00232v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00232
- Code URL: null
- Copy Paste:
[[2209.00232] Wasserstein Embedding for Capsule Learning](http://arxiv.org/abs/2209.00232)
- Summary:
Capsule networks (CapsNets) aim to parse images into a hierarchical component structure that consists of objects, parts, and their relations. Despite their potential, they are computationally expensive and pose a major drawback, which limits utilizing these networks efficiently on more complex datasets. The current CapsNet models only compare their performance with the capsule baselines and do not perform at the same level as deep CNN-based models on complicated tasks. This paper proposes an efficient way for learning capsules that detect atomic parts of an input image, through a group of SubCapsules, upon which an input vector is projected. Subsequently, we present the Wasserstein Embedding Module that first measures the dissimilarity between the input and components modeled by the SubCapsules, and then finds their degree of alignment based on the learned optimal transport. This strategy leverages new insights on defining alignment between the input and SubCapsules based on the similarity between their respective component distributions. Our proposed model, (i) is lightweight and allows to apply capsules for more complex vision tasks; (ii) performs better than or at par with CNN-based models on these challenging tasks. Our experimental results indicate that Wasserstein Embedding Capsules (WECapsules) perform more robustly on affine transformations, effectively scale up to larger datasets, and outperform the CNN and CapsNet models in several vision tasks.
Title: Combating Noisy Labels in Long-Tailed Image Classification. (arXiv:2209.00273v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00273
- Code URL: null
- Copy Paste:
[[2209.00273] Combating Noisy Labels in Long-Tailed Image Classification](http://arxiv.org/abs/2209.00273)
- Summary:
Most existing methods that cope with noisy labels usually assume that the class distributions are well balanced, which has insufficient capacity to deal with the practical scenarios where training samples have imbalanced distributions. To this end, this paper makes an early effort to tackle the image classification task with both long-tailed distribution and label noise. Existing noise-robust learning methods cannot work in this scenario as it is challenging to differentiate noisy samples from clean samples of tail classes. To deal with this problem, we propose a new learning paradigm based on matching between inferences on weak and strong data augmentations to screen out noisy samples and introduce a leave-noise-out regularization to eliminate the effect of the recognized noisy samples. Furthermore, we incorporate a novel prediction penalty based on online prior distribution to avoid bias towards head classes. This mechanism has superiority in capturing the class fitting degree in realtime compared to the existing long-tail classification methods. Exhaustive experiments demonstrate that the proposed method outperforms state-of-the-art algorithms that address the distribution imbalance problem in long-tailed classification under noisy labels.
Title: Gait Recognition in the Wild with Multi-hop Temporal Switch. (arXiv:2209.00355v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00355
- Code URL: null
- Copy Paste:
[[2209.00355] Gait Recognition in the Wild with Multi-hop Temporal Switch](http://arxiv.org/abs/2209.00355)
- Summary:
Existing studies for gait recognition are dominated by in-the-lab scenarios. Since people live in real-world senses, gait recognition in the wild is a more practical problem that has recently attracted the attention of the community of multimedia and computer vision. Current methods that obtain state-of-the-art performance on in-the-lab benchmarks achieve much worse accuracy on the recently proposed in-the-wild datasets because these methods can hardly model the varied temporal dynamics of gait sequences in unconstrained scenes. Therefore, this paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes. Concretely, we design a novel gait recognition network, named Multi-hop Temporal Switch Network (MTSGait), to learn spatial features and multi-scale temporal features simultaneously. Different from existing methods that use 3D convolutions for temporal modeling, our MTSGait models the temporal dynamics of gait sequences by 2D convolutions. By this means, it achieves high efficiency with fewer model parameters and reduces the difficulty in optimization compared with 3D convolution-based models. Based on the specific design of the 2D convolution kernels, our method can eliminate the misalignment of features among adjacent frames. In addition, a new sampling strategy, i.e., non-cyclic continuous sampling, is proposed to make the model learn more robust temporal features. Finally, the proposed method achieves superior performance on two public gait in-the-wild datasets, i.e., GREW and Gait3D, compared with state-of-the-art methods.
Title: TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning. (arXiv:2209.00489v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00489
- Code URL: null
- Copy Paste:
[[2209.00489] TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning](http://arxiv.org/abs/2209.00489)
- Summary:
We introduce TempCLR, a new time-coherent contrastive learning approach for the structured regression task of 3D hand reconstruction. Unlike previous time-contrastive methods for hand pose estimation, our framework considers temporal consistency in its augmentation scheme, and accounts for the differences of hand poses along the temporal direction. Our data-driven method leverages unlabelled videos and a standard CNN, without relying on synthetic data, pseudo-labels, or specialized architectures. Our approach improves the performance of fully-supervised hand reconstruction methods by 15.9% and 7.6% in PA-V2V on the HO-3D and FreiHAND datasets respectively, thus establishing new state-of-the-art performance. Finally, we demonstrate that our approach produces smoother hand reconstructions through time, and is more robust to heavy occlusions compared to the previous state-of-the-art which we show quantitatively and qualitatively. Our code and models will be available at https://eth-ait.github.io/tempclr.
Title: Implicit and Efficient Point Cloud Completion for 3D Single Object Tracking. (arXiv:2209.00522v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00522
- Code URL: null
- Copy Paste:
[[2209.00522] Implicit and Efficient Point Cloud Completion for 3D Single Object Tracking](http://arxiv.org/abs/2209.00522)
- Summary:
The point cloud based 3D single object tracking (3DSOT) has drawn increasing attention. Lots of breakthroughs have been made, but we also reveal two severe issues. By an extensive analysis, we find the prediction manner of current approaches is non-robust, i.e., exposing a misalignment gap between prediction score and actually localization accuracy. Another issue is the sparse point returns will damage the feature matching procedure of the SOT task. Based on these insights, we introduce two novel modules, i.e., Adaptive Refine Prediction (ARP) and Target Knowledge Transfer (TKT), to tackle them, respectively. To this end, we first design a strong pipeline to extract discriminative features and conduct the matching procedure with the attention mechanism. Then, ARP module is proposed to tackle the misalignment issue by aggregating all predicted candidates with valuable clues. Finally, TKT module is designed to effectively overcome incomplete point cloud due to sparse and occlusion issues. We call our overall framework PCET. By conducting extensive experiments on the KITTI and Waymo Open Dataset, our model achieves state-of-the-art performance while maintaining a lower computational consumption.
Title: Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition. (arXiv:2209.00641v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00641
- Code URL: null
- Copy Paste:
[[2209.00641] Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition](http://arxiv.org/abs/2209.00641)
- Summary:
This paper looks at semi-supervised learning (SSL) for image-based text recognition. One of the most popular SSL approaches is pseudo-labeling (PL). PL approaches assign labels to unlabeled data before re-training the model with a combination of labeled and pseudo-labeled data. However, PL methods are severely degraded by noise and are prone to over-fitting to noisy labels, due to the inclusion of erroneous high confidence pseudo-labels generated from poorly calibrated models, thus, rendering threshold-based selection ineffective. Moreover, the combinatorial complexity of the hypothesis space and the error accumulation due to multiple incorrect autoregressive steps posit pseudo-labeling challenging for sequence models. To this end, we propose a pseudo-label generation and an uncertainty-based data selection framework for semi-supervised text recognition. We first use Beam-Search inference to yield highly probable hypotheses to assign pseudo-labels to the unlabelled examples. Then we adopt an ensemble of models, sampled by applying dropout, to obtain a robust estimate of the uncertainty associated with the prediction, considering both the character-level and word-level predictive distribution to select good quality pseudo-labels. Extensive experiments on several benchmark handwriting and scene-text datasets show that our method outperforms the baseline approaches and the previous state-of-the-art semi-supervised text-recognition methods.
Title: Isotropic Representation Can Improve Dense Retrieval. (arXiv:2209.00218v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00218
- Code URL: null
- Copy Paste:
[[2209.00218] Isotropic Representation Can Improve Dense Retrieval](http://arxiv.org/abs/2209.00218)
- Summary:
The recent advancement in language representation modeling has broadly affected the design of dense retrieval models. In particular, many of the high-performing dense retrieval models evaluate representations of query and document using BERT, and subsequently apply a cosine-similarity based scoring to determine the relevance. BERT representations, however, are known to follow an anisotropic distribution of a narrow cone shape and such an anisotropic distribution can be undesirable for the cosine-similarity based scoring. In this work, we first show that BERT-based DR also follows an anisotropic distribution. To cope with the problem, we introduce unsupervised post-processing methods of Normalizing Flow and whitening, and develop token-wise method in addition to the sequence-wise method for applying the post-processing methods to the representations of dense retrieval models. We show that the proposed methods can effectively enhance the representations to be isotropic, then we perform experiments with ColBERT and RepBERT to show that the performance (NDCG at 10) of document re-ranking can be improved by 5.17\%$\sim$8.09\% for ColBERT and 6.88\%$\sim$22.81\% for RepBERT. To examine the potential of isotropic representation for improving the robustness of DR models, we investigate out-of-distribution tasks where the test dataset differs from the training dataset. The results show that isotropic representation can achieve a generally improved performance. For instance, when training dataset is MS-MARCO and test dataset is Robust04, isotropy post-processing can improve the baseline performance by up to 24.98\%. Furthermore, we show that an isotropic model trained with an out-of-distribution dataset can even outperform a baseline model trained with the in-distribution dataset.
Title: Holomorphic Equilibrium Propagation Computes Exact Gradients Through Finite Size Oscillations. (arXiv:2209.00530v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00530
- Code URL: null
- Copy Paste:
[[2209.00530] Holomorphic Equilibrium Propagation Computes Exact Gradients Through Finite Size Oscillations](http://arxiv.org/abs/2209.00530)
- Summary:
Equilibrium propagation (EP) is an alternative to backpropagation (BP) that allows the training of deep neural networks with local learning rules. It thus provides a compelling framework for training neuromorphic systems and understanding learning in neurobiology. However, EP requires infinitesimal teaching signals, thereby limiting its applicability in noisy physical systems. Moreover, the algorithm requires separate temporal phases and has not been applied to large-scale problems. Here we address these issues by extending EP to holomorphic networks. We show analytically that this extension naturally leads to exact gradients even for finite-amplitude teaching signals. Importantly, the gradient can be computed as the first Fourier coefficient from finite neuronal activity oscillations in continuous time without requiring separate phases. Further, we demonstrate in numerical simulations that our approach permits robust estimation of gradients in the presence of noise and that deeper models benefit from the finite teaching signals. Finally, we establish the first benchmark for EP on the ImageNet 32x32 dataset and show that it matches the performance of an equivalent network trained with BP. Our work provides analytical insights that enable scaling EP to large-scale problems and establishes a formal framework for how oscillations could support learning in biological and neuromorphic systems.
Title: Progressive Fusion for Multimodal Integration. (arXiv:2209.00302v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00302
- Code URL: null
- Copy Paste:
[[2209.00302] Progressive Fusion for Multimodal Integration](http://arxiv.org/abs/2209.00302)
- Summary:
Integration of multimodal information from various sources has been shown to boost the performance of machine learning models and thus has received increased attention in recent years. Often such models use deep modality-specific networks to obtain unimodal features which are combined to obtain "late-fusion" representations. However, these designs run the risk of information loss in the respective unimodal pipelines. On the other hand, "early-fusion" methodologies, which combine features early, suffer from the problems associated with feature heterogeneity and high sample complexity. In this work, we present an iterative representation refinement approach, called Progressive Fusion, which mitigates the issues with late fusion representations. Our model-agnostic technique introduces backward connections that make late stage fused representations available to early layers, improving the expressiveness of the representations at those stages, while retaining the advantages of late fusion designs. We test Progressive Fusion on tasks including affective sentiment detection, multimedia analysis, and time series fusion with different models, demonstrating its versatility. We show that our approach consistently improves performance, for instance attaining a 5% reduction in MSE and 40% improvement in robustness on multimodal time series prediction.
Title: Learning with Differentiable Algorithms. (arXiv:2209.00616v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00616
- Code URL: null
- Copy Paste:
[[2209.00616] Learning with Differentiable Algorithms](http://arxiv.org/abs/2209.00616)
- Summary:
Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path in a large graph, neural networks allow learning from data to predict the most likely answer in more complex tasks such as image classification, which cannot be reduced to an exact algorithm. To get the best of both worlds, this thesis explores combining both concepts leading to more robust, better performing, more interpretable, more computationally efficient, and more data efficient architectures. The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm. When integrating an algorithm into a neural architecture, it is important that the algorithm is differentiable such that the architecture can be trained end-to-end and gradients can be propagated back through the algorithm in a meaningful way. To make algorithms differentiable, this thesis proposes a general method for continuously relaxing algorithms by perturbing variables and approximating the expectation value in closed form, i.e., without sampling. In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable renderers, and differentiable logic gate networks. Finally, this thesis presents alternative training strategies for learning with algorithms.
biometric
steal
extraction
Title: MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment. (arXiv:2209.00244v1 [cs.CV])
- Paper URL: http://arxiv.org/abs/2209.00244
- Code URL: null
- Copy Paste:
[[2209.00244] MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment](http://arxiv.org/abs/2209.00244)
- Summary:
The visual quality of point clouds has been greatly emphasized since the ever-increasing 3D vision applications are expected to provide cost-effective and high-quality experiences for users. Looking back on the development of point cloud quality assessment (PCQA) methods, the visual quality is usually evaluated by utilizing single-modal information, i.e., either extracted from the 2D projections or 3D point cloud. The 2D projections contain rich texture and semantic information but are highly dependent on viewpoints, while the 3D point clouds are more sensitive to geometry distortions and invariant to viewpoints. Therefore, to leverage the advantages of both point cloud and projected image modalities, we propose a novel no-reference point cloud quality assessment (NR-PCQA) metric in a multi-modal fashion. In specific, we split the point clouds into sub-models to represent local geometry distortions such as point shift and down-sampling. Then we render the point clouds into 2D image projections for texture feature extraction. To achieve the goals, the sub-models and projected images are encoded with point-based and image-based neural networks. Finally, symmetric cross-modal attention is employed to fuse multi-modal quality-aware information. Experimental results show that our approach outperforms all compared state-of-the-art methods and is far ahead of previous NR-PCQA methods, which highlights the effectiveness of the proposed method.
Title: Less is More: Rethinking State-of-the-art Continual Relation Extraction Models with a Frustratingly Easy but Effective Approach. (arXiv:2209.00243v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00243
- Code URL: null
- Copy Paste:
[[2209.00243] Less is More: Rethinking State-of-the-art Continual Relation Extraction Models with a Frustratingly Easy but Effective Approach](http://arxiv.org/abs/2209.00243)
- Summary:
Continual relation extraction (CRE) requires the model to continually learn new relations from class-incremental data streams. In this paper, we propose a Frustratingly easy but Effective Approach (FEA) method with two learning stages for CRE: 1) Fast Adaption (FA) warms up the model with only new data. 2) Balanced Tuning (BT) finetunes the model on the balanced memory data. Despite its simplicity, FEA achieves comparable (on TACRED or superior (on FewRel) performance compared with the state-of-the-art baselines. With careful examinations, we find that the data imbalance between new and old relations leads to a skewed decision boundary in the head classifiers over the pretrained encoders, thus hurting the overall performance. In FEA, the FA stage unleashes the potential of memory data for the subsequent finetuning, while the BT stage helps establish a more balanced decision boundary. With a unified view, we find that two strong CRE baselines can be subsumed into the proposed training pipeline. The success of FEA also provides actionable insights and suggestions for future model designing in CRE.
Title: Find the Funding: Entity Linking with Incomplete Funding Knowledge Bases. (arXiv:2209.00351v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00351
- Code URL: null
- Copy Paste:
[[2209.00351] Find the Funding: Entity Linking with Incomplete Funding Knowledge Bases](http://arxiv.org/abs/2209.00351)
- Summary:
Automatic extraction of funding information from academic articles adds significant value to industry and research communities, such as tracking research outcomes by funding organizations, profiling researchers and universities based on the received funding, and supporting open access policies. Two major challenges of identifying and linking funding entities are: (i) sparse graph structure of the Knowledge Base (KB), which makes the commonly used graph-based entity linking approaches suboptimal for the funding domain, (ii) missing entities in KB, which (unlike recent zero-shot approaches) requires marking entity mentions without KB entries as NIL. We propose an entity linking model that can perform NIL prediction and overcome data scarcity issues in a time and data-efficient manner. Our model builds on a transformer-based mention detection and bi-encoder model to perform entity linking. We show that our model outperforms strong existing baselines.
Title: KoCHET: a Korean Cultural Heritage corpus for Entity-related Tasks. (arXiv:2209.00367v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00367
- Code URL: null
- Copy Paste:
[[2209.00367] KoCHET: a Korean Cultural Heritage corpus for Entity-related Tasks](http://arxiv.org/abs/2209.00367)
- Summary:
As digitized traditional cultural heritage documents have rapidly increased, resulting in an increased need for preservation and management, practical recognition of entities and typification of their classes has become essential. To achieve this, we propose KoCHET - a Korean cultural heritage corpus for the typical entity-related tasks, i.e., named entity recognition (NER), relation extraction (RE), and entity typing (ET). Advised by cultural heritage experts based on the data construction guidelines of government-affiliated organizations, KoCHET consists of respectively 112,362, 38,765, 113,198 examples for NER, RE, and ET tasks, covering all entity types related to Korean cultural heritage. Moreover, unlike the existing public corpora, modified redistribution can be allowed both domestic and foreign researchers. Our experimental results make the practical usability of KoCHET more valuable in terms of cultural heritage. We also provide practical insights of KoCHET in terms of statistical and linguistic analysis. Our corpus is freely available at https://github.com/Gyeongmin47/KoCHET.
Title: Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods. (arXiv:2209.00470v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00470
- Code URL: https://github.com/umcu/negation-detection
- Copy Paste:
[[2209.00470] Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods](http://arxiv.org/abs/2209.00470)
- Summary:
As structured data are often insufficient, labels need to be extracted from free text in electronic health records when developing models for clinical information retrieval and decision support systems. One of the most important contextual properties in clinical text is negation, which indicates the absence of findings. We aimed to improve large scale extraction of labels by comparing three methods for negation detection in Dutch clinical notes. We used the Erasmus Medical Center Dutch Clinical Corpus to compare a rule-based method based on ContextD, a biLSTM model using MedCAT and (finetuned) RoBERTa-based models. We found that both the biLSTM and RoBERTa models consistently outperform the rule-based model in terms of F1 score, precision and recall. In addition, we systematically categorized the classification errors for each model, which can be used to further improve model performance in particular applications. Combining the three models naively was not beneficial in terms of performance. We conclude that the biLSTM and RoBERTa-based models in particular are highly accurate accurate in detecting clinical negations, but that ultimately all three approaches can be viable depending on the use case at hand.
Title: Multi-Scale Contrastive Co-Training for Event Temporal Relation Extraction. (arXiv:2209.00568v1 [cs.CL])
- Paper URL: http://arxiv.org/abs/2209.00568
- Code URL: null
- Copy Paste:
[[2209.00568] Multi-Scale Contrastive Co-Training for Event Temporal Relation Extraction](http://arxiv.org/abs/2209.00568)
- Summary:
Extracting temporal relationships between pairs of events in texts is a crucial yet challenging problem for natural language understanding. Depending on the distance between the events, models must learn to differently balance information from local and global contexts surrounding the event pair for temporal relation prediction. Learning how to fuse this information has proved challenging for transformer-based language models. Therefore, we present MulCo: Multi-Scale Contrastive Co-Training, a technique for the better fusion of local and global contextualized features. Our model uses a BERT-based language model to encode local context and a Graph Neural Network (GNN) to represent global document-level syntactic and temporal characteristics. Unlike previous state-of-the-art methods, which use simple concatenation on multi-view features or select optimal sentences using sophisticated reinforcement learning approaches, our model co-trains GNN and BERT modules using a multi-scale contrastive learning objective. The GNN and BERT modules learn a synergistic parameterization by contrasting GNN multi-layer multi-hop subgraphs (i.e., global context embeddings) and BERT outputs (i.e., local context embeddings) through end-to-end back-propagation. We empirically demonstrate that MulCo provides improved ability to fuse local and global contexts encoded using BERT and GNN compared to the current state-of-the-art. Our experimental results show that MulCo achieves new state-of-the-art results on several temporal relation extraction datasets.
membership infer
federate
Title: Federated Learning with Label Distribution Skew via Logits Calibration. (arXiv:2209.00189v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00189
- Code URL: null
- Copy Paste:
[[2209.00189] Federated Learning with Label Distribution Skew via Logits Calibration](http://arxiv.org/abs/2209.00189)
- Summary:
Traditional federated optimization methods perform poorly with heterogeneous data (ie, accuracy reduction), especially for highly skewed data. In this paper, we investigate the label distribution skew in FL, where the distribution of labels varies across clients. First, we investigate the label distribution skew from a statistical view. We demonstrate both theoretically and empirically that previous methods based on softmax cross-entropy are not suitable, which can result in local models heavily overfitting to minority classes and missing classes. Additionally, we theoretically introduce a deviation bound to measure the deviation of the gradient after local update. At last, we propose FedLC (\textbf {Fed} erated learning via\textbf {L} ogits\textbf {C} alibration), which calibrates the logits before softmax cross-entropy according to the probability of occurrence of each class. FedLC applies a fine-grained calibrated cross-entropy loss to local update by adding a pairwise label margin. Extensive experiments on federated datasets and real-world datasets demonstrate that FedLC leads to a more accurate global model and much improved performance. Furthermore, integrating other FL methods into our approach can further enhance the performance of the global model.
Title: Online Data Selection for Federated Learning with Limited Storage. (arXiv:2209.00195v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00195
- Code URL: null
- Copy Paste:
[[2209.00195] Online Data Selection for Federated Learning with Limited Storage](http://arxiv.org/abs/2209.00195)
- Summary:
Machine learning models have been deployed in mobile networks to deal with the data from different layers to enable automated network management and intelligence on devices. To overcome high communication cost and severe privacy concerns of centralized machine learning, Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices. While the computation and communication limitation has been widely studied in FL, the impact of on-device storage on the performance of FL is still not explored. Without an efficient and effective data selection policy to filter the abundant streaming data on devices, classical FL can suffer from much longer model training time (more than $4\times$) and significant inference accuracy reduction (more than $7\%$), observed in our experiments. In this work, we take the first step to consider the online data selection for FL with limited on-device storage. We first define a new data valuation metric for data selection in FL: the projection of local gradient over an on-device data sample onto the global gradient over the data from all devices. We further design \textbf{ODE}, a framework of \textbf{O}nline \textbf{D}ata s\textbf{E}lection for FL, to coordinate networked devices to store valuable data samples collaboratively, with theoretical guarantees for speeding up model convergence and enhancing final model accuracy, simultaneously. Experimental results on one industrial task (mobile network traffic classification) and three public tasks (synthetic task, image classification, human activity recognition) show the remarkable advantages of ODE over the state-of-the-art approaches. Particularly, on the industrial dataset, ODE achieves as high as $2.5\times$ speedup of training time and $6\%$ increase in final inference accuracy, and is robust to various factors in the practical environment.
Title: Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning. (arXiv:2209.00361v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00361
- Code URL: null
- Copy Paste:
[[2209.00361] Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning](http://arxiv.org/abs/2209.00361)
- Summary:
While variance reduction methods have shown great success in solving large scale optimization problems, many of them suffer from accumulated errors and, therefore, should periodically require the full gradient computation. In this paper, we present a single-loop algorithm named SLEDGE (Single-Loop mEthoD for Gradient Estimator) for finite-sum nonconvex optimization, which does not require periodic refresh of the gradient estimator but achieves nearly optimal gradient complexity. Unlike existing methods, SLEDGE has the advantage of versatility; (i) second-order optimality, (ii) exponential convergence in the PL region, and (iii) smaller complexity under less heterogeneity of data.
We build an efficient federated learning algorithm by exploiting these favorable properties. We show the first and second-order optimality of the output and also provide analysis under PL conditions. When the local budget is sufficiently large and clients are less (Hessian-)~heterogeneous, the algorithm requires fewer communication rounds then existing methods such as FedAvg, SCAFFOLD, and Mime. The superiority of our method is verified in numerical experiments.
fair
Title: Fair mapping. (arXiv:2209.00617v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00617
- Code URL: null
- Copy Paste:
[[2209.00617] Fair mapping](http://arxiv.org/abs/2209.00617)
- Summary:
To mitigate the effects of undesired biases in models, several approaches propose to pre-process the input dataset to reduce the risks of discrimination by preventing the inference of sensitive attributes. Unfortunately, most of these pre-processing methods lead to the generation a new distribution that is very different from the original one, thus often leading to unrealistic data. As a side effect, this new data distribution implies that existing models need to be re-trained to be able to make accurate predictions. To address this issue, we propose a novel pre-processing method, that we coin as fair mapping, based on the transformation of the distribution of protected groups onto a chosen target one, with additional privacy constraints whose objective is to prevent the inference of sensitive attributes. More precisely, we leverage on the recent works of the Wasserstein GAN and AttGAN frameworks to achieve the optimal transport of data points coupled with a discriminator enforcing the protection against attribute inference. Our proposed approach, preserves the interpretability of data and can be used without defining exactly the sensitive groups. In addition, our approach can be specialized to model existing state-of-the-art approaches, thus proposing a unifying view on these methods. Finally, several experiments on real and synthetic datasets demonstrate that our approach is able to hide the sensitive attributes, while limiting the distortion of the data and improving the fairness on subsequent data analysis tasks.
interpretability
Title: STDEN: Towards Physics-Guided Neural Networks for Traffic Flow Prediction. (arXiv:2209.00225v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00225
- Code URL: https://github.com/echo-ji/stden
- Copy Paste:
[[2209.00225] STDEN: Towards Physics-Guided Neural Networks for Traffic Flow Prediction](http://arxiv.org/abs/2209.00225)
- Summary:
High-performance traffic flow prediction model designing, a core technology of Intelligent Transportation System, is a long-standing but still challenging task for industrial and academic communities. The lack of integration between physical principles and data-driven models is an important reason for limiting the development of this field. In the literature, physics-based methods can usually provide a clear interpretation of the dynamic process of traffic flow systems but are with limited accuracy, while data-driven methods, especially deep learning with black-box structures, can achieve improved performance but can not be fully trusted due to lack of a reasonable physical basis. To bridge the gap between purely data-driven and physics-driven approaches, we propose a physics-guided deep learning model named Spatio-Temporal Differential Equation Network (STDEN), which casts the physical mechanism of traffic flow dynamics into a deep neural network framework. Specifically, we assume the traffic flow on road networks is driven by a latent potential energy field (like water flows are driven by the gravity field), and model the spatio-temporal dynamic process of the potential energy field as a differential equation network. STDEN absorbs both the performance advantage of data-driven models and the interpretability of physics-based models, so is named a physics-guided prediction model. Experiments on three real-world traffic datasets in Beijing show that our model outperforms state-of-the-art baselines by a significant margin. A case study further verifies that STDEN can capture the mechanism of urban traffic and generate accurate predictions with physical meaning. The proposed framework of differential equation network modeling may also cast light on other similar applications.
Title: Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions. (arXiv:2209.00322v1 [cs.LG])
- Paper URL: http://arxiv.org/abs/2209.00322
- Code URL: null
- Copy Paste:
[[2209.00322] Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions](http://arxiv.org/abs/2209.00322)
- Summary:
Computational phenotyping allows for unsupervised discovery of subgroups of patients as well as corresponding co-occurring medical conditions from electronic health records (EHR). Typically, EHR data contains demographic information, diagnoses and laboratory results. Discovering (novel) phenotypes has the potential to be of prognostic and therapeutic value. Providing medical practitioners with transparent and interpretable results is an important requirement and an essential part for advancing precision medicine. Low-rank data approximation methods such as matrix (e.g., non-negative matrix factorization) and tensor decompositions (e.g., CANDECOMP/PARAFAC) have demonstrated that they can provide such transparent and interpretable insights. Recent developments have adapted low-rank data approximation methods by incorporating different constraints and regularizations that facilitate interpretability further. In addition, they offer solutions for common challenges within EHR data such as high dimensionality, data sparsity and incompleteness. Especially extracting temporal phenotypes from longitudinal EHR has received much attention in recent years. In this paper, we provide a comprehensive review of low-rank approximation-based approaches for computational phenotyping. The existing literature is categorized into temporal vs. static phenotyping approaches based on matrix vs. tensor decompositions. Furthermore, we outline different approaches for the validation of phenotypes, i.e., the assessment of clinical significance.