# Install from CRAN
install.packages("MLCOPULA")
MLCOPULA
Overview
MLCOPULA is a package that provides several classifiers based on probabilistic models. These classifiers allow to model the dependence structure of continuous features through bivariate copula functions and graphical models, see Salinas-Gutiérrez et al. (2014).
The package has been published and is available in the official R CRAN repository: https://CRAN.R-project.org/package=MLCOPULA
Methodology
This package implements 7 copulas for supervised classification: frank, gaussian, clayton, joe, gumbel, AMH and grid. The classification model is based on the Bayes theorem, similar to the naive Bayes classifier model, but does not assume that the features are independent.
The probability of a class given a set of characteristics (predictor variables) is:
\[P(A|x_1,..x_d) \alpha \prod_{i = 1}^{d}f_{X_i|A}(x_i)c(u_1,...,u_i)\]
where each \(u_i = F_{X_i|A}(x_i)\) with \(i = 1,2,..d\).
The copula density function \(c(u_1,..u_i)\) is modeled by bivariate copula functions, using graphical models (trees and chains)
Copulas
Frank copula:
\[C(u_1,u_2;\theta) = -\frac{1}{\theta} ln \left[ 1 + \frac{(e^{-\theta u_1} - 1) (e^{-\theta u_2} - 1) } {e^{-\theta} - 1} \right]\]
with \(\theta \in (-\infty,\infty)/0\)
This copula has no upper nor lower tail dependency.
Clayton copula:
\[C(u_1,u_2;\theta) = \left( u_1^{-\theta} + u_2^{-\theta} - 1 \right)^{-1/\theta}\]
with \(\theta \in [-1,\infty)/0\)
When \(\theta \geq 0\) has lower tail dependence equal to \(\lambda_L = 2^{-1/\theta}\)
Gaussiana (Normal) copula \[C(u_1,u_2;\theta) = \Phi_G (\Phi^{-1} (u_1) , \Phi^{-1} (u_2) )\]
with \(\theta \in (-1,1)\)
This copula has no upper nor lower tail dependency.
Joe copula \[C(u_1,u_2) = 1 - \left[ (1 - u_1)^\theta + (1 - u_2)^\theta - (1 - u_1)^\theta (1 - u_2)^\theta \right ] ^ {1/\theta}\]
with \(\theta \in [1,\infty)\)
This copula has upper tail dependence equal to \(\lambda_U = 2 - 2^{1/\theta}\)
Gumbel copula
\[C(u_1,u_2) = exp \left[ - \left[ ( -ln(u_1) )^\theta + ( -ln(u_2) )^\theta \right]^{1/\theta} \right]\]
with \(\theta \in [1,\infty)\)
This copula has upper tail dependence equal to \(\lambda_U = 2 - 2^{1/\theta}\)
Ali–Mikhail–Haq copula
\[C(u_1,u_2) = \frac{u_1 u_2}{1 - \theta (1 - u_1)(1- u_2)}\]
with \(\theta \in [-1,1)\)
This copula has no upper nor lower tail dependency.
Installation
Quick Example
library(MLCOPULA)
<- copulaClassifier(X = iris[,1:4],
model y = iris$Species)
<- copulaPredict(X = iris[,1:4], model = model)
y_pred classification_report(iris$Species,y_pred$class)
$metrics
precision recall f1-score
setosa 1.00 1.00 1.00
versicolor 0.98 0.98 0.98
virginica 0.98 0.98 0.98
$confusion_matrix
y_pred
y_true setosa versicolor virginica
setosa 50 0 0
versicolor 0 49 1
virginica 0 1 49
$accuracy
[1] 0.9866667
$mutual_information
[1] 1.033253