• Home
  • Get Started
  • Updates
  • Support
  • Shop
  • Pricing
  • AI News
Get Started
  • Login
  • Register
Neuraldemy
Cart / 0.00$

No products in the cart.

No Result
View All Result
Get Started
Neuraldemy
Get started
Home Premium

[In Depth] Linear Discriminant Analysis: Concepts And Application

Amritesh Kumar by Amritesh Kumar
December 15, 2023 - Updated on December 29, 2023
in Machine Learning
Reading Time: 27 mins read
A A
Linear Discriminant Analysis: Concepts And Application

In the previous post, we learned about PCA and how to use PCA in machine learning for various applications. In this tutorial, we are going to learn about linear discriminant analysis (LDA). In PCA we project the higher dimensional data to lower dimensions but we are only concerned with the features of the data in LDA we consider the classes of the data.

This means LDA allow us to project data in a manner so that classes are linearly well separable. This means LDA is a supervised learning algorithm since we have labels involved here. So, let’s get started. If you have any questions, please ask in the forum/community support.

Table of Contents

  • Prerequisites
  • What You Will Learn
  • What Is Linear Discriminant Analysis?
  • Derivation Of LDA:
  • Numpy & Scikit Learn Implementation:
    • Example 1 – PCA Vs LDA
    • Example 2 – Just Using Numpy
    • Example 3 – On The Wine Dataset
  • Limitations Of LDA
  • Applications Of LDA:
  • Footnotes & Further Readings:

Prerequisites

  • Understanding of Linear Algebra Until PCA
  • Familiarity with statistical concepts such as variance, co-variance and correlation
  • Python, Numpy & Scikit-Learn
  • Sale Product on sale
    Linear Algebra For Machine Learning And Data Science
    Linear Algebra For Machine Learning And Data Science
    40.00$ Original price was: 40.00$.24.99$Current price is: 24.99$.
    Add to cart
  • Sale Product on sale
    Probability and Statistics for Machine Learning and Data Science
    Probability and Statistics for Machine Learning and Data Science
    30.00$ Original price was: 30.00$.19.00$Current price is: 19.00$.
    Add to cart

What You Will Learn

  1. LDA concepts
  2. LDA derivation
  3. LDA applications

What Is Linear Discriminant Analysis?

LDA is a dimensionality reduction technique in which we consider the class labels. In other words, we try to preserve as much of the class-discriminatory information as possible while performing dimensionality reduction.

Definition:

The goal of the LDA technique is to project the original data matrix onto a lower dimensional space. To achieve this goal, three steps needed to be performed. The first step is to calculate the separability between different classes (i.e. the distance between the means of different classes), which is called the between-class variance or between-class matrix.

The second step is to calculate the distance between the mean and the samples of each class, which is called the within-class variance or within-class matrix. The third step is to construct the lower dimensional space which maximizes the between-class variance and minimizes the within-class variance.

Alaa Tharwat 1

Fisher Linear Discriminant Analysis (also called Linear Discriminant Analysis(LDA)) are methods used in statistics, pattern recognition and machine learning to find a linear combination of features which characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

LDA is closely related to PCA, for both of them are based on linear, i.e. matrix multiplication, transformations. For the case of PCA, the transformation is based on minimizing mean square error between original data vectors and data vectors that can be estimated fro the reduced dimensionality data vectors. And the PCA does not take into account any difference in class. But for the case of LDA, the transformation is based on maximizing a ratio of “between-class variance” to “within-class variance” with the goal of reducing data variation in the same class and increasing the separation between classes.

Cheng Li, Bingyu Wang2
[In Depth] Linear Discriminant Analysis: Concepts And Application
This is mostly what happens in PCA where classes may not be clearly separated after the projection
[In Depth] Linear Discriminant Analysis: Concepts And Application
Whereas in LDA we try to find a line that maximizes the class separability after projection. This is what we set to find out. 3

One way of doing that is by maximizing the mean distance between projections for each class but simply maximizing the mean distance won’t be sufficient as it does not take into account the variance of the standard deviation within the classes.

[In Depth] Linear Discriminant Analysis: Concepts And Application

For this, we need a measure of within-class variability which is called scatter. It is equivalent to variance but we remove the 1/n from the equation and it is defined as the sum of square differences between the projected samples and their class mean. Furthermore, we define the within-class scatter matrix (SW) which measures the spread of data points within each class and the between-class scatter matrix (SB) which measures the spread of the class means.

It reflects how much the class means deviate from the overall mean. Within-class scatter matrix is calculated by summing up the individual scatter matrices for each class. A detailed formula for these two is provided in the derivation below. I have provided derivation for the two classes but for more classes, the same principle applies.

Derivation Of LDA:

Here is the detailed derivation of LDA:

[In Depth] Linear Discriminant Analysis: Concepts And Application
Page 1
[In Depth] Linear Discriminant Analysis: Concepts And Application
Page 2
[In Depth] Linear Discriminant Analysis: Concepts And Application
Page 3
[In Depth] Linear Discriminant Analysis: Concepts And Application
Page 4

Once we have solved our equation and found SWW = λSBW, Where W = S-1W SB is the transformation matrix and λ is its eigenvalues. We calculate the eigenvalues and eigenvectors of W. The eigenvectors of W represent the directions of the new space, and the corresponding eigenvalues represent the scaling factor, length, or the magnitude of the eigenvectors.

Each eigenvector represents one axis of the LDA space, and the associated eigenvalue represents the robustness of this eigenvector. The robustness of the eigenvector reflects its ability to discriminate between different classes, i.e. increase the between-class variance, and decrease the within-class variance of each class; hence meets the LDA goal. Thus, the eigenvectors with the k highest eigenvalues are used to construct a lower dimensional space (Vk), while the other eigenvectors ({vk+1, vk+2, vM}) are neglected. 4

Linear Discriminant Analysis
Steps involved in LDA5

Numpy & Scikit Learn Implementation:

I have provided the codes here. Make sure to type these codes. You can’t copy the content. Outputs are not included.

Example 1 – PCA Vs LDA

import numpy as np
import matplotlib.pyplot as plt
X = np.array([[0, 1, 2, 3, 4, 5, 1, 2, 3, 3, 5, 6, 7, 8], [1, 2, 3, 3, 5, 5, 0, 1, 1, 2, 3, 5, 6, 6]])
y = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1])
X = X.T
XCode language: JavaScript (javascript)
# plot the data
plt.scatter(X[:, 0], X[:, 1], c = y);Code language: PHP (php)

First, we will apply PCA to check how it is different from LDA. You will see that it is not separating the classes.

# Apply PCA 
from sklearn.decomposition import PCA
pca = PCA(n_components = 1)
pca.fit(X)
Xr = pca.transform(X)
print(Xr)Code language: PHP (php)
# PCA projection
plt.scatter(Xr[:, 0], Xr[:, 0], c = y);Code language: PHP (php)

Here The classes are clearly linearly separable

# Apply LDA 
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
lda = LinearDiscriminantAnalysis()
X_lda = lda.fit_transform(X, y)
plt.scatter(X_lda[:, 0], X_lda[:, 0], c = y);

Code language: PHP (php)

Example 2 – Just Using Numpy

Now we will see an end-to-end example without using Sklearn.

# Generate simple 2D dataset with two classes
np.random.seed(42)
class1_data = np.random.randn(50, 2) + np.array([2, 2])
class2_data = np.random.randn(50, 2) + np.array([5, 5])

# Plot data
plt.scatter(class1_data[:, 0], class1_data[:, 1], label='Class 1')
plt.scatter(class2_data[:, 0], class2_data[:, 1], label='Class 2')
plt.title('Generated Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()Code language: PHP (php)
# Step 1: Calculate Class Means
mean_class1 = np.mean(class1_data, axis=0)
mean_class2 = np.mean(class2_data, axis=0)

print("Mean of Class 1:", mean_class1)
print("Mean of Class 2:", mean_class2)Code language: PHP (php)
# Step 2: Calculate Within-Class Scatter Matrix (SW)
cov_class1 = np.cov(class1_data.T)
cov_class2 = np.cov(class2_data.T)
SW = cov_class1 + cov_class2

print("Within-Class Scatter Matrix (SW):\n", SW)Code language: PHP (php)
# Step 3: Calculate Between-Class Scatter Matrix (SB)
mean_diff = (mean_class1 - mean_class2).reshape(2, 1)
SB = np.dot(mean_diff, mean_diff.T)

print("Between-Class Scatter Matrix (SB):\n", SB)Code language: PHP (php)
# Step 4: Solve the Generalized Eigenvalue Problem

# Compute the eigenvalues and eigenvectors
eig_vals, eig_vecs = np.linalg.eig(np.linalg.inv(SW).dot(SB))

# Sort eigenvalues and corresponding eigenvectors
sorted_indices = np.argsort(eig_vals)[::-1]
eig_vals = eig_vals[sorted_indices]
eig_vecs = eig_vecs[:, sorted_indices]

print("Eigenvalues:\n", eig_vals)
print("Eigenvectors:\n", eig_vecs)Code language: PHP (php)
# Step 5: Choose Top Eigenvector and Project Data
# Choose the top eigenvector for projection (LDA dimensionality reduction)
W = eig_vecs[:, 0]

# Project the data onto the new feature subspace
lda_result_class1 = class1_data.dot(W)
lda_result_class2 = class2_data.dot(W)

# Plot the LDA results with correct decision boundary
plt.scatter(lda_result_class1, np.zeros_like(lda_result_class1), label='Class 1', alpha=0.7)
plt.scatter(lda_result_class2, np.zeros_like(lda_result_class2), label='Class 2', alpha=0.7)

# Plot the linear discriminant line (decision boundary)
decision_boundary = (mean_class1.dot(W) + mean_class2.dot(W)) / 2
plt.axvline(x=decision_boundary, color='black', linestyle='--', label='Decision Boundary')

plt.title('LDA Projection Results with Decision Boundary')
plt.xlabel('LDA Projection Value')
plt.legend()
plt.show()Code language: PHP (php)
# Project the data onto the new feature subspace for the top two eigenvectors
W1 = eig_vecs[:, 0]
W2 = eig_vecs[:, 1]

lda_result_class1_W1 = class1_data.dot(W1)
lda_result_class2_W1 = class2_data.dot(W1)

lda_result_class1_W2 = class1_data.dot(W2)
lda_result_class2_W2 = class2_data.dot(W2)

# Plot the LDA results for the top two eigenvectors
plt.figure(figsize=(12, 4))

plt.subplot(1, 2, 1)
plt.scatter(lda_result_class1_W1, np.zeros_like(lda_result_class1_W1), label='Class 1', alpha=0.7)
plt.scatter(lda_result_class2_W1, np.zeros_like(lda_result_class2_W1), label='Class 2', alpha=0.7)
plt.axvline(x=decision_boundary, color='black', linestyle='--', label='Decision Boundary')
plt.title('LDA Projection - Eigenvector 1')
plt.xlabel('LDA Projection Value')
plt.legend()

plt.subplot(1, 2, 2)
plt.scatter(lda_result_class1_W2, np.zeros_like(lda_result_class1_W2), label='Class 1', alpha=0.7)
plt.scatter(lda_result_class2_W2, np.zeros_like(lda_result_class2_W2), label='Class 2', alpha=0.7)
plt.axvline(x=0, color='black', linestyle='--', label='Decision Boundary')
plt.title('LDA Projection - Eigenvector 2')
plt.xlabel('LDA Projection Value')
plt.legend()

plt.tight_layout()
plt.show()Code language: PHP (php)

Example 3 – On The Wine Dataset

  • Number of Instances: The dataset consists of a total of 178 instances.
  • Number of Features: There are 13 attributes (features) in the dataset, representing various chemical properties of the wines.
  • Classes: The dataset is divided into three classes, each corresponding to a different type of wine. The classes are labelled 1, 2, and 3.

Since there are 3 classes we can max go to 2 dimensions in LDA. We will do that here.

from sklearn.datasets import load_wine
import pandas as pd
wine = load_wine()
X = np.array(wine.data)
y = np.array(wine.target)
print(X[1:5, :])
print(y)Code language: PHP (php)
wine.feature_namesCode language: CSS (css)
# Apply PCA - Check the output 
from sklearn.decomposition import PCA
pca = PCA(n_components = 2)
result = pca.fit(X)
Z = result.transform(X)
plt.scatter(Z[:,0], Z[:,1], c = y);Code language: PHP (php)
# Apply LDA - Check the output and compare it with PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
lda = LinearDiscriminantAnalysis()
X_lda = lda.fit_transform(X, y)
plt.scatter(X_lda[:, 0], X_lda[:, 1], c = y);Code language: PHP (php)

You can also use LDA as a classifier. To do that simply create a train and a test set and how it performs. You can use the predict function provided by Sklearn. If you have any further queries, feel free to ask in community support.

Limitations Of LDA

  1. In Linear Discriminant Analysis (LDA) for a classification problem with C classes, the maximum number of discriminant functions (or feature projections) that can be derived is C−1. Each discriminant function represents a direction in the feature space that maximizes the separation between classes.
  2. LDA assumes that the decision boundaries separating different classes are linear.
  3. LDA assumes that the feature vectors for each class follow a multivariate Gaussian distribution.
  4. It can be sensitive to situations where the discriminatory information is primarily in the variance of the data rather than in the mean differences between classes.

Applications Of LDA:


Linear Discriminant Analysis (LDA) has various applications across different domains. Here are a few of its applications:

  1. In medical research, LDA can be applied to distinguish between different patient groups or to identify patterns in medical data for disease diagnosis.
  2. LDA can be utilized in speech recognition systems to model and classify speech signals based on discriminant features.
  3. LDA is employed in biometric identification systems, such as fingerprint recognition, where it helps in extracting discriminant features for accurate identification.
  4. LDA can be used in human-computer interaction applications for gesture recognition by finding discriminant features in sensor data.
  5. In finance, LDA can be applied to credit scoring or fraud detection by distinguishing between different risk or fraud categories based on financial features.
  6. LDA can be used in genomics to analyze gene expression data and classify samples into different biological conditions or disease states.
  7. In market research, LDA can assist in segmenting customers or products based on various attributes, aiding in targeted marketing strategies.

Footnotes & Further Readings:

  1. Linear Discriminant Analysis: A Detailed Tutorial by Alaa Tharwat, Department of Computer Science and Engineering,
    Frankfurt University of Applied Sciences, Frankfurt am Main, Germany ↩︎
  2. Fisher Linear Discriminant Analysis, Cheng Li, Bingyu Wang, August 31, 2014 ↩︎
  3. Shireen Elhabian and Aly A. Fara, University of Louisville, CVIP Lab September 2009 ↩︎
  4. Same as 1 ↩︎
  5. Same as 1 ↩︎
Tags: dimensionality reductionMatrix FactorizationPCASupervised LearningSVD
Previous Post

[In Depth] Principal Components Analysis: Concepts And Application

Next Post

Machine Learning: An Introduction

Amritesh Kumar

Amritesh Kumar

I believe you are not dumb or unintelligent; you just never had someone who could simplify the concepts you struggled to understand. My goal here is to simplify AI for all. Please help me improve this platform by contributing your knowledge on machine learning and data science, or help me improve current tutorials. I want to keep all the resources free except for support and certifications. Email me @amriteshkr18@gmail.com.

Related Posts

ResNet And DenseNet Implementation In Depth

Convolutional Neural Networks (CNNs): Concept And Application

[In Depth] Nearest Neighbors: Concept And Application

A Beginner’s Guide to Data Preprocessing In ML

[In Depth] Stochastic Gradient Descent: Concept And Application

[In Depth] Logistic Regression: Concept And Application

Next Post
Machine Learning: An Introduction

Machine Learning: An Introduction

Support Vector Machine: Concepts And Applications

[In Depth] Support Vector Machine: Concepts And Applications

Decision Trees: Concept And Application

[In Depth] Decision Trees: Concept And Application

  • Customer Support
  • Get Started
  • Ask Your ML Queries
  • Contact
  • Privacy Policy
  • Terms Of Use
Neuraldemy

© 2024 - A learning platform by Odist Magazine

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Get Started
  • Updates
  • Support
  • Shop
  • Pricing
  • AI News
  • Login
  • Sign Up
  • Cart
Order Details

© 2024 - A learning platform by Odist Magazine

This website uses cookies. By continuing to use this website you are giving consent to cookies being used.
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?
0