Feature selection is an important step in building a good machine learning model, one of the technique that helps us in selecting these features is checking the correlation between different features of the dataset. In this article we will discuss the following:
The Pearson Correlation Coefficient is basically used to find out the strength of the linear relation between two continuous variables, it is represented using r. The mathematical formula to calculate the correlation coefficient is given by:
We all have come across the terms outliers or skewness while cleaning or preprocessing data for a project. So, in this article we will be discussing the following points:
Skewness is basically asymmetry in distribution of data as it does not show any kind of symmetry in continuous data. Skewed data can be of 2 types Right-Skewed data also called as Positively-Skewed data or Left-Skewed data also called as Negatively-Skewed data.
Identification of skewness can be easily done…
Computer Science Undergraduate, Machine Learning and Deep Learning Enthusiast.