Skip to main content

Posts

Showing posts from March, 2023

Effective ways to Handle Missing Values in Data - Part 1

 Introduction The real world is not perfect and the same goes for real-world data. It often has a lot of missing values which leads to biased and inaccurate results if not handled properly. In addition to this many machine-learning algorithms do not support missing values in the data. Therefore, it is important to address missing values appropriately. In this article, we will discuss different methods of missing value imputation. In the next part of the article, we will explain these concepts using python code. We will divide these into two parts. 1. Traditional Imputation Methods 2. Advance Imputation Methods Traditional Imputation Methods 1. Imputation with Mean, Median, or Mode  The missing values are imputed with the mean or median of the non-missing values in the same column. The important point to note here is that if there are outliers in the data then we should use the median instead of the mean as the median is more robust to outliers in the data. Likewise, for catego...