NaN or Not a Number is a special value that represents an undefined or unpresentable value. It is often used when a calculation does not have a meaningful numerical value, such as dividing zero by zero or taking the square root of a negative number.
To replace the NaN values with “Zeros”, various functions are utilized in Python, such as “dataframe.fillna()”, “dataframe.replace()”, etc.
This post will provide various ways to replace the NaN values with zeros in Pandas DataFrame.
- Method 1: Using “df.fillna()” Function
- For Specific Column
- For an Entire DataFrame
- For Multiple Column DataFrame
- Method 2: Using “df.replace()” Function
- For Specific Column
- For an Entire DataFrame
- For Multiple Column DataFrame
Method 1: Using “df.fillna()” Function
The “dataframe.fillna()” method of the Pandas module is used to replace the Null/NaN values with a specific value. The syntax of the “df.fillna()” method is shown below:
df.fillna(value, method, axis, inplace, limit, downcast)
In the above syntax:
- The “value” parameter represents a value that fills/replaces missing values.
- The optional parameter “method” specifies the method for filling in the missing values.
- The optional parameter “axis” indicates the axis to fill in missing values. This can be either 0 (for filling in missing values in columns) or 1 (for filling in missing values in rows).
- The other optional parameters are rarely used while replacing NaN values with zeros in DataFrame.
The following examples demonstrate how to replace NaN values with floating point zeros.
Example 1: Replace NaN Values With Zeros for Specific Column
The code below replaces the NaN values with floating point “0” for a specific column in Pandas DataFrame:
Code:
import pandas
import numpy
data_frame = pandas.DataFrame({'Name': ['Joseph', 'Henry', 'Alex', 'Jon'],
'Salary': [1450, numpy.nan, 5215, numpy.nan]})
print(data_frame)
data_frame['Salary'] = data_frame['Salary'].fillna(0)
print ('\n', data_frame)
- The “pandas” and “numpy” modules are imported.
- The “pd.DataFrame()” function creates the DataFrame.
- The “df.fillna(0)” function is used to replace the “NaN” values from the specific column named “Salary”.
Output:
The NaN values have been replaced by 0.
Example 2: Replace NaN Values With Zeros for an Entire DataFrame
The code below replaces the NaN values with “0” for an entire DataFrame:
Code:
import pandas
import numpy
data_frame = pandas.DataFrame({'Name': ['Joseph', numpy.nan, 'ALex', numpy.nan],
'Age' : [22, numpy.nan, 24, numpy.nan]})
print(data_frame)
data_frame = data_frame.fillna(0)
print ('\n',data_frame)
The “df.fillna(0)” function is used to replace all the “NaN” values with the value “zero” in the entire DataFrame.
Output:
The NaN values of the entire DataFrame have been replaced/substituted with “0”.
Example 3: Replace NaN Values With Zeros for Multiple Column DataFrame
To replace the NaN values of multiple columns of DataFrame, use the following code in Python:
Code:
import pandas
import numpy
data_frame = pandas.DataFrame({'Name': ['Joseph', numpy.nan, 'ALex', numpy.nan],
'Age' : [22, numpy.nan, 24, numpy.nan],
'Height' : [5.7, numpy.nan, 5.3, numpy.nan]})
print(data_frame)
data_frame[['Name', 'Height']] = data_frame[['Name', 'Height']].fillna(0)
print ('\n',data_frame)
The “df.fillna(0)” function is used on the multiple columns, such as “[[‘Name’, ‘Height’]]” to replace the “NaN” values with zeros.
Output:
The NaN values of multiple columns have been replaced with the floating point value “0”.
Method 2: Using “df.replace()” Function
The “df.replace()” function of Pandas DataFrame is used to replace the specific value from the given dataframe. Here is an example code:
Example 1: Replace NaN Values With Zeros for Specific Column
The following code uses the “df.replace()” function to replace the NaN values with zeros for an individual column:
Code:
import pandas
import numpy
data_frame = pandas.DataFrame({'Name': ['Joseph', numpy.nan, 'Alex', numpy.nan],
'Age' : [22, numpy.nan, 24, numpy.nan],
'Height' : [5.7, numpy.nan, 5.3, numpy.nan]})
print(data_frame)
data_frame['Name'] = data_frame['Name'].replace(numpy.nan, 0)
print ('\n',data_frame)
- The “Pandas” and “Numpy” modules are imported at the start of the code.
- The “pd.DataFrame()” function is used to create the DataFrame consisting of the “Name”, “Age”, and ‘Height’ columns.
- The “df,replace()” function accepts the “numpy.nan” as a first argument and “0” as a second argument to replace the ‘NaN’ values with zeros of a specific column.
Output:
The NaN values have been replaced by 0.
Example 2: Replace NaN Values With Zeros for an Entire DataFrame
The following code is used to replace the ‘NaN’ values of the complete DataFrame with “0”:
Code:
import pandas
import numpy
data_frame = pandas.DataFrame({'Name': ['Joseph', numpy.nan, 'Alex', numpy.nan],
'Age' : [22, numpy.nan, 24, numpy.nan],
'Height' : [5.7, numpy.nan, 5.3, numpy.nan]})
print(data_frame)
data_frame = data_frame.replace(numpy.nan, 0)
print ('\n',data_frame)
The “df.replace()” function takes the “numpy.nan” and “0” as an argument to replace the ‘NaN’ values of the entire DataFrame with zeros.
Output:
All NaN values in the DataFrame have been replaced/substituted with “0”.
Example 3: Replace NaN Values With Zeros for Multiple Column DataFrame
The code below replaces the “NaN” values with zeros for multiple columns:
Code:
import pandas
import numpy
data_frame = pandas.DataFrame({'Name': ['Joseph', numpy.nan, 'Alex', numpy.nan],
'Age' : [22, numpy.nan, 24, numpy.nan],
'Height' : [5.7, numpy.nan, 5.3, numpy.nan]})
print(data_frame)
data_frame[['Name', 'Height']] = data_frame[['Name', 'Height']].replace(numpy.nan, 0)
print ('\n',data_frame)
The “df.replace()” function is used to replace the ‘NaN’ values of multiple columns with the zeros by accepting the “numpy.nan” and “0” as arguments.
Output:
Multiple columns with NaN values have been replaced with zeros.
Conclusion
The “dataframe.fillna()” function and “dataframe.replace()” function of the Pandas module are used to replace the NaN values with zeros. These functions are used to replace the “NaN” values with zeros for a specific column, for an entire DataFrame, or multiple columns of the DataFrame. The “numpy.nan” defines the NaN/Null value in the DataFrame. This guide provided various ways to replace DataFrame NaN values with zeros.