How to Set Index in Pandas DataFrame?

Pandas’ DataFrames are one of the most widely used data structures. They are two-dimensional tables with rows and columns, each with a separate data type.

Pandas module provides multiple functions to perform different operations on DataFrame. The setting of an index is one of the most important operations on a DataFrame and for which Python supports various functions explained in this post.

  • Method 1: Using the set_index() Method
  • Method 2: Using the reset_index() Method
  • Method 3: Using the reindex() Method
  • Method 4: Using the sort_index() Method

Method 1: Using the set_index() Method

The “set_index()” function of the Pandas module converts one or more column values into a DataFrame’s row index. Here is an example of how this can be done:

Code:

import pandas
data_frame = pandas.DataFrame({'Name': ['Joseph','Anna','Lily'],
                              'Age': [24, 25, 26],
                              'Height': [5.7, 4.8, 4.9]})
print('Original DataFrame: \n',data_frame)
# set column 'A' as the index
data_frame.set_index('Name', inplace=True)
print('\n',data_frame)
  • A pandas DataFrame named “data_frame” is created with three columns.
  • The “set_index()” sets the ‘Name’ column as the DataFrame’s index.

Output:

Dataframe index has been successfully set.

Method 2: Using the reset_index() Method

The “reset_index()” method is used to reset the index of a DataFrame to its default numeric index. Here is an example of how reset_index() can set the index:

Code:

import pandas
data_frame = pandas.DataFrame({'Name': ['Joseph','Anna','Lily'],
                              'Age': [24, 25, 26],
                              'Height': [5.7, 4.8, 4.9]})
print('Original DataFrame: \n',data_frame)
# set column 'A' as the index
data_frame.set_index('Name', inplace=True)
print('\n',data_frame)
# reset the index
data_frame.reset_index(drop=True, inplace=True)
print('\n',data_frame)
  • The “set_index()” is used to set the ‘Name’ column as the DataFrame’s index.
  • The “set_index()”  method resets the DataFrame index to its default integer value.

Output:

The DataFrame with the specific index has been reset to the default integer index.

Method 3: Using the reindex() Method

The reindex() method is utilized to change/modify the row indexes and column labels of a given DataFrame. Let’s use this method to set the index in Pandas data frame:

Code:

import pandas
data_frame = pandas.DataFrame({'Name': ['Joseph','Anna','Lily'],
                              'Age': [24, 25, 26],
                              'Height': [5.7, 4.8, 4.9]})
print('Original DataFrame: \n',data_frame)
# set column 'A' as the index
data_frame.set_index('Name', inplace=True)
print('\n',data_frame)
# create a new index
new_index = ['Anna', 'Lily', 'Joseph']
# reindex the DataFrame
data_frame = data_frame.reindex(new_index)
print('\n',data_frame)
  • The “set_index()” method is used to set/assign the ‘Name’ column as the index of the given DataFrame. The parameter named “inplace=True” is utilized to change the DataFrame in place.
  • The list named “new_index” is initialized and can be used to create a new index.
  • The “reindex()” method is used to reindex the DataFrame using the new index.

Output:

The data frame has been reindexed using the new index.

Method 4: Using the sort_index() Method

In Python, the “sort_index()” sorts DataFrames or Series according to their indexes. This method is used in the below code to set the specific index:

Code:

import pandas
data_frame = pandas.DataFrame({'Name': ['Joseph','Anna','Lily'],
                              'Age': [35, 25, 22],
                              'Height': [5.7, 4.8, 4.9]})
print('Original DataFrame: \n',data_frame)
# set column 'A' as the index
data_frame.set_index('Age', inplace=True)
print('\n',data_frame)
# sort the dataframe by index
data_frame.sort_index(inplace=True)
print('\n',data_frame)
  • The data frame is created, and the ‘Age’ column is set as the index of the DataFrame using the “df.set_index()” method.
  • The ‘inplace=True’ argument means that the original DataFrame is modified instead of creating a new one.
  • The “df.sort_index()‘ function is used to sort the data frame by the index (which is now the ‘Age’ column).

Output:

The index has been set.

Conclusion

The “set_index(),” “reset_index(),” “reindex(),” and “sort_index()” methods are used to set indexes in Pandas DataFrame. The “set_index()” function converts column values into row indexes for DataFrames. Similarly, the other methods such as “reset_index(),” “reindex()” function, etc. are used to set the index by performing specific tasks. This guide presented various ways to set indexes in Pandas DataFrame using numerous examples.