How to Set Columns as Index in Pandas DataFrame?

Python provides the “Pandas” package that is used for data organizing, handling, and manipulation, etc. To handle and analyze the large data, various functions of the “pandas” module are used in Python. Pandas module also has a variety of functions that supports creating, deleting, and managing the DataFrame. To set the column as a Pandas DataFrame index the “set_index()”, “df.index” methods are used in Python.

This post will elaborate on various methods to set columns as an index in Pandas DataFrame with the help of the following outline:

Method 1: Using set_index() Function

The “set_index()” function is used to set the specific columns as an index of DataFrame. Let’s understand it by the following examples:

Example 1: Setting Single Specific Columns as Index of DataFrame

In the code given below, the “set_index()” function is used to set the specific columns as an index of given DataFrame:

Code:

import pandas as pd
data = pd.DataFrame([['Alex', 22, 5.7],['John', 32, 6.7],
    ['Lily', 21, 4.7]], columns=['Name', 'Age', 'Height'])
print(data)
data = data.set_index('Name')
print(data)

In the above code:

  • The “pd.DataFrame()” function is utilized to create the DataFrame of the given list.
  • The specific column values are also defined as a parameter inside the “pd.DataFrame()” function.
  • The “set_index()” function is utilized to assign the column “Name” as the index.

Output:

The DataFrame with the default index and the specific columns as an index has been created successfully.

Example 2: Setting Multiple Specific Columns as Index of DataFrame

In the code below, the “set_index()” function is used to set the multiple columns as an index of the DataFrame:

Code:

import pandas as pd
data = pd.DataFrame([['Alex', 22, 5.7],['John', 32, 6.7],
    ['Lily', 21, 4.7]], columns=['Name', 'Age', 'Height'])
print(data)
data = data.set_index(['Name', 'Age'])
print('\n\n',data)

In the above code:

  • The Pandas module’s function “pd.DataFrame()” is utilized to represent the given data as DataFrame.
  • The “set_index()” function takes the multiple columns as a list and returns the DataFrame containing the multiple indexes, such as in our case, “Name” and “Age”.

Output:

The above output verified the creation of the DataFrame with a default index and multiple columns as a multi-index.

Method 2: Using Index Attributes

The “index” attribute of Pandas DataFrame is also utilized to set the specified column as a DataFrame index. In the below code, the “index” attribute is utilized to set the column as a Pandas DataFrame

Index:

Code:

import pandas as pd
data = pd.DataFrame([['Alex', 22, 5.7],['John', 32, 6.7],
    ['Lily', 21, 4.7]], columns=['Name', 'Age', 'Height'])
print(data)
data.index = data['Age']
print(data)

In the above code, the DataFrame “Data” is used with the “index” attribute to set the column ‘Age‘ as the index of the DataFrame.

Output:

A DataFrame with a default index and a specific column as an index has been created.

Conclusion

To set columns as an index in Pandas DataFrame, the “set_index()” function and the “index” attribute are used in Python. We can set the single or more than one column as a DataFrame index. To do that, it accepts a single column or multiple columns as arguments and creates a DataFrame accordingly. This Python guide presented a detailed guide on how to set the column as the index in pandas DataFrame.