How to Select Rows From Pandas DataFrame?

The 2-Dimensional data analysis tool named “DataFrame” is used to store different types of data in rows and columns format. There are a number of operations performed on DataFrame using various functions. These operations are adding rows and columns value, removing rows or columns, sorting, etc.

To select specific rows from Pandas DataFrame, the “loc()” and “iloc()” functions are used in Python. This guide will provide an overview to select rows from Pandas DataFrame using the below content:

Using Python loc() Function

In Python, the “loc()” function is used to access data values from a given dataset according to the index label. A dataframe can be selected by selecting a row or column. The following examples are used to select rows from Pandas DataFrame: 

Example 1: Select Rows From Pandas DataFrame

The below code is used to select specific rows from Pandas DataFrame:

Code:

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
output = data_frame.loc[data_frame['Name'] == 'Alex']
print (output)
  • The DataFrame is created using the “pd.DataFrame()” function by accepting the dictionary and columns name as an argument.
  • The “df.loc()” function is used to select the specific rows “Alex” from the DataFrame.

Output:

The rows containing “Alex” have been displayed.

Example 2: Select Rows Based on Specific Condition

The below code is used to select rows based on specific conditions:

Code:

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
output = data_frame.loc[data_frame['Age'] >= 20]
print (output)
  • The pandas module is imported, and the dictionary named “data” is initialized.
  • The “pd.DataFrame()” function takes the dictionary and columns name as an argument.
  • The “df.loc()” takes the column name “Age” and selects rows according to the specified condition.
  • The “df.loc()” function will select any rows in the “Age” column that contain a number greater than or equal to “20”.

Output:

The rows in the “Age” column having a value greater than “20” have been selected.

Example 3: Select Rows Based on Multiple Conditions

The below code is used to select rows based on multiple defined conditions:

Code-1: (Multiple Condition Using & Operator)

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
output = data_frame.loc[(data_frame['Age'] >= 20) & (data_frame['Height'] <= 5.5)]
print (output)
  • The “df.loc()” function takes the multiple conditions as an argument and selects the rows based on the condition return value.
  • The “df.loc()” function selects any rows from the DataFrame that contain an age value greater or equal to “20” or a height value lesser or equal to “5.5”.

Output:

The row value satisfying the given condition has been displayed.

Code-2:(Multiple Condition Using “|” Operator)

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
output = data_frame.loc[(data_frame['Age'] <= 16) | (data_frame['Height'] <= 2)]
print (output)
  • The “df.loc()” function takes multiple conditions and separates them using the OR “|” operator.
  • It selects any rows if one of the conditions becomes True.

Output:

The rows satisfying the multiple conditions have been displayed.

Example 4: Select Rows Based on List of Values in Column

The below code is used to select rows based on the list values of any specified column:

Code:

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
output = data_frame.loc[data_frame['Height'].isin([5.1, 5.5, 4.7])]
print (output)
  • The DataFrame is created using the pd.DataFrame() function.
  • The “df.loc()” function is used along with the “isin()” function to select the rows from Pandas DataFrame based on the specified values.

Output:

The rows containing the specified values have been selected.

Using Python iloc() Function

The Python “iloc()” function is also used to select single and multiple rows from the given pandas DataFrame: This function takes the integer index value as an argument and returns the specific data of rows. The following examples are utilized to select rows from pandas DataFrame:

Example 1: Select Rows From Pandas DataFrame

The below code is used to select rows from Pandas DataFrame:

Code:

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
print(data_frame, '\n')
output = data_frame.iloc[3]
print (output)
  • The “pd.DataFrame()” takes the given dictionary and columns name as a parameter and retrieves the DataFrame.
  • The “data_frame.iloc[3]” function takes the integer “3” as an argument and returns the specific rows of data placed at index “3” of Pandas DataFrame.

Output: 

The row placed at index “3” has been selected.

Example 2: Select Multiple Rows From Pandas DataFrame

The following code is used to select multiple rows from Pandas DataFrame:

Code:

import pandas
data = {'Name': ['Alex','Joseph','Lily','Anna', 'Alex'],
        'Age': [15, 22, 23, 18, 16], 'Height': [5.3, 5.5, 5.7, 5.1, 4.7]}
data_frame = pandas.DataFrame(data, columns= ['Name','Age','Height'])
print(data_frame, '\n')
output = data_frame.iloc[2:4]
print (output)
  • The “iloc[]” function takes the range of integers “[2:4]” as an argument and returns the particular multiple rows of Pandas DataFrame.

Output:

The rows placed at index “2” and “3” have been selected.

Conclusion

The Python “loc()” and “iloc()” functions are used to select rows from Pandas DataFrame based on the specific condition, multiple conditions, or based on the list of values. The Python “loc()” function selects single and multiple rows when the specified condition becomes “True”. The Python “iloc()” function takes the index as an argument and selects the rows according to the index position. This blog presented an in-depth guide on how to select rows from Pandas DataFrame.