Skip to main content

Selection by Label

In this tutorial, we'll explore the concept of 'Selection by Label' in pandas, a powerful Python library for data manipulation and analysis.

Understanding the Basics

Pandas provides various methods for selecting data out of a DataFrame or Series, and 'Selection by Label' is one of the most common operations. We use labels to refer to the name of the columns and indices.

DataFrame Creation

Let's start by creating a simple DataFrame:

import pandas as pd

data = {
'fruit': ['apple', 'banana', 'cherry', 'date'],
'color': ['red', 'yellow', 'red', 'brown'],
'weight': [120, 150, 10, 15]
}
df = pd.DataFrame(data)
df

This code will generate a DataFrame with columns labeled 'fruit', 'color', and 'weight'.

.loc Accessor

Pandas provides the .loc accessor for label-based selection. It's used like so:

df.loc[1, 'fruit']

In this example, 1 is the label of the row, and 'fruit' is the label of the column. This will return the fruit at index 1 — 'banana'.

Selecting Multiple Columns

You can select multiple columns by passing a list of column labels:

df.loc[:, ['fruit', 'color']]

Here, the colon : means "all rows", and ['fruit', 'color'] is a list of the column labels we're interested in.

Selecting Ranges

You can also use label-based slicing to select a range of rows:

df.loc[1:3]

This will return all rows from label 1 to 3 (inclusive).

Conditional Selection

The .loc accessor also supports boolean indexing for conditional selection:

df.loc[df['weight'] > 100]

This will return all rows where the 'weight' is greater than 100.

.at Accessor

For accessing a scalar value, a faster method is at:

df.at[1, 'fruit']

This is similar to .loc but faster because it accesses the exact location directly.

Wrap Up

In this tutorial, we have learned the basics of Selection by Label in pandas. We have covered how to use .loc and .at to select data based on the labels. We've also seen how to select multiple columns, how to select ranges of data, and how to use conditional selection.

Remember, practice is essential to get the hang of these concepts. Try to play around with these methods with different datasets. Happy data wrangling!