How to Efficiently Select Non-Adjacent Columns by Column Number in Pandas (iloc Method)

When working with data in Pandas, selecting specific columns is a fundamental task. While many users rely on column names (using loc), there are scenarios where selecting columns by their position (column number) is more efficient: for example, when column names are long, non-descriptive, or dynamically generated. Pandas’ iloc method is designed for integer-based indexing, making it ideal for this job.

In this blog, we’ll focus on a common challenge: selecting non-adjacent columns (columns that are not next to each other) using their position with iloc. We’ll break down the syntax, walk through practical examples, highlight pitfalls to avoid, and share best practices to make your code robust and readable.

Table of Contents#

  1. Understanding iloc Basics
  2. Selecting Non-Adjacent Columns with iloc
  3. Step-by-Step Examples
  4. Common Pitfalls and How to Avoid Them
  5. Best Practices
  6. Conclusion
  7. References

1. Understanding iloc Basics#

Before diving into non-adjacent column selection, let’s recap how iloc works.

iloc is Pandas’ integer-location based indexer, used to select rows and columns by their position (0-based index). The syntax is:

df.iloc[row_indexer, column_indexer]  
  • row_indexer: Specifies which rows to select (e.g., 0:5 for the first 5 rows, [0, 2, 4] for specific rows).
  • column_indexer: Specifies which columns to select (using the same logic as row_indexer).

For example, to select rows 0-2 and columns 1 and 3:

df.iloc[0:3, [1, 3]]  # Rows 0,1,2; Columns 1 and 3  

Key Note: iloc uses 0-based indexing, meaning the first column is 0, the second is 1, and so on.

2. Selecting Non-Adjacent Columns with iloc#

To select non-adjacent columns (columns that are not next to each other) by their position, use a list of integers as the column_indexer. This list contains the positions of the columns you want to keep.

Basic Syntax for Non-Adjacent Columns:#

df.iloc[:, [col1, col2, col3, ...]]  
  • The : in the row_indexer selects all rows (replace with specific row indices if needed).
  • [col1, col2, ...] is a list of non-adjacent column positions (e.g., [0, 2, 4] for the 1st, 3rd, and 5th columns).

3. Step-by-Step Examples#

Let’s walk through practical examples to solidify your understanding.

Example 1: Basic Non-Adjacent Column Selection#

Suppose we have a small DataFrame with 5 columns (A, B, C, D, E). We want to select the 1st, 3rd, and 5th columns (positions 0, 2, and 4).

Step 1: Create a Sample DataFrame#

import pandas as pd  
 
data = {  
    'A': [10, 20, 30],  
    'B': [40, 50, 60],  
    'C': [70, 80, 90],  
    'D': [100, 110, 120],  
    'E': [130, 140, 150]  
}  
df = pd.DataFrame(data)  
print("Original DataFrame:\n", df)  

Output:

Original DataFrame:  
     A   B   C    D    E  
0  10  40  70  100  130  
1  20  50  80  110  140  
2  30  60  90  120  150  

Step 2: Select Non-Adjacent Columns (Positions 0, 2, 4)#

selected_columns = df.iloc[:, [0, 2, 4]]  # Columns 0 (A), 2 (C), 4 (E)  
print("Selected Non-Adjacent Columns:\n", selected_columns)  

Output:

Selected Non-Adjacent Columns:  
     A   C    E  
0  10  70  130  
1  20  80  140  
2  30  90  150  

Example 2: Selecting Non-Adjacent Columns in a Large Dataset#

For larger datasets, column positions are often easier to track than long/complex column names. Let’s use the Titanic dataset (via Seaborn) to demonstrate.

Step 1: Load the Dataset and Inspect Columns#

import seaborn as sns  
 
titanic = sns.load_dataset('titanic')  
print("Titanic Columns (with positions):")  
for idx, col in enumerate(titanic.columns):  
    print(f"Position {idx}: {col}")  

Output (truncated):

Titanic Columns (with positions):  
Position 0: survived  
Position 1: pclass  
Position 2: sex  
Position 3: age  
Position 4: sibsp  
Position 5: parch  
Position 6: fare  
...  

Step 2: Select Non-Adjacent Columns by Position#

Suppose we want survived (pos 0), age (pos 3), and fare (pos 6):

selected_titanic = titanic.iloc[:, [0, 3, 6]]  # Columns 0, 3, 6  
print("Selected Titanic Columns:\n", selected_titanic.head())  

Output:

Selected Titanic Columns:  
   survived   age     fare  
0         0  22.0   7.2500  
1         1  38.0  71.2833  
2         1  26.0   7.9250  
3         1  35.0  53.1000  
4         0  35.0   8.0500  

Example 3: Combining Adjacent and Non-Adjacent Columns#

Sometimes you may want to select a mix of adjacent (e.g., columns 0-1) and non-adjacent (e.g., columns 3 and 5) columns. To do this, combine slices (for adjacent columns) and lists (for non-adjacent columns) using list(range()).

Example: Select Columns 0-1 (adjacent) and 3, 5 (non-adjacent)#

# Columns 0-1 (adjacent) + 3,5 (non-adjacent)  
columns_to_select = list(range(0, 2)) + [3, 5]  # [0,1,3,5]  
selected_mixed = titanic.iloc[:, columns_to_select]  
print("Mixed Selection Columns:\n", selected_mixed.columns)  

Output:

Mixed Selection Columns:  
Index(['survived', 'pclass', 'age', 'parch'], dtype='object')  

4. Common Pitfalls and How to Avoid Them#

While iloc is powerful, it’s easy to make mistakes. Here are key pitfalls and fixes:

Pitfall 1: Confusing 0-Based vs. 1-Based Indexing#

Problem: Assuming the first column is position 1 (instead of 0).
Example:

# Trying to select the first column (A) with [1] (incorrect)  
df.iloc[:, [1]]  # Returns column B instead of A  

Fix: Use 0 for the first column:

df.iloc[:, [0]]  # Correctly selects column A  

Pitfall 2: Using Negative Indices (Without Awareness)#

iloc supports negative indices (e.g., -1 = last column), but this can cause confusion.
Example:

df.iloc[:, [-1]]  # Selects the last column (E in our sample DataFrame)  

Fix: Explicitly note negative indices in comments to avoid ambiguity.

Pitfall 3: Using String Names in iloc#

iloc only accepts integer indices. Using column names raises an error.
Problem:

df.iloc[:, ['A']]  # TypeError: .iloc requires numeric indexers  

Fix: Use loc for column names, or iloc with positions:

df.loc[:, ['A']]  # Correct (using loc with names)  
df.iloc[:, [0]]   # Correct (using iloc with position)  

Pitfall 4: Out-of-Bounds Column Indices#

Selecting a column position greater than the number of columns raises an IndexError.
Problem:

df.iloc[:, [10]]  # df has 5 columns (0-4) → IndexError  

Fix: Check column count first with len(df.columns):

print(f"Total columns: {len(df.columns)}")  # Ensure indices are < len(df.columns)  

5. Best Practices#

To make your code readable and robust when selecting non-adjacent columns with iloc:

1. Document Column Positions#

Comment the purpose of each column position to clarify intent:

# Select 'survived' (0), 'age' (3), 'fare' (6)  
selected = titanic.iloc[:, [0, 3, 6]]  

2. Use Variables for Reusable Column Indices#

Assign column positions to variables for clarity, especially in large projects:

col_survived = 0  
col_age = 3  
col_fare = 6  
selected = titanic.iloc[:, [col_survived, col_age, col_fare]]  

3. Validate Selections with df.columns#

Check that your selected positions map to the correct columns:

selected_cols = [0, 3, 6]  
print("Selected Column Names:", titanic.columns[selected_cols])  # Verify names  

4. Avoid Over-Reliance on Positions#

If column order might change (e.g., after data preprocessing), prefer loc with names. Use iloc only when column positions are stable.

6. Conclusion#

Selecting non-adjacent columns by position in Pandas is efficient and intuitive with iloc. By using a list of integers as the column indexer, you can quickly extract the columns you need—even in large datasets. Remember to:

  • Use 0-based indexing.
  • Combine slices and lists for mixed adjacent/non-adjacent selections.
  • Avoid common pitfalls like string indices or out-of-bounds errors.
  • Document and validate your selections for readability.

With these techniques, you’ll streamline your data wrangling workflow and avoid costly mistakes.

7. References#