How to Select Local Minima and Maxima from pandas Series: Fixing scipy.signal.argrelextrema Error
Local minima and maxima (collectively "extrema") are critical data points in time series analysis, financial modeling, sensor data processing, and scientific research. They represent peaks (maxima) and troughs (minima) that often signal trend changes, anomalies, or key events. For example, in stock prices, local maxima might indicate sell signals, while minima could suggest buy opportunities.
Pandas, a popular Python library for data manipulation, is often used to store and preprocess time series data. To identify extrema, many analysts turn to scipy.signal.argrelextrema, a function designed to detect local maxima and minima in arrays. However, argrelextrema can be tricky to use with pandas Series, leading to errors like "ValueError: The length of the input array must be at least order + 1" or unexpected results (e.g., missing extrema in flat regions).
This blog will demystify the process of finding local extrema in pandas Series using argrelextrema. We’ll cover:
- What local extrema are and why they matter.
- How to use
argrelextremawith pandas Series. - Common errors and their root causes.
- Step-by-step fixes for these errors.
- Alternative methods for finding extrema.
Table of Contents#
- Understanding Local Minima and Maxima
- Using scipy.signal.argrelextrema with pandas Series
- Common Errors with argrelextrema
- Fixing the Errors: Practical Solutions
- Alternative Methods to Find Extrema
- Practical Example: Stock Price Data
- Conclusion
- References
1. Understanding Local Minima and Maxima#
A local maximum is a data point greater than its neighboring points, and a local minimum is a data point smaller than its neighbors. For example, in the series [1, 3, 2, 5, 4]:
3is a local maximum (greater than1and2).2is a local minimum (smaller than3and5).5is a local maximum (greater than2and4).
Formally, for a point ( x_i ) in a series ( x ):
- ( x_i ) is a local maximum if ( x_i > x_{i-1}, x_{i-2}, ..., x_{i-order} ) and ( x_i > x_{i+1}, x_{i+2}, ..., x_{i+order} ), where
orderis the number of neighbors to compare on each side. - ( x_i ) is a local minimum if ( x_i < x_{i-1}, ..., x_{i-order} ) and ( x_i < x_{i+1}, ..., x_{i+order} ).
2. Using scipy.signal.argrelextrema with pandas Series#
scipy.signal.argrelextrema is a powerful tool to find indices of local extrema in an array. It works by comparing each element to its neighbors (controlled by the order parameter) and returns indices where extrema occur.
Step 1: Install Required Libraries#
Ensure scipy and pandas are installed:
pip install scipy pandas numpy Step 2: Basic Usage with pandas Series#
To use argrelextrema with a pandas Series:
- Convert the Series to a numpy array (since
argrelextremarequires arrays, not Series). - Specify
order(number of neighbors to compare on each side). - Use comparison functions like
np.greater(for maxima) ornp.less(for minima).
Example Code:
import pandas as pd
import numpy as np
from scipy.signal import argrelextrema
# Create a sample pandas Series (e.g., a noisy sine wave)
np.random.seed(42)
x = np.linspace(0, 10, 100)
y = np.sin(x) + np.random.normal(0, 0.1, 100) # Noisy sine wave
series = pd.Series(y, index=x)
# Find local maxima (order=2: compare 2 neighbors on each side)
max_indices = argrelextrema(series.values, np.greater, order=2)[0]
local_max = series.iloc[max_indices]
# Find local minima
min_indices = argrelextrema(series.values, np.less, order=2)[0]
local_min = series.iloc[min_indices]
print("Local Maxima Indices:", max_indices)
print("Local Maxima Values:\n", local_max.head()) Key Parameters of argrelextrema:#
arr: Input array (convert pandas Series with.valuesor.to_numpy()).comparator: Function to compare elements (e.g.,np.greaterfor maxima,np.lessfor minima).order: Number of neighbors to compare on each side (default=1).mode: How to handle edge values (e.g.,'clip'pads edges with the first/last value;'wrap'treats data as periodic).
3. Common Errors with argrelextrema#
Despite its utility, argrelextrema often throws errors or returns unexpected results. Below are the most common issues and their causes:
Error 1: "ValueError: The length of the input array must be at least order + 1"#
Cause: The input Series is shorter than required to compare order neighbors. For order=k, the Series must have at least ( 2k + 1 ) elements (to compare ( k ) neighbors on both sides of a central point).
Example:
short_series = pd.Series([1, 3, 2]) # Length=3
argrelextrema(short_series.values, np.greater, order=2) # order=2 requires 2*2+1=5 elements → Error! Error 2: Flat Regions Are Ignored#
Cause: argrelextrema uses strict comparisons by default (e.g., np.greater requires ( x_i > x_j )). Flat regions (e.g., [2, 2, 2]) or plateaus (e.g., [1, 3, 3, 3, 2]) will not be detected as extrema.
Example:
flat_series = pd.Series([1, 3, 3, 3, 2])
max_indices = argrelextrema(flat_series.values, np.greater, order=1)[0]
print("Max Indices:", max_indices) # Output: [] (no maxima detected!) Error 3: Edge Extrema Are Missed#
Cause: Extrema at the start/end of the Series may be missed because there are not enough neighbors to compare. For example, the first element has no left neighbors, so order=1 will never detect it as a maximum.
Error 4: NaN Values Break the Function#
Cause: argrelextrema cannot handle NaN values in the input array.
Example:
nan_series = pd.Series([1, 3, np.nan, 2, 5])
argrelextrema(nan_series.values, np.greater, order=1) # Throws ValueError! 4. Fixing the Errors: Practical Solutions#
Fix 1: Handle Short Series (Length Error)#
Solution: Ensure the Series length is at least ( 2 \times \text{order} + 1 ). If not, reduce order or filter short Series.
def safe_extrema(series, comparator, order=1):
min_length = 2 * order + 1
if len(series) < min_length:
print(f"Warning: Series too short. Reducing order to {len(series)//2 - 1}")
order = max(1, len(series)//2 - 1) # Avoid order=0
return argrelextrema(series.values, comparator, order=order)[0]
# Test with short series
short_series = pd.Series([1, 3, 2])
max_indices = safe_extrema(short_series, np.greater, order=2) # order reduced to 0 (but 0 is invalid → max(1, ...) → order=0? Wait, len=3: 2*order+1 ≤3 → order=1. 2*1+1=3. So order=1 is okay.
print("Fixed Max Indices:", max_indices) # Output: [1] (correct: 3 is a maximum) Fix 2: Detect Flat Regions (Plateau Error)#
Solution: Use non-strict comparators like np.greater_equal (for maxima) or np.less_equal (for minima) to include flat regions.
flat_series = pd.Series([1, 3, 3, 3, 2])
# Use np.greater_equal to include plateaus
max_indices = argrelextrema(flat_series.values, np.greater_equal, order=1)[0]
print("Max Indices (with plateaus):", max_indices) # Output: [1, 2, 3] (all 3s are maxima) Fix 3: Capture Edge Extrema#
Solution: Pad the Series with edge values or -inf/inf to simulate neighbors for edge elements.
def pad_edges(series, order=1, pad_value=-np.inf):
# Pad 'order' elements to the left and right
padded = np.pad(series.values, (order, order), mode='constant', constant_values=pad_value)
return padded
# Original series with edge maximum (first element)
edge_series = pd.Series([5, 3, 4, 2, 1])
padded = pad_edges(edge_series, order=1, pad_value=-np.inf)
# Find maxima in padded array, then adjust indices by subtracting padding
max_indices_padded = argrelextrema(padded, np.greater, order=1)[0]
original_indices = max_indices_padded - order # Subtract left padding
print("Original Max Indices (with edges):", original_indices) # Output: [0, 2] (5 and 4 are maxima) Fix 4: Remove NaN Values#
Solution: Drop or interpolate NaNs before using argrelextrema.
nan_series = pd.Series([1, 3, np.nan, 2, 5])
clean_series = nan_series.dropna() # or nan_series.interpolate()
max_indices = argrelextrema(clean_series.values, np.greater, order=1)[0]
print("Max Indices (cleaned):", max_indices) # Output: [1, 3] (3 and 5 are maxima) 5. Alternative Methods to Find Extrema#
If argrelextrema feels cumbersome, try these pandas/numpy-based alternatives:
Method 1: Using pandas.diff()#
Detect sign changes in the first difference to identify extrema. A maximum occurs where the difference changes from positive to negative; a minimum where it changes from negative to positive.
def find_extrema_diff(series):
diff = series.diff()
maxima = (diff.shift(1) > 0) & (diff < 0)
minima = (diff.shift(1) < 0) & (diff > 0)
return series[maxima], series[minima]
# Example
series = pd.Series([1, 3, 2, 5, 4])
max_diff, min_diff = find_extrema_diff(series)
print("Maxima (diff method):", max_diff) # Output: 3,5
print("Minima (diff method):", min_diff) # Output: 2 Method 2: Rolling Window#
Use a rolling window to compare the central value to its neighbors.
def rolling_extrema(series, window=3):
# Window=3 → compare 1 neighbor on each side
roll_max = series.rolling(window=window, center=True).max()
maxima = series == roll_max
roll_min = series.rolling(window=window, center=True).min()
minima = series == roll_min
return series[maxima], series[minima]
# Example
series = pd.Series([1, 3, 2, 5, 4])
max_roll, min_roll = rolling_extrema(series, window=3)
print("Maxima (rolling):", max_roll) # Output: 3,5 6. Practical Example: Stock Price Data#
Let’s apply argrelextrema to real-world stock price data to find local peaks and troughs.
Step 1: Load and Preprocess Data#
import yfinance as yf # Install with: pip install yfinance
# Download Apple stock data
ticker = "AAPL"
data = yf.download(ticker, start="2023-01-01", end="2023-06-01")
prices = data["Close"] # Use closing prices
# Clean data (remove NaNs)
prices = prices.dropna() Step 2: Find Extrema with Error Fixes#
# Find maxima (order=5: compare 5 days on each side)
max_indices = argrelextrema(prices.values, np.greater, order=5)[0]
local_max = prices.iloc[max_indices]
# Find minima
min_indices = argrelextrema(prices.values, np.less, order=5)[0]
local_min = prices.iloc[min_indices]
# Plot results (matplotlib)
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.plot(prices, label="AAPL Close Price")
plt.scatter(local_max.index, local_max, color='red', label='Local Maxima')
plt.scatter(local_min.index, local_min, color='green', label='Local Minima')
plt.title("AAPL Stock Price with Local Extrema (order=5)")
plt.legend()
plt.show() Output: A plot with price data, red dots (maxima), and green dots (minima), highlighting key trend reversals.
7. Conclusion#
Identifying local extrema in pandas Series is critical for data analysis, but scipy.signal.argrelextrema requires careful handling to avoid errors. By addressing common issues like short series, flat regions, edge effects, and NaNs, you can reliably extract extrema. Alternatives like pandas.diff() or rolling windows offer simplicity for basic use cases, but argrelextrema remains superior for flexibility with neighbor comparisons.
Experiment with order and comparators to match your data’s characteristics, and always visualize results to validate extrema detection!
8. References#
- SciPy Documentation:
scipy.signal.argrelextrema - Pandas Documentation:
Series.diff() - Yahoo Finance API (
yfinance) - "Numerical Recipes in Python: The Art of Scientific Computing" (2007), Cambridge University Press.