Hybrid Methods
Hybrid methods combine undersampling and oversampling strategies.
SVDDWSMOTE
Combines Support Vector Data Description with SMOTE.
from fairsample import SVDDWSMOTE
sampler = SVDDWSMOTE(
nu=0.5,
kernel='rbf',
gamma='scale',
k_neighbors=5,
random_state=42
)
X_resampled, y_resampled = sampler.fit_resample(X, y)
Parameters:
- nu: SVDD parameter (default: 0.5)
- kernel: Kernel type (default: 'rbf')
- gamma: Kernel coefficient (default: 'scale')
- k_neighbors: SMOTE neighbors (default: 5)
Best for: Small datasets with non-linear boundaries
ODBOT - Outlier Detection-Based Oversampling Technique
Uses outlier detection to guide oversampling.
from fairsample import ODBOT
sampler = ODBOT(
contamination=0.1,
n_neighbors=5,
random_state=42
)
X_resampled, y_resampled = sampler.fit_resample(X, y)
Parameters:
- contamination: Expected outlier proportion (default: 0.1)
- n_neighbors: Number of neighbors (default: 5)
Best for: Datasets with outliers
EHSO - Evolutionary Hybrid Sampling
Uses evolutionary algorithms for hybrid sampling.
from fairsample import EHSO
sampler = EHSO(
population_size=50,
generations=100,
mutation_rate=0.1,
random_state=42
)
X_resampled, y_resampled = sampler.fit_resample(X, y)
Parameters:
- population_size: EA population size (default: 50)
- generations: Number of generations (default: 100)
- mutation_rate: Mutation probability (default: 0.1)
Best for: Complex datasets, willing to trade speed for quality
Comparison
from fairsample.utils import compare_techniques
results = compare_techniques(
X, y,
techniques=['SVDDWSMOTE', 'ODBOT', 'EHSO'],
complexity_measures='basic'
)
print(results[['technique', 'N3', 'sample_size']])