We used 3 techniques of feature selection: it did the best

Image by editor

. Introduction

In any machine learning project, the selection of the feature can make or break your model. Choosing maximum subset of features reduces noise, prevents maximum fitting, increases interpretation, and often improves accuracy. With many irrelevant or useless variables, the models make it difficult to swallow and train. With very few people, they are at risk of losing critical indications.

We deal with this challenge, we tested with three popular techniques of choice of features on a real dataset. The goal was to determine which approach would provide the best balance of performance, interpretation and performance. In this article, we share our experience to test the three features of the feature selection and show what works better for our data.

. Why matters of feature selection?

When the machine learning model, especially on high -dimensional dimensions, not all features equally contribute. A lean thin, input high -informative set offers several benefits:

Reducing the maximum fitting – Eliminating irrelevant variables helps improve models not visible.
Faster training – Low features mean faster training and low computational costs.
Better description – With a compact set of predictions, it is easy to tell what the model decisions run.

. Datasate

For this experience, we used diabetes datastas with skatelon. This includes 10 baseline properties such as body mass index (BMI), blood pressure, several serum measurements and records of 442 patients with age. The target is a quantitative move to develop the disease a year after the baseline.

Let’s load the dataset and prepare it:

import pandas as pd
from sklearn.datasets import load_diabetes

# Load dataset
data = load_diabetes(as_frame=True)
df = data.frame

X = df.drop(columns=('target'))
y = df('target')

print(df.head())

Here, X Contains features, and y Contains the target. Now we have to apply different methods of choice of everything.

. The filter method

Filter methods Rate or eliminate the features based on the features of the model rather than the model training. They give a sharp way to relieve easy, fast and clear unemployment.

Wee Wee, we examined the most associated features and dropped anyone that exceeded the range of 0.85.

import numpy as np

corr = X.corr()
threshold = 0.85
upper = corr.abs().where(np.triu(np.ones(corr.shape), k=1).astype(bool))
to_drop = (col for col in upper.columns if any(upper(col) > threshold))
X_filter = X.drop(columns=to_drop)
print("Remaining features after filter:", X_filter.columns.tolist())

Output:

Remaining features after filter: ('age', 'sex', 'bmi', 'bp', 's1', 's3', 's4', 's5', 's6')

Only a useless feature was removed, so Dataset retained 9 of the 10 predictions. This shows that the data of diabetes is relatively clean in terms of communication.

. Reaper’s method

Wiper methods In fact, check out the sub -subcutaneous features by examining the training models and performance. Is a famous technique Repeat feature elimination (RFE).

The RFE starts with all the features, fits a model, categorized them in terms of importance, and removes at least useful people until the required number of features are left.

from sklearn.linear_model import LinearRegression
from sklearn.feature_selection import RFE

lr = LinearRegression()
rfe = RFE(lr, n_features_to_select=5)
rfe.fit(X, y)

selected_rfe = X.columns(rfe.support_)
print("Selected by RFE:", selected_rfe.tolist())

Selected by RFE: ('bmi', 'bp', 's1', 's2', 's5')

The RFE chose 5 out of 10 features. The trade point is that this approach is more expensive because it requires a number of model fitting rounds.

. Embeded method

Embeded ways Connect the selection of features in the model training process. Laso Regression (L1 Regular) There is a classic example. This feature fines weight, which shrinks less important to zero.

from sklearn.linear_model import LassoCV

lasso = LassoCV(cv=5, random_state=42).fit(X, y)

coef = pd.Series(lasso.coef_, index=X.columns)
selected_lasso = coef(coef != 0).index
print("Selected by Lasso:", selected_lasso.tolist())

Selected by Lasso: ('age', 'sex', 'bmi', 'bp', 's1', 's2', 's4', 's5', 's6')

Laso maintained 9 features and eliminated one that supported the power of a little prediction. In contrast to the filter methods, however, the decision was based on the performance of the model, not only the connection.

. Compare the results

We have trained a linear registration model on the selected feature sets. We used 5 times cross verification and measurement efficiency using the RO score and the meaning of the Square error (MSE).

from sklearn.model_selection import cross_val_score, KFold
from sklearn.linear_model import LinearRegression

# Helper evaluation function
def evaluate_model(X, y, model):
    cv = KFold(n_splits=5, shuffle=True, random_state=42)
    r2_scores = cross_val_score(model, X, y, cv=cv, scoring="r2")
    mse_scores = cross_val_score(model, X, y, cv=cv, scoring="neg_mean_squared_error")
    return r2_scores.mean(), -mse_scores.mean()

# 1. Filter Method results
lr = LinearRegression()
r2_filter, mse_filter = evaluate_model(X_filter, y, lr)

# 2. Wrapper (RFE) results
X_rfe = X(selected_rfe)
r2_rfe, mse_rfe = evaluate_model(X_rfe, y, lr)

# 3. Embedded (Lasso) results
X_lasso = X(selected_lasso)
r2_lasso, mse_lasso = evaluate_model(X_lasso, y, lr)

# Print results
print("=== Results Comparison ===")
print(f"Filter Method   -> R2: {r2_filter:.4f}, MSE: {mse_filter:.2f}, Features: {X_filter.shape(1)}")
print(f"Wrapper (RFE)   -> R2: {r2_rfe:.4f}, MSE: {mse_rfe:.2f}, Features: {X_rfe.shape(1)}")
print(f"Embedded (Lasso)-> R2: {r2_lasso:.4f}, MSE: {mse_lasso:.2f}, Features: {X_lasso.shape(1)}")

=== Results Comparison ===
Filter Method   -> R2: 0.4776, MSE: 3021.77, Features: 9
Wrapper (RFE)   -> R2: 0.4657, MSE: 3087.79, Features: 5
Embedded (Lasso)-> R2: 0.4818, MSE: 2996.21, Features: 9

The filter procedure only removed a useless feature and gave good baseline performances. Raper (RFE) cut the feature set in half but slightly less accuracy. Embedded (Laso) maintained 9 features and provided the best RC and the lowest MSE. Overall, Laso offered the best balance of accuracy, performance and interpretation.

. Conclusion

The choice of feature is not just a pre -processing move, but a strategic decision that shapes the overall success of the machine learning pipeline. Our experience reinforced that while each of the simple filters and complete rapists has its own place, but embedded methods often provide sweet space.

On a diabetes datastate, the Laso Regulatory came out as a clear winner. It helped us create a heavy, more accurate, and more explanatory model without explanation of heavy counts or filters.

For practitioners, this is the way: Do not close your eyes and rely on a single way. Start with Quick Quick Filters to Reduce Clear Spare, if you need full exploration, try rapists, but always consider embedded methods like LAS LASO of practical balance.

Jayta gland Machine learning is a fond and technical author who is driven by his fondness for making machine learning model. He holds a master’s degree in computer science from the University of Liverpool.

. Introduction

. Why matters of feature selection?

. Datasate

. The filter method

. Reaper’s method

. Embeded method

. Compare the results

. Conclusion

Editor's pick

Get latest news

We used 3 techniques of feature selection: it did the best

. Introduction

. Why matters of feature selection?

. Datasate

. The filter method

. Reaper’s method

. Embeded method

. Compare the results

. Conclusion

AI imagery is not art: the soul of human creativity. By Ragini “Rain” Navaskar | October, 2025

Soft introduction to MCP servers and clients

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news