__Slots__ What does it actually do?

Photo by Author | Canva

What if there is a way to make your codes faster? __slots__ Enforcement is easy and can improve your code performance by reducing memory use.

In this article, we will go through how it works using a data science project from the real world, where Alegro is using it as a challenge for the process of science recruitment. However, before going to this project, let’s create a solid understanding of what we __slots__ Does

What is `slots` In

In the same way, everything has a dictionary of its attributes. This facilitates you to add, replace or delete them, but it also comes at a price: additional memory and access to slow attributes.
__slots__ The declaration has told Azgar that these are the only attributes that will ever need this purpose. This is a kind of limit, but it will save our time. Let’s look at an example.

class WithoutSlots:
    def __init__(self, name, age):
        self.name = name
        self.age = age

class WithSlots:
    __slots__ = ('name', 'age')

    def __init__(self, name, age):
        self.name = name
        self.age = age

In the second class, __slots__ He tells us not to make a dictionary for everything. Instead, it preserves a fixed space in memory for name and age values, which reduces the use of faster and memory.

Why use `slots`?

Now, before starting the data project, let’s tell the reason why you should use __slots__.

Memory: When Pyon develops a dictionary, objects take less space.
Speed: Access to values is faster because Azagar knows where every price is safe.
Insects: This structure avoids silent insects because only admirable people are allowed.

For example Use of Elgro’s Data Science Challenge

In this data project, Alegro asked the data science candidates to build a machine learning model and predict laptop prices.

A real data project to understand the Slot Slots

Link to this data project:

There are three different datases:

Train_ditaste.
Well_Dette.
Test_dataset.json

Well let’s continue the data research process.

Data Search

Now load one of them to see the data structure.

with open('train_dataset.json', 'r') as f:
    train_data = json.load(f)
df = pd.DataFrame(train_data).dropna().reset_index(drop=True)
df.head()

Here is an output.

Well, let’s see the column.

Here is an output.

Ezagar slot example

Now, check the numerical column.

Here is an output.

Ezagar slot example

Search for data with `slots` Vs regular classes

Let’s create a class called slotted data exploration, which will use __slots__ The attribute allows for only one attribute called the DF. Let’s see the code.

class SlottedDataExploration:
    __slots__ = ('df')

    def __init__(self, df):
        self.df = df

    def info(self):
        return self.df.info()

    def head(self, n=5):
        return self.df.head(n)

    def tail(self, n=5):
        return self.df.tail(n)

    def describe(self):
        return self.df.describe(include="all")

Now we see the implementation, and instead of using __slots__ Let’s use regular classes.

class DataExploration:
    def __init__(self, df):
        self.df = df

    def info(self):
        return self.df.info()

    def head(self, n=5):
        return self.df.head(n)

    def tail(self, n=5):
        return self.df.tail(n)

    def describe(self):
        return self.df.describe(include="all")

This is what you can read more about how the class methods work in The methods of azagar class Leader

Performance Comparison: Time Benchmark

Now measure the performance by measuring time and memory.

import time
from pympler import asizeof  # memory measurement

start_normal = time.time()
de = DataExploration(df)
_ = de.head()
_ = de.tail()
_ = de.describe()
_ = de.info()
end_normal = time.time()
normal_duration = end_normal - start_normal
normal_memory = asizeof.asizeof(de)

start_slotted = time.time()
sde = SlottedDataExploration(df)
_ = sde.head()
_ = sde.tail()
_ = sde.describe()
_ = sde.info()
end_slotted = time.time()
slotted_duration = end_slotted - start_slotted
slotted_memory = asizeof.asizeof(sde)

print(f"⏱️ Normal class duration: {normal_duration:.4f} seconds")
print(f"⏱️ Slotted class duration: {slotted_duration:.4f} seconds")

print(f"📦 Normal class memory usage: {normal_memory:.2f} bytes")
print(f"📦 Slotted class memory usage: {slotted_memory:.2f} bytes")

Let’s see the result now.

The classed class duration is 46.45 % faster, but for this example the use of memory is the same.

Machine Learning in Action

Now, in this section, let’s continue with the machine learning. But before doing so, let’s distribute a train and test.

Train and test divide

Now we have three different datases, trains, wells, and tests, so let’s find their indications first.

train_indeces = train_df.dropna().index
val_indeces = val_df.dropna().index
test_indeces = test_df.dropna().index

The time has come to assign these indicators to easily select these datases in the next step.

train_df = new_df.loc(train_indeces)
val_df = new_df.loc(val_indeces)
test_df = new_df.loc(test_indeces)

Great, now let’s format these data frames as the NIMP wants a flat (n,) format instead of
(n, 1) to do this, we need OT use. After Revil () to_numpy ().

X_train, X_val, X_test = train_df(selected_features).to_numpy(), val_df(selected_features).to_numpy(), test_df(selected_features).to_numpy()
y_train, y_val, y_test = df.loc(train_indeces)(label_col).to_numpy().ravel(), df.loc(val_indeces)(label_col).to_numpy().ravel(), df.loc(test_indeces)(label_col).to_numpy().ravel()

Machine Learning Model Apply

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error 
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.ensemble import VotingRegressor
from sklearn import linear_model
from sklearn.neural_network import MLPRegressor
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler, MaxAbsScaler
import matplotlib.pyplot as plt
from sklearn import tree
import seaborn as sns
def rmse(y_true, y_pred): 
    return mean_squared_error(y_true, y_pred, squared=False)
def regression(regressor_name, regressor):
    pipe = make_pipeline(MaxAbsScaler(), regressor)
    pipe.fit(X_train, y_train) 
    predicted = pipe.predict(X_test)
    rmse_val = rmse(y_test, predicted)
    print(regressor_name, ':', rmse_val)
    pred_df(regressor_name+'_Pred') = predicted
    plt.figure(regressor_name)
    plt.title(regressor_name)
    plt.xlabel('predicted')
    plt.ylabel('actual')
    sns.regplot(y=y_test,x=predicted)

Next, we will explain a dictionary of registers and run every model.

regressors = {
    'Linear' : LinearRegression(),
    'MLP': MLPRegressor(random_state=42, max_iter=500, learning_rate="constant", learning_rate_init=0.6),
    'DecisionTree': DecisionTreeRegressor(max_depth=15, random_state=42),
    'RandomForest': RandomForestRegressor(random_state=42),
    'GradientBoosting': GradientBoostingRegressor(random_state=42, criterion='squared_error',
                                                  loss="squared_error",learning_rate=0.6, warm_start=True),
    'ExtraTrees': ExtraTreesRegressor(n_estimators=100, random_state=42),
}
pred_df = pd.DataFrame(columns =("Actual"))
pred_df("Actual") = y_test
for key in regressors.keys():
    regression(key, regressors(key))

Here are the results.

Now, enforce it with both slot and regular classes.

With the machine learning `slots` Vs regular classes

Now check the code with the slot.

class SlottedMachineLearning:
    __slots__ = ('X_train', 'y_train', 'X_test', 'y_test', 'pred_df')

    def __init__(self, X_train, y_train, X_test, y_test):
        self.X_train = X_train
        self.y_train = y_train
        self.X_test = X_test
        self.y_test = y_test
        self.pred_df = pd.DataFrame({'Actual': y_test})

    def rmse(self, y_true, y_pred):
        return mean_squared_error(y_true, y_pred, squared=False)

    def regression(self, name, model):
        pipe = make_pipeline(MaxAbsScaler(), model)
        pipe.fit(self.X_train, self.y_train)
        predicted = pipe.predict(self.X_test)
        self.pred_df(name + '_Pred') = predicted

        score = self.rmse(self.y_test, predicted)
        print(f"{name} RMSE:", score)

        plt.figure(figsize=(6, 4))
        sns.regplot(x=predicted, y=self.y_test, scatter_kws={"s": 10})
        plt.xlabel('Predicted')
        plt.ylabel('Actual')
        plt.title(f'{name} Predictions')
        plt.grid(True)
        plt.show()

    def run_all(self):
        models = {
            'Linear': LinearRegression(),
            'MLP': MLPRegressor(random_state=42, max_iter=500, learning_rate="constant", learning_rate_init=0.6),
            'DecisionTree': DecisionTreeRegressor(max_depth=15, random_state=42),
            'RandomForest': RandomForestRegressor(random_state=42),
            'GradientBoosting': GradientBoostingRegressor(random_state=42, learning_rate=0.6, warm_start=True),
            'ExtraTrees': ExtraTreesRegressor(n_estimators=100, random_state=42)
        }

        for name, model in models.items():
            self.regression(name, model)

Here is a regular class application.

class MachineLearning:
    def __init__(self, X_train, y_train, X_test, y_test):
        self.X_train = X_train
        self.y_train = y_train
        self.X_test = X_test
        self.y_test = y_test
        self.pred_df = pd.DataFrame({'Actual': y_test})

    def rmse(self, y_true, y_pred):
        return mean_squared_error(y_true, y_pred, squared=False)

    def regression(self, name, model):
        pipe = make_pipeline(MaxAbsScaler(), model)
        pipe.fit(self.X_train, self.y_train)
        predicted = pipe.predict(self.X_test)
        self.pred_df(name + '_Pred') = predicted

        score = self.rmse(self.y_test, predicted)
        print(f"{name} RMSE:", score)

        plt.figure(figsize=(6, 4))
        sns.regplot(x=predicted, y=self.y_test, scatter_kws={"s": 10})
        plt.xlabel('Predicted')
        plt.ylabel('Actual')
        plt.title(f'{name} Predictions')
        plt.grid(True)
        plt.show()

    def run_all(self):
        models = {
            'Linear': LinearRegression(),
            'MLP': MLPRegressor(random_state=42, max_iter=500, learning_rate="constant", learning_rate_init=0.6),
            'DecisionTree': DecisionTreeRegressor(max_depth=15, random_state=42),
            'RandomForest': RandomForestRegressor(random_state=42),
            'GradientBoosting': GradientBoostingRegressor(random_state=42, learning_rate=0.6, warm_start=True),
            'ExtraTrees': ExtraTreesRegressor(n_estimators=100, random_state=42)
        }

        for name, model in models.items():
            self.regression(name, model)

Performance Comparison: Time Benchmark

Now let’s compare each code with what we did in the back.

import time

start_normal = time.time()
ml = MachineLearning(X_train, y_train, X_test, y_test)
ml.run_all()
end_normal = time.time()
normal_duration = end_normal - start_normal
normal_memory = (
    ml.X_train.nbytes +
    ml.X_test.nbytes +
    ml.y_train.nbytes +
    ml.y_test.nbytes
)

start_slotted = time.time()
sml = SlottedMachineLearning(X_train, y_train, X_test, y_test)
sml.run_all()
end_slotted = time.time()
slotted_duration = end_slotted - start_slotted
slotted_memory = (
    sml.X_train.nbytes +
    sml.X_test.nbytes +
    sml.y_train.nbytes +
    sml.y_test.nbytes
)

print(f"⏱️ Normal ML class duration: {normal_duration:.4f} seconds")
print(f"⏱️ Slotted ML class duration: {slotted_duration:.4f} seconds")

print(f"📦 Normal ML class memory usage: {normal_memory:.2f} bytes")
print(f"📦 Slotted ML class memory usage: {slotted_memory:.2f} bytes")

time_diff = normal_duration - slotted_duration
percent_faster = (time_diff / normal_duration) * 100
if percent_faster > 0:
    print(f"✅ Slotted ML class is {percent_faster:.2f}% faster than the regular ML class.")
else:
    print(f"ℹ️ No speed improvement with slots in this run.")

memory_diff = normal_memory - slotted_memory
percent_smaller = (memory_diff / normal_memory) * 100
if percent_smaller > 0:
    print(f"✅ Slotted ML class uses {percent_smaller:.2f}% less memory than the regular ML class.")
else:
    print(f"ℹ️ No memory savings with slots in this run.")

Here is an output.

Conclusion

By stopping the dynamic creation __dict__ For each instance, Azigar __slots__ Great in reducing memory use and accelerating access to attributes. You see how it works in practice through both data research and machine learning, using the Algro’s real recruitment project.

In small datases, improvement can be modest. But as the scales of data, the benefits are more noticeable, especially in memory connected or in critical applications of performance.

Net Razii A data is in a scientist and product strategy. He is also an affiliated professor of Teaching Analytics, and is the founder of Stratskrich, a platform that helps data scientists prepare for his interview with the real questions of high companies. The net carrier writes on the latest trends in the market, gives interview advice, sharing data science projects, and everything covers SQL.

__Slots__

What is `slots` In

Why use `slots`?

For example Use of Elgro’s Data Science Challenge

Data Search

Search for data with `slots` Vs regular classes

Performance Comparison: Time Benchmark

Machine Learning in Action

Train and test divide

Machine Learning Model Apply

With the machine learning `slots` Vs regular classes

Performance Comparison: Time Benchmark

Conclusion

Editor's pick

Get latest news

__Slots__ What does it actually do?

What is __slots__ In

Why use __slots__?

For example Use of Elgro’s Data Science Challenge

Data Search

Search for data with __slots__ Vs regular classes

Performance Comparison: Time Benchmark

Machine Learning in Action

Train and test divide

Machine Learning Model Apply

With the machine learning __slots__ Vs regular classes

Performance Comparison: Time Benchmark

Conclusion

Openi claims that a new chat GPT agent can operate your work, make your slides, and you feel like your life is together

Apple sued June processor by charging of stealing iOS 26 information from an employee

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news

Slots What does it actually do?

What is `slots` In

Why use `slots`?

Search for data with `slots` Vs regular classes

With the machine learning `slots` Vs regular classes