Stress Testing Fastepe Application - Kdnuggets

Picture by the writer

. Introduction

Stress testing is very important to understand how your application behaves under heavy loads. Machine Learning APIS LT, This is especially important because the model can be estimated to be related to CPU. By imitating a large number of users, we can identify performance barriers, determine our system’s ability, and ensure reliability.

In this tutorial, we will use:

Fastep: A modern, fast (high performance) web framework to make APIS with Azigar.
uvicorn: An ASGI server to run our fast API application.
Teddy: An open source load test tool. You describe the user’s behavior with the code, and crowded your system with hundreds simultaneously.
Skate Learn: Our example machine learning model.

. 1. Project setup and dependent

Set the project structure and install the necessary dependence.

Create requirements.txt File and add the following Pacific Packages:

fastapi==0.115.12
locust==2.37.10
numpy==2.3.0
pandas==2.3.0
pydantic==2.11.5
scikit-learn==1.7.0
uvicorn==0.34.3
orjson==3.10.18

Open your terminal, create a virtual environment, and activate it.

python -m venv venv
venv\Scripts\activate

Install all the Packages using this requirements.txt File

pip install -r requirements.txt

. 2. Fast PI application construction

In this section, we will create a file for registration model training, Padintic Models, and Fast API applications.

These ml_model.py The machine handles the learning model. It uses single pattern to ensure just one example of the model. Model California Housing Dataste is registered with a random jungle trained. If a pre -trained model (model dot PKL and Scale Dot PKL) is not available, it gives and saves a new training.

app/ml_model.py:

import os
import threading

import joblib
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

class MLModel:
    _instance = None
    _lock = threading.Lock()

    def __new__(cls):
        if cls._instance is None:
            with cls._lock:
                if cls._instance is None:
                    cls._instance = super().__new__(cls)
        return cls._instance

    def __init__(self):
        if not hasattr(self, "initialized"):
            self.model = None
            self.scaler = None
            self.model_path = "model.pkl"
            self.scaler_path = "scaler.pkl"
            self.feature_names = None
            self.initialized = True
            self.load_or_create_model()

    def load_or_create_model(self):
        """Load existing model or create a new one using California housing dataset"""
        if os.path.exists(self.model_path) and os.path.exists(self.scaler_path):
            self.model = joblib.load(self.model_path)
            self.scaler = joblib.load(self.scaler_path)
            housing = fetch_california_housing()
            self.feature_names = housing.feature_names
            print("Model loaded successfully")
        else:
            print("Creating new model...")
            housing = fetch_california_housing()
            X, y = housing.data, housing.target
            self.feature_names = housing.feature_names

            X_train, X_test, y_train, y_test = train_test_split(
                X, y, test_size=0.2, random_state=42
            )

            self.scaler = StandardScaler()
            X_train_scaled = self.scaler.fit_transform(X_train)

            self.model = RandomForestRegressor(
                n_estimators=50,  # Reduced for faster predictions
                max_depth=8,  # Reduced for faster predictions
                random_state=42,
                n_jobs=1,  # Single thread for consistency
            )
            self.model.fit(X_train_scaled, y_train)

            joblib.dump(self.model, self.model_path)
            joblib.dump(self.scaler, self.scaler_path)

            X_test_scaled = self.scaler.transform(X_test)
            score = self.model.score(X_test_scaled, y_test)
            print(f"Model R² score: {score:.4f}")

    def predict(self, features):
        """Make prediction for house price"""
        features_array = np.array(features).reshape(1, -1)
        features_scaled = self.scaler.transform(features_array)
        prediction = self.model.predict(features_scaled)(0)
        return prediction * 100000

    def get_feature_info(self):
        """Get information about the features"""
        return {
            "feature_names": list(self.feature_names),
            "num_features": len(self.feature_names),
            "description": "California housing dataset features",
        }

# Initialize model as singleton
ml_model = MLModel()

pydantic_models.py The file explains the Pydantic model for application and response data verification and serialization.

app/pydantic_models.py:

from typing import List

from pydantic import BaseModel, Field

class PredictionRequest(BaseModel):
    features: List(float) = Field(
        ...,
        description="List of 8 features: MedInc, HouseAge, AveRooms, AveBedrms, Population, AveOccup, Latitude, Longitude",
        min_length=8,
        max_length=8,
    )

    model_config = {
        "json_schema_extra": {
            "examples": (
                {"features": (8.3252, 41.0, 6.984, 1.024, 322.0, 2.556, 37.88, -122.23)}
            )
        }
    }

app/main.py: This file is the basic phosphate application, which explains the closing points of the API.

import asyncio
from contextlib import asynccontextmanager

from fastapi import FastAPI, HTTPException
from fastapi.responses import ORJSONResponse

from .ml_model import ml_model
from .pydantic_models import (
    PredictionRequest,
)

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Pre-load the model
    _ = ml_model.get_feature_info()
    yield

app = FastAPI(
    title="California Housing Price Prediction API",
    version="1.0.0",
    description="API for predicting California housing prices using Random Forest model",
    lifespan=lifespan,
    default_response_class=ORJSONResponse,
)

@app.get("/health")
async def health_check():
    """Health check endpoint"""
    return {"status": "healthy", "message": "Service is operational"}

@app.get("/model-info")
async def model_info():
    """Get information about the ML model"""
    try:
        feature_info = await asyncio.to_thread(ml_model.get_feature_info)
        return {
            "model_type": "Random Forest Regressor",
            "dataset": "California Housing Dataset",
            "features": feature_info,
        }
    except Exception:
        raise HTTPException(
            status_code=500, detail="Error retrieving model information"
        )

@app.post("/predict")
async def predict(request: PredictionRequest):
    """Make house price prediction"""
    if len(request.features) != 8:
        raise HTTPException(
            status_code=400,
            detail=f"Expected 8 features, got {len(request.features)}",
        )
    try:
        prediction = ml_model.predict(request.features)
        return {
            "prediction": float(prediction),
            "status": "success",
            "features_used": request.features,
        }
    except ValueError as e:
        raise HTTPException(status_code=400, detail=str(e))
    except Exception:
        raise HTTPException(status_code=500, detail="Prediction error")

Key points:

lifespan Manager: The ML model is filled during the application startup.
asyncio.to_thread: This is very important because the prediction of skate learning is connected to the CPU. Running it in a separate thread prevents it from stopping the unnecessary event loop of the fastpress, which allows the server to handle other requests simultaneously.

End Tips:

/health: A simple test of health.
/model-info: ML provides about metadata about the model.
/predict: A list of features accepts and returns home price forecast.

run_server.py: It contains the script that is used to operate the fast API application using Uvakorin.

import uvicorn

if __name__ == "__main__":

    uvicorn.run("app.main:app", host="localhost", port=8000, workers=4)

All files and configurations are available on the Gut Hub Repeasary: Kingbzpro/Stress Testing Fastep

. 3. Write the tension test of

Now, make a stress test script using locusts.

tests/locustfile.py: This file describes the behavior of artificial users.

import json
import logging
import random

from locust import HttpUser, task

# Reduce logging to improve performance
logging.getLogger("urllib3").setLevel(logging.WARNING)

class HousingAPIUser(HttpUser):
    def generate_random_features(self):
        """Generate random but realistic California housing features"""
        return (
            round(random.uniform(0.5, 15.0), 4),  # MedInc
            round(random.uniform(1.0, 52.0), 1),  # HouseAge
            round(random.uniform(2.0, 10.0), 2),  # AveRooms
            round(random.uniform(0.5, 2.0), 2),  # AveBedrms
            round(random.uniform(3.0, 35000.0), 0),  # Population
            round(random.uniform(1.0, 10.0), 2),  # AveOccup
            round(random.uniform(32.0, 42.0), 2),  # Latitude
            round(random.uniform(-124.0, -114.0), 2),  # Longitude
        )

    @task(1)
    def model_info(self):
        """Test health endpoint"""
        with self.client.get("/model-info", catch_response=True) as response:
            if response.status_code == 200:
                response.success()
            else:
                response.failure(f"Model info failed: {response.status_code}")

    @task(3)
    def single_prediction(self):
        """Test single prediction endpoint"""
        features = self.generate_random_features()


        with self.client.post(
            "/predict", json={"features": features}, catch_response=True, timeout=10
        ) as response:
            if response.status_code == 200:
                try:
                    data = response.json()
                    if "prediction" in data:
                        response.success()
                    else:
                        response.failure("Invalid response format")
                except json.JSONDecodeError:
                    response.failure("Failed to parse JSON")
            elif response.status_code == 503:
                response.failure("Service unavailable")
            else:
                response.failure(f"Status code: {response.status_code}")

Key points:

Each artificial user will wait between 0.5 and 2 seconds to perform tasks.
Creates a realistic random feature data for forecasting requests.
Each user will make a precision application and 3 single -prediction requests.

. 4. The tension test is running

To assess the performance of your application under the load, start your contradictory machine learning application in a terminal.

Model loaded successfully
INFO:     Started server process (26216)
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on  (Press CTRL+C to quit)

Open your browser and navigate the interactive API documents to test your closing points and make sure they are working properly.

Open a new terminal window, activate the virtual environment, and go to your Project Route Directory to run Teddy with Web UI:

locust -f tests/locustfile.py --host

Access to the Teddy Web UI In your browser

In the Teddy Web UI, set the total number of users to 500, the spoon rate on 10 users in 10 seconds, and run it for a minute.

During the test, Teddy will show real -time statistics, including applications, failures and number of reactions for each closing point.

Once the test is completed, click on the chart tab to view the interactive graph, which shows the number of users, the applications per second, and the response.

Use the following command to run the Teddy without web UI and to automatically prepare the HTML report:

locust -f tests/locustfile.py --host  --users 500 --spawn-rate 10 --run-time 60s --headless  --html report.html

After the test is over, a HTML report called report. HTML will be stored in your project directory later to review.

. The final views

Our app can handle a large number of users as we are using a simple machine learning model. The results show that the model info is more reaction than the closing point forecast, which is impressive. This is the best situation to check your request locally before pushing your application into production.

If you want to experience this setup, please visit Kingbzpro/Stress Testing Fastep Store and follow the instructions in the documents.

Abid Ali Owan For,,,,,,,,,, for,, for,,,, for,,,, for,,, for,,,, for,,,, for,,,, for,,, for,,, for,,, for,,, for,,,, for,,, for,,, for,,,, for,,, for,,,, for,,, for,,, for,,,, for,,, for,,, for,,,, for,,, for,,,, for,,, for,,,, for,,, for,,,, for,,, for,,,, for,,,, for,,,, for,,,, for,,,, for,,,, for,,,, for,,, for,,, for,,, for,,, for,,,,, for,,,, for,,,, for,,,, for,, for,.@1abidaliawan) A certified data scientist is a professional who loves to create a machine learning model. Currently, he is focusing on creating content and writing technical blogs on machine learning and data science technologies. Abid has a master’s degree in technology management and a bachelor’s degree in telecommunications engineering. Its vision is to create AI products using a graph neural network for students with mental illness.

. Introduction

. 1. Project setup and dependent

. 2. Fast PI application construction

. 3. Write the tension test of

. 4. The tension test is running

. The final views

Editor's pick

Get latest news

Stress Testing Fastepe Application – Kdnuggets

. Introduction

. 1. Project setup and dependent

. 2. Fast PI application construction

. 3. Write the tension test of

. 4. The tension test is running

. The final views

Painting Nature Drama in Pixels 🌊 | Kevin | August, 2025

The bid of anxiety for Google Chrome can only start

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news