Create an AI Chat Boat with Azigar and Openi API

https://www.youtube.com/watch?v=9rej66Crlcm

Learning to work directly with AI opens the world of possibilities beyond the use of Chat GPT in a browser with a program. When you understand how to connect with AI services using the application programming interface (APIS), you can create customs applications, integrate AI into existing systems, and create personal experiences that are compatible with your right needs.

In this hands -on tutorial, we will make a fully active chat boot from the beginning using Azgar and Openi API. You will learn to manage dialogue, to control the costs with the token budget, and create custom AI personalities that remain in several exchange. Finally, you will have both a working chat boot and basic skills to build more sophisticated AI -powered applications.

Why make your own chat boot?

Although Chat GPT likes AI tools are powerful, making your own chat boot teaches you the skills needed to work with AIAPIS professionally. You will understand how the conversation memory actually works, learn to effectively handle API costs, and have the ability to customize AI behavior for specific use matters.

This knowledge translates directly into real -world applications: your company’s voice with customer service boots, educational assistants for specific articles, or personal production capacity tools that understand your workflow.

Will you learn

By the end of this tutorial, you will know how:

Contact Openi API with secure verification
Design Customs AI Personas using System Indications
Make conversation loops that remember the previous exchange
Implement token count and budget management
Chat boot code structure using functions and classes
Handle the API’s mistakes and edge matters beautifully
Deploy your own chat boot to use others

Before starting: Setup guide

Provisions

You will need to stay comfortable with explaining the basic principles of Azar, such as variables, functions, loops and dictionary. Familiarity with your own explanation is especially important. The basic knowledge of APIS is helpful but does not need it – we will cover you need to know.

Setup of environment

First, you will need a local development environment. We suggest Vs. code If you are new to local development, even though an IDE will work.

Install the desired libraries using this command in your terminal:

pip install openai tiktoken

API Key Setup

Access to AI Model Lou You have two options:

Free Option: Sign up for Together togetherWhich provides \ $ 1 in free credit – more than enough of this whole lesson. Their free model is slow but it does not cost.

Premium Option: Use Open I Directly the model we will use (GPT -4 O -meni) is extremely affordable. Our entire tutorial test costs less than 5 cents.

Main security notes: Never in your script hard code API keys. We will use Environmental variables Keep them safe.

Set your environment through the Windows users, through the variable Settings > Environmental variablesThen restart your computer. Mac and Linux users can configure environmental variables without rebooting.

Part 1: Your first AI answer

Let’s start with an easiest chat boot – one who can respond to the same message. This foundation will teach you the basic concepts before adding complexity.

Create a new file that says chatbot.py And add this code:

import os
from openai import OpenAI

# Load API key securely from environment variables
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")

# Create the OpenAI client
client = OpenAI(api_key=api_key)

# Send a message and get a response
response = client.chat.completions.create(
    model="gpt-4o-mini",  # or "meta-llama/Llama-3.3-70B-Instruct-Turbo-Free" for Together
    messages=(
        {"role": "system", "content": "You are a fed up and sassy assistant who hates answering questions."},
        {"role": "user", "content": "What is the weather like today?"}
    ),
    temperature=0.7,
    max_tokens=100
)

# Extract and display the reply
reply = response.choices(0).message.content
print("Assistant:", reply)

Run this script and you will see something like that:

Assistant: Oh fantastic, questioning another season! I don’t have real time weather data, but here is a wild idea-maybe you look outside your window or check the seasonal app like each.

Understand the code

Happens in magic messages Parameter, which uses three separate character:

System: AI determines the personality and behavior. This is equivalent to briefing AI to a character that affects every response.
User: You (or your users) represent what you type on the chat boot.
Assistant: AI’s answers (we will add them to the conversation later).

Key parameters defined

Temperature Controls AI’s “creativity”. The lower values (0-0.3) produce permanent, forecast. High values (0.7-1.0) produce more creative but potentially unexpected results. We use 0.7 as a good balance.

Max tokens Limits the reaction length and protects your budget. Each token is equal to about 1/2 and 1 word, so 100 tokens allow considerable reaction while preventing the expenses that run away.

Part 2: Understanding AI variants

Run your script several times and see how the reaction is different each time. This happens because AI models use data samples – they not only choose the “best” word, but also select the potential options on the basis of context.

Let’s experience with it by editing your temperature:

# Try temperature=0 for consistent responses
temperature=0,
max_tokens=100

Run this version multiple times and observe the more permanent (though not the same) reaction.

Try now temperature=1.0 And see how much more creative and unexpected the reaction is. High temperatures often cause a long response, which leads us to a significant lesson about cost management.

Learning Visual: During the development of a different project, I mistakenly spent $ 20 on a single API call because I forgot to set up max_tokens When you take action on a large file. Always add token limits when experimenting!

Part 3: Reacting with functions

As your chat boot becomes more complicated, it is important to manage the code. Let’s react our script to use functions and global variables.

Make your edit app.py Code:

import os
from openai import OpenAI

# Configuration variables
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
client = OpenAI(api_key=api_key)
MODEL = "gpt-4o-mini"  # or "meta-llama/Llama-3.3-70B-Instruct-Turbo-Free"
TEMPERATURE = 0.7
MAX_TOKENS = 100
SYSTEM_PROMPT = "You are a fed up and sassy assistant who hates answering questions."

def chat(user_input):
    """Send a message to the AI and return the response."""
    response = client.chat.completions.create(
        model=MODEL,
        messages=(
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_input}
        ),
        temperature=TEMPERATURE,
        max_tokens=MAX_TOKENS
    )

    reply = response.choices(0).message.content
    return reply

# Test the function
print(chat("How are you doing today?"))

This reflecting makes our code more maintain and reusable. Global variables allow us to adjust the layout easily, while the function surrounds the chat logic for reuse.

Part 4: Adding Memorandum of Conversation

Real chat boats remember the previous exchange. Let’s add the conversation memory while maintaining a growing list of messages.

Create part3_chat_loop.py:

import os
from openai import OpenAI

# Configuration
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
client = OpenAI(api_key=api_key)
MODEL = "gpt-4o-mini"
TEMPERATURE = 0.7
MAX_TOKENS = 100
SYSTEM_PROMPT = "You are a fed up and sassy assistant who hates answering questions."

# Initialize conversation with system prompt
messages = ({"role": "system", "content": SYSTEM_PROMPT})

def chat(user_input):
    """Add user input to conversation and get AI response."""
    # Add user message to conversation history
    messages.append({"role": "user", "content": user_input})

    # Get AI response using full conversation history
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=TEMPERATURE,
        max_tokens=MAX_TOKENS
    )

    reply = response.choices(0).message.content

    # Add AI response to conversation history
    messages.append({"role": "assistant", "content": reply})

    return reply

# Interactive chat loop
while True:
    user_input = input("You: ")
    if user_input.strip().lower() in {"exit", "quit"}:
        break

    answer = chat(user_input)
    print("Assistant:", answer)

Now run your chat boot and try to ask the same question twice:

You: Hi, how are you?
Assistant: Oh fantastic, just living the dream of answering questions I don't care about. What do you want?

You: Hi, how are you?
Assistant: Seriously, again? Look, I'm here to help, not to exchange pleasantries all day. What do you need?

AI remembers your previous question and answers it accordingly.

How does the memory work

Every time someone sends a message, we get both the user’s input and AI response to us messages The list takes action on this whole history of the conversation to create a proper response according to the API context.

However, this causes a growing problem: long conversation means more token, which means more costs.

Part 5: Token management and cost control

As the conversation increases, so is the token count. And your bill. Let’s add smart token management to prevent running expenses.

Edit part4_final.py:

import os
from openai import OpenAI
import tiktoken

# Configuration
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
client = OpenAI(api_key=api_key)
MODEL = "gpt-4o-mini"
TEMPERATURE = 0.7
MAX_TOKENS = 100
TOKEN_BUDGET = 1000  # Maximum tokens to keep in conversation
SYSTEM_PROMPT = "You are a fed up and sassy assistant who hates answering questions."

# Initialize conversation
messages = ({"role": "system", "content": SYSTEM_PROMPT})

def get_encoding(model):
    """Get the appropriate tokenizer for the model."""
    try:
        return tiktoken.encoding_for_model(model)
    except KeyError:
        print(f"Warning: Tokenizer for model '{model}' not found. Falling back to 'cl100k_base'.")
        return tiktoken.get_encoding("cl100k_base")

ENCODING = get_encoding(MODEL)

def count_tokens(text):
    """Count tokens in a text string."""
    return len(ENCODING.encode(text))

def total_tokens_used(messages):
    """Calculate total tokens used in conversation."""
    try:
        return sum(count_tokens(msg("content")) for msg in messages)
    except Exception as e:
        print(f"(token count error): {e}")
        return 0

def enforce_token_budget(messages, budget=TOKEN_BUDGET):
    """Remove old messages if conversation exceeds token budget."""
    try:
        while total_tokens_used(messages) > budget:
            if len(messages) <= 2:  # Keep system prompt + at least one exchange
                break
            messages.pop(1)  # Remove oldest non-system message
    except Exception as e:
        print(f"(token budget error): {e}")

def chat(user_input):
    """Chat with memory and token management."""
    messages.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=TEMPERATURE,
        max_tokens=MAX_TOKENS
    )

    reply = response.choices(0).message.content
    messages.append({"role": "assistant", "content": reply})

    # Prune old messages if over budget
    enforce_token_budget(messages)

    return reply

# Interactive chat with token monitoring
while True:
    user_input = input("You: ")
    if user_input.strip().lower() in {"exit", "quit"}:
        break

    answer = chat(user_input)
    print("Assistant:", answer)
    print(f"Current tokens: {total_tokens_used(messages)}")

How does the token management work

The token management system works in several stages:

Count the token: We use trickin to count the token in each message
Monitor tomorrow: Track the total token in the whole conversation
Enforce the budget: When we exceed our token budget, automatically remove the oldest messages (but keep the system quick)

Learning Visual: Different models use different tochinization schemes. The word “dog” can contain 1 token in one model but 2 tokens in the other. Our encoding functions beautifully handle these differences.

Run your chat boot and talk long. See how the token count increases, then note when these old messages are cut. The chat boot maintains the current context while living in the budget.

Part 6: Code structure ready for production

For production applications, the object -based design provides better organization and encapsulation. Here’s how to convert our functional code to a class -based approach:

Create oop_chatbot.py:

import os
import tiktoken
from openai import OpenAI

class Chatbot:
    def __init__(self, api_key, model="gpt-4o-mini", temperature=0.7, max_tokens=100,
                 token_budget=1000, system_prompt="You are a helpful assistant."):
        self.client = OpenAI(api_key=api_key)
        self.model = model
        self.temperature = temperature
        self.max_tokens = max_tokens
        self.token_budget = token_budget
        self.messages = ({"role": "system", "content": system_prompt})
        self.encoding = self._get_encoding()

    def _get_encoding(self):
        """Get tokenizer for the model."""
        try:
            return tiktoken.encoding_for_model(self.model)
        except KeyError:
            print(f"Warning: No tokenizer found for model '{self.model}'. Falling back to 'cl100k_base'.")
            return tiktoken.get_encoding("cl100k_base")

    def _count_tokens(self, text):
        """Count tokens in text."""
        return len(self.encoding.encode(text))

    def _total_tokens_used(self):
        """Calculate total tokens in conversation."""
        try:
            return sum(self._count_tokens(msg("content")) for msg in self.messages)
        except Exception as e:
            print(f"(token count error): {e}")
            return 0

    def _enforce_token_budget(self):
        """Remove old messages if over budget."""
        try:
            while self._total_tokens_used() > self.token_budget:
                if len(self.messages) <= 2:
                    break
                self.messages.pop(1)
        except Exception as e:
            print(f"(token budget error): {e}")

    def chat(self, user_input):
        """Send message and get response."""
        self.messages.append({"role": "user", "content": user_input})

        response = self.client.chat.completions.create(
            model=self.model,
            messages=self.messages,
            temperature=self.temperature,
            max_tokens=self.max_tokens
        )

        reply = response.choices(0).message.content
        self.messages.append({"role": "assistant", "content": reply})

        self._enforce_token_budget()
        return reply

    def get_token_count(self):
        """Get current token usage."""
        return self._total_tokens_used()

# Usage example
api_key = os.getenv("OPENAI_API_KEY") or os.getenv("TOGETHER_API_KEY")
if not api_key:
    raise ValueError("No API key found. Set OPENAI_API_KEY or TOGETHER_API_KEY.")

bot = Chatbot(
    api_key=api_key,
    system_prompt="You are a fed up and sassy assistant who hates answering questions."
)

while True:
    user_input = input("You: ")
    if user_input.strip().lower() in {"exit", "quit"}:
        break

    response = bot.chat(user_input)
    print("Assistant:", response)
    print("Current tokens used:", bot.get_token_count())

The class -based approach to the chat boot boots all the functionality, makes the code more maintained, and provides a clean interface for integration into large applications.

Testing your chat boot

Run your full chat boot and test these scenarios:

Memory test: Ask a question, then refer to it later in the conversation
Personality examination: Confirm CSC remains permanently in personality exchange
Token management test: Have a long conversation and see the token count
Defects in handling the test: Try the wrong input to handle the malicious error

Common problems and solutions

Environmental variable problems: If you find verification errors, confirm that your API key is correctly configured. Windows users may need to resume after setting the environment variables.

Token counting contradictions: Different models use different toothing. Our Fallback Encoding provides proper estimates when precise toyers are not available.

Memory Management: If the conversation feels repeatedly, your token budget may be very low, causing the significant context to be very aggressively.

What’s ahead?

Now you have a complete active chat boot with memory, personality, and cost control. Here are the natural next steps:

Quick extension

Web interface: Deployment using Streamlit or Grade for user -friendly interface
Multiple personalities: Create different system indicators for different issues of use
Talks recovered: Save the conversation in JSON files for perseverance
Use analytics: Track the token use and costs over time

Advanced properties

Multi Model Support: Compare answers to different AI models
Customs knowledge: Connect your documents or data sources
Sound interface: Add speech to text and text -to -speech capabilities
User verification: Help multiple users with a separate conversation date

Production Conservatives

Limiting the rate: Handle the range of API rate beautifully
Watch: Add logging and error tracking
Scale Ebbitty: Designed for multiple compatible users
Hello: Implement the appropriate input verification and cleaning

Key path

Construction of its own chatboat teaches basic skills to work professionally with AI APIS. You have learned the condition of the conversation, control the token budget through the budget, and the structure code to maintain.

These skills transfer directly to production applications: customer service boots, educational assistants, creative writing tools, and countless other AI -powered applications.

The chat that you have made represents a solid foundation. Along with the techniques you have mastered – API integration, memory management, and cost.

Remember to experience with different personalities, temperature settings, and token budgets in the case of your specific use. The real power of building your own chatboat is in the capacity of the custom you cannot achieve by using any other KAI interface.

Resources and next steps

The full code: All examples are available Solution notebook
The support of the community: I join Data Quest Community To discuss their plans and to help extend
Learn the relevant: Discover API integration patterns and sophisticated techniques to build even more sophisticated applications

Start experimenting with your new chat boot, and remember that every conversation has the opportunity to learn for both you and your AI assistant!

Why make your own chat boot?

Will you learn

Before starting: Setup guide

Provisions

Setup of environment

API Key Setup

Part 1: Your first AI answer

Understand the code

Key parameters defined

Part 2: Understanding AI variants

Part 3: Reacting with functions

Part 4: Adding Memorandum of Conversation

How does the memory work

Part 5: Token management and cost control

How does the token management work

Part 6: Code structure ready for production

Testing your chat boot

Common problems and solutions

What’s ahead?

Quick extension

Advanced properties

Production Conservatives

Key path

Resources and next steps

Editor's pick

Get latest news

Create an AI Chat Boat with Azigar and Openi API

Why make your own chat boot?

Will you learn

Before starting: Setup guide

Provisions

Setup of environment

API Key Setup

Part 1: Your first AI answer

Understand the code

Key parameters defined

Part 2: Understanding AI variants

Part 3: Reacting with functions

Part 4: Adding Memorandum of Conversation

How does the memory work

Part 5: Token management and cost control

How does the token management work

Part 6: Code structure ready for production

Testing your chat boot

Common problems and solutions

What’s ahead?

Quick extension

Advanced properties

Production Conservatives

Key path

Resources and next steps

A brief introduction of sqlite

The Techtok deal approved but not finalized: President Trump

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news