

Photo by Author | Canva
With a large login model (LLMS), every coder today is! This is a message that you get from LLM promo content. This is not as true as any advertisement. The coding brake speed is much more than producing the code. However, translating English (or other natural languages) into viable SQL questions is one of LLM’s most forced use, and has its place in the world.
. Why use LLMS to produce SQL?
There are many benefits to using LLM to produce SQL, and, like everything, there are anything.
. Two types of text -to -SSQ LLM
We can distinguish between two of the most wide type of text -to -SSQL technology available in relation to their access to your database scheme.
- Llms without direct access
- LLM with direct access
!! 1. LLM without direct access to database scheme
These LLMs are not connected to the questions against the original database and do not implement them. The closest you can get is that you upload the datases you want to inquire. These tools depend on you to provide contexts about your scheme.
Examples of the device:
Use matters:
- Question Drafting and Prototing
- Learning and teaching
- Static code generation to review later
!! 2. LLM with direct access to database scheme
These LLMs are directly connected to your direct data sources, such as PostgresqlFor, for, for,. AsnophilicFor, for, for,. Big CuriOr Redshift. They allow you to produce, implement and return directly to your database from SQL questions.
Examples of the device:
Use matters:
- Analytics of conversation for business users
- Real Time Data Exploration
- Ambed AI Assistants in BI Platforms
. Stepped: How to go from the text to SQL
The basic workflow of obtaining SQL from the text is uniform, whether you use disconnected or associated LLM.
We will try to make a solution Question of interview from Shapif and Amazon Use the aforementioned steps in Chat GPT.
!! 1. Explain the scheme
In order to work on your data, LLM needs to clearly understand your data structure. It usually covers:
- The name of the table
- Names and types of column
- Relationships between tables (join, keys)
This information can be transmitted directly into indicators or recovered by dynamically using Search for vector within the Recovery Generation (RAG) Pipeline
!! 2. Indicated with natural tongue
Indications will usually contain two classes:
- Skima praise
- Question (Languages) for which we need SQL answer
For example: I first provide you with a quick structure that includes place holders. After that we will write an original gesture.
We will use Roll play indicatingWhich means to instruct Chattagat to play a special role.
Here is the way to create a gesture.
Dataset: My dataset consists of (number of tables) tables.
The first one is (table “” not found /)
with the following columns and data types:
(column names and data types)
The second table is (table “” not found /)
with the following columns and data types:
(column names and data types)
Question: (provide a question to be answered)
Assumptions: (provide assumptions for solving the question)
Role: (describe a role LLM has to play)
Let’s now write an original gesture that solves our interview questions.
This is, and we hope it will provide the right solution to the SQL. The names, data and assumptions of the column are pasted only from the question.
Dataset: My dataset consists of two tables.
The first one is "customers" with the following columns and data types:
address:
text
city:
text
first_name:
text
id:
bigint
last_name:
text
phone_number:
text
The second table is "orders" with the following columns and data types:
cust_id:
bigint
id:
bigint
order_date:
date
order_details:
text
total_order_cost:
bigint
Question: Find the customers with the highest daily total order cost between 2019-02-01 and 2019-05-01. If a customer had more than one order on a certain day, sum the order costs on a daily basis. Output each customer's first name, total cost of their items, and the date.
Assumption: For simplicity, you can assume that every first name in the dataset is unique.
Role: Act as a SQL expert and write a PostgreSQL query that answers the question.
!! 3. Create SQL Question
At this stage, the LLMSQL of your choice produces a question.
For example: Here’s a postgrade QL code chatiguepted for us.
SELECT
c.first_name,
daily_totals.order_date,
daily_totals.total_cost
FROM (
SELECT
cust_id,
order_date,
SUM(total_order_cost) AS total_cost
FROM orders
WHERE order_date BETWEEN '2019-02-01' AND '2019-05-01'
GROUP BY cust_id, order_date
) AS daily_totals
JOIN customers c ON c.id = daily_totals.cust_id
WHERE (daily_totals.cust_id, daily_totals.order_date) IN (
SELECT
cust_id,
order_date
FROM (
SELECT
cust_id,
order_date,
SUM(total_order_cost) AS total_cost,
RANK() OVER (ORDER BY SUM(total_order_cost) DESC) AS rnk
FROM orders
WHERE order_date BETWEEN '2019-02-01' AND '2019-05-01'
GROUP BY cust_id, order_date
) ranked
WHERE rnk = 1
);
!! 4. Follow the inquiry
Now, you can process the inquiry, be directly (if your LLM supports direct contacts) or by copying it and running it into your database system.
For example: In our case, we will copy the code to the Stratskrich Code Editor and use the ‘Check Solution’ button to correct the solution.
This is what is the result of it, and it is a right solution. Bravo for Chat GPT! She made her nail on the first attempt!
First_ Name | Order_Det | Total_cost |
---|---|---|
Burning | 2019-04-19 | 275 |
Hit | 2019-04-19 | 275 |
!! 5. Review, imagine, and improve
Depending on the purpose of using LLMS to write SQL code, this step can be optional. In the business world, you usually offer an output of inquiry in a user -friendly form, which usually includes:
- Showing the results as a table and/or chart
- Allowing follow -up requirements (such as, “Can you add customer City?”) And providing converted inquiry and output
. Drags and the best action
In our example, Chattagpat immediately. He came with the correct answer. However, this does not mean that it always happens, especially when data and requirements are more complicated. The use of LLMS is not without losses to get SQL questions from the text. If you want to make the LLM inquiry generation part of your data science workflow, you can avoid them by applying some of the best methods.
. Conclusion
LLM can be your best friend when you want to make SQL questions from the text. However, to make the best of these tools, you should have a clear understanding of what you want to achieve and use of the use of LLM.
This article provides you with such guidelines, as well as an example of how to indicate LLM in natural language and get a working SQL code.
Net Razii A data is in a scientist and product strategy. He is also an affiliated professor of Teaching Analytics, and is the founder of Stratskrich, a platform that helps data scientists prepare for his interview with the real questions of high companies. The net carrier writes on the latest trends in the market, gives interview advice, sharing data science projects, and everything covers SQL.