ask chatGPT to read structured data using langchain

Shi
CI/CD/DevOps
Published in
2 min readNov 25, 2023

--

I am following instructions at https://python.langchain.com/docs/use_cases/qa_structured/sql to experiment querying structured data using human language.

in short, we will write a very short python3 program to initialize SQLDatabaseChain to accept some intuitive human query again a small sqlite3 sample database.

# Set env var OPENAI_API_KEY or load from a .env file

import dotenv
from langchain.llms import OpenAI
from langchain.utilities import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

dotenv.load_dotenv()
db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = OpenAI(temperature=0, verbose=True)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

db_chain.run("How many employees are there?")

The details of the database can be found here at: https://github.com/lerocha/chinook-database/

I can’t reproduce the exact steps mentioned in langchain.com to create the .db, so I took the shortcut to download some existing db hosted https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip, and it seems to be working as expected.

so I documented the exact steps here: https://github.com/whoissqr/LLM-learn/blob/main/sqlite-test-langchain/

OK, I am thinking what would I do if I wish to do this without LLM?

sqlite> .tables
albums employees invoices playlists
artists genres media_types tracks
customers invoice_items playlist_track


sqlite> .schema employees
CREATE TABLE IF NOT EXISTS "employees"
(
[EmployeeId] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
[LastName] NVARCHAR(20) NOT NULL,
[FirstName] NVARCHAR(20) NOT NULL,
[Title] NVARCHAR(30),
[ReportsTo] INTEGER,
[BirthDate] DATETIME,
[HireDate] DATETIME,
[Address] NVARCHAR(70),
[City] NVARCHAR(40),
[State] NVARCHAR(40),
[Country] NVARCHAR(40),
[PostalCode] NVARCHAR(10),
[Phone] NVARCHAR(24),
[Fax] NVARCHAR(24),
[Email] NVARCHAR(60),
FOREIGN KEY ([ReportsTo]) REFERENCES "employees" ([EmployeeId])
ON DELETE NO ACTION ON UPDATE NO ACTION
);
CREATE INDEX [IFK_EmployeeReportsTo] ON "employees" ([ReportsTo]);

sqlite> select count(EmployeeId) from employees;
8

okay, first, the results from LLM are correct; and, since I didn’t have experience with sqlite, it actually took me a while to figure out the command line above.

And, 5 mins later, OMG, the scary part, GPT is smarter than I thought:

--

--

Shi
CI/CD/DevOps

I am a coder/engineer/application security specialist. I like to play around with language and tools; I have strong interest in efficiency improvement.