ByteByteGo Newsletter

ByteByteGo Newsletter

Share this post

ByteByteGo Newsletter
ByteByteGo Newsletter
How to Build a Smart Chatbot in 10 mins with LangChain
Copy link
Facebook
Email
Notes
More

How to Build a Smart Chatbot in 10 mins with LangChain

Alex Xu's avatar
Alex Xu
Jun 06, 2023
∙ Paid
179

Share this post

ByteByteGo Newsletter
ByteByteGo Newsletter
How to Build a Smart Chatbot in 10 mins with LangChain
Copy link
Facebook
Email
Notes
More
5
3
Share

A large number of people have shown a keen interest in learning how to build a smart chatbot. To help us gain a better understanding of the process, I'm excited to bring you a special guest post by Damien Benveniste. He is the author of The AiEdge newsletter and was a Machine Learning Tech Lead at Meta. He holds a PhD from The Johns Hopkins University.

Below, he shares how to build a smart chatbot in 10 minutes with LangChain.

Subscribe to Damien's The AiEdge newsletter for more. You can also follow him on LinkedIn and Twitter.


LangChain is an incredible tool for interacting with Large Language Models (LLM.) In this deep dive, I’ll show you how to use databases, tools and memory to build a smart chatbot. At the end, I show how to ask ChatGPT for investment advice. This article covers: 

  • What is LangChain?

  • Indexing and searching new data

    • Let’s get some data

    • Pinecone: A vector database

    • Storing the data

    • Retrieving data with ChatGPT

  • Giving ChatGPT access to tools

  • Providing a conversation memory

  • Putting everything together

    • Giving access to Google Search

    • Utilizing the database as a tool

    • Solving a difficult problem: Should I invest in Google today?


What is LangChain?

LangChain is a package to build applications using LLMs. It is composed of 6 modules:

  • Prompts: This module allows you to build dynamic prompts using templates. It can adapt to different LLM types depending on the context window size and input variables used as context, such as conversation history, search results, previous answers, and more.

  • Models: This module provides an abstraction layer to connect to most available third- party LLM APIs. It has API connections to ~40 public LLMs, chat and embedding models.

  • Memory: This gives the LLMs access to the conversation history.

  • Indexes: Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents and integration to different vector databases.

  • Agents: Some applications require not just a predetermined chain of calls to LLMs or other tools, but potentially to an unknown chain that depends on the user’s input. In these types of chains, there is an agent with access to a suite of tools. Depending on the user’s input, the agent can decide which – if any – tool to call.

  • Chains: Using an LLM in isolation is fine for some simple applications, but many more complex ones require the chaining of LLMs, either with each other, or other experts. LangChain provides a standard interface for Chains, as well as some common implementations of chains for ease of use.

Currently, the API is not well documented and is disorganized, but if you are willing to dig into the source code, it is well worth the effort. I advise you to watch the following introductory video to get more familiar with it:

I now demonstrate how to use LangChain. You can install all the necessary libraries by running the following:

pip install pinecone-client langchain openai wikipedia google-api-python-client unstructured tabulate pdf2image

Indexing and searching new data

One difficulty with LLMs is that they only know what they learned during training. So how do we get them to use private data? One way is to make new text data discoverable by the LLM. The typical way to do this is to convert all private data into embeddings stored in a vector database. The process is as follows:

  • Chunk the data into small pieces

  • Pass that data through an LLM. The resulting final layer of the network can be used as a semantic vector representation of the data

  • The data can then be stored in a database of the vector representation used to recover that piece of data 

A question which we ask can be converted into an embedding, which is the query. We can then search for pieces of data located close to it in the embedding space and feed relevant documents to the LLM for it to extract an answer from: 

Let’s get some data

I sourced interesting data for a demonstration and selected the earnings reports of tech giant, Alphabet (Google): https://abc.xyz/investor/previous/

For simplicity, I downloaded and stored the reports on my computer’s hard drive:

We can now load those documents into memory with LangChain, using 2 lines of code:

from langchain.document_loaders import DirectoryLoader

loader = DirectoryLoader(
    './Langchain/data/', # my local directory
    glob='**/*.pdf',     # we only get pdfs
    show_progress=True
)
docs = loader.load()
docs

We split them into chunks. Each chunk corresponds to an embedding vector.

from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(
    chunk_size=1000, 
    chunk_overlap=0
)
docs_split = text_splitter.split_documents(docs)
docs_split

For this reason, we need to convert the data into embeddings and store them in a database.

Pinecone: A vector database

To store the data, I use Pinecone. You can create a free account and automatically get API keys with which to access the database:

In the “indexes” tab, click on “create index.” Give it a name and a dimension. I used “1536” for the dimension, as it is the size of the chosen embedding from the OpenAI embedding model. I use the cosine similarity metric to search for similar documents: 

This will create a vector table: 

Storing the data

Before continuing, make sure to get a OpenAI API key by signing up to the OpenAI platform:

Let’s first write down our API keys 

import os

PINECONE_API_KEY = ... # find at app.pinecone.io
PINECONE_ENV = ...     # next to api key in console
OPENAI_API_KEY = ...   # found at platform.openai.com/account/api-keys

os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

We upload the data to the vector database. The default OpenAI embedding model used in Langchain is 'text-embedding-ada-002' (OpenAI embedding models.) It is used to convert data into embedding vectors

import pinecone 
from langchain.vectorstores import Pinecone
from langchain.embeddings.openai import OpenAIEmbeddings

# we use the openAI embedding model
embeddings = OpenAIEmbeddings()
pinecone.init(
    api_key=PINECONE_API_KEY,
    environment=PINECONE_ENV
)

doc_db = Pinecone.from_documents(
    docs_split, 
    embeddings, 
    index_name='langchain-demo'
)

We can now search for relevant documents in that database using the cosine similarity metric

query = "What were the most important events for Google in 2021?"
search_docs = doc_db.similarity_search(query)
search_docs

Retrieving data with ChatGPT

We can now use a LLM to utilize the database data. Let’s get an LLM such as GPT-3 using:

from langchain import OpenAI
llm = OpenAI()

or we could get ChatGPT using

from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI()

Let’s use the RetrievalQA module to query that data:

from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type='stuff',
    retriever=doc_db.as_retriever(),
)

query = "What were the earnings in 2022?"
result = qa.run(query)

result

> 'The total revenues for the full year 2022 were $282,836 million, with operating income and operating margin information not provided in the given context.'

RetrievalQA is actually a wrapper around a specific prompt. The chain type “stuff“ will use a prompt, assuming the whole query text fits into the context window. It uses the following prompt template:

Use the following pieces of context to answer the users question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
{context}

{question}

Here the context will be populated with the user’s question and the results of the retrieved documents found in the database. You can use other chain types: “map_reduce”, “refine”, and “map-rerank” if the text is longer than the context window.

Giving ChatGPT access to tools

Keep reading with a 7-day free trial

Subscribe to ByteByteGo Newsletter to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 ByteByteGo
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More