สร้าง Chatbot AI แบบกำหนดเองด้วย LangChain, Faiss และ Next.js

ในโพสต์นี้ ฉันพูดถึงวิธีที่ฉันสร้าง Chatbot AI เนื้อหาแบบกำหนดเองแบบโอเพ่นซอร์สด้วย Upstash, Next.js, LangChain และ Fly.io Upstash ช่วยให้ฉันจัดกำหนดการการฝึกโมเดล เสนอวิธีการจำกัดอัตราจำนวนมาก และการแคชการตอบสนองของ OpenAI API

สร้าง Chatbot AI แบบกำหนดเองด้วย LangChain, Faiss และ Next.js – คู่มือปฏิบัติ

สิ่งที่เราจะใช้

Next.js (ส่วนหน้าและส่วนหลัง)
LangChain (เฟรมเวิร์กสำหรับการพัฒนาแอปพลิเคชันที่ขับเคลื่อนโดยโมเดลภาษา)
Upstash (โมเดลการฝึกอบรมการจัดกำหนดการผ่าน QStash การจำกัดอัตรา และการตอบกลับ OpenAI แคช)
Tailwind CSS (การจัดรูปแบบ)
Fly.io (การติดตั้งใช้งาน)

สิ่งที่คุณต้องการ

Node.js 18
บัญชี Upstash
บัญชี OpenAI (สำหรับคีย์ OpenAI API)

การตั้งค่า Upstash Redis

เมื่อคุณสร้างบัญชี Upstash และเข้าสู่ระบบแล้ว คุณจะไปที่แท็บ Redis และสร้างฐานข้อมูล

สร้าง Chatbot AI แบบกำหนดเองด้วย LangChain, Faiss และ Next.js – คู่มือปฏิบัติ

หลังจากที่คุณสร้างฐานข้อมูลแล้ว คุณจะไปที่แท็บรายละเอียด เลื่อนลงไปจนกว่าคุณจะพบส่วนเชื่อมต่อฐานข้อมูลของคุณ คัดลอกเนื้อหาและบันทึกไว้ในที่ที่ปลอดภัย

สร้าง Chatbot AI แบบกำหนดเองด้วย LangChain, Faiss และ Next.js – คู่มือปฏิบัติ

เลื่อนลงไปจนกว่าคุณจะพบส่วน REST API และเลือกปุ่ม .env คัดลอกเนื้อหาและบันทึกไว้ในที่ที่ปลอดภัย

สร้าง Chatbot AI แบบกำหนดเองด้วย LangChain, Faiss และ Next.js – คู่มือปฏิบัติ

การตั้งค่าอัพสแตช QStash

เมื่อเข้าสู่ระบบแล้ว คุณจะไปที่แท็บ QStash และรับ 03 , 16 , 23 และ 36 . คัดลอกเนื้อหาและบันทึกไว้ในที่ที่ปลอดภัย

 
 
 การตั้งค่าโครงการ  
 
 หากต้องการตั้งค่า เพียงโคลน repo ของแอปแล้วทำตามบทช่วยสอนนี้เพื่อเรียนรู้ทุกอย่างที่อยู่ในนั้น หากต้องการแยกโครงการ ให้รัน:
 git clone https://github.com/rishi-raj-jain/custom-content-ai-chatbot
cd custom-content-ai-chatbot
npm install
 
 เมื่อคุณโคลน repo แล้ว คุณจะต้องสร้างไฟล์ .env คุณกำลังจะเพิ่มรายการที่เราบันทึกไว้จากส่วนด้านบน
 
 มันควรมีลักษณะดังนี้:
 # .env
 
# Obtained from the steps as above
 
# Upstash Redis Secrets
UPSTASH_REDIS_REST_URL="https://....upstash.io"
UPSTASH_REDIS_REST_TOKEN="..."
 
# Upstash QStash Secrets
QSTASH_URL="https://qstash.upstash.io/v1/publish/"
QSTASH_TOKEN="..."
QSTASH_CURRENT_SIGNING_KEY="sig_..."
QSTASH_NEXT_SIGNING_KEY="sig_..."
 
# OpenAI Key
OPENAI_API_KEY="sk-..."
 
# Admin Access Key
# Used to verify a training request as to be done only by an admin
ADMIN_KEY="..."
 
 หลังจากขั้นตอนเหล่านี้ คุณควรจะสามารถเริ่มต้นสภาพแวดล้อมภายในเครื่องได้โดยใช้คำสั่งต่อไปนี้:
 npm run dev
 โครงสร้างพื้นที่เก็บข้อมูล  
 
 นี่คือโครงสร้างโฟลเดอร์หลักสำหรับโปรเจ็กต์ ฉันได้ทำเครื่องหมายสีแดงในไฟล์ที่จะกล่าวถึงต่อไปในโพสต์นี้ที่เกี่ยวข้องกับการจัดการร้านค้าเวกเตอร์ การสร้างเส้นทาง API สำหรับการแชทกับ AI ที่ได้รับการฝึกในเนื้อหาที่คุณกำหนดเอง (พร้อมการแคชการตอบกลับ) และการจัดกำหนดการกระบวนการฝึกอบรมโมเดล
 
 
 กระแสข้อมูลและการดำเนินงานระดับสูง  
 
 นี่คือแผนภาพระดับสูงที่แสดงการไหลของข้อมูลและการดำเนินการที่เกิดขึ้น 👇🏻
 
 
  เมื่อผู้ใช้ถามคำถามผ่านแชทบอท IP ของผู้ใช้จะถูกตรวจสอบเทียบกับการจำกัดอัตรา และการตอบกลับ หากไม่ได้แคชผ่าน Upstash Redis จะถูกค้นหาจาก OpenAI API (จากนั้นแคช) และสตรีมไปยังผู้ใช้
 เมื่อผู้ดูแลระบบร้องขอการฝึกอบรมโมเดลที่มีอยู่ในชุด URL ที่กำหนด ด้วยความช่วยเหลือของ QStash ของ Upstash คำขอ POST จะเกิดขึ้นแบบไร้เซิร์ฟเวอร์หลังจากเกิดความล่าช้าที่กำหนดเพื่อดึงเนื้อหาใน URL ที่กำหนดและอัปเดตโมเดล (ในเบื้องหลัง)
 
 ตั้งค่าแชทและฝึกอบรมเส้นทาง API ใน Next.js
 
 ในส่วนนี้ เราจะพูดถึงวิธีที่เราตั้งค่าเส้นทาง:43 เพื่อเปิดใช้งานคำขอ Cross Origin, จำกัดอัตราการเรียก Chat API, แคชและการตอบสนองสตรีมให้กับผู้ใช้ และเปิดเผยวิธีการกำหนดเวลาการฝึกอบรมเนื้อหาใน URL เฉพาะ และ 54 เพื่อดำเนินการฝึกอบรมเกี่ยวกับ URL ที่กำหนดแต่ในเบื้องหลัง
 1. เปิดใช้งาน CORS
 
 ใช้ 69 แพ็คเกจ เราได้เปิดใช้งาน CORS ในแอปพลิเคชันเพื่อใช้แชทบอทในหลาย ๆ ที่ เช่น บอทบนเว็บไซต์ของคุณ ทันทีที่เส้นทาง API ได้รับการเริ่มต้น เราจะดำเนินการตั้งค่า cors ตามด้านล่าง 👇🏻
 // File: pages/api/chat.js
 
// Reference Function to cors
import { runMiddleware } from '@/lib/cors'
 
export default async function (req, res) {
 try {
 // Run the middleware
 await runMiddleware(req, res)
 // ...
 catch (e) {
 console.log(e.message || e.toString())
 }
 return res.end()
}
 
// Cors Function
// File: lib/cors.js
import Cors from 'cors'
 
// Initializing the cors middleware
// You can read more about the available options here: https://github.com/expressjs/cors#configuration-options
const cors = Cors({
 methods: ['POST', 'OPTIONS', 'HEAD'],
})
 
// Helper method to wait for a middleware to execute before continuing
// And to throw an error when an error happens in a middleware
export function runMiddleware(req, res, fn = cors) {
 return new Promise((resolve, reject) => {
 fn(req, res, (result) => {
 if (result instanceof Error) return reject(result)
 return resolve(result)
 })
 })
}
 2. กำหนดเวลาคำขอการฝึกอบรมเนื้อหาใน URL ที่กำหนด
 
 ด้วย Upstash QStash เราสามารถสร้าง API ที่เป็นเหมือนไฟและลืมได้ คุณไม่จำเป็นต้องรอให้ฟังก์ชันหลักเสร็จสิ้นเพื่อรับการตอบกลับ แต่ควรทำในเบื้องหลัง (เป็นทางเลือก หลังจากเกิดความล่าช้าไปบ้าง) มันเหมือนกับงาน cron แต่ทำงานตามคำขอแต่ละครั้งและไม่สม่ำเสมอตามช่วงเวลาที่กำหนด
 
 ในเส้นทาง Chat API เดียวกัน เรายอมรับคำขอที่มี 77 ส่วนหัวและหากตรงกับข้อมูลลับฝั่งเซิร์ฟเวอร์ (86 ) เราจะกำหนดเวลาการฝึกอบรมเนื้อหาเกี่ยวกับชุด URL ที่ส่งผ่านเนื้อหาคำขอหลังจากเกิดความล่าช้า (ที่นี่ 95 ). คำขอการฝึกอบรมเนื้อหาหลังจากการหน่วงเวลาที่กำหนดไว้ไปยังจุดสิ้นสุดที่กำหนด (ที่นี่:109 )
 // File: pages/api/chat.js
 
// If the headers contain an `admin-key` header
if (req.headers['admin-key'] === process.env.ADMIN_KEY) {
 // If `urls` is not in body, return with `Bad Request`
 if (!req.body.urls) return res.status(400).send('No urls to train on.')
 // Hit QStash API to train on this set of URLs after 10 seconds from now
 await qstashClient.publishJSON({
 delay: 10,
 body: { urls: req.body.urls },
 url: 'https://custom-content-ai-chatbot.fly.dev/api/train'
 })
 return res.status(200).end()
}
 
 ตอนนี้ เรามาเจาะลึกถึงสิ่งที่มีอยู่ในเส้นทาง Train API (118 ) 👇🏻
 // File: pages/api/train.js
 
import train from '@/lib/train'
import * as dotenv from 'dotenv'
import { redis } from '@/lib/redis'
import { runMiddleware } from '@/lib/cors'
import { verifySignature } from '@upstash/qstash/nextjs'
 
dotenv.config()
 
// Disabling converting request body to JSON directly
// More on https://nextjs.org/docs/pages/building-your-application/routing/api-routes#custom-config
export const config = {
 api: {
 bodyParser: false,
 },
}
 
async function handler(req, res) {
 try {
 // Run the middleware
 await runMiddleware(req, res)
 // If method is not POST, return with `Forbidden Access`
 if (req.method !== 'POST') return res.status(403).send('No other methods allowed.')
 // If `urls` is not in body, return with `Bad Request`
 if (!req.body.urls) return res.status(400).send('No urls to train on.')
 // Train on the particular URLs
 await train(req.body.urls)
 // Once saved, clear all the responses in Upstash
 let allKeys = await redis.keys('*')
 if (allKeys) {
 // Filter out the keys to not have the ratelimiter ones
 allKeys = allKeys.filter((i) => !i.includes('@upstash/ratelimit:'))
 const p = redis.pipeline()
 // Create a pipeline to clear out all the keys
 allKeys.forEach((i) => p.del(i))
 // Execute the pipeline commands in a transaction
 await p.exec()
 console.log('Cleaned cached responses in Upstash.')
 }
 return res.status(200).end()
 } catch (e) {
 console.log(e.message || e.toString())
 }
 return res.end()
}
 
// Verify the incoming request to be a valid
// QStash Scheduled POST request with Upstash-Signature
export default verifySignature(handler)
 
 ในโค้ดด้านบน เรากำลังดำเนินการที่สำคัญสามประการ:
  ดำเนินการยืนยันคำขอที่เข้ามาโดยใช้ 122 ของ QStash วิธีการ ด้านล่างจะมองหา 137 ส่วนหัวและตรวจสอบกับข้อมูลดิบที่ได้รับ
 โทรไปที่ 141 ฟังก์ชันที่รับเนื้อหา URL และเพิ่มไปยังร้านค้าเวกเตอร์ที่มีอยู่ (และบันทึก)
 ล้างการตอบสนองที่แคชไว้ใน Upstash Redis หลังจากกรองคีย์ที่เกี่ยวข้องกับการใช้งานการจำกัดอัตราผ่านธุรกรรม Redis ออก
 
 3. การจำกัดอัตรา
 
 ในการใช้การจำกัดอัตรา เราใช้ไคลเอนต์ฐานข้อมูล Upstash Redis และไลบรารีตัวจำกัดอัตราที่เรียกว่า 152 .
 // File: lib/redis.js
// Reference Function to ratelimiting
 
import * as dotenv from 'dotenv'
import { Redis } from '@upstash/redis'
import { Ratelimit } from '@upstash/ratelimit'
 
// Load environment variables
dotenv.config()
 
// Initialize Upstash Redis
export const redis = new Redis({
 url: process.env.UPSTASH_REDIS_REST_URL,
 token: process.env.UPSTASH_REDIS_REST_TOKEN,
})
 
// Initialize Upstash Rate Limiter
export const ratelimit = {
 chat: new Ratelimit({
 redis,
 // Limit requests to 30 questions per day per IP Address
 limiter: Ratelimit.slidingWindow(30, '86400s'),
 }),
}
 
 การใช้การจำกัดอัตราทำให้ฉันสามารถใช้บริการได้ - ฟรีและเป็นสาธารณะทั้งหมด! สิ่งนี้ทำให้ฉันสามารถแสดงประโยชน์ของระบบได้ เช่น การตอบกลับแชท แท้จริงแล้วใครๆ ก็สามารถถามคำถามได้ 30 ข้อต่อวันผ่านทางเว็บไซต์ เราสามารถบังคับใช้ขีดจำกัดอัตราคำถาม 30 ข้อในหนึ่งวันโดยอิงจาก 162 เป็นกุญแจสำคัญ
 // File: pages/api/chat.js
 
import requestIp from 'request-ip'
import { ratelimit } from '@/lib/redis'
 
// ...
 
// Get the client IP
const detectedIp = requestIp.getClientIp(req)
 
// If no IP detected, return with a `Bad Request`
if (!detectedIp) return res.status(400).send('Bad request.')
 
// Check the Rate Limit
const result = await ratelimit.chat.limit(detectedIp)
 
// If rate limited, return with the same
if (!result.success) return res.status(400).send('Rate limit exceeded.')
 
// Continue with serving the chat responses
 4. โหลดร้านค้าเวกเตอร์ที่จัดทำดัชนีไว้และขอคำตอบจาก OpenAI
 
 เมื่อการตรวจสอบทั้งหมดเสร็จสิ้นแล้ว ตอนนี้เรากำลังมุ่งหน้าไปสู่งานหลัก - เรียก OpenAI API ด้วยเนื้อหาที่กำหนดเองของเรา และส่งการตอบกลับไปยังผู้ใช้ เพื่อให้สิ่งต่าง ๆ ง่ายขึ้น เราจะแบ่งสิ่งนี้ออกเป็นส่วนเพิ่มเติม:
  3.1:การดึงข้อมูลร้านค้าเวกเตอร์ที่บันทึกไว้ 
 
 // File: pages/api/chat.js
 
// Reference Function to loadVectorStore
import { loadVectorStore } from '@/lib/vectorStore'
 
// Load the trained model
const vectorStore = await loadVectorStore()
 
// ...
 
// Vectore Store Function
// File: lib/vectorStore.js
 
import { join } from 'path'
import { existsSync } from 'fs'
import { Document } from 'langchain/document'
import { FaissStore } from 'langchain/vectorstores/faiss'
import { OpenAIEmbeddings } from 'langchain/embeddings/openai'
 
export async function loadVectorStore() {
 const directory = join(process.cwd(), 'loadedVectorStore')
 const docStoreJSON = join(process.cwd(), 'loadedVectorStore', 'docstore.json')
 if (existsSync(docStoreJSON)) {
 // If the directory is found, load the vector store saved by Faiss integration
 return await FaissStore.load(directory, new OpenAIEmbeddings())
 } else {
 // If no content is there, load the vector store with just `Hey` for starters
 return await FaissStore.fromDocuments([new Document({ pageContent: 'Hey' })], new OpenAIEmbeddings())
 }
}
  3.2:การเพิ่มหลักเกณฑ์พร้อมท์ให้กับคำถามของผู้ใช้ 
 
 
 การใช้ PromptTemplate โดย LangChain กับการสอบถามของผู้ใช้ เราจะส่งต่อคำแนะนำเกี่ยวกับวิธีการ และ AI จะตอบคำถามในลักษณะใด:
 // File: pages/api/chat.js
 
import { z } from 'zod'
import { PromptTemplate } from 'langchain/prompts'
import { RetrievalQAChain } from 'langchain/chains'
import { OutputFixingParser, StructuredOutputParser } from 'langchain/output_parsers'
 
// Load the trained model
// ...
 
// Create a prompt specifying for OpenAI what to write
const outputParser = StructuredOutputParser.fromZodSchema(
 z.object({
 answer: z.string().describe('answer to question in HTML friendly format, use all of the tags wherever possible and including reference links'),
 }),
)
 
// ...
 
// Create an instance of output parser class to help refine the response of OpenAI
const outputFixingParser = OutputFixingParser.fromLLM(model, outputParser)
 
// Create a prompt specifying for OpenAI how to process on the input
const prompt = new PromptTemplate({
 template: `Answer the user's question as best and be as detailed as possible:\n{format_instructions}\n{query}`,
 inputVariables: ['query'],
 partialVariables: {
 format_instructions: outputFixingParser.getFormatInstructions(),
 },
})
 
// Pass the prompt to the query with the model to OpenAI API
const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever(), prompt)
  3.3:สตรีมและการตอบสนองแคช 
 
 
 หากต้องการแคชการตอบกลับด้วย Upstash Redis เราจะใช้ 174 ไลบรารีแคชโดย LangChain เราส่งต่ออินสแตนซ์ Redis ที่มีอยู่เป็นไคลเอนต์ และส่งตัวจัดการแคชไปที่ 184 wrapper เพื่อใช้ในการแคชเมื่อมีการส่งการตอบสนอง:
 // File: pages/api/chat.js
 
import { redis } from '@/lib/redis'
import { ChatOpenAI } from 'langchain/chat_models/openai'
import { UpstashRedisCache } from 'langchain/cache/upstash_redis'
 
// Load the trained model
// ...
 
// Create Upstash caching
const upstashRedisCache = new UpstashRedisCache({ client: redis })
 
// A flag to detect if response was not cached
let doesToken = false
 
const model = new ChatOpenAI({
 // Enable streaming to return responses to user as quickly possible
 streaming: true,
 // Cache responses using Upstash Redis cache client
 cache: upstashRedisCache,
 callbacks: [
 {
 handleLLMNewToken(token) {
 // Set the flag to true if we receive stream from OpenAI
 doesToken = true
 // Stream the token to the user
 res.write(token)
 },
 },
 ],
})
 
// Create a LLM QA Chain
// ...
 
// Store the output to refer to in case cached
const chainOutput = await chain.call({ query: req.body.input })
 
// If no tokens received implies that the content is cached
// Return the cached response as is
if (!doesToken) return res.status(200).send(chainOutput.text)
 
 นั่นเป็นการเรียนรู้มากมาย! คุณทำเสร็จแล้ว
 ปรับใช้กับ Fly.io  
 
 พื้นที่เก็บข้อมูลมาพร้อมกับการตั้งค่าแบบอบสำหรับ Fly.io โดยเฉพาะที่เกี่ยวข้องกับ:
  นักเทียบท่าไฟล์
 fly.toml
 .นักเทียบท่าไม่สนใจ
 
 
 การปรับใช้ต้องมีบัญชีบน Fly.io เมื่อคุณมีบัญชีแล้ว คุณสามารถสร้างแอปใน Fly.io ได้โดยการรันคำสั่งต่อไปนี้ในโฟลเดอร์รูทของโปรเจ็กต์ของคุณ:
 # Create an app based on the baked-in configuration in your account
# This will result only in the change of app name in existing fly.toml
fly launch
 
 และปรับใช้ผ่าน 👇🏻
 # Deploy the app based on the configuration created above
fly deploy
 
 ตอนนี้เราเสร็จสิ้นการปรับใช้แล้ว! ใช่ นั่นคือทั้งหมด
 บทสรุป 
 
 โดยสรุป โปรเจ็กต์นี้ได้มอบประสบการณ์อันมีค่าในการใช้งานแคชการตอบสนองของ OpenAI การจำกัดอัตรา และคำขอ API ที่กำหนดเวลาไว้เพื่อฝึกโมเดล ทั้งหมดนี้ในขณะเดียวกันก็ใช้บริการที่ปรับขนาดตามความต้องการของคุณ เช่น Upstash
 
 197 , 201 , 212 , 224 , 235