RAG-路由选择

发表于 2025-07-05 分类于 RAG

简述

RAG的主流程其实比较简单，流程前面已经说过了，简单看一下下面的图了解一下就行。

从上面的图，可以看到，RAG在回答用户问题的时候，最少做了两次交互：1、与向量数据库的交互，2、与大模型的交互。如果我们对向量数据库和大模型有了细分，比如物理相关文档都放在物理向量数据库，物理大模型也是全部都用物理数据训练的，如果用户提问的是物理问题，那与物理向量数据库和物理大模型交互是不是能得到更好的答案，答案是肯定的，毕竟没有其他数据的污染，会查询生产的更快更准。

路由选择

从上面我们看到了两个地方可以做路由选择：向量数据库、大模型。其实除了这两个地方，还有一个隐形的地方，那就是提示词模版。我们调用大模型的时候是要输入提示词的，提示词都是按模版输入的，如果我们细分做好分类，按不同的问题选择特定的模版，也能提高我们生成更好答案的概率。

说直白一点，就是根据问题的类型，选择相应的向量库、模版和大模型。

向量数据库路由

其实我们是没有办法直接区分用户问题的类型的，但是我们可以让大模型去做区分。拿到区分后的结果，去选择相应的路由。

graph LR
A(开始) --> B[划定分类范围]
B --> C[大模型判断分类]
C --> D[选择向量库]
D --> E(结束)

1、划定分类：根据我们已有的向量数据库设置好分类，规范好分类种类。

2、大模型判断：输入用户问题，和我们设置好的分类，让大模型判断出用户输入问题是分类中的哪一类，或者说与分类中哪一类相关性最高。

3、选择向量库：根据大模型的判断结果，选择已有的向量库。

from typing import Literal

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI

# Data model
class RouteQuery(BaseModel):
    """Route a user query to the most relevant datasource."""

    datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(
        ...,
        description="Given a user question choose which datasource would be most relevant for answering their question",
    )

# LLM with function call 
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(RouteQuery)

# Prompt 
system = """You are an expert at routing a user question to the appropriate data source.

Based on the programming language the question is referring to, route it to the relevant data source."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

# Define router 
router = prompt | structured_llm

提示词模版路由

套路显然和上面的向量库的套路一样，用大模型获取到分类，在用分类去获取预设好的提示词模版。代码类同，能拿到对应的分类，找到分类对应的模版即可。

graph LR
A(开始) --> B[划定分类范围]
B --> C[大模型判断分类]
C --> D[选择提示词模版]
D --> E(结束)

大模型路由

套路一样，用大模型的判断结果去选择相应的大模型。代码类同。

graph LR
A(开始) --> B[划定分类范围]
B --> C[大模型判断分类]
C --> D[选择大模型]
D --> E(结束)

其他路由选择方法

除了上述使用大模型判断分类，还有没有其他方法判断分类，肯定是有的。以向量计算相关性，判断出最相关的分类。使用embeddings计算出预设模版的向量，再计算出用户问题的向量，最后找出与问题向量最相关的预设模版。能找出模版也就知道预设模版的分类，使用分类去找向量库和大模型即可。

from langchain.utils.math import cosine_similarity
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Two prompts
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise and easy to understand manner. \
When you don't know the answer to a question you admit that you don't know.

Here is a question:
{query}"""

math_template = """You are a very good mathematician. You are great at answering math questions. \
You are so good because you are able to break down hard problems into their component parts, \
answer the component parts, and then put them together to answer the broader question.

Here is a question:
{query}"""

# Embed prompts
embeddings = OpenAIEmbeddings()
prompt_templates = [physics_template, math_template]
prompt_embeddings = embeddings.embed_documents(prompt_templates)

# Route question to prompt 
def prompt_router(input):
    # Embed question
    query_embedding = embeddings.embed_query(input["query"])
    # Compute similarity
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    # Chosen prompt 
    print("Using MATH" if most_similar == math_template else "Using PHYSICS")
    return PromptTemplate.from_template(most_similar)


chain = (
    {"query": RunnablePassthrough()}
    | RunnableLambda(prompt_router)
    | ChatOpenAI()
    | StrOutputParser()
)

print(chain.invoke("What's a black hole"))