LangChain for LLM Application Development

1.Models,Prompts and Parsers

Model(模型):底層支持的大部分語(yǔ)言模型
Prompts(提示詞):為模型創(chuàng)建輸入的一種方式
Parsers(解析器):在模型輸出的基礎(chǔ)上進(jìn)行解析楞遏,以結(jié)構(gòu)化的方式呈現(xiàn)出來(lái),方便后續(xù)處理
使用conda安裝Langchain的指令見(jiàn):Langchain官網(wǎng)


2.關(guān)于使用Prompt Templates in LangChain

有助于重復(fù)使用優(yōu)秀的prompt命斧,只需要更改其中部分核心的內(nèi)容辩块,類似于函數(shù)接口,將函數(shù)實(shí)現(xiàn)抽象出來(lái),僅需要提供參數(shù)即可使用虑瀑;

from langchain.chat_models import ChatOpenAI
#這一步是初始化一個(gè)聊天模型逛拱,傳入API的KEY
#需要注意的是可能需要通過(guò)參數(shù)名指定openai_api_key的值
chat = ChatOpenAI(openai_api_key = os.getenv("CHATGPT_API_KEY"),temperature = 0)
#演示用的Prompt Template
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""
#導(dǎo)入響應(yīng)的類
from langchain.prompts import ChatPromptTemplate
#根據(jù)提供的字符串生成對(duì)應(yīng)的Prompt Template敌厘,類似于函數(shù)的定義和初始化
prompt_template = ChatPromptTemplate.from_template(template_string)
#查看生成的Prompt Template
prompt_template.messages[0].prompt
#查看生成模板的所需輸入?yún)?shù)
prompt_template.messages[0].prompt.input_variables

#模板的兩個(gè)參數(shù)
customer_style = """American English \
in a calm and respectful tone
"""
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""
#使用模板將兩個(gè)參數(shù)傳遞進(jìn)去,類似調(diào)用函數(shù)過(guò)程
customer_messages = prompt_template.format_messages(
                    style=customer_style,
                    text=customer_email)
#查看參數(shù)是否成功按照模板放入customer_messages中
print(customer_messages[0])
#使用之前初始化的模型變量chat向LLM傳遞消息朽合,并用customer_response接收返回結(jié)果
customer_response = chat(customer_messages)
#打印模型響應(yīng)內(nèi)容
print(customer_response.content)

3.關(guān)于使用Parser(解釋器)來(lái)格式化output

首先導(dǎo)入LangChain響應(yīng)的函數(shù)

from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

其次制定所需要的格式化shcema

gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased\
                             as a gift for someone else? \
                             Answer True if yes,\
                             False if not or unknown.")
delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")
price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")

#制定好回應(yīng)的樣式俱两,list類型
response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

將制定好的樣式生成輸出解釋器

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

再根據(jù)輸出解釋器生成響應(yīng)的prompt提示詞,以便后續(xù)放入message中交由LLM執(zhí)行

format_instructions = output_parser.get_format_instructions()

將生成的解釋器提示詞放入Prompt中

review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

根據(jù)review_template_2生成對(duì)應(yīng)的prompt模板曹步,并向模版中傳入所需的參數(shù)

prompt = ChatPromptTemplate.from_template(review_template_2)

messages = prompt.format_messages(text=customer_review, 
                                format_instructions=format_instructions)

將提示詞交由LLM分析處理

response = chat(messages)

查看response中的返回內(nèi)容

print(response.content)

使用解釋器來(lái)將返回內(nèi)容格式化輸出并存儲(chǔ)在output_dict中宪彩,顯示為dictionary類型

output_dict = output_parser.parse(response.content)
print(type(output_dict))

4. Memory:使模型記住上下文,即之前的對(duì)話

4.1 ConversationBufferMemory

首先引入三個(gè)用到的LangChain的函數(shù)

from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

分別創(chuàng)建并初始化所需要的語(yǔ)言模型和memory讲婚,并將兩者作為參數(shù)傳入尿孔,以完成ConversationChain的初始化
其中Verbose參數(shù)控制的是:

是否顯示ConversationChain中的詳細(xì)信息,在后續(xù)打印信息中可以看出來(lái)

llm = ChatOpenAI(openai_api_key = openai.api_key,temperature = 0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm = llm,
    memory = memory,
    verbose = True)

經(jīng)過(guò)如下操作可以看出ConversationChain的詳細(xì)信息
同時(shí)也會(huì)發(fā)現(xiàn)Chain很好地自動(dòng)完成了上下文的鏈接筹麸,以幫助LLM記住上下文

conversation.predict(input = "Hi,I'm Marcus")
conversation.predict(input = "Can you tell me where is the capital of Chain?")
conversation.predict(input = "Do you still remember my name?")

如下兩行代碼可以查看當(dāng)前memory中存儲(chǔ)的對(duì)話

print(memory.buffer)
#此處的雙大括號(hào)代表一個(gè)空字典活合,為必要參數(shù)
#該函數(shù)還有很多別的強(qiáng)大功能,先按下不表
memory.load_memory_variables({})

函數(shù)save_context可以通過(guò)如下函數(shù)完成對(duì)memory的增加內(nèi)容操作

memory.save_context({"input":"Hi"},{"output":"What's up"})

LLM的每次對(duì)話都是獨(dú)立的API調(diào)用物赶,并不包括上下文的鏈接白指,看起來(lái)LLM好像具備“記憶”能力,是因?yàn)橛写a將這一功能封裝實(shí)現(xiàn)好了酵紫;
以至于告嘲,每次調(diào)用API時(shí),都是傳遞之前記錄的完整的對(duì)話內(nèi)容奖地;
但由于隨著對(duì)話的增加状蜗,每次調(diào)用API所需要傳遞的數(shù)據(jù)量增大,token數(shù)量也將增大鹉动,為降低成本LangChain有其他靈活便捷的存儲(chǔ)方式

4.2 ConversationBufferWindowMemory

使用對(duì)話次數(shù)來(lái)限制記憶的篇幅
創(chuàng)建時(shí)代碼如下:
其中參數(shù)k限制了回話的篇幅轧坎,如示例中的k = 1表示只會(huì)存儲(chǔ)雙方上一次的對(duì)話

from langchain.memory import ConversationBufferWindowMemory
#創(chuàng)建Memory
memory = ConversationBufferWindowMemory(k = 1)
#創(chuàng)建模型
llm = ChatOpenAI(openai_api_key = openai.api_key,temperature = 0)
#創(chuàng)建會(huì)話
conversation = ConversationChain(
    llm = llm,
    memory = memory,
    verbose = True)

4.3 ConversationTokenBufferMemory

使用token限制記憶的篇幅
代碼如下:
在ConversationTokenBufferMemory函數(shù)中,需要兩個(gè)參數(shù)

  • 一個(gè)是LLM模型泽示,因?yàn)椴煌腖LM有不同的Token計(jì)數(shù)方式
  • 另一個(gè)是限制的Token數(shù)量
from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

通過(guò)改變Token數(shù)量可以通過(guò)下面的代碼查看memory中存儲(chǔ)的對(duì)話信息:

memory.load_memory_variables({})

4.4 ConversationSummaryBufferMemory

通過(guò)LLM將過(guò)去的對(duì)話進(jìn)行總結(jié)缸血,再將其裝載入memory中節(jié)省篇幅
對(duì)于在token限制范圍內(nèi)的信息會(huì)顯式存儲(chǔ)蜜氨,而在這之外的則會(huì)被總結(jié)存儲(chǔ)
類似于TokenMemory,在創(chuàng)建SummaryMemory時(shí)需要提供LLM和max_token_limit兩個(gè)參數(shù)

代碼如下:

from langchain.memory import ConversationTokenBufferMemory
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."
#創(chuàng)建SummaryBufferMemory
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

為了驗(yàn)證并理解其功能
測(cè)試代碼如下:

memory.load_memory_variables({})
conversation = ConversationChain(
    llm = llm,
    memory = memory,
    verbose = True)
conversation.predict(input = "What is a good demo?")
memory.load_memory_variables({})

5. Chains

將一個(gè)LLM和提示詞prompt連接在一起捎泻,通過(guò)多個(gè)這樣的基礎(chǔ)構(gòu)建的塊可以實(shí)現(xiàn)對(duì)數(shù)據(jù)的一系列操作

5.1 LLMChain

首先飒炎,導(dǎo)入所需要的一系列函數(shù):

from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain

初始化所需要使用的模型:

llm = ChatOpenAI(temperature=0.9)

創(chuàng)建提示詞模板:

prompt = ChatPromptTemplate.from_template(
    "What is the best name to describe \
    a company that makes {product}?"
)

使用LLMChain鏈接模型與提示詞模板:

chain = LLMChain(llm=llm, prompt=prompt)

提供提示詞模板中所需要的參數(shù)并將其傳入運(yùn)行:

product = "Queen Size Sheet Set"
chain.run(product)

5.2 SimpleSequentialChain

依次運(yùn)行一系列的Chain,適用于子鏈僅有一個(gè)輸入和一個(gè)輸出的情況
首先導(dǎo)入響應(yīng)函數(shù):

from langchain.chains import SimpleSequentialChain

初始化LLM以及一個(gè)提示詞模板笆豁,并鏈接起來(lái):
提示詞用于根據(jù)一個(gè)產(chǎn)品名稱給出公司名

llm = ChatOpenAI(temperature=0.9)

# prompt template 1
first_prompt = ChatPromptTemplate.from_template(
    "What is the best name to describe \
    a company that makes {product}?"
)

# Chain 1
chain_one = LLMChain(llm=llm, prompt=first_prompt)

創(chuàng)建第二個(gè)提示詞模板郎汪,并與LLM鏈接:
提示詞用于根據(jù)一個(gè)公司名給出相應(yīng)的公司描述


# prompt template 2
second_prompt = ChatPromptTemplate.from_template(
    "Write a 20 words description for the following \
    company:{company_name}"
)
# chain 2
chain_two = LLMChain(llm=llm, prompt=second_prompt)

將兩個(gè)LLMChain裝入SimpleSequentialChain,并按照順序執(zhí)行:

overall_simple_chain = SimpleSequentialChain(chains=[chain_one, chain_two],
                                             verbose=True
                                            )
overall_simple_chain.run(product)

5.3 SequentialChain

適用于在子鏈有多個(gè)輸入或多個(gè)輸出時(shí)
首先導(dǎo)入所需要的函數(shù):

from langchain.chains import SequentialChain

創(chuàng)建第一個(gè)LLMChain,并指定output_key:

llm = ChatOpenAI(temperature=0.9)

# prompt template 1: translate to english
first_prompt = ChatPromptTemplate.from_template(
    "Translate the following review to english:"
    "\n\n{Review}"
)
# chain 1: input= Review and output= English_Review
chain_one = LLMChain(llm=llm, prompt=first_prompt, 
                     output_key="English_Review"
                    )

創(chuàng)建第二個(gè)LLMChain闯狱,并指定output_key:

second_prompt = ChatPromptTemplate.from_template(
    "Can you summarize the following review in 1 sentence:"
    "\n\n{English_Review}"
)
# chain 2: input= English_Review and output= summary
chain_two = LLMChain(llm=llm, prompt=second_prompt, 
                     output_key="summary"
                    )

創(chuàng)建第三個(gè)LLMChain煞赢,并指定output_key:

# prompt template 3: translate to english
third_prompt = ChatPromptTemplate.from_template(
    "What language is the following review:\n\n{Review}"
)
# chain 3: input= Review and output= language
chain_three = LLMChain(llm=llm, prompt=third_prompt,
                       output_key="language"
                      )

創(chuàng)建第四個(gè)LLMChain,并指定output_key:

# prompt template 4: follow up message
fourth_prompt = ChatPromptTemplate.from_template(
    "Write a follow up response to the following "
    "summary in the specified language:"
    "\n\nSummary: {summary}\n\nLanguage: {language}"
)
# chain 4: input= summary, language and output= followup_message
chain_four = LLMChain(llm=llm, prompt=fourth_prompt,
                      output_key="followup_message"
                     )

將上述四個(gè)LLMChain組合起來(lái)哄孤,并輸入SequentialChain:

# overall_chain: input= Review 
# and output= English_Review,summary, followup_message
overall_chain = SequentialChain(
    chains=[chain_one, chain_two, chain_three, chain_four],
    input_variables=["Review"],
    output_variables=["English_Review", "summary","followup_message"],
    verbose=True
)

最終運(yùn)行SequentialChain照筑,查看運(yùn)行效果:

review = df.Review[5]
overall_chain(review)

5.4 RouterChain

處理更為復(fù)雜的任務(wù),類似于有分支的情形
首先瘦陈,將四種學(xué)科的template初始化好:

physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise\
and easy to understand manner. \
When you don't know the answer to a question you admit\
that you don't know.

Here is a question:
{input}"""


math_template = """You are a very good mathematician. \
You are great at answering math questions. \
You are so good because you are able to break down \
hard problems into their component parts, 
answer the component parts, and then put them together\
to answer the broader question.

Here is a question:
{input}"""

history_template = """You are a very good historian. \
You have an excellent knowledge of and understanding of people,\
events and contexts from a range of historical periods. \
You have the ability to think, reflect, debate, discuss and \
evaluate the past. You have a respect for historical evidence\
and the ability to make use of it to support your explanations \
and judgements.

Here is a question:
{input}"""


computerscience_template = """ You are a successful computer scientist.\
You have a passion for creativity, collaboration,\
forward-thinking, confidence, strong problem-solving capabilities,\
understanding of theories and algorithms, and excellent communication \
skills. You are great at answering coding questions. \
You are so good because you know how to solve a problem by \
describing the solution in imperative steps \
that a machine can easily interpret and you know how to \
choose a solution that has a good balance between \
time complexity and space complexity. 

Here is a question:
{input}"""

使用list凝危,將其構(gòu)成便于處理的數(shù)據(jù)結(jié)構(gòu):

prompt_infos = [
    {
        "name": "physics", 
        "description": "Good for answering questions about physics", 
        "prompt_template": physics_template
    },
    {
        "name": "math", 
        "description": "Good for answering math questions", 
        "prompt_template": math_template
    },
    {
        "name": "History", 
        "description": "Good for answering history questions", 
        "prompt_template": history_template
    },
    {
        "name": "computer science", 
        "description": "Good for answering computer science questions", 
        "prompt_template": computerscience_template
    }
]

導(dǎo)入所需要用到的函數(shù):

from langchain.chains.router import MultiPromptChain
from langchain.chains.router.llm_router import LLMRouterChain,RouterOutputParser
from langchain.prompts import PromptTemplate

初始化所需要的LLM:

llm = ChatOpenAI(temperature=0)

創(chuàng)建destination_chains字典,并將上述template與LLM鏈接起來(lái)晨逝。裝入該字典:

#初始化終點(diǎn)鏈接字典
destination_chains = {}
for p_info in prompt_infos:
#根據(jù)關(guān)鍵字“name”提取學(xué)科類別
    name = p_info["name"]
#根據(jù)關(guān)鍵字“prompt_template”提取模板
    prompt_template = p_info["prompt_template"]
#根據(jù)template生成相應(yīng)的prompt
    prompt = ChatPromptTemplate.from_template(template=prompt_template)
#鏈接LLM和prompt
    chain = LLMChain(llm=llm, prompt=prompt)
#并根據(jù)關(guān)鍵字“name”將鏈接裝入destination_chains字典
    destination_chains[name] = chain  
    
#將prompt_infos字典中的元素以字符串的形式裝入destinations列表
destinations = [f"{p['name']}: {p['description']}" for p in prompt_infos]
#使用"\n"分割destinations的各個(gè)元素蛾默,使之成為字符串
destinations_str = "\n".join(destinations)

創(chuàng)建一個(gè)默認(rèn)的鏈接,以適應(yīng)上述學(xué)科均未涉及的情況:

default_prompt = ChatPromptTemplate.from_template("{input}")
default_chain = LLMChain(llm=llm, prompt=default_prompt)

以下為多提示詞路由template:

MULTI_PROMPT_ROUTER_TEMPLATE = """Given a raw text input to a \
language model select the model prompt best suited for the input. \
You will be given the names of the available prompts and a \
description of what the prompt is best suited for. \
You may also revise the original input if you think that revising\
it will ultimately lead to a better response from the language model.

<< FORMATTING >>
Return a markdown code snippet with a JSON object formatted to look like:
json
{{{{
    "destination": string \ name of the prompt to use or "DEFAULT"
    "next_inputs": string \ a potentially modified version of the original input
}}}}


REMEMBER: "destination" MUST be one of the candidate prompt \
names specified below OR it can be "DEFAULT" if the input is not\
well suited for any of the candidate prompts.
REMEMBER: "next_inputs" can just be the original input \
if you don't think any modifications are needed.

<< CANDIDATE PROMPTS >>
{destinations}

<< INPUT >>
{{input}}

<< OUTPUT (remember to include the ```json)>>"""

根據(jù)上述template生成routechain:

#用destinations_str替換destinations
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(
    destinations=destinations_str
)
router_prompt = PromptTemplate(
    template=router_template,
    input_variables=["input"],
#Parser將幫助鏈決定要路由到哪條子鏈
    output_parser=RouterOutputParser(),
)
#鏈接LLM與router_prompt
router_chain = LLMRouterChain.from_llm(llm, router_prompt)

最后鏈接好多提示詞鏈接:

chain = MultiPromptChain(
                         router_chain=router_chain, 
                         destination_chains=destination_chains, 
                         default_chain=default_chain, 
                         verbose=True
                        )

接下來(lái)即可使用該chain完成復(fù)雜的操作:

chain.run("What is black body radiation?")

6. Question & Answer Over Documents

將語(yǔ)言模型與文檔結(jié)合起來(lái)捉貌,但一般的語(yǔ)言模型只能檢查幾千詞支鸡,如何處理數(shù)據(jù)量大的文檔;
這就是嵌入(Embedding)庫(kù)和向量(Vector)庫(kù)的作用

6.1 嵌入(Embedding)

為文本創(chuàng)建數(shù)值表示昏翰,數(shù)值表示捕獲了它所覆蓋的文本片段的語(yǔ)義含義苍匆,內(nèi)容相似的文本中會(huì)含有相似的向量刘急;
讓我們能夠在向量空間中比較文本片段棚菊;


屏幕截圖 2023-08-09 160800.png

這在我們決定將哪些文本傳給LLM來(lái)回答問(wèn)題時(shí)十分有用;

6.2 向量數(shù)據(jù)庫(kù)(Vector Database)

是存儲(chǔ)上一部分創(chuàng)建的向量表示的一種方式叔汁;
創(chuàng)建該向量數(shù)據(jù)庫(kù)的方式统求,是將來(lái)自傳入文檔的文本片段填充進(jìn)去;
在面對(duì)大文檔時(shí)据块,先將其切片成小的文本段码邻,以便將小段的相關(guān)文本傳給LLM,然后為這些小文本段創(chuàng)建嵌入另假,再將這些嵌入存入向量數(shù)據(jù)庫(kù)像屋;

屏幕截圖 2023-08-09 163940.png

在query出現(xiàn)時(shí),先為其創(chuàng)建一個(gè)嵌入边篮,后將其向量與向量數(shù)據(jù)庫(kù)中的所有向量進(jìn)行比較己莺,選出最接近的幾個(gè)向量奏甫;
返回選中的向量,然后將其放入提示詞傳給LLM凌受,以獲得最終回答;
屏幕截圖 2023-08-09 164813.png

6.3 示例代碼

CSDN可參閱的資料
index = VectorstoreIndexCreator函數(shù)是一個(gè)包裝了很多別的邏輯操作的函數(shù)阵子,包括創(chuàng)建嵌入,初始化索引胜蛉,生成檢索器

6.3.1 簡(jiǎn)易版

導(dǎo)入相關(guān)的函數(shù):

#用于對(duì)文檔進(jìn)行檢索
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
#文檔加載挠进,用于加載將要和模型結(jié)合的文檔數(shù)據(jù)
from langchain.document_loaders import CSVLoader
#向量存儲(chǔ)器
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

初始化文件路徑,并用加載器加載該文件:

file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)

導(dǎo)入庫(kù):

#用于創(chuàng)建一個(gè)向量庫(kù)
from langchain.indexes import VectorstoreIndexCreator

創(chuàng)建一個(gè)向量庫(kù):

#第一個(gè)參數(shù)指定了向量庫(kù)的類型
#第二個(gè)loader為作為輸入的文檔加載器列表
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

創(chuàng)建詢問(wèn)query:

query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

使用向量庫(kù)index針對(duì)詢問(wèn)進(jìn)行查詢:

response = index.query(query)
display(Markdown(response))

6.3.2 詳盡版

首先創(chuàng)建一個(gè)文檔加載器誊册,并加載文檔:

loader = CSVLoader(file_path=file)
docs = loader.load()

創(chuàng)建嵌入:

from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

embeddings將請(qǐng)求轉(zhuǎn)成一個(gè)向量领突,代碼如下:

embed = embeddings.embed_query("Hi my name is Harrison")
print(len(embed))
print(embed[:5])

為載入的所有文檔創(chuàng)建嵌入,并將其存儲(chǔ)在向量數(shù)據(jù)庫(kù)中:

#接收文檔列表和一個(gè)嵌入對(duì)象(Embedding Objective)解虱,并生成一個(gè)向量數(shù)據(jù)庫(kù)
db = DocArrayInMemorySearch.from_documents(
    docs, 
    embeddings
)

使用生成的向量數(shù)據(jù)庫(kù)來(lái)找到與傳入查詢(query)相似的文本片段:

query = "Please suggest a shirt with sunblocking"
docs = db.similarity_search(query)
len(docs)
docs[0]

利用已有的向量索引即向量數(shù)據(jù)庫(kù)來(lái)生成相應(yīng)的檢索器:

retriever = db.as_retriever()

創(chuàng)建LLM對(duì)象:

llm = ChatOpenAI(temperature = 0.0)

將文檔列表連接為一個(gè)長(zhǎng)字符串攘须,并作為提示詞的一部分傳給LLM:

qdocs = "".join([docs[i].page_content for i in range(len(docs))])
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 
display(Markdown(response))

將上述操作組合成一個(gè)鏈,并使用該鏈:

#該鏈的作用是先進(jìn)行檢索,后在檢索到的文檔上進(jìn)行回答
qa_stuff = RetrievalQA.from_chain_type(
#LLM殴泰,用于最后回答文本的生成
    llm=llm, 
#設(shè)置Chain的類型
    chain_type="stuff", 
#獲取文檔的接口于宙,將被用于獲取文檔并被傳遞給語(yǔ)言模型
    retriever=retriever, 
    verbose=True
)
query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."
response = qa_stuff.run(query)
display(Markdown(response))

也可以用下列代碼實(shí)現(xiàn)響應(yīng):

response = index.query(query, llm=llm)

創(chuàng)建鏈時(shí)有四種不同的類型

  • stuff:將所有內(nèi)容放在一個(gè)提示詞中,交由LLM并獲得回復(fù)悍汛;
  • Map_reduce:接受所有的文本片段捞魁,將他們和問(wèn)題一起交給LLM,得到一個(gè)回答离咐,再將這些回答由另一個(gè)LLM調(diào)用進(jìn)行匯總谱俭;
  • Refine :迭代處理文檔,當(dāng)前回答總是基于上一個(gè)文檔的回答之上的宵蛀;
  • Map_rerank:對(duì)每個(gè)文檔都進(jìn)行一次LLM調(diào)用昆著,并要求返回一個(gè)分?jǐn)?shù),選擇最高分术陶,但需要你在提示詞中定義評(píng)分標(biāo)準(zhǔn)凑懂;

7.Evaluation:評(píng)估應(yīng)用程序的性能

首先構(gòu)建一個(gè)應(yīng)用程序,例子為上一節(jié)的程序:

from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)
data = loader.load()
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])
llm = ChatOpenAI(temperature = 0.0)
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

找一些想對(duì)其進(jìn)行評(píng)估的數(shù)據(jù)點(diǎn)梧宫,將會(huì)有多個(gè)方法實(shí)現(xiàn)該目標(biāo)
我們提出一些我們認(rèn)為是好示例的數(shù)據(jù)點(diǎn)
手動(dòng)方式接谨,自己查看doc并寫出QA example:

data[10]
data[11]
examples = [
    {
        "query": "Do the Cozy Comfort Pullover Set\
        have side pockets?",
        "answer": "Yes"
    },
    {
        "query": "What collection is the Ultra-Lofty \
        850 Stretch Down Hooded Jacket from?",
        "answer": "The DownTek collection"
    }
]

使用Chain自動(dòng)化生成QA的examples,可以節(jié)省很多時(shí)間:

from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in data[:5]]
)
new_examples[0]

將手動(dòng)生成的examples和自動(dòng)生成的examples結(jié)合起來(lái)塘匣,并使用鏈運(yùn)行其中的一個(gè)請(qǐng)求:

examples += new_examples
qa.run(examples[0]["query"])

評(píng)估環(huán)節(jié)
手動(dòng)Evaluate:
“自己評(píng)估”
僅僅查看鏈輸出的結(jié)果很難進(jìn)行錯(cuò)誤分析脓豪,因此需要將debug置為True并再次運(yùn)行展示詳細(xì)信息:

import langchain
langchain.debug = True
qa.run(examples[0]["query"])

LLM輔助評(píng)估:
調(diào)用一個(gè)新的LLM構(gòu)成一個(gè)EvalChain(評(píng)估鏈)來(lái)對(duì)剛剛的LLM生成的內(nèi)容進(jìn)行評(píng)估

# Turn off the debug mode
langchain.debug = False
predictions = qa.apply(examples)
from langchain.evaluation.qa import QAEvalChain
llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(examples, predictions)
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

8. Agent:LangChain的一個(gè)組件

LLM通過(guò)用戶提供的新的知識(shí)和數(shù)據(jù),幫助用戶回答問(wèn)題或通過(guò)內(nèi)容進(jìn)行推理忌卤;
例子中的tools有維基百科的api
首先引入需要用到的函數(shù):

from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.chat_models import ChatOpenAI

創(chuàng)建LLM和需要用到的tools:

llm = ChatOpenAI(temperature=0)
tools = load_tools(["llm-math","wikipedia"], llm=llm)

接著創(chuàng)建一個(gè)Agent:

agent= initialize_agent(
    tools, 
    llm, 
#CHAT:表示這是一個(gè)優(yōu)化過(guò)后的聊天代理
#REACT:一種提示詞技術(shù)扫夜,旨在從語(yǔ)言模型中獲得最佳推理
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
#當(dāng)輸出格式出錯(cuò)時(shí),將錯(cuò)誤文本傳回給LLM,要求其自我修正
    handle_parsing_errors=True,
    verbose = True)

詢問(wèn)數(shù)學(xué)問(wèn)題:

agent("What is the 25% of 300?")

詢問(wèn)維基百科查詢:

question = "Tom M. Mitchell is an American computer scientist \
and the Founders University Professor at Carnegie Mellon University (CMU)\
what book did he write?"
result = agent(question) 

使用語(yǔ)言模型書寫代碼然后執(zhí)行:

agent = create_python_agent(
    llm,
#REPL可以視作一個(gè)NoteBook笤闯,用于執(zhí)行代碼
    tool=PythonREPLTool(),
    verbose=True
)

向agent發(fā)布指令請(qǐng)求排序:

customer_list = [["Harrison", "Chase"], 
                 ["Lang", "Chain"],
                 ["Dolly", "Too"],
                 ["Elle", "Elem"], 
                 ["Geoff","Fusion"], 
                 ["Trance","Former"],
                 ["Jen","Ayai"]
                ]
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""") 

查看agent詳細(xì)運(yùn)行過(guò)程:

import langchain
langchain.debug=True
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""") 
langchain.debug=False

定義自己的tool现拒,并加入tools中:

#!pip install DateTime
from langchain.agents import tool
from datetime import date
@tool
def time(text: str) -> str:
#一定要對(duì)該函數(shù)進(jìn)行文本說(shuō)明
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function."""
    return str(date.today())

agent= initialize_agent(
    tools + [time], 
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose = True)
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市望侈,隨后出現(xiàn)的幾起案子印蔬,更是在濱河造成了極大的恐慌,老刑警劉巖脱衙,帶你破解...
    沈念sama閱讀 217,734評(píng)論 6 505
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件侥猬,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡捐韩,警方通過(guò)查閱死者的電腦和手機(jī)退唠,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,931評(píng)論 3 394
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)荤胁,“玉大人瞧预,你說(shuō)我怎么就攤上這事〗稣” “怎么了垢油?”我有些...
    開(kāi)封第一講書人閱讀 164,133評(píng)論 0 354
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)圆丹。 經(jīng)常有香客問(wèn)我滩愁,道長(zhǎng),這世上最難降的妖魔是什么辫封? 我笑而不...
    開(kāi)封第一講書人閱讀 58,532評(píng)論 1 293
  • 正文 為了忘掉前任硝枉,我火速辦了婚禮,結(jié)果婚禮上倦微,老公的妹妹穿的比我還像新娘妻味。我一直安慰自己,他們只是感情好欣福,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,585評(píng)論 6 392
  • 文/花漫 我一把揭開(kāi)白布责球。 她就那樣靜靜地躺著,像睡著了一般劣欢。 火紅的嫁衣襯著肌膚如雪棕诵。 梳的紋絲不亂的頭發(fā)上裁良,一...
    開(kāi)封第一講書人閱讀 51,462評(píng)論 1 302
  • 那天凿将,我揣著相機(jī)與錄音,去河邊找鬼价脾。 笑死牧抵,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播犀变,決...
    沈念sama閱讀 40,262評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼妹孙,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了获枝?” 一聲冷哼從身側(cè)響起蠢正,我...
    開(kāi)封第一講書人閱讀 39,153評(píng)論 0 276
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎省店,沒(méi)想到半個(gè)月后嚣崭,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,587評(píng)論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡懦傍,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,792評(píng)論 3 336
  • 正文 我和宋清朗相戀三年雹舀,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片粗俱。...
    茶點(diǎn)故事閱讀 39,919評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡说榆,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出寸认,到底是詐尸還是另有隱情签财,我是刑警寧澤,帶...
    沈念sama閱讀 35,635評(píng)論 5 345
  • 正文 年R本政府宣布偏塞,位于F島的核電站荠卷,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏烛愧。R本人自食惡果不足惜油宜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,237評(píng)論 3 329
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望怜姿。 院中可真熱鬧慎冤,春花似錦、人聲如沸沧卢。這莊子的主人今日做“春日...
    開(kāi)封第一講書人閱讀 31,855評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)但狭。三九已至披诗,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間立磁,已是汗流浹背呈队。 一陣腳步聲響...
    開(kāi)封第一講書人閱讀 32,983評(píng)論 1 269
  • 我被黑心中介騙來(lái)泰國(guó)打工, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留唱歧,地道東北人宪摧。 一個(gè)月前我還...
    沈念sama閱讀 48,048評(píng)論 3 370
  • 正文 我出身青樓粒竖,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親几于。 傳聞我的和親對(duì)象是個(gè)殘疾皇子蕊苗,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,864評(píng)論 2 354

推薦閱讀更多精彩內(nèi)容