1.Models,Prompts and Parsers
Model(模型):底層支持的大部分語(yǔ)言模型
Prompts(提示詞):為模型創(chuàng)建輸入的一種方式
Parsers(解析器):在模型輸出的基礎(chǔ)上進(jìn)行解析楞遏,以結(jié)構(gòu)化的方式呈現(xiàn)出來(lái),方便后續(xù)處理
使用conda安裝Langchain的指令見(jiàn):Langchain官網(wǎng)
2.關(guān)于使用Prompt Templates in LangChain
有助于重復(fù)使用優(yōu)秀的prompt命斧,只需要更改其中部分核心的內(nèi)容辩块,類似于函數(shù)接口,將函數(shù)實(shí)現(xiàn)抽象出來(lái),僅需要提供參數(shù)即可使用虑瀑;
from langchain.chat_models import ChatOpenAI
#這一步是初始化一個(gè)聊天模型逛拱,傳入API的KEY
#需要注意的是可能需要通過(guò)參數(shù)名指定openai_api_key的值
chat = ChatOpenAI(openai_api_key = os.getenv("CHATGPT_API_KEY"),temperature = 0)
#演示用的Prompt Template
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""
#導(dǎo)入響應(yīng)的類
from langchain.prompts import ChatPromptTemplate
#根據(jù)提供的字符串生成對(duì)應(yīng)的Prompt Template敌厘,類似于函數(shù)的定義和初始化
prompt_template = ChatPromptTemplate.from_template(template_string)
#查看生成的Prompt Template
prompt_template.messages[0].prompt
#查看生成模板的所需輸入?yún)?shù)
prompt_template.messages[0].prompt.input_variables
#模板的兩個(gè)參數(shù)
customer_style = """American English \
in a calm and respectful tone
"""
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""
#使用模板將兩個(gè)參數(shù)傳遞進(jìn)去,類似調(diào)用函數(shù)過(guò)程
customer_messages = prompt_template.format_messages(
style=customer_style,
text=customer_email)
#查看參數(shù)是否成功按照模板放入customer_messages中
print(customer_messages[0])
#使用之前初始化的模型變量chat向LLM傳遞消息朽合,并用customer_response接收返回結(jié)果
customer_response = chat(customer_messages)
#打印模型響應(yīng)內(nèi)容
print(customer_response.content)
3.關(guān)于使用Parser(解釋器)來(lái)格式化output
首先導(dǎo)入LangChain響應(yīng)的函數(shù)
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser
其次制定所需要的格式化shcema
gift_schema = ResponseSchema(name="gift",
description="Was the item purchased\
as a gift for someone else? \
Answer True if yes,\
False if not or unknown.")
delivery_days_schema = ResponseSchema(name="delivery_days",
description="How many days\
did it take for the product\
to arrive? If this \
information is not found,\
output -1.")
price_value_schema = ResponseSchema(name="price_value",
description="Extract any\
sentences about the value or \
price, and output them as a \
comma separated Python list.")
#制定好回應(yīng)的樣式俱两,list類型
response_schemas = [gift_schema,
delivery_days_schema,
price_value_schema]
將制定好的樣式生成輸出解釋器
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
再根據(jù)輸出解釋器生成響應(yīng)的prompt提示詞,以便后續(xù)放入message中交由LLM執(zhí)行
format_instructions = output_parser.get_format_instructions()
將生成的解釋器提示詞放入Prompt中
review_template_2 = """\
For the following text, extract the following information:
gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.
delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.
price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.
text: {text}
{format_instructions}
"""
根據(jù)review_template_2生成對(duì)應(yīng)的prompt模板曹步,并向模版中傳入所需的參數(shù)
prompt = ChatPromptTemplate.from_template(review_template_2)
messages = prompt.format_messages(text=customer_review,
format_instructions=format_instructions)
將提示詞交由LLM分析處理
response = chat(messages)
查看response中的返回內(nèi)容
print(response.content)
使用解釋器來(lái)將返回內(nèi)容格式化輸出并存儲(chǔ)在output_dict中宪彩,顯示為dictionary類型
output_dict = output_parser.parse(response.content)
print(type(output_dict))
4. Memory:使模型記住上下文,即之前的對(duì)話
4.1 ConversationBufferMemory
首先引入三個(gè)用到的LangChain的函數(shù)
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
分別創(chuàng)建并初始化所需要的語(yǔ)言模型和memory讲婚,并將兩者作為參數(shù)傳入尿孔,以完成ConversationChain的初始化
其中Verbose參數(shù)控制的是:
是否顯示ConversationChain中的詳細(xì)信息,在后續(xù)打印信息中可以看出來(lái)
llm = ChatOpenAI(openai_api_key = openai.api_key,temperature = 0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
llm = llm,
memory = memory,
verbose = True)
經(jīng)過(guò)如下操作可以看出ConversationChain的詳細(xì)信息
同時(shí)也會(huì)發(fā)現(xiàn)Chain很好地自動(dòng)完成了上下文的鏈接筹麸,以幫助LLM記住上下文
conversation.predict(input = "Hi,I'm Marcus")
conversation.predict(input = "Can you tell me where is the capital of Chain?")
conversation.predict(input = "Do you still remember my name?")
如下兩行代碼可以查看當(dāng)前memory中存儲(chǔ)的對(duì)話
print(memory.buffer)
#此處的雙大括號(hào)代表一個(gè)空字典活合,為必要參數(shù)
#該函數(shù)還有很多別的強(qiáng)大功能,先按下不表
memory.load_memory_variables({})
函數(shù)save_context可以通過(guò)如下函數(shù)完成對(duì)memory的增加內(nèi)容操作
memory.save_context({"input":"Hi"},{"output":"What's up"})
LLM的每次對(duì)話都是獨(dú)立的API調(diào)用物赶,并不包括上下文的鏈接白指,看起來(lái)LLM好像具備“記憶”能力,是因?yàn)橛写a將這一功能封裝實(shí)現(xiàn)好了酵紫;
以至于告嘲,每次調(diào)用API時(shí),都是傳遞之前記錄的完整的對(duì)話內(nèi)容奖地;
但由于隨著對(duì)話的增加状蜗,每次調(diào)用API所需要傳遞的數(shù)據(jù)量增大,token數(shù)量也將增大鹉动,為降低成本LangChain有其他靈活便捷的存儲(chǔ)方式
4.2 ConversationBufferWindowMemory
使用對(duì)話次數(shù)來(lái)限制記憶的篇幅
創(chuàng)建時(shí)代碼如下:
其中參數(shù)k限制了回話的篇幅轧坎,如示例中的k = 1表示只會(huì)存儲(chǔ)雙方上一次的對(duì)話
from langchain.memory import ConversationBufferWindowMemory
#創(chuàng)建Memory
memory = ConversationBufferWindowMemory(k = 1)
#創(chuàng)建模型
llm = ChatOpenAI(openai_api_key = openai.api_key,temperature = 0)
#創(chuàng)建會(huì)話
conversation = ConversationChain(
llm = llm,
memory = memory,
verbose = True)
4.3 ConversationTokenBufferMemory
使用token限制記憶的篇幅
代碼如下:
在ConversationTokenBufferMemory函數(shù)中,需要兩個(gè)參數(shù)
- 一個(gè)是LLM模型泽示,因?yàn)椴煌腖LM有不同的Token計(jì)數(shù)方式
- 另一個(gè)是限制的Token數(shù)量
from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)
memory.save_context({"input": "AI is what?!"},
{"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
{"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"},
{"output": "Charming!"})
通過(guò)改變Token數(shù)量可以通過(guò)下面的代碼查看memory中存儲(chǔ)的對(duì)話信息:
memory.load_memory_variables({})
4.4 ConversationSummaryBufferMemory
通過(guò)LLM將過(guò)去的對(duì)話進(jìn)行總結(jié)缸血,再將其裝載入memory中節(jié)省篇幅
對(duì)于在token限制范圍內(nèi)的信息會(huì)顯式存儲(chǔ)蜜氨,而在這之外的則會(huì)被總結(jié)存儲(chǔ)
類似于TokenMemory,在創(chuàng)建SummaryMemory時(shí)需要提供LLM和max_token_limit兩個(gè)參數(shù)
代碼如下:
from langchain.memory import ConversationTokenBufferMemory
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."
#創(chuàng)建SummaryBufferMemory
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
{"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"},
{"output": f"{schedule}"})
為了驗(yàn)證并理解其功能
測(cè)試代碼如下:
memory.load_memory_variables({})
conversation = ConversationChain(
llm = llm,
memory = memory,
verbose = True)
conversation.predict(input = "What is a good demo?")
memory.load_memory_variables({})
5. Chains
將一個(gè)LLM和提示詞prompt連接在一起捎泻,通過(guò)多個(gè)這樣的基礎(chǔ)構(gòu)建的塊可以實(shí)現(xiàn)對(duì)數(shù)據(jù)的一系列操作
5.1 LLMChain
首先飒炎,導(dǎo)入所需要的一系列函數(shù):
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
初始化所需要使用的模型:
llm = ChatOpenAI(temperature=0.9)
創(chuàng)建提示詞模板:
prompt = ChatPromptTemplate.from_template(
"What is the best name to describe \
a company that makes {product}?"
)
使用LLMChain鏈接模型與提示詞模板:
chain = LLMChain(llm=llm, prompt=prompt)
提供提示詞模板中所需要的參數(shù)并將其傳入運(yùn)行:
product = "Queen Size Sheet Set"
chain.run(product)
5.2 SimpleSequentialChain
依次運(yùn)行一系列的Chain,適用于子鏈僅有一個(gè)輸入和一個(gè)輸出的情況
首先導(dǎo)入響應(yīng)函數(shù):
from langchain.chains import SimpleSequentialChain
初始化LLM以及一個(gè)提示詞模板笆豁,并鏈接起來(lái):
提示詞用于根據(jù)一個(gè)產(chǎn)品名稱給出公司名
llm = ChatOpenAI(temperature=0.9)
# prompt template 1
first_prompt = ChatPromptTemplate.from_template(
"What is the best name to describe \
a company that makes {product}?"
)
# Chain 1
chain_one = LLMChain(llm=llm, prompt=first_prompt)
創(chuàng)建第二個(gè)提示詞模板郎汪,并與LLM鏈接:
提示詞用于根據(jù)一個(gè)公司名給出相應(yīng)的公司描述
# prompt template 2
second_prompt = ChatPromptTemplate.from_template(
"Write a 20 words description for the following \
company:{company_name}"
)
# chain 2
chain_two = LLMChain(llm=llm, prompt=second_prompt)
將兩個(gè)LLMChain裝入SimpleSequentialChain,并按照順序執(zhí)行:
overall_simple_chain = SimpleSequentialChain(chains=[chain_one, chain_two],
verbose=True
)
overall_simple_chain.run(product)
5.3 SequentialChain
適用于在子鏈有多個(gè)輸入或多個(gè)輸出時(shí)
首先導(dǎo)入所需要的函數(shù):
from langchain.chains import SequentialChain
創(chuàng)建第一個(gè)LLMChain,并指定output_key:
llm = ChatOpenAI(temperature=0.9)
# prompt template 1: translate to english
first_prompt = ChatPromptTemplate.from_template(
"Translate the following review to english:"
"\n\n{Review}"
)
# chain 1: input= Review and output= English_Review
chain_one = LLMChain(llm=llm, prompt=first_prompt,
output_key="English_Review"
)
創(chuàng)建第二個(gè)LLMChain闯狱,并指定output_key:
second_prompt = ChatPromptTemplate.from_template(
"Can you summarize the following review in 1 sentence:"
"\n\n{English_Review}"
)
# chain 2: input= English_Review and output= summary
chain_two = LLMChain(llm=llm, prompt=second_prompt,
output_key="summary"
)
創(chuàng)建第三個(gè)LLMChain煞赢,并指定output_key:
# prompt template 3: translate to english
third_prompt = ChatPromptTemplate.from_template(
"What language is the following review:\n\n{Review}"
)
# chain 3: input= Review and output= language
chain_three = LLMChain(llm=llm, prompt=third_prompt,
output_key="language"
)
創(chuàng)建第四個(gè)LLMChain,并指定output_key:
# prompt template 4: follow up message
fourth_prompt = ChatPromptTemplate.from_template(
"Write a follow up response to the following "
"summary in the specified language:"
"\n\nSummary: {summary}\n\nLanguage: {language}"
)
# chain 4: input= summary, language and output= followup_message
chain_four = LLMChain(llm=llm, prompt=fourth_prompt,
output_key="followup_message"
)
將上述四個(gè)LLMChain組合起來(lái)哄孤,并輸入SequentialChain:
# overall_chain: input= Review
# and output= English_Review,summary, followup_message
overall_chain = SequentialChain(
chains=[chain_one, chain_two, chain_three, chain_four],
input_variables=["Review"],
output_variables=["English_Review", "summary","followup_message"],
verbose=True
)
最終運(yùn)行SequentialChain照筑,查看運(yùn)行效果:
review = df.Review[5]
overall_chain(review)
5.4 RouterChain
處理更為復(fù)雜的任務(wù),類似于有分支的情形
首先瘦陈,將四種學(xué)科的template初始化好:
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise\
and easy to understand manner. \
When you don't know the answer to a question you admit\
that you don't know.
Here is a question:
{input}"""
math_template = """You are a very good mathematician. \
You are great at answering math questions. \
You are so good because you are able to break down \
hard problems into their component parts,
answer the component parts, and then put them together\
to answer the broader question.
Here is a question:
{input}"""
history_template = """You are a very good historian. \
You have an excellent knowledge of and understanding of people,\
events and contexts from a range of historical periods. \
You have the ability to think, reflect, debate, discuss and \
evaluate the past. You have a respect for historical evidence\
and the ability to make use of it to support your explanations \
and judgements.
Here is a question:
{input}"""
computerscience_template = """ You are a successful computer scientist.\
You have a passion for creativity, collaboration,\
forward-thinking, confidence, strong problem-solving capabilities,\
understanding of theories and algorithms, and excellent communication \
skills. You are great at answering coding questions. \
You are so good because you know how to solve a problem by \
describing the solution in imperative steps \
that a machine can easily interpret and you know how to \
choose a solution that has a good balance between \
time complexity and space complexity.
Here is a question:
{input}"""
使用list凝危,將其構(gòu)成便于處理的數(shù)據(jù)結(jié)構(gòu):
prompt_infos = [
{
"name": "physics",
"description": "Good for answering questions about physics",
"prompt_template": physics_template
},
{
"name": "math",
"description": "Good for answering math questions",
"prompt_template": math_template
},
{
"name": "History",
"description": "Good for answering history questions",
"prompt_template": history_template
},
{
"name": "computer science",
"description": "Good for answering computer science questions",
"prompt_template": computerscience_template
}
]
導(dǎo)入所需要用到的函數(shù):
from langchain.chains.router import MultiPromptChain
from langchain.chains.router.llm_router import LLMRouterChain,RouterOutputParser
from langchain.prompts import PromptTemplate
初始化所需要的LLM:
llm = ChatOpenAI(temperature=0)
創(chuàng)建destination_chains字典,并將上述template與LLM鏈接起來(lái)晨逝。裝入該字典:
#初始化終點(diǎn)鏈接字典
destination_chains = {}
for p_info in prompt_infos:
#根據(jù)關(guān)鍵字“name”提取學(xué)科類別
name = p_info["name"]
#根據(jù)關(guān)鍵字“prompt_template”提取模板
prompt_template = p_info["prompt_template"]
#根據(jù)template生成相應(yīng)的prompt
prompt = ChatPromptTemplate.from_template(template=prompt_template)
#鏈接LLM和prompt
chain = LLMChain(llm=llm, prompt=prompt)
#并根據(jù)關(guān)鍵字“name”將鏈接裝入destination_chains字典
destination_chains[name] = chain
#將prompt_infos字典中的元素以字符串的形式裝入destinations列表
destinations = [f"{p['name']}: {p['description']}" for p in prompt_infos]
#使用"\n"分割destinations的各個(gè)元素蛾默,使之成為字符串
destinations_str = "\n".join(destinations)
創(chuàng)建一個(gè)默認(rèn)的鏈接,以適應(yīng)上述學(xué)科均未涉及的情況:
default_prompt = ChatPromptTemplate.from_template("{input}")
default_chain = LLMChain(llm=llm, prompt=default_prompt)
以下為多提示詞路由template:
MULTI_PROMPT_ROUTER_TEMPLATE = """Given a raw text input to a \
language model select the model prompt best suited for the input. \
You will be given the names of the available prompts and a \
description of what the prompt is best suited for. \
You may also revise the original input if you think that revising\
it will ultimately lead to a better response from the language model.
<< FORMATTING >>
Return a markdown code snippet with a JSON object formatted to look like:
json
{{{{
"destination": string \ name of the prompt to use or "DEFAULT"
"next_inputs": string \ a potentially modified version of the original input
}}}}
REMEMBER: "destination" MUST be one of the candidate prompt \
names specified below OR it can be "DEFAULT" if the input is not\
well suited for any of the candidate prompts.
REMEMBER: "next_inputs" can just be the original input \
if you don't think any modifications are needed.
<< CANDIDATE PROMPTS >>
{destinations}
<< INPUT >>
{{input}}
<< OUTPUT (remember to include the ```json)>>"""
根據(jù)上述template生成routechain:
#用destinations_str替換destinations
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(
destinations=destinations_str
)
router_prompt = PromptTemplate(
template=router_template,
input_variables=["input"],
#Parser將幫助鏈決定要路由到哪條子鏈
output_parser=RouterOutputParser(),
)
#鏈接LLM與router_prompt
router_chain = LLMRouterChain.from_llm(llm, router_prompt)
最后鏈接好多提示詞鏈接:
chain = MultiPromptChain(
router_chain=router_chain,
destination_chains=destination_chains,
default_chain=default_chain,
verbose=True
)
接下來(lái)即可使用該chain完成復(fù)雜的操作:
chain.run("What is black body radiation?")
6. Question & Answer Over Documents
將語(yǔ)言模型與文檔結(jié)合起來(lái)捉貌,但一般的語(yǔ)言模型只能檢查幾千詞支鸡,如何處理數(shù)據(jù)量大的文檔;
這就是嵌入(Embedding)庫(kù)和向量(Vector)庫(kù)的作用
6.1 嵌入(Embedding)
為文本創(chuàng)建數(shù)值表示昏翰,數(shù)值表示捕獲了它所覆蓋的文本片段的語(yǔ)義含義苍匆,內(nèi)容相似的文本中會(huì)含有相似的向量刘急;
讓我們能夠在向量空間中比較文本片段棚菊;
這在我們決定將哪些文本傳給LLM來(lái)回答問(wèn)題時(shí)十分有用;
6.2 向量數(shù)據(jù)庫(kù)(Vector Database)
是存儲(chǔ)上一部分創(chuàng)建的向量表示的一種方式叔汁;
創(chuàng)建該向量數(shù)據(jù)庫(kù)的方式统求,是將來(lái)自傳入文檔的文本片段填充進(jìn)去;
在面對(duì)大文檔時(shí)据块,先將其切片成小的文本段码邻,以便將小段的相關(guān)文本傳給LLM,然后為這些小文本段創(chuàng)建嵌入另假,再將這些嵌入存入向量數(shù)據(jù)庫(kù)像屋;
在query出現(xiàn)時(shí),先為其創(chuàng)建一個(gè)嵌入边篮,后將其向量與向量數(shù)據(jù)庫(kù)中的所有向量進(jìn)行比較己莺,選出最接近的幾個(gè)向量奏甫;
返回選中的向量,然后將其放入提示詞傳給LLM凌受,以獲得最終回答;
6.3 示例代碼
CSDN可參閱的資料
index = VectorstoreIndexCreator函數(shù)是一個(gè)包裝了很多別的邏輯操作的函數(shù)阵子,包括創(chuàng)建嵌入,初始化索引胜蛉,生成檢索器
6.3.1 簡(jiǎn)易版
導(dǎo)入相關(guān)的函數(shù):
#用于對(duì)文檔進(jìn)行檢索
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
#文檔加載挠进,用于加載將要和模型結(jié)合的文檔數(shù)據(jù)
from langchain.document_loaders import CSVLoader
#向量存儲(chǔ)器
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown
初始化文件路徑,并用加載器加載該文件:
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)
導(dǎo)入庫(kù):
#用于創(chuàng)建一個(gè)向量庫(kù)
from langchain.indexes import VectorstoreIndexCreator
創(chuàng)建一個(gè)向量庫(kù):
#第一個(gè)參數(shù)指定了向量庫(kù)的類型
#第二個(gè)loader為作為輸入的文檔加載器列表
index = VectorstoreIndexCreator(
vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])
創(chuàng)建詢問(wèn)query:
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."
使用向量庫(kù)index針對(duì)詢問(wèn)進(jìn)行查詢:
response = index.query(query)
display(Markdown(response))
6.3.2 詳盡版
首先創(chuàng)建一個(gè)文檔加載器誊册,并加載文檔:
loader = CSVLoader(file_path=file)
docs = loader.load()
創(chuàng)建嵌入:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
embeddings將請(qǐng)求轉(zhuǎn)成一個(gè)向量领突,代碼如下:
embed = embeddings.embed_query("Hi my name is Harrison")
print(len(embed))
print(embed[:5])
為載入的所有文檔創(chuàng)建嵌入,并將其存儲(chǔ)在向量數(shù)據(jù)庫(kù)中:
#接收文檔列表和一個(gè)嵌入對(duì)象(Embedding Objective)解虱,并生成一個(gè)向量數(shù)據(jù)庫(kù)
db = DocArrayInMemorySearch.from_documents(
docs,
embeddings
)
使用生成的向量數(shù)據(jù)庫(kù)來(lái)找到與傳入查詢(query)相似的文本片段:
query = "Please suggest a shirt with sunblocking"
docs = db.similarity_search(query)
len(docs)
docs[0]
利用已有的向量索引即向量數(shù)據(jù)庫(kù)來(lái)生成相應(yīng)的檢索器:
retriever = db.as_retriever()
創(chuàng)建LLM對(duì)象:
llm = ChatOpenAI(temperature = 0.0)
將文檔列表連接為一個(gè)長(zhǎng)字符串攘须,并作為提示詞的一部分傳給LLM:
qdocs = "".join([docs[i].page_content for i in range(len(docs))])
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.")
display(Markdown(response))
將上述操作組合成一個(gè)鏈,并使用該鏈:
#該鏈的作用是先進(jìn)行檢索,后在檢索到的文檔上進(jìn)行回答
qa_stuff = RetrievalQA.from_chain_type(
#LLM殴泰,用于最后回答文本的生成
llm=llm,
#設(shè)置Chain的類型
chain_type="stuff",
#獲取文檔的接口于宙,將被用于獲取文檔并被傳遞給語(yǔ)言模型
retriever=retriever,
verbose=True
)
query = "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."
response = qa_stuff.run(query)
display(Markdown(response))
也可以用下列代碼實(shí)現(xiàn)響應(yīng):
response = index.query(query, llm=llm)
創(chuàng)建鏈時(shí)有四種不同的類型
- stuff:將所有內(nèi)容放在一個(gè)提示詞中,交由LLM并獲得回復(fù)悍汛;
- Map_reduce:接受所有的文本片段捞魁,將他們和問(wèn)題一起交給LLM,得到一個(gè)回答离咐,再將這些回答由另一個(gè)LLM調(diào)用進(jìn)行匯總谱俭;
- Refine :迭代處理文檔,當(dāng)前回答總是基于上一個(gè)文檔的回答之上的宵蛀;
- Map_rerank:對(duì)每個(gè)文檔都進(jìn)行一次LLM調(diào)用昆著,并要求返回一個(gè)分?jǐn)?shù),選擇最高分术陶,但需要你在提示詞中定義評(píng)分標(biāo)準(zhǔn)凑懂;
7.Evaluation:評(píng)估應(yīng)用程序的性能
首先構(gòu)建一個(gè)應(yīng)用程序,例子為上一節(jié)的程序:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)
data = loader.load()
index = VectorstoreIndexCreator(
vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])
llm = ChatOpenAI(temperature = 0.0)
qa = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=index.vectorstore.as_retriever(),
verbose=True,
chain_type_kwargs = {
"document_separator": "<<<<>>>>>"
}
)
找一些想對(duì)其進(jìn)行評(píng)估的數(shù)據(jù)點(diǎn)梧宫,將會(huì)有多個(gè)方法實(shí)現(xiàn)該目標(biāo)
我們提出一些我們認(rèn)為是好示例的數(shù)據(jù)點(diǎn)
手動(dòng)方式接谨,自己查看doc并寫出QA example:
data[10]
data[11]
examples = [
{
"query": "Do the Cozy Comfort Pullover Set\
have side pockets?",
"answer": "Yes"
},
{
"query": "What collection is the Ultra-Lofty \
850 Stretch Down Hooded Jacket from?",
"answer": "The DownTek collection"
}
]
使用Chain自動(dòng)化生成QA的examples,可以節(jié)省很多時(shí)間:
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())
new_examples = example_gen_chain.apply_and_parse(
[{"doc": t} for t in data[:5]]
)
new_examples[0]
將手動(dòng)生成的examples和自動(dòng)生成的examples結(jié)合起來(lái)塘匣,并使用鏈運(yùn)行其中的一個(gè)請(qǐng)求:
examples += new_examples
qa.run(examples[0]["query"])
評(píng)估環(huán)節(jié)
手動(dòng)Evaluate:
“自己評(píng)估”
僅僅查看鏈輸出的結(jié)果很難進(jìn)行錯(cuò)誤分析脓豪,因此需要將debug置為True并再次運(yùn)行展示詳細(xì)信息:
import langchain
langchain.debug = True
qa.run(examples[0]["query"])
LLM輔助評(píng)估:
調(diào)用一個(gè)新的LLM構(gòu)成一個(gè)EvalChain(評(píng)估鏈)來(lái)對(duì)剛剛的LLM生成的內(nèi)容進(jìn)行評(píng)估
# Turn off the debug mode
langchain.debug = False
predictions = qa.apply(examples)
from langchain.evaluation.qa import QAEvalChain
llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(examples, predictions)
for i, eg in enumerate(examples):
print(f"Example {i}:")
print("Question: " + predictions[i]['query'])
print("Real Answer: " + predictions[i]['answer'])
print("Predicted Answer: " + predictions[i]['result'])
print("Predicted Grade: " + graded_outputs[i]['text'])
print()
8. Agent:LangChain的一個(gè)組件
LLM通過(guò)用戶提供的新的知識(shí)和數(shù)據(jù),幫助用戶回答問(wèn)題或通過(guò)內(nèi)容進(jìn)行推理忌卤;
例子中的tools有維基百科的api
首先引入需要用到的函數(shù):
from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.chat_models import ChatOpenAI
創(chuàng)建LLM和需要用到的tools:
llm = ChatOpenAI(temperature=0)
tools = load_tools(["llm-math","wikipedia"], llm=llm)
接著創(chuàng)建一個(gè)Agent:
agent= initialize_agent(
tools,
llm,
#CHAT:表示這是一個(gè)優(yōu)化過(guò)后的聊天代理
#REACT:一種提示詞技術(shù)扫夜,旨在從語(yǔ)言模型中獲得最佳推理
agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
#當(dāng)輸出格式出錯(cuò)時(shí),將錯(cuò)誤文本傳回給LLM,要求其自我修正
handle_parsing_errors=True,
verbose = True)
詢問(wèn)數(shù)學(xué)問(wèn)題:
agent("What is the 25% of 300?")
詢問(wèn)維基百科查詢:
question = "Tom M. Mitchell is an American computer scientist \
and the Founders University Professor at Carnegie Mellon University (CMU)\
what book did he write?"
result = agent(question)
使用語(yǔ)言模型書寫代碼然后執(zhí)行:
agent = create_python_agent(
llm,
#REPL可以視作一個(gè)NoteBook笤闯,用于執(zhí)行代碼
tool=PythonREPLTool(),
verbose=True
)
向agent發(fā)布指令請(qǐng)求排序:
customer_list = [["Harrison", "Chase"],
["Lang", "Chain"],
["Dolly", "Too"],
["Elle", "Elem"],
["Geoff","Fusion"],
["Trance","Former"],
["Jen","Ayai"]
]
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""")
查看agent詳細(xì)運(yùn)行過(guò)程:
import langchain
langchain.debug=True
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""")
langchain.debug=False
定義自己的tool现拒,并加入tools中:
#!pip install DateTime
from langchain.agents import tool
from datetime import date
@tool
def time(text: str) -> str:
#一定要對(duì)該函數(shù)進(jìn)行文本說(shuō)明
"""Returns todays date, use this for any \
questions related to knowing todays date. \
The input should always be an empty string, \
and this function will always return todays \
date - any date mathmatics should occur \
outside this function."""
return str(date.today())
agent= initialize_agent(
tools + [time],
llm,
agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
handle_parsing_errors=True,
verbose = True)