LlamaIndex 是一個RAG檢索增強生成框架犀勒, 提供了必要的抽象摔笤,可以更輕松地攝取、構(gòu)建和訪問私有或特定領(lǐng)域的數(shù)據(jù)低剔,以便將這些數(shù)據(jù)安全可靠地注入 LLM 中速梗,以實現(xiàn)更準確的文本生成肮塞。
引入新知識時,RAG效果比fine tune好姻锁,可控性更強枕赵。RAG將新知識注入預(yù)訓練的語言模型,通過簡化問題來減少幻覺位隶。
LlamaIndex的優(yōu)勢是自帶向量數(shù)據(jù)庫拷窜。另外兩個RAG框架是LangChain和GroundX。LangChain配合向量數(shù)據(jù)庫PineCone涧黄,所謂的LCPC篮昧。
魔搭社區(qū)提供了一個使用LlamaIndex做檢索增強的栗子,可以免費試用魔搭社區(qū)的免費GPU環(huán)境運行這個栗子笋妥。
Step1: 安裝依賴庫
!pip install llama-index llama-index-llms-huggingface ipywidgets
!pip install transformers -U
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from IPython.display import Markdown, display
import torch
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.prompts import PromptTemplate
from modelscope import snapshot_download
from llama_index.core.base.embeddings.base import BaseEmbedding, Embedding
from abc import ABC
from typing import Any, List, Optional, Dict, cast
from llama_index.core import (
VectorStoreIndex,
ServiceContext,
set_global_service_context,
SimpleDirectoryReader,
)
Step2: 加載大語言模型
#Model names
qwen2_4B_CHAT = "qwen/Qwen1.5-4B-Chat"
selected_model = snapshot_download(qwen2_4B_CHAT)
SYSTEM_PROMPT = """You are a helpful AI assistant.
"""
query_wrapper_prompt = PromptTemplate(
"[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
)
llm = HuggingFaceLLM(
context_window=4096,
max_new_tokens=2048,
generate_kwargs={"temperature": 0.0, "do_sample": False},
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name=selected_model,
model_name=selected_model,
device_map="auto",
# change these settings below depending on your GPU
model_kwargs={"torch_dtype": torch.float16},
)
Step3: 加載知識庫數(shù)據(jù)文檔懊昨,markdown格式
!mkdir -p 'data/xianjiaoda/'
!wget 'https://modelscope.oss-cn-beijing.aliyuncs.com/resource/rag/xianjiaoda.md' -O 'data/xianjiaoda/xianjiaoda.md'
documents = SimpleDirectoryReader("/mnt/workspace/data/xianjiaoda/").load_data()
documents
Step4: 使用 GTE 模型構(gòu)造 Embedding
embedding_model = "iic/nlp_gte_sentence-embedding_chinese-base"
class ModelScopeEmbeddings4LlamaIndex(BaseEmbedding, ABC):
embed: Any = None
model_id: str = "iic/nlp_gte_sentence-embedding_chinese-base"
def __init__(
self,
model_id: str,
**kwargs: Any,
) -> None:
super().__init__(**kwargs)
try:
from modelscope.models import Model
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
# 使用modelscope的embedding模型(包含下載)
self.embed = pipeline(Tasks.sentence_embedding, model=self.model_id)
except ImportError as e:
raise ValueError(
"Could not import some python packages." "Please install it with `pip install modelscope`."
) from e
def _get_query_embedding(self, query: str) -> List[float]:
text = query.replace("\n", " ")
inputs = {"source_sentence": [text]}
return self.embed(input=inputs)['text_embedding'][0].tolist()
def _get_text_embedding(self, text: str) -> List[float]:
text = text.replace("\n", " ")
inputs = {"source_sentence": [text]}
return self.embed(input=inputs)['text_embedding'][0].tolist()
def _get_text_embeddings(self, texts: List[str]) -> List[List[float]]:
texts = list(map(lambda x: x.replace("\n", " "), texts))
inputs = {"source_sentence": texts}
return self.embed(input=inputs)['text_embedding'].tolist()
async def _aget_query_embedding(self, query: str) -> List[float]:
return self._get_query_embedding(query)
Step5: 建立檢索使用的LlamaIndex向量庫索引,需要設(shè)置embeddings和llm
embeddings = ModelScopeEmbeddings4LlamaIndex(model_id=embedding_model)
service_context = ServiceContext.from_defaults(embed_model=embeddings, llm=llm)
set_global_service_context(service_context)
index = VectorStoreIndex.from_documents(documents)
Step6: 最后一步查詢和問答:基于本地知識庫春宣!
query_engine = index.as_query_engine()
response = query_engine.query("西安交大是由哪幾個學校合并的?")
print(response)