吳恩達老師提出了一種反思翻譯的大語言模型 (LLM) AI 翻譯工作流程——GitHub - andrewyng/translation-agent己单,具體工作流程如下:
- 提示一個 LLM 將文本從
source_language
翻譯到target_language
内地; - 讓 LLM 反思翻譯結(jié)果并提出建設(shè)性的改進建議侦锯;
- 使用這些建議來改進翻譯石景。
這個 AI 翻譯流程是目前比較新的一種翻譯方式,利用 LLM 對自己的翻譯結(jié)果進行改進來獲得較好的 AI 翻譯效果菇夸。
項目中展示了可以利用對長文本進行分片,然后分別進行反思翻譯處理,以突破 LLM 對 tokens 數(shù)量的限制裙盾,真正實現(xiàn)長文本一鍵高效率高質(zhì)量翻譯。
該項目還通過給大模型限定國家地區(qū)番官,已實現(xiàn)更精確的 AI 翻譯,如美式英語徘熔、英式英語之分;同時提出一些可能能帶來更好效果的優(yōu)化酷师,如對于一些 LLM 未曾訓練到的術(shù)語 (或有多種翻譯方式的術(shù)語) 建立術(shù)語表讶凉,進一步提升翻譯的精確度等等山孔。
而這一切都能通過 FastGPT 工作流輕松實現(xiàn)饱须,本文將手把手教你如何使用 FastGPT 復刻吳恩達老師的 translation-agent。
單文本塊反思翻譯
咱們先從簡單的開始譬挚,即不超出 LLM tokens 數(shù)量限制的單文本塊翻譯酪呻。
初始翻譯
第一步先讓 LLM 對源文本塊進行初始翻譯:
通過 “文本拼接” 模塊引用源語言玩荠、目標語言阶冈、源文本這三個參數(shù),生成提示詞填具,傳給 LLM劳景,讓它給出第一版的翻譯。
提示詞:
This is an {{source_lang}} to {{target_lang}} translation, please provide the {{target_lang}} translation for this text. \
Do not provide any explanations or text apart from the translation.
{{source_lang}}: {{source_text}}
{{target_lang}}:
反思
然后讓 LLM 對第一步生成的初始翻譯給出修改建議闷串,稱之為反思烹吵。
提示詞:
Your task is to carefully read a source text and a translation from {{source_lang}} to {{target_lang}}, and then give constructive criticism and helpful suggestions to improve the translation. \
The final style and tone of the translation should match the style of {{target_lang}} colloquially spoken in {{country}}.
The source text and initial translation, delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT> and <TRANSLATION></TRANSLATION>, are as follows:
<SOURCE_TEXT>
{{source_text}}
</SOURCE_TEXT>
<TRANSLATION>
{{translation_1}}
</TRANSLATION>
When writing suggestions, pay attention to whether there are ways to improve the translation's \n\
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),\n\
(ii) fluency (by applying {{target_lang}} grammar, spelling and punctuation rules, and ensuring there are no unnecessary repetitions),\n\
(iii) style (by ensuring the translations reflect the style of the source text and takes into account any cultural context),\n\
(iv) terminology (by ensuring terminology use is consistent and reflects the source text domain; and by only ensuring you use equivalent idioms {{target_lang}}).\n\
Write a list of specific, helpful and constructive suggestions for improving the translation.
Each suggestion should address one specific part of the translation.
Output only the suggestions and nothing else.
這里的提示詞接收 5 個參數(shù)年叮,源文本、初始翻譯玻募、源語言七咧、目標語言以及限定詞地區(qū)國家,這樣 LLM 會對前面生成的翻譯提出相當多的修改建議爆存,為后續(xù)的提升翻譯作準備先较。
提升翻譯
提示詞:
Your task is to carefully read, then edit, a translation from {{source_lang}} to {{target_lang}}, taking into
account a list of expert suggestions and constructive criticisms.
The source text, the initial translation, and the expert linguist suggestions are delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT>, <TRANSLATION></TRANSLATION> and <EXPERT_SUGGESTIONS></EXPERT_SUGGESTIONS> \
as follows:
<SOURCE_TEXT>
{{source_lang}}
</SOURCE_TEXT>
<TRANSLATION>
{{translation_1}}
</TRANSLATION>
<EXPERT_SUGGESTIONS>
{{reflection}}
</EXPERT_SUGGESTIONS>
Please take into account the expert suggestions when editing the translation. Edit the translation by ensuring:
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),
(ii) fluency (by applying {{target_lang}} grammar, spelling and punctuation rules and ensuring there are no unnecessary repetitions), \
(iii) style (by ensuring the translations reflect the style of the source text)
(iv) terminology (inappropriate for context, inconsistent use), or
(v) other errors.
Output only the new translation and nothing else.
在前文生成了初始翻譯以及相應的反思后,將這二者輸入給第三次 LLM 翻譯扣猫,這樣我們就能獲得一個比較高質(zhì)量的翻譯結(jié)果申尤。
運行效果
由于考慮之后對這個反思翻譯的復用昧穿,所以創(chuàng)建了一個插件,那么在下面我直接調(diào)用這個插件就能使用反思翻譯胶逢,效果如下:
隨機挑選了一段哈利波特的內(nèi)容宪塔。
可以看到反思翻譯后的效果還是好上不少的某筐,其中反思的輸出如下:
長文反思翻譯
在掌握了對短文本塊的反思翻譯后南誊,我們能輕松的通過分片和循環(huán)抄囚,實現(xiàn)對長文本也即多文本塊的反思翻譯橄务。
整體的邏輯是蜂挪,首先對傳入文本的 tokens 數(shù)量做判斷棠涮,如果不超過設(shè)置的 tokens 限制,那么直接調(diào)用單文本塊反思翻譯史煎,如果超過設(shè)置的 tokens 限制篇梭,那么切割為合理的大小很洋,再分別進行對應的反思翻譯處理隧枫。
至于為什么要切割分塊官脓,有兩個原因:
1卑笨、大模型輸出上下文只有 4k,無法輸出超過 4k token 內(nèi)容的文字妖滔。
2座舍、輸入分塊可以減少太長的輸入導致的幻覺。
計算 tokens
首先采蚀,我使用 “Laf 函數(shù)” 模塊來實現(xiàn)對輸入文本的 tokens 的計算榆鼠。
Laf 函數(shù)的使用相當簡單妆够,即開即用责静,只需要在 Laf 云開發(fā)平臺中創(chuàng)建個應用盖桥,然后安裝 tiktoken 依賴揩徊,導入如下代碼即可:
const { Tiktoken } = require("tiktoken/lite");
const cl100k_base = require("tiktoken/encoders/cl100k_base.json");
interface IRequestBody {
str: string
}
interface RequestProps extends IRequestBody {
systemParams: {
appId: string,
variables: string,
histories: string,
cTime: string,
chatId: string,
responseChatItemId: string
}
}
interface IResponse {
message: string;
tokens: number;
}
export default async function (ctx: FunctionContext): Promise<IResponse> {
const { str = "" }: RequestProps = ctx.body
const encoding = new Tiktoken(
cl100k_base.bpe_ranks,
cl100k_base.special_tokens,
cl100k_base.pat_str
);
const tokens = encoding.encode(str);
encoding.free();
return {
message: 'ok',
tokens: tokens.length
};
}
再回到 FastGPT塑荒,點擊 “同步參數(shù)”齿税,再連線將源文本傳入,即可計算 tokens 數(shù)量拧篮。
計算單文本塊大小
由于不涉及第三方包串绩,只是一些數(shù)據(jù)處理礁凡,所以直接使用 “代碼運行” 模塊處理即可:
function main({tokenCount, tokenLimit}){
const numChunks = Math.ceil(tokenCount / tokenLimit);
let chunkSize = Math.floor(tokenCount / numChunks);
const remainingTokens = tokenCount % tokenLimit;
if (remainingTokens > 0) {
chunkSize += Math.floor(remainingTokens / numChunks);
}
return {chunkSize};
}
通過上面的代碼顷牌,我們就能算出不超過 token 限制的合理單文本塊大小是多少了窟蓝。
獲得切分后源文本塊
通過單文本塊大小和源文本疗锐,我們在 Laf 中再編寫一個函數(shù)調(diào)用 langchain 的 textsplitters 包來實現(xiàn)文本分片费彼,具體代碼如下:
import cloud from '@lafjs/cloud'
import { TokenTextSplitter } from "@langchain/textsplitters";
interface IRequestBody {
text: string
chunkSize: number
}
interface RequestProps extends IRequestBody {
systemParams: {
appId: string,
variables: string,
histories: string,
cTime: string,
chatId: string,
responseChatItemId: string
}
}
interface IResponse {
output: string[];
}
export default async function (ctx: FunctionContext): Promise<IResponse> {
const { text = '', chunkSize = 1000 }: RequestProps = ctx.body;
const splitter = new TokenTextSplitter({
encodingName: "gpt2",
chunkSize: Number(chunkSize),
chunkOverlap: 0,
});
const initialChunks = await splitter.splitText(text);
console.log(initialChunks)
// 定義不同語言的句子分隔符
const sentenceDelimiters = /[雇卷。5吆铩翘瓮?.!?]/;
// 進一步處理每個初步分割塊
const output = [];
let currentChunk = initialChunks[0];
for (let i = 1; i < initialChunks.length; i++) {
const sentences = initialChunks[i].split(sentenceDelimiters);
if (sentences.length > 0) {
currentChunk += sentences[0]; // 拼接第一個句子到當前塊
output.push(currentChunk.trim()); // 將當前塊加入輸出數(shù)組
currentChunk = sentences.slice(1).join(''); // 剩余的句子作為新的當前塊
}
}
// 將最后一個塊加入輸出數(shù)組
if (currentChunk.trim().length > 0) {
output.push(currentChunk.trim());
}
console.log(output);
return {
output
}
}
這樣我們就獲得了切分好的文本资盅,接下去的操作就類似單文本塊反思翻譯呵扛。
多文本塊翻譯
這里應該還是不能直接調(diào)用前面的單文本塊反思翻譯,因為提示詞中會涉及一些上下文的處理 (或者可以修改下前面寫好的插件缤灵,多傳點參數(shù)進去)腮出。
詳細的和前面類似利诺,就是提示詞進行一些替換剩燥,以及需要做一些很簡單的數(shù)據(jù)處理,整體效果如下口注。
多文本塊初始翻譯
多文本塊反思
多文本塊提升翻譯
循環(huán)執(zhí)行
長文反思翻譯比較關(guān)鍵的一個部分,就是對多個文本塊進行循環(huán)反思翻譯材部。
FastGPT 提供了工作流線路可以返回去執(zhí)行的功能唯竹,所以我們可以寫一個很簡單的判斷函數(shù)浸颓,來判斷結(jié)束或是接著執(zhí)行产上。
js 代碼:
function main({chunks, currentChunk}){
const findIndex = chunks.findIndex((item) => item ===currentChunk)
return {
isEnd: chunks.length-1 === findIndex,
i: findIndex + 1,
}
}
也就是通過判斷當前處理的這個文本塊晋涣,是否是最后一個文本塊,從而判斷是否需要繼續(xù)執(zhí)行规丽,就這樣赌莺,我們實現(xiàn)了長文反思翻譯的效果艘狭。
運行效果
首先輸入全局設(shè)置:
然后輸入需要翻譯的文本巢音,這里我選擇了一章哈利波特的英文原文來做翻譯尽超,其文本長度通過 OpenAI 對 tokens 數(shù)量的判斷如下:
實際運行效果如下:
可以看到還是能滿足閱讀需求的傲绣。
進一步調(diào)優(yōu)
提示詞調(diào)優(yōu)
在源項目中秃诵,給 AI 的系統(tǒng)提示詞還是比較的簡略的,我們可以通過比較完善的提示詞禁舷,來督促 LLM 返回更合適的翻譯牵咙,進一步提升翻譯的質(zhì)量洁桌。比如可以使用 CoT 思維鏈革答,讓 LLM 顯式地残拐、系統(tǒng)地生成推理鏈條溪食,展示翻譯的完整思考過程娜扇。
比如初始翻譯中的提示詞可以換成以下提示詞:
# Role: 資深翻譯專家
## Background:
你是一位經(jīng)驗豐富的翻譯專家,精通{{source_lang}}和{{target_lang}}互譯,尤其擅長將{{source_lang}}文章譯成流暢易懂的{{target_lang}}雀瓢。你曾多次帶領(lǐng)團隊完成大型翻譯項目,譯文廣受好評刃麸。
## Attention:
- 翻譯過程中要始終堅持"信泊业、達、雅"的原則,但"達"尤為重要
- 譯文要符合{{target_lang}}的表達習慣,通俗易懂,連貫流暢
- 避免使用過于文縐縐的表達和晦澀難懂的典故引用
- 對于專有的名詞或術(shù)語饮睬,可以適當保留或音譯
## Constraints:
- 必須嚴格遵循四輪翻譯流程:直譯捆愁、意譯牙瓢、校審、定稿
- 譯文要忠實原文,準確無誤,不能遺漏或曲解原意
- 注意判斷上下文页慷,避免重復翻譯
## Goals:
- 通過四輪翻譯流程,將{{source_lang}}原文譯成高質(zhì)量的{{target_lang}}譯文
- 譯文要準確傳達原文意思,語言表達力求淺顯易懂,朗朗上口
- 適度使用一些熟語俗語酒繁、流行網(wǎng)絡(luò)用語等,增強譯文的親和力
- 在直譯的基礎(chǔ)上,提供至少2個不同風格的意譯版本供選擇
## Skills:
- 精通{{source_lang}} {{target_lang}}兩種語言,具有扎實的語言功底和豐富的翻譯經(jīng)驗
- 擅長將{{source_lang}}表達習慣轉(zhuǎn)換為地道自然的{{target_lang}}
- 對當代{{target_lang}}語言的發(fā)展變化有敏銳洞察,善于把握語言流行趨勢
## Workflow:
1. 第一輪直譯:逐字逐句忠實原文,不遺漏任何信息
2. 第二輪意譯:在直譯的基礎(chǔ)上用通俗流暢的{{target_lang}}意譯原文,至少提供2個不同風格的版本
3. 第三輪校審:仔細審視譯文,消除偏差和欠缺,使譯文更加地道易懂
4. 第四輪定稿:擇優(yōu)選取,反復修改潤色,最終定稿出一個簡潔暢達州袒、符合大眾閱讀習慣的譯文
## OutputFormat:
- 每一輪翻譯前用【思考】說明該輪要點
- 每一輪翻譯后用【翻譯】呈現(xiàn)譯文
- 在\`\`\`代碼塊中展示最終定稿譯文郎哭,\`\`\`之后無需加其他提示
## Suggestions:
- 直譯時力求忠實原文,但不要過于拘泥逐字逐句
- 意譯時在準確表達原意的基礎(chǔ)上,用最樸實無華的{{target_lang}}來表達
- 校審環(huán)節(jié)重點關(guān)注譯文是否符合{{target_lang}}表達習慣,是否通俗易懂
- 定稿時適度采用一些熟語諺語夸研、網(wǎng)絡(luò)流行語等,使譯文更接地氣- 善于利用{{target_lang}}的靈活性,用不同的表述方式展現(xiàn)同一內(nèi)容,提高譯文的可讀性
從而可以返回更準確更高質(zhì)量的初始翻譯胧卤。我們還需要再加一個節(jié)點桑李,將初始翻譯的第四輪定稿提取出來:
js 代碼如下:
function main({data1}){
const result = data1.split("```").filter(item => !!item.trim())
if(result[result.length-1]) {
return {
result: result[result.length-1]
}
}
return {
result: '未截取到翻譯內(nèi)容'
}
}
后續(xù)的反思和提升翻譯也可以修改更準確的提示詞,例如:
提示詞如下:
# Role: 資深翻譯專家
## Background:
你是一位經(jīng)驗豐富的翻譯水平評判專家,精通{{source_lang}}和{{target_lang}}互譯,尤其擅長將{{source_lang}}文章譯成流暢易懂的{{target_lang}}茶敏。你曾多次參與文章翻譯的校對和審核惊搏,能對翻譯的文章提出一針見血的見解
## Attention:
- 譯文要遵守"信胀屿、達包雀、雅"的原則,但"達"尤為重要
- 譯文要符合{{target_lang}}的表達習慣,通俗易懂,連貫流暢
- 譯文要避免使用過于文縐縐的表達和晦澀難懂的典故引用
## Constraints:
- 譯文要忠實原文,準確無誤,不能遺漏或曲解原意
- 建議要明確可執(zhí)行才写,一針見血
- 盡可能詳細地對每段話提出建議
## Goals:
- 你會獲得一段{{source_lang}}的原文,以及它對應的初始翻譯吆鹤,你需要針對這段翻譯給出你的改進建議
- 盡可能詳細地對每段話進行判斷疑务,對于需要修改部分的提出建議梗醇,而無需修改的部分不要強行修改
- 譯文要準確傳達原文意思,語言表達力求淺顯易懂,朗朗上口
- 適度使用一些熟語俗語叙谨、流行網(wǎng)絡(luò)用語等,增強譯文的親和力
## Skills:
- 精通{{source_lang}} {{target_lang}}兩種語言,具有扎實的語言功底和豐富的翻譯經(jīng)驗
- 擅長將{{source_lang}}表達習慣轉(zhuǎn)換為地道自然的{{target_lang}}
- 對當代{{target_lang}}語言的發(fā)展變化有敏銳洞察,善于把握語言流行趨勢
我們再來看看最終的運行效果手负,拿一段技術(shù)文章來測試一下:
In February of 1992, the development of Windows 3.1 was nearing a close, and the Windows team was trying to figure out what their next steps would be. By the 5th of March, the team knew that they’d be focusing on desktops, laptops, mobile, and pen with NT taking servers and workstations. The team also knew that they needed to address three major areas: UI, hardware support, networking.
There was a ton of stuff being worked on at this time (and through the rest of the 1990s) within Microsoft. Just within the Systems group (as distinct from the Apps group) Janus would release on the 6th of April as Windows 3.1, Astro would release in March of 1993 as MS-DOS 6.0, Winball would release in October of 1992 as Windows for Workgroups 3.1, Jaguar while being worked on at this time would never see an independent release (more on that in a bit), and then came the next windows projects: Cougar, Panther, Rover, NT, and Cairo. Cougar was a project to build a fully 32 bit Windows kernel, evolving the Windows 3.x 386 mode kernel for 386-class and higher machines. Panther was a project to port the win32 API to this new kernel. Rover was a project to make a mobile computing version of Cougar/Panther. The NT project was Microsoft’s first steps into a dedicated workstation and server release of Windows, and it would release in July of 1993. Cairo was a project for the next major release of NT, and it would mirror many of the changes to Windows from Cougar/Panther (and the reverse is also true). This system comprised of Cougar and Panther was known as Chicago. The Cougar portion of this system was vital to making a more stable and robust Windows. Beyond being a fully 32 bit protected-mode system, this new kernel would feature dynamically loaded and unloaded protected-mode device drivers. This system would also be threaded and fully support any MS-DOS program running from Windows (where previously in Windows 2 and 3, programs that wrote directly to video RAM would require Windows to terminate and stay resident, one side effect being that in really big Command and Conquer maps, the memory space of Windows would be overwritten and as a result Windows would not restore on exit).
These moves were huge for Chicago and for Microsoft more generally. When Chicago was taking shape in 1992, MS-DOS was still Microsoft’s bread and butter. Brad Silverberg was relatively new to Microsoft, but he had a very strong background. He had worked at Apple on the Lisa, and he had worked at Borland. By early 1992, he was the project leader of Chicago and the SVP of Microsoft’s personal systems division. In an internal Microsoft memo Silverberg said:
Lest anyone be confused, ms-dos is the the bedrock product of the company, accounting for a very major portion of Microsoft’s profits (ie, stock price). Further, it is under strong competitive pressures (I am more inclined to say “under attack”) from DR-DOS and IBM. We must protect this franchise with our lives. Short term, that means continued aggressive marketing plans. In addition, it also means we need to get yearly product releases out so we put the other guys on a treadmill, rather than be put on the treadmill. As a result, we are going to release a new version of MS-DOS this year, chock full of new goodies, while we move with full-speed toward cougar.
That new MS-DOS release was MS-DOS 6 mentioned earlier. The most visible and important new “goodies” referenced by Silverberg were disk defragmentation, disk compression, anti-virus, a new backup system, and file transfer tools. MS-DOS 6 was released in March of 1993 with updates being pushed until June of 1994.
I bring this up to try and portray where Microsoft and the industry were at this time. IBM compatible computers outnumbered all other computers by nearly 80 million units. MS-DOS or a compatible DOS system was installed on almost all of them (with OS/2 or Linux being rare). Most software on these computers ran in 16 bit real mode. Most hardware was configured with dip switches, and the config had to match that setting exactly. Loading a driver required knowledge of autoexec and load-high tools. Windows 3 was a huge success, and Windows 3.1 was an even greater success. Despite these successes and the resultant changes in Microsoft’s future plans, MS-DOS was still the market leader in PC operating systems by a very wide margin. Windows 3x did ameliorate some problems, but the old systems remained dominant. Due to this, Microsoft absolutely needed to ensure that MS-DOS was still part of their future despite having a more technically advanced system in NT. Adding to this, most computers that home users were purchasing were incapable of providing a good experience with NT. Chicago needed to provide the best experience possible for win16, win32, and MS-DOS applications on modest hardware, and it needed to be a noticeable improvement over Windows 3. If Microsoft failed in either case, they would be yielding ground to Digital Research or to IBM.
Ultimately, the need for backwards compatibility meant that some 16 bit code remained in Chicago. Without this, the backwards compatibility wouldn’t have been as good. In hindsight, given that IBM’s OS/2 could run DOS and Windows software, this was a very good decision on the part of Microsoft.
Chicago was structured in a way that is similar to Windows for Workgroups 3.1 (386 enhanced), but is far more refined. There are a large number of virtual device drivers (VxDs) running in 32 bit protected mode alongside virtual DOS machines (VDMs) running in a virtual real mode. These virtual device drivers are used for real physical hardware, for emulating devices for virtual machines, and for providing services to other software. Three of these VxDs comprise the very heart of Chicago: Virtual Machine Manager (VMM32.VXD), Configuration Manager (CONFIGMG), Installable Filesystem Manager (IFM). VMM32 is essentially the Chicago kernel. It handles memory management, event handling, interrupt handling, device driver loading and initialization, the creation of virtual machines, and the scheduling. CONFIGMG handles plug and play. IFM coordinates filesystem access, provides a disk buffer, and provides a 32 bit protected mode I/O access system. This bypasses MS-DOS entirely and was first seen 386 Windows 3 releases.
翻譯效果如下:
太強了敦姻!
從現(xiàn)在開始,不管你想翻譯什么文章迷守,不管這篇文章有多長兑凿,你都可以直接丟給這個翻譯專家茵瘾,然后該干嘛干嘛拗秘,過一會兒再回來就可以領(lǐng)取最完美的翻譯結(jié)果了雕旨,還有誰捧请?
其他調(diào)優(yōu)
比如限定詞調(diào)優(yōu)疹蛉,源項目中已經(jīng)做了示范可款,就是加上國家地區(qū)這個限定詞克蚂,實測確實會有不少提升陨舱。
出于 LLM 的卓越能力游盲,我們能夠通過設(shè)置不同的 prompt 來獲取不同的翻譯結(jié)果,也就是可以很輕松地通過設(shè)置特殊的限定詞谜慌,來實現(xiàn)特定的欣范,更精確的翻譯恼琼。
而對于一些超出 LLM 理解的術(shù)語等屏富,也可以利用 FastGPT 的知識庫功能進行相應擴展狠半,進一步完善翻譯機器人的功能神年。
結(jié)語
下一篇文章將會給大家?guī)硪粋€更強大的智能體:字幕反思翻譯專家。
這個專家能干什么呢垛耳?舉個例子艾扮,假設(shè)你有一個英文字幕泡嘴,不管這個字幕有多長,你都可以復制這個字幕的所有內(nèi)容磺箕,直接丟給字幕翻譯專家松靡,然后該干嘛干嘛雕欺,過一會兒再回來就可以領(lǐng)取最完美的中英雙語字幕了棉姐,還有誰伞矩?
最后是福利時刻笛洛,該翻譯專家的完整工作流我已經(jīng)分享出來了,大家自饶死ぁ:長文本反思翻譯專家工作流