酷酷的群 - 簡書

發(fā)簡信

酷酷的群

0
關(guān)注
944
粉絲
129
文章
299206

字?jǐn)?shù)
1828

收獲喜歡
96

總資產(chǎn)

IP屬地：浙江

酷酷的群

直接偏好優(yōu)化技術(shù)DPO基礎(chǔ)理論及推導(dǎo)
論文標(biāo)題：Direct Preference Optimization: Your Language Model is Secretly a Reward Model論文鏈接...

902 0 1
酷酷的群

生成式大模型的RLHF技術(shù)（一）：基礎(chǔ)
一号坡、概述大語言模型（LLMs）在預(yù)訓(xùn)練的過程中通常會捕捉數(shù)據(jù)的特征，而這些訓(xùn)練數(shù)據(jù)通常既包含高質(zhì)量的也包含低質(zhì)量的描扯，因此模型有時(shí)會產(chǎn)生不被期望的行為，如編造事實(shí)趟薄，生成有偏見...

919 0 1
酷酷的群

LoRA：大模型下游任務(wù)的低秩適應(yīng)
論文標(biāo)題：LoRA: Low-Rank Adaptation of Large Language Models論文鏈接：https://arxiv.org/abs/2106....

801 0 1
酷酷的群

思維鏈Prompting促進(jìn)大型語言模型的推理能力
論文標(biāo)題：Chain-of-Thought Prompting Elicits Reasoning in Large Language Models論文鏈接：https://...

1553 0 2
酷酷的群

Megatron-LM：Transformer模型專用分布式張量模型并行方法
論文標(biāo)題：Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallel...

603 0 1
酷酷的群

思維樹：大模型的復(fù)雜推理技術(shù)
論文標(biāo)題：Tree of Thoughts: Deliberate Problem Solving with Large Language Models論文鏈接：https:...

712 0 1
酷酷的群

LIMA：小規(guī)模監(jiān)督數(shù)據(jù)指令微調(diào)
論文標(biāo)題：LIMA: Less Is More for Alignment論文鏈接：https://arxiv.org/abs/2305.11206[https://arxi...

319 0 1
酷酷的群

語言模型的自洽性思維鏈推理技術(shù)
論文標(biāo)題：Self-Consistency Improves Chain of Thought Reasoning in Language Models論文鏈接：https:...

407 0 1
酷酷的群

GPipe：微批量流水線并行
論文標(biāo)題：GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism論文鏈接：https://arxiv.org/ab...

373 0 2