酷酷的群 - 簡書

酷酷的群

IP屬地：浙江

直接偏好優(yōu)化技術(shù)DPO基礎(chǔ)理論及推導(dǎo)
論文標(biāo)題：Direct Preference Optimization: Your Language Model is Secretly a R...

0.1 569 0 1
自適應(yīng)視圖增強(qiáng)的謠言檢測圖對比學(xué)習(xí)方法
論文標(biāo)題：Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning A...

161 0 0

生成式大模型的RLHF技術(shù)（一）：基礎(chǔ)
一怠益、概述大語言模型（LLMs）在預(yù)訓(xùn)練的過程中通常會捕捉數(shù)據(jù)的特征，而這些訓(xùn)練數(shù)據(jù)通常既包含高質(zhì)量的也包含低質(zhì)量的屠升，因此模型有時會產(chǎn)生不被期望...

0.1 834 0 1
LoRA：大模型下游任務(wù)的低秩適應(yīng)
論文標(biāo)題：LoRA: Low-Rank Adaptation of Large Language Models論文鏈接：https://arxi...

0.1 691 0 1
Megatron-LM：Transformer模型專用分布式張量模型并行方法
論文標(biāo)題：Megatron-LM: Training Multi-Billion Parameter Language Models Using...

517 0 1
思維樹：大模型的復(fù)雜推理技術(shù)
論文標(biāo)題：Tree of Thoughts: Deliberate Problem Solving with Large Language Mo...

0.1 669 0 1
LIMA：小規(guī)模監(jiān)督數(shù)據(jù)指令微調(diào)
論文標(biāo)題：LIMA: Less Is More for Alignment論文鏈接：https://arxiv.org/abs/2305.112...

0.1 278 0 1

語言模型的自洽性思維鏈推理技術(shù)
論文標(biāo)題：Self-Consistency Improves Chain of Thought Reasoning in Language Mo...

349 0 1
GPipe：微批量流水線并行
論文標(biāo)題：GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism論文鏈接：https...

0.5 309 0 2