酷酷的群 - 簡(jiǎn)書(shū)

酷酷的群

IP屬地：浙江

直接偏好優(yōu)化技術(shù)DPO基礎(chǔ)理論及推導(dǎo)
論文標(biāo)題：Direct Preference Optimization: Your Language Model is Secretly a R...

0.1 544 0 1
自適應(yīng)視圖增強(qiáng)的謠言檢測(cè)圖對(duì)比學(xué)習(xí)方法
論文標(biāo)題：Propagation Tree Is Not Deep: Adaptive Graph Contrastive Learning A...

156 0 0

生成式大模型的RLHF技術(shù)（一）：基礎(chǔ)
一给涕、概述大語(yǔ)言模型（LLMs）在預(yù)訓(xùn)練的過(guò)程中通常會(huì)捕捉數(shù)據(jù)的特征垛叨，而這些訓(xùn)練數(shù)據(jù)通常既包含高質(zhì)量的也包含低質(zhì)量的，因此模型有時(shí)會(huì)產(chǎn)生不被期望...

0.1 826 0 1
LoRA：大模型下游任務(wù)的低秩適應(yīng)
論文標(biāo)題：LoRA: Low-Rank Adaptation of Large Language Models論文鏈接：https://arxi...

0.1 685 0 1
Megatron-LM：Transformer模型專用分布式張量模型并行方法
論文標(biāo)題：Megatron-LM: Training Multi-Billion Parameter Language Models Using...

513 0 1
思維樹(shù)：大模型的復(fù)雜推理技術(shù)
論文標(biāo)題：Tree of Thoughts: Deliberate Problem Solving with Large Language Mo...

0.1 659 0 1
LIMA：小規(guī)模監(jiān)督數(shù)據(jù)指令微調(diào)
論文標(biāo)題：LIMA: Less Is More for Alignment論文鏈接：https://arxiv.org/abs/2305.112...

0.1 276 0 1

語(yǔ)言模型的自洽性思維鏈推理技術(shù)
論文標(biāo)題：Self-Consistency Improves Chain of Thought Reasoning in Language Mo...

347 0 1
GPipe：微批量流水線并行
論文標(biāo)題：GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism論文鏈接：https...

0.5 305 0 2