XingfuYi

Home
notes
Search

Table of Contents
Overview

Xingfu

Think Different

11 categories

GitHub E-Mail

Flamingo

Posted on 2025-02-12 Edited on 2025-11-26 In notes , LLM Disqus:

use tanh to ensure that the initial output of cross attention is 0, improved the training stability of training process

An Empirical Study of LLaMA3 Quantization From LLMs to MLLMs

0%