XingfuYi

  • Home
  • notes
  • Search
  • Table of Contents
  • Overview
Xingfu

Xingfu

Think Different
64 posts
11 categories
GitHub E-Mail

Flamingo

Posted on 2025-02-12 Edited on 2025-11-26 In notes , LLM Disqus:

use tanh to ensure that the initial output of cross attention is 0, improved the training stability of training process

Llama3.2
An Empirical Study of LLaMA3 Quantization From LLMs to MLLMs
0%