Language & LLMs

What Is QLoRA?

QLoRA is a fine-tuning method that applies LoRA on top of a quantized model, storing the base weights in a compressed low-bit format. This reduces memory requirements enough to fine-tune very large models on a single GPU. It preserves much of the quality of full-precision fine-tuning.

What Is QLoRA?

Related topics

Further reading