Same authors as Unsloth, sharing new finding that you have to keep some layers in its original precision when quantizing to 4bit. They call accounting for these layers “dynamic quantization”. They not only include the models they have quantized using this method, but also Colab notebooks that provide a template for fine-tuning the models. The more resources they provide, the easier it is for someone to get started.