”… what level of success have you had making something work that wasn’t working with the base model + RAG?”

Experiences shared:

  • Tried fine-tuning a model to specialize in breaking down text to Q&A pairs, ended up hallucinating and generating assumed Q&As that are not in the text. Their experience was much harder than what they expected.
  • “Go look at unsloth. Free Jupyter notebooks with a GPU on Google colab. All you have to do is run it. Start with that, see how it works, then start tweaking things. Like try a different dataset, how does learning rate affect things, then epochs, then batch size, etc.”
    • Helping with the point that good devs can earn a reputation.
  • “For every application I have had, finetuning was critical to get any acceptable level of accuracy.” “Semantic text correction and structured knowledge creation”
  • “In my opinion it really just comes down to how much work you put into your dataset. Results are typically going to be subpar if you just do some basic automated data extraction and roll with whatever was spit out. It’s tedious, but you need to go over the datasets yourself to some extent. Whether that’s random sampling to look for issues or just going through it piece by piece. That tends to be where you notice issues with perspective around how data is presented, bias, repetitive language, etc. That said, I think that fine tuning plus RAG is always the best rather than going with one or the other.”
  • I am still working on the “self learning” part but here’s the RAG and Indexer For finetuning the model i am using unsloth chat completion template you can change the template name to whatever suits you. For the dataset i built a one from scratch since my use case is for a very specific domain , dataset Feel free to let me know if you have any questions

One thing that people reiterate in the comments is how little good documentation and tutorials there are on doing good fine-tuning (which is likely why a post like this exists).