Reasoning model fine-tuned from Qwen2.5-3B Instruct. They not only trained this model, but also created a dataset from Qwen’s QwQ model, which was one of the first reasoning models to come to the open source community. They use the dataset of long CoT chains to fine-tune the models.
They specifically mention “To encourage research in the open-source community, we’ve also made the dataset publicly available - feel free to use it!”