Posts sorted in themes
The codes developed for posts were only for those with expressed intent. Posts that do not make the OP’s intention of posting clear were not considered.
Addressing a shared and in-demand problem of the community (spc)
Note: adding properties post_id, github, huggingface, notebook, and blogpost to all of these.
Posts
- Post as flywheel: Benchmarking effects of quantization
- Indirect compute donation: Llama 3 quants comparison
- Encouraging research: SmallThinker-3B-Preview
- Public service announcement: Llama 2 prompt format
- Model reviews: LLM “serious use” comparison, Huge LLM Comparison, Experience with Codestral for Android development
- Model release: SmallThinker-3B-Preview, Lexi Llama-3-8B-Uncensored, Translate to 400+ languages, FastApply-open source Cursor
- Simply converting model extensions into a more convenient format is helpful, in relation to GGUF development for llama.cpp
- Model quantization:
- Quantization reviews/comparisons: Mistral-Large 35GB quantization, Qwen2.5 32B quant comparison, Qwen2.5 14B quant comparison, Benchmarking effects of quantization, Llama 3 quants comparison
- Quantization tooling: exl2 quantization, Unsloth dynamic quantization
- Prompt formatting/prompt engineering: Llama 2 prompt format, Prompt format comparisons, Perfect labels via prompting
- Open-source LLM tooling:
- Uncensoring models: llama.cpp —logit-bias flag, Prompt format comparisons
- Alternative cloud compute providers: Vast.ai cloud inferencing guide
Desiring feedback from the community (ffc)
Posts
- Lexi Llama-3-8B-Uncensored
- Perfect labels via prompting 2
- Conflicting feedback enables OP to reappraise their work
- Axiom prompt engineering
- Novel approach to prompting, encouraged to open source
- NexaAI—Ollama alternative (repeat from
spc) - Background erase network
- Feedback can be to increase accessibility, not on the model itself.
- Omnivision-968M
- Very good example of OP-user feedback loop and timely iteration/reporting back with updates
- Yet another RAG system
- Another good example. Maybe something we can go through during the meeting.
- Quantizing Llama 3 8B seems more harmful
- This asks for feedback on a finding, not a project.
- Benchmarking effects of quantization
- WizardVicunaLM
- Community extending an idea into other models
- Local translator based on LLaMA
- OpusV1 models for story-writing and role-playing
- thinking more in detail about what fine-tuning really is
- and the kinds of things that happen when a model does something that others haven’t done before
- LLM fine-tuning datasets
- Open-source Perplexity alternative
- PocketPal
- How we chunk
- This post definitely has more
pevibes, intended to teach the community their trials and errors with RAG - Interesting that they aren’t sure whether to open source or sell the product, but the post is purely educational.
- This post definitely has more
- Open LLM Leaderboard new interface
- Leaderboard is another potential interface
- UGI-Leaderboard remake
Peer education pe
Has a lot of overlapping codes with ffc and spc.
- Comparing different whisper packages
- Optimize Whisper for fast inference
- Llama 2 prompt format
- Trailing whitespace in prompts
- The kinds of issues the community had to deal with in early LLM era
- Prompt format comparisons
- Perfect labels via prompting
- A common misconception about RAG frameworks
- Collection of open source RAG techniques
- exl2 quantization
- Unsloth dynamic quantization
- LLM “serious use” comparison
- Huge LLM Comparison
- Misguided attention
- New type of artifact, a list of prompts. And debates.
- llama.cpp breaking change
- GGUF security advisory
- Llama 3 GGUF conversion bug
- Interaction between GH and Reddit
- A paradigm shift in machine translation
- Sharing something not by the author, but the author appears in the comments
- How we chunk
Asking the community to share their projects spi
- Your best RAG projects
- Not the most interesting, but notable as an example of this theme.
- Dangers of malicious GGUF files
- Local models aren’t automatically safe
- Do you guys finetune models?
- How many people are fine tuning?
- Open source coding assistants?
- Interesting artifact in that it mixes two sizes and types of models into one workflow
- GPT-4 alternatives
- most are just lists of models and frontends, but notable w.r.t. Three levers of control, six levels of transparency