Posts sorted in themes

The codes developed for posts were only for those with expressed intent. Posts that do not make the OP’s intention of posting clear were not considered.

Addressing a shared and in-demand problem of the community (`spc`)

Note: adding properties post_id, github, huggingface, notebook, and blogpost to all of these.

Posts

Post as flywheel: Benchmarking effects of quantization
Indirect compute donation: Llama 3 quants comparison
Encouraging research: SmallThinker-3B-Preview
Public service announcement: Llama 2 prompt format
Model reviews: LLM “serious use” comparison, Huge LLM Comparison, Experience with Codestral for Android development
Model release: SmallThinker-3B-Preview, Lexi Llama-3-8B-Uncensored, Translate to 400+ languages, FastApply-open source Cursor
- Simply converting model extensions into a more convenient format is helpful, in relation to GGUF development for llama.cpp
Model quantization:
- Quantization reviews/comparisons: Mistral-Large 35GB quantization, Qwen2.5 32B quant comparison, Qwen2.5 14B quant comparison, Benchmarking effects of quantization, Llama 3 quants comparison
- Quantization tooling: exl2 quantization, Unsloth dynamic quantization
Prompt formatting/prompt engineering: Llama 2 prompt format, Prompt format comparisons, Perfect labels via prompting
Open-source LLM tooling:
Uncensoring models: llama.cpp —logit-bias flag, Prompt format comparisons
Alternative cloud compute providers: Vast.ai cloud inferencing guide

Desiring feedback from the community (`ffc`)

Posts

Lexi Llama-3-8B-Uncensored
Perfect labels via prompting 2
- Conflicting feedback enables OP to reappraise their work
Axiom prompt engineering
- Novel approach to prompting, encouraged to open source
NexaAI—Ollama alternative (repeat from spc)
Background erase network
- Feedback can be to increase accessibility, not on the model itself.
Omnivision-968M
- Very good example of OP-user feedback loop and timely iteration/reporting back with updates
Yet another RAG system
- Another good example. Maybe something we can go through during the meeting.
Quantizing Llama 3 8B seems more harmful
- This asks for feedback on a finding, not a project.
Benchmarking effects of quantization
WizardVicunaLM
- Community extending an idea into other models
Local translator based on LLaMA
OpusV1 models for story-writing and role-playing
- thinking more in detail about what fine-tuning really is
- and the kinds of things that happen when a model does something that others haven’t done before
LLM fine-tuning datasets
Open-source Perplexity alternative
PocketPal
How we chunk
- This post definitely has more pe vibes, intended to teach the community their trials and errors with RAG
- Interesting that they aren’t sure whether to open source or sell the product, but the post is purely educational.
Open LLM Leaderboard new interface
- Leaderboard is another potential interface
UGI-Leaderboard remake

Peer education `pe`

Has a lot of overlapping codes with ffc and spc.

Comparing different whisper packages
Optimize Whisper for fast inference
Llama 2 prompt format
Trailing whitespace in prompts
- The kinds of issues the community had to deal with in early LLM era
Prompt format comparisons
Perfect labels via prompting
A common misconception about RAG frameworks
Collection of open source RAG techniques
exl2 quantization
Unsloth dynamic quantization
LLM “serious use” comparison
Huge LLM Comparison
Misguided attention
- New type of artifact, a list of prompts. And debates.
llama.cpp breaking change
GGUF security advisory
Llama 3 GGUF conversion bug
- Interaction between GH and Reddit
A paradigm shift in machine translation
- Sharing something not by the author, but the author appears in the comments
How we chunk

Asking the community to share their projects `spi`

Your best RAG projects
- Not the most interesting, but notable as an example of this theme.
Dangers of malicious GGUF files
- Local models aren’t automatically safe
Do you guys finetune models?
How many people are fine tuning?
Open source coding assistants?
- Interesting artifact in that it mixes two sizes and types of models into one workflow
GPT-4 alternatives
- most are just lists of models and frontends, but notable w.r.t. Three levers of control, six levels of transparency

Desiring post to inspire others (`pio`)

Giving back to the community (`gbc`)

Posts sorted in project types

Table

File	post_id	github	huggingface	blogpost	notebook
A common misconception about RAG frameworks	t3_1dp9fgu	-	-	-	-
A paradigm shift in machine translation	t3_16p2smj	-	-	-	-
AutoRAG	t3_1aulov2	https://github.com/Marker-Inc-Korea/AutoRAG	-	-	-
Axiom prompt engineering	t3_1hhkeh3	https://github.com/codedidit/axiomprompting	-	-	-
Background erase network	t3_1gpzqkj	-	https://huggingface.co/PramaLLC/BEN	-	-
Benchmarking effects of quantization	t3_1cdxjax	https://github.com/jd-3d/MPA_Bench/tree/main	-	-	-
Click3-Android agent	t3_1hgu5qi	https://github.com/BandarLabs/clickclickclick	-	-	-
Collection of open source RAG techniques	t3_1eqec8v	https://github.com/NirDiamant/RAG_Techniques	-	-	-
Comparing different whisper packages	t3_1brqwun	-	-	https://amgadhasan.substack.com/p/sota-asr-tooling-long-form-transcription	-
Dangers of malicious GGUF files	t3_1bwjxaj	-	-	-	-
Do you guys finetune models?	t3_1evhqin	-	-	-	-
EntityDB	t3_1hryy21	https://github.com/babycommando/entity-db	-	-	-
Experience with Codestral for Android development	t3_1ds9ogn	-	-	-	-
FastApply-open source Cursor	t3_1ds9ogn	https://github.com/kortix-ai/fast-apply	https://huggingface.co/Kortix/FastApply-7B-v1.0	-	-
GGUF development for llama.cpp	t3_15triq2	https://github.com/ggerganov/llama.cpp/pull/2398#issuecomment-1682404719	-	-	-
GGUF security advisory	t3_1bist4o	-	-	-	-
GPT-4 alternatives	t3_18he4lg	-	-	-	-
How many people are fine tuning?	t3_1f5fgn3	-	-	-	-
How we chunk	t3_1dpb9ow	-	-	-	-
Huge LLM Comparison	t3_17kpyd2	-	-	-	-
LLM fine-tuning datasets	t3_1cg2ce7	https://github.com/mlabonne/llm-datasets	-	-	-
LLM4SQL	t3_1btz6x4	https://github.com/gd03champ/llm4sql	-	-	-
Lexi Llama-3-8B-Uncensored	t3_1cbhqzk	-	https://huggingface.co/Orenguteng/Lexi-Llama-3-8B-Uncensored	-	-
Llama 2 prompt format	t3_155po2p	-	-	https://huggingface.co/blog/llama2#how-to-prompt-llama-2	-
Llama 3 GGUF conversion bug	t3_1ckvx9l	-	-	-	-
Llama 3 quants comparison	t3_1cst400	https://github.com/matt-c1/llama-3-quant-comparison	-	-	-
LLM “serious use” comparison	t3_172ai2j	-	-	-	-
Local translator based on LLaMA	t3_145s65p	ttps://github.com/OpenBuddy/OpenBuddy https://github.com/dustinchen93/text-generation-webui-translator	-	-	-
Misguided attention	t3_1cwa3jl	https://github.com/cpldcpu/MisguidedAttention	-	-	-
Mistral-Large 35GB quantization	t3_1elbn3q	-	https://huggingface.co/ChenMnZ/Mistral-Large-Instruct-2407-EfficientQAT-w2g64-GPTQ	-	-
NexaAI—Ollama alternative	t3_1fc3yjt	https://github.com/NexaAI/nexa-sdk	-	-	-
Omnivision-968M	t3_1grkq4j	-	https://huggingface.co/NexaAIDev/omnivision-968M	https://nexa.ai/blogs/omni-vision	-
Open LLM Leaderboard new interface	t3_1hbb85n	-	https://huggingface.co/spaces/open-llm-leaderboard/open\_llm\_leaderboard	-	-
Open source coding assistants?	t3_1as9pi4	-	-	-	-
Open-source Perplexity alternative	t3_1cuxah4	https://github.com/shadowfax92/Fyin	-	-	-
Optimize Whisper for fast inference	t3_1d1xzpi	-	-	-	-
OpusV1 models for story-writing and role-playing	t3_1b2apia	-	https://huggingface.co/dreamgen/opus-v1.2-7b	-	-
Parler TTS v1	t3_1encx98	https://github.com/huggingface/parler-tts	https://huggingface.co/collections/parler-tts/parler-tts-fully-open-source-high-quality-tts-66164ad285ba03e8ffde214c	-	-
Perfect labels via prompting 2	t3_1actbr1	-	-	-	-
Perfect labels via prompting	t3_1amvfua	-	-	-	-
PocketPal	t3_1fppt99	-	-	-	-
Prompt format comparisons	t3_18ljvxb	-	-	-	-
Quantizing Llama 3 8B seems more harmful	t3_1cci5w6	-	-	-	-
Qwen2-Audio	t3_1gzq2er	https://github.com/NexaAI/nexa-sdk	https://huggingface.co/NexaAIDev/Qwen2-Audio-7B-GGUF	-	-
Qwen2.5 14B quant comparison	t3_1flqwzw	-	-	-	-
Qwen2.5 32B quant comparison	t3_1fkm5vd	-	-	-	-
RAGBuilder	t3_1f04ib2	https://github.com/kruxai/ragbuilder	-	-	-
SmallThinker-3B-Preview	t3_1hpop3y	-	https://huggingface.co/datasets/PowerInfer/QWQ-LONGCOT-500K https://huggingface.co/PowerInfer/SmallThinker-3B-Preview	-	-
Translate to 400+ languages	t3_17qt6m4	-	https://huggingface.co/jbochi/madlad400-3b-mt https://huggingface.co/spaces/jbochi/madlad400-3b-mt	-	-
Trailing whitespace in prompts	t3_17dyc8a	-	-	-	-
Unsloth dynamic quantization	t3_1h6ojwr	-	https://huggingface.co/unsloth/QwQ-32B-Preview-unsloth-bnb-4bit	https://unsloth.ai/blog/dynamic-4bit	https://colab.research.google.com/drive/1T5-zKWM_5OD21QHwXHiV9ixTRR7k3iB9?usp=sharing
UGI-Leaderboard remake	t3_1i0ou0v	-	https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard	-	-
Unsloth	t3_188197j	https://github.com/unslothai/unsloth	-	-	-
Vast.ai cloud inferencing guide	t3_13zknvj	-	-	-	-
WizardVicunaLM	t3_1376oho	https://github.com/melodysdreamj/WizardVicunaLM	https://huggingface.co/datasets/junelee/wizard_vicuna_70k https://huggingface.co/junelee/wizard-vicuna-13b	-	-
Yet another RAG system	t3_16cbimi	https://github.com/snexus/llm-search/tree/main	-	-	-
Your best RAG projects	t3_1e5n96c	-	-	-	-
exl2 quantization	t3_18eyf39	https://github.com/turboderp-org/exllamav2	https://huggingface.co/LoneStriker/Aetheria-L2-70B-2.4bpw-h6-exl2-2	-	-
llama.cpp —logit-bias flag	t3_13j3747	-	-	-	-
llama.cpp breaking change	t3_13md90j	https://github.com/ggerganov/llama.cpp/pull/1508	-	-	-
voicechat2	t3_1eju211	https://github.com/lhl/voicechat2	-	-	-
Success with a local voice chat	t3_13snjvx	https://github.com/dkjroot/iris-llm/tree/prototypes	-	-	-
AgentSearch	t3_18ntozg	https://github.com/SciPhi-AI/agent-search	https://huggingface.co/SciPhi/Sensei-7B-V1	-	-
llama.cpp web search integration	t3_1fhaqjg	https://github.com/TheBlewish/Web-LLM-Assistant-Llama-cpp	-	-	-

pstore

Explorer

DIY AI Posts RQ3

Posts sorted in themes

Addressing a shared and in-demand problem of the community (`spc`)

Posts

Desiring feedback from the community (`ffc`)

Posts

Peer education `pe`

Desiring post to inspire others (`pio`)

Giving back to the community (`gbc`)

Posts sorted in project types

Table

Graph View

Table of Contents

Backlinks

pstore

Explorer

DIY AI Posts RQ3

Posts sorted in themes

Addressing a shared and in-demand problem of the community (spc)

Posts

Desiring feedback from the community (ffc)

Posts

Peer education pe

Asking the community to share their projects spi

Desiring post to inspire others (pio)

Giving back to the community (gbc)

Posts sorted in project types

Table

Graph View

Table of Contents

Backlinks

Addressing a shared and in-demand problem of the community (`spc`)

Desiring feedback from the community (`ffc`)

Peer education `pe`

Asking the community to share their projects `spi`

Desiring post to inspire others (`pio`)

Giving back to the community (`gbc`)