“…gpt-4 has been seriously nerfed and is basically unusable at this point.”
This is a kind of issue that arises due to the opaqueness of accessing a model through a web interface. You never know what changes have been made at each checkpoint, and you may not be able to access the “one you depended on”.
“so ive decided to run my own models as its the only way i can rely on them.”
One advice is that even with all the ‘nerfing’, no local model comes close to the performance of GPT-4, so it might be best to try using a local interface and the API endpoint. OP knows that they prefer the gpt-4-0615 and gpt-4-0314 checkpoints better than what they have been being served on ChatGPT, and while these checkpoints will not be served forever, being able to choose these endpoints is an ability afforded by choosing the open interface.
On the above point, OP asks whether others have experimented with optimizing the chat interface, like how to do more than just sending the query via API. Others note that most of the steps (system prompt, summary of conversation history, and next input) is abstracted away in most open source frontends.
Another recommends openrouter, which lets you run generations against a range of hosted models, some for free and some by buying tokens.
I’ve only ever used it through my own tools (using API). I would never pay for a monthly sub for just their web app, paying for the API gives me much more control over how I use it and also I only pay for exactly what I use. Using the API, it has repeatedly impressed me helping me solve all sorts of complex programming problems in several languages.
I like people reporting these types of unreliable experiences:
My experience using gpt4 varies day to day. Yeserday it was on fire smashing out excellent python code for a generative music thing I wanted to play with, but I quickly realised, no no, it IS really on fire and after several network and unexplained errors, it just deleted the entire conversation and no amount of cache clearing, changing browsers etc would help. Constant network and undefined errors… It looks to me like OpenAi just don’t give a crap and would rather let Microsoft field substandard ‘safe’ versions that drag you into their infrastructure. Screw that.