Frankenmerges are actually kind of great...

For the past few months I've been working on a quiet little project on the weekends, whenever I can scrounge up time, and part of that project involves looking for the best models of each domain.

Of course, there are some great coding, medical, math, etc finetunes, but one thing that local models REALLY struggle with is context understanding. Reading the between the lines.

What I need is essentially to find a model that actually understands what I'm saying when I give a request. If I ask to help me work on a task, I don't want the LLM tripping all over itself trying to understand some of my implied speech, and come up with the wrong requirements all-together. Unfortunately, most of the coding models aren't exactly a chatty bunch, so they are the worst about that.

Well, interestingly enough, the roleplayers appear to have solved this problem by throwing MORE model at the problem lol

Enter Frankenmerges: smooshing the same model, or multiple models, together like play-dough to see what you get. Jam two 70bs together and somehow get a 120b; the model loses some level of coherence in terms of its raw knowledge and problem solving ability, but what it gets in terms of general understanding? WAY better. It's ability to actually figure out what you're saying when you ask for something is amazing.

Goliath-120b is so far the best for sure. Miqu-1-120b is also really solid. I tried some others, like Miquliz-120b, that one didn't really work too well.

I wonder if it's the extra layers? Maybe more layers == better comprehension?