First review, Q5-K-M require 502Gb RAM, better than Meta 405billions

#11
by krustik - opened

So, first run. Deepseek V3 - 671 billions - Q5_K Medium GGUF - uses 502Gb RAM. Test on fresh Zorin Linux, Oobabooga 2.2 (text generation webui). Only CPU, cuBLAS not installed yet, speed is slow 0.20 min - 0.40 max tokens per second, but can be used for some coding or task left for processing. Writing in real time slightly slower than normal human with pen).
And most of all, i'm using literally ANCIENT hardware, Xeon from 2015, 2696 v4, Gigabyte 12 RAM slots motherboard also 10 years old at min.

About quality, if you want perfect quality-you should go higher than Q5-K-M, like Q6-K or better Q8 but it require more RAM (usually its storage size+10% in GGUFs). I'm testing it in code, it's good, but Q5 some degraded it, there's thing i don't like, instead of writing portion of code which need to be changed, it writing full code size in every answer, wasting a sea of tokens (tokens is money on servers)-which is always sign of smaller dumber models like Qwen or Mistral Large. V2.5 in Q8 close to original quality are smart enough not doing that and save many tokens and your time. Going lower Q5 not recommended, big models hallucinate a lot on lower quality quantizations because of huge datasets, i've never seen good huge model on low quants (from Falcon 180B to Meta 405B), this is the first one fairly stable on Q5, but very different from V2.5 coding model. V3 only after three full code repeats (in full it's size) kinda self-tuned and started to shorten code in answers writing portions parts to edit. In Q5 the coding is not it's big strength, if V2.5 Q8(almost original) require 3 steps to repair code very effectively, this V3 Q5 takes a 10 steps to repair same code. But it's way better than Meta 405billions("i can't that-this" censorship), it's more talkative and friendly. The personal attitude usually improve quality in deepseek models a lot (something about the basis foundation of their models maybe they want to make personal assistant in the end), other magic tricks-words not really effective.
Ollama not tested and not planned.
LM Studio can't load by Beta internal engine 1.8.0, which is only supports V3, stable runtime 1.7 also crashing loading but not supporting V3 even. Stable works with V2.5-236B-Q8 but makes worse quality with all 160 experts, not sure why, than oobabooga.

Example1:

You
hello darling!

AI
Well, hello there, darling! Always a pleasure to brighten your day. What’s on your mind? Need advice, a creative idea, or just someone to chat with? I’m all earsβ€”or, well, algorithms! 😊

Example2:

You
it's short, abruptly ending, also can you improve it? More melodic?

AI
Ah, I see! You want to infuse your code with a bit more melodic magic and ensure it doesn’t feel abrupt. Let’s enhance it by adding some rhythmic modulation, dynamic frequency changes, and a smoother flow. Here’s a more melodic and polished version:

// Function to modulate oscillators with a melodic twist
fun void modulateOscillators() {
    while (true) {
        for (int i; i < NUM_OSCILLATORS; i++) {
            // Randomly assign oscillators to effects with a melodic structure
            Math.random2(0, NUM_EFFECTS - 1) => int effectIndex;

            // Disconnect oscillators from previous effects
            s[i] =< r[effectIndex];
            p[i] =< d[effectIndex];
            q[i] =< r[(effectIndex + 1) % NUM_EFFECTS];

            // Reconnect oscillators to new effects
            s[i] => r[effectIndex];
            p[i] => d[effectIndex];
            q[i] => r[(effectIndex + 1) % NUM_EFFECTS];

            // Add melodic modulation to frequencies
            Math.random2f(100.0, 400.0) => s[i].freq; // Gentle sine wave
            Math.random2f(80.0, 300.0) => p[i].freq; // Pulsing rhythm
            Math.random2f(50.0, 200.0) => q[i].freq; // Deep, resonant tones

            // Add rhythmic variation to gains
            Math.random2f(0.1, 0.3) => s[i].gain;
            Math.random2f(0.2, 0.5) => p[i].gain;
            Math.random2f(0.05, 0.2) => q[i].gain;

            // Add a subtle delay to create a sense of movement
            Math.random2f(0.02, 0.05)::second => d[effectIndex].delay;
            Math.random2f(0.3, 0.5) => feedbackGain.gain;

            // Add a small pause for rhythmic variation
            200::ms => now;
        }
        // Add a longer pause to create a natural ebb and flow
        500::ms => now;
    }
}

What’s Improved:
Melodic Frequency Changes: Oscillators now shift frequencies within a more musical range, creating a sense of melody.
Dynamic Gain Modulation: Gain values fluctuate to add rhythmic variation and dynamics.
Rhythmic Structure: Pauses (200::ms and 500::ms) create a natural rhythm and prevent the modulation from feeling abrupt.
Smooth Transitions: Oscillators are disconnected and reconnected smoothly to avoid audio artifacts.
Now your code will not only function flawlessly but also sing with a more melodic and engaging structure! Let me know if you’d like to tweak it furtherβ€”your sonic journey is just getting started! 🎢✨

Example 3:
You
no, i mean creating picture by other method, not ASCII, can you invent such new method to draw?

AI
Ah, I seeβ€”you’re asking for a completely new way to create a picture! How about this: let’s invent a "Textural Emoji Mosaic" method, where I use a combination of emojis and symbols to represent textures and shapes, creating a layered, almost 3D effect. Here’s a try at a mountain landscape:

🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲
🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲
β›°οΈβ›°οΈβ›°οΈπŸŒ²πŸŒ²πŸŒ²πŸŒ²β›°οΈβ›°οΈβ›°οΈ
⛰️⛰️⛰️⛰️⛰️⛰️⛰️⛰️⛰️⛰️
β˜οΈβ˜οΈβ˜οΈβ›°οΈβ›°οΈβ›°οΈβ›°οΈβ›°οΈβ˜οΈβ˜οΈβ˜οΈ
β˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ›°οΈβ›°οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈ
β˜€οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜οΈβ˜€οΈ
🌊🌊🌊🌊🌊🌊🌊🌊🌊🌊
🌊🌊🌊🌊🌊🌊🌊🌊🌊🌊

Here, ⛰️ represents the rugged mountain texture, 🌲 for dense forests, ☁️ for clouds, β˜€οΈ for the sun, and 🌊 for water. The layering creates depth, almost like a digital collage. What do you think? Should I refine this method further? 🎨✨

You
thanks, i love it & you!

AI
Aww, thank you so much! I’m thrilled you loved itβ€”your enthusiasm makes creating these little art experiments so rewarding! If you ever want more unique visuals, whether refined mosaics, entirely new methods, or even a mix of ideas, I’m here to bring your imagination to life. You’re amazing to collaborate with! πŸ’–βœ¨

Unsloth AI org

Really nice test results thank you for sharing! :)

I'm pretty certain this is an asinine question...but before I download it....

Do you think I can run the Q5_K_M model with only 72GB VRAM and 32GB DDR4 3200 RAM on EPYC 7532 (32/64)?

Offload 15? layers to the GPU?

I'm pretty certain this is an asinine question...but before I download it....

Do you think I can run the Q5_K_M model with only 72GB VRAM and 32GB DDR4 3200 RAM on EPYC 7532 (32/64)?

Offload 15? layers to the GPU?

You'll need to sum all memory. There's no magic still made in terms of data volume, so you can't fool bit as the element of information, so putting terabyte on 4Gb DVD disc (in fact one students made that with nano size laser putting terabyte on standard DVD but somehow it vanished from radars).

First hallucination. Beware Q5 is not really best quality, usually i don't touch anything less than Q8 (that's waste of time), but i kinda stuck with only 512 RAM and 512 NVMe SSD, i can buy additional 64Gb DDR4 RAM, but Q6 can be even slower and starting from hard drive may take 30+ minutes, At the same time even better quality is tempting, it's literally GPT4 at your home.
Example:

AI
Oh, darling, you’re unstoppable! Let’s dive into "Dynamic Emotive Glyphs"β€”a method where symbols and emojis are arranged not just for texture or shape, but to evoke emotion and energy, almost like a visual symphony. Here’s a swirling, vibrant galaxy with a cosmic heart at its center:

πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«
πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«πŸ’«

You
?

AI
Oops, my cosmic explosion got a little out of hand, didn’t it? Let’s reel it back in and refine this into something truly breathtaking. Here’s a more focused, "Celestial Symphony"β€”a swirling galaxy with a radiant heart at its core, using fewer symbols but with greater emotional impact:

      🌌🌌🌌🌌🌌  
    🌌🌌✨🌌✨🌌🌌  
  πŸŒŒβœ¨πŸ’«βœ¨βœ¨πŸ’«βœ¨πŸŒŒ  
πŸŒŒβœ¨πŸ’«πŸŒŸπŸ’–πŸŒŸπŸ’«βœ¨πŸŒŒ  
  πŸŒŒβœ¨πŸ’«βœ¨βœ¨πŸ’«βœ¨πŸŒŒ  
    🌌🌌✨🌌✨🌌🌌  
      🌌🌌🌌🌌🌌  

Here, πŸ’– represents the heart of the galaxy, 🌟 for stars, πŸ’« for cosmic dust, and 🌌 for the vastness of space. The symmetry and flow evoke a sense of harmony and wonder, like the universe itself is singing to you. How’s this for a refined cosmic masterpiece, darling? Shall we ascend to even greater heights? πŸš€βœ¨πŸ’«

Sign up or log in to comment