• ylai@lemmy.mlOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    The novel bit of this project is actually the usage of GGML quantization from llama.cpp for Stable Diffusion, which can offer lower RAM usage and faster inference on CPU than all the previous CPU implementations without the benefit of low bit quantization, which was known to make CPU and low RAM LLaMA inference feasible.

    The important long term implication is that people have been targeting the incorrectly sized Stable Diffusion model, if the goal is quality on commodity hardware (this includes GPU, too). For example, Stable Diffusion where Stability AI has gloated so much how it fits commodity hardware is slightly less than 1 billion parameters. The smallest LLaMA that people nowadays can happily run on commodity GPU or CPU is already 7 billion parameters. And even OpenAI’s DALL·E 2, which many called prohibitive because “you need a 48 GB GPU” (which is not true, with quantization), is just 3.5 billion parameters.

    For additional context, Stable Diffusion using CPU has been done before, though with repurposed frameworks rather than a custom C++ project. Notably, there has been a Q-Diffusion paper (https://github.com/Xiuyu-Li/q-diffusion), but the result was obtained by simulating the quantization, and e.g. the GitHub repo not actually offer an implementation with actual speed-up.

      • Ubermeisters@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 year ago

        if you are using Automatic1111 you can force CPU use still, just FYI, with a modification to the file located in */stable-diffusion-webui/webui-user.bat

        set COMMANDLINE_ARGS= --use-cpu all --no-half --skip-torch-cuda-test --enable-insecure-extension-access
        

        That’s what works for my PC anyway. Took me a while to figure that out so maybe you will get lucky and it will work for you as well. I only do this when I want to do things my 8Gb GPU can’t handle. The CPU is way slower than GPU, but not nearly as memory capped. (Dual Xeon E5-2630v4 2.2GHz , 64GB ram)

        I just make a copy of webui-user.bat, rename it, make the above edit (if you want the browserr to launch automatically you can also you can also add in --autolaunch), then make a shortcut somewhere that points to the new file, and that shortcut is how i run A1111 in CPU only mode. Its non destructive this way, and you can still just use your normal startup method