SVDQuant: 4-Bit Quantization Powers 12B Flux on a 16GB 4090 GPU with 3x Speedup

(hanlab.mit.edu)

178 points | by lmxyy 6 days ago ago

65 comments