Three Models Doctrine
Home inference does not need one perfect machine. It needs a small fleet of imperfect machines with clear roles. My current doctrine is built around three models of compute. The first model is cheap VRAM: a Tesla P40 with 24GB. It is old, passive, loud, and power-hungry, but it still