Wink - AI原生创新，忠于用户，专属智能体验

Startup company Tenstorrent wants to challenge GPU dominance with their AI accelerator cards, and their latest p150a model is equipped with 32GB of GDDR6 memory. Russian tech blogger Pro Hi-Tech conducted a series of tests with interesting results.

Tests used the original Llama 3 8B model with inference through the Transformers library. The first token generation time was slightly faster than the RTX 5090 and A100, but the continuous generation speed was only half that of the 5090, on par with the A30. Power consumption performance was impressive, with better performance per watt than most comparison graphics cards.

The problems lie in the software layer:

- The official installation guide is outdated

- Training containers cannot be started

- Monitoring programs don't display memory usage

- Enabling monitoring during operation causes system crashes

- Unable to test the 14B model (reports insufficient memory error)

- System crashes after long-term loads

The hardware design actually has highlights, such as four 800G interconnect interfaces (similar to NVLink), suitable for building distributed training clusters. However, the current software completion level is clearly not yet suitable for production environments.

Someone in the comments mentioned that the AMD PRO W7900 at the same price point might be more practical, but if Tenstorrent can address their software shortcomings, this hardware architecture actually has potential. Spending $1400 on this card now is basically equivalent to being a tester for the startup company.

Wink Pings

Tenstorrent AI Accelerator Tested: Hardware Potential Hindered by Software