We are pleased to add to our prior announcements a new set of compelling benchmarks for Bitfusion’s production-grade Elastic AI Platform. We ran several ML benchmarks with SkyScale Cloud Solutions, demonstrating that Elastic AI via remote attached GPUs is a mainstream production-grade solution.
The below chart shows our virtual cluster as hosted on SkyScale.com.
The Bitfusion GPU attached network follows a typical network disaggregation evolution… separate the resources from the compute, and offer elasticity, efficiency and scale. We see this trend as an unavoidable evolution of heterogeneous compute, where accelerators (e.g. GPU and FPGAs) will form clusters connected to the network, and any user or workload can attach to it on-demand.
Essentially this is the epitome of Hyperconvergence or consumption-based AI. Many industry experts buy into this concept, but have asked us about what the network latency impact will be. After running many tests, we are convinced that the impact with Bitfusion is minimal for the majority of AI use cases. We ran several Tensorflow benchmarks, and compared local GPU (native) performance to remote performance, creating an apples to apples comparison. Here are the results:
To sum it up: Bitfusion remote achieves similar to native performance, with a negligible performance gap.
We are aware that there may be other networking configurations which are less optimal. That is why we prefer to say that our performance gap will be less than 10% (and not ~0% as shown above). However, networks are getting better and faster, so the concepts of remote attached AI like that enabled by Bitfusion will organically get better over time.
Please contact us to get access to our virtual cluster hosted in SkyScale.com so that you can run your own benchmarks.
Welcome to the AI Attached Network with Bitfusion.