Enterprise Refact Edition - Model Hosting
Refact Enterprise Refact is a version that is optimized for enterprise use cases. It allows you to use all of the models avaliable in Refact.ai Self-hosted and also supports vLLM models.
Enabling vLLM
With the enterprise version of Refact, you can use an inference engine that uses PagedAttention
from the vLLM library. It works faster and supports continuous batching, which means it can start work on new inference tasks, while continuing to serve other clients at the same time.
To enable vLLM select one the available vLLM models in the Model Hosting page. The full list of available models can be found on the Supported Models page.