Self-Hosted Refact: 15b Code Model for Code Transformation, Completion and Chat

May 4, 2023
by Oleg Klimov

We’re releasing a new Self-Hosted Server that allows Refact users to use code completion, code transformation, and chat functions powered by the Starcoder 15b code model on their own GPU.

What is StarCoder?

StarCoder is a new 15b state-of-the-art large language model (LLM) for code released by BigCode*.

StarCoder is a high-performance LLM for code with over 80 programming languages, trained on permissively licensed code from GitHub. The 15B parameter model outperforms models such as OpenAI’s code-cushman-001 on popular programming benchmarks making it the most powerful open-source model to date.

With over 8,000 tokens input size it can use more context than any open LLM enabling developers to use it not only for code completion, but also for more complex coding tasks and chat.

Self-hosting StarCoder in Refact

In addition to code completion, we have implemented code transformation tools and chat powered by StarCoder model using prompt engineering. Check this repo for implementation, or perhaps even try to add your own tools!

You can use self-hosted chat to ask for code examples and use AI Toolbox with tools like “Fix Bug”, “Make Code Shorter”, “Explain Code”, “Add Console Logs”, “Comment Each Line”, “Add Type Hints”, etc.

Check out the docs on self-hosting to get your AI code assistant up and running.

To run StarCoder using 4-bit quantization, you’ll need a 12GB GPU, and for 8-bit you’ll need 24GB.

It’s currently available for VS Code, and JetBrains IDEs.

Enterprise Version

We are building an enterprise self-hosted version with the ability to fine-tune on company’s code. Contact us if you’re interested in trying it for your company.

* BigCode is a Hugging Face’s and ServiceNow’s initiative aimed to develop “state-of-the-art” AI systems for code in an open and responsible way. Refact’s team is a part of BigCode and takes part in building open-source models.