Image by author
The war between open source and closed source has been going on for some time. After OpenAI launched GPT-3 as a closed-source model, EleutherAI launched an open-source alternative called GPT-Neo, which provided comparable results. Similarly, when DALL·E 2 was launched, an open source version of DALL·E 2 by Stability AI called Stable Diffusion was released.
We all know about ChatGPT and how people want to get the open source version of the model and build their apps securely with more control. ChatGPT currently offers API access and customization, but you’ll be using their service and machine to perform all sorts of tasks.
On March 10, 2023, Together Computer released an open source version of ChatGPT called OpenChatKit. The open source alternative allows developers to have more control over the chatbot’s behavior and tailor it to their specific needs. Moreover, it is more accessible to a wider range of users and communities, especially those who may not have the resources to access proprietary models.
OpenChatKit provides an open source, powerful toolkit for building generalized and specialized chatbot applications. It’s the first version of the model, and the developers have released a number of tools and processes to improve the model with community input.
Together Computer has released OpenChatKit 0.15 under the Apache-2.0 license, which comes with code, model weights, and training datasets.
You can try demo based model in Hugging Face in OpenChatKit. It’s similar to ChatGPT where you write a prompt and the model responds to you with an answer, code block, tables or text.
Image by |: OpenChatKit:
OpenChatKit comes with a base bot and building blocks to build custom chatbot applications from the ground up.
The set consists of 4 components.
- A large command-driven language model fine-tuned for conversation from EleutherAI’s GPT-NeoX-20B.
- Instructions for adjusting the model to achieve high accuracy on specific tasks.
- A comprehensive search engine to update the bot’s answer using knowledge from Wikipedia, news feeds or sports scores.
- Fine tuned from GPT-JT-6B with moderate goals to filter which questions the bot answers.
The basis of OpenChatKit is a large language model called GPT-NeoXT-Chat-Base-20B. It is based on EleutherAI’s GPT-NeoX model and refined on 43 million high-quality spoken instructions. The development team has particularly focused on fine-tuning several tasks, such as multi-faceted dialogue, question answering, classification, extraction, and summarization.
Out of the box, the model provides a solid foundation. As we can see, it scores higher than its base model GPT-NeoX on the HELM benchmark. The GPT-NeoXT-Chat-Base-20B model performed reasonably well on the question-and-answer, extraction, and classification tasks.
It is the first version of the model and you will see many bugs, errors and corresponding answers. In this session, we’ll review a few areas that the model struggles to understand.
- Knowledge basedA chatbot can actually produce incorrect results. ChatGPT has the same problems. The team is working on a search engine that will update the incorrect information.
- Code basedThe model was not developed on a large enough corpus of source code to write accurate code. You may be disappointed.
- Context switchingIf you start talking about something else during the conversation, the chatbot will not automatically change the topic and will continue to give you answers related to the previous topics.
- RepetitionThe chatbot sometimes repeats the answer or gets stuck. You can refresh the page to restore it.
- Creative answersUnlike ChatGPT, the chatbot does not generate essays or creative stories. It is limited to short answers.
OpenChatKit is a good initiative and with the help of the community we can see a better version of the chatbot soon. If you’re expecting OpenChatKit to be as relaxing as ChatGPT or to provide amazing responses, you’ll be disappointed because it’s in its early stages and it’s been trained on a less diverse database.
In this post, we learned valuable insights about the open source version of ChatGPT, which is great news for the developer and data science community. What’s more, we’ve explored how it works and delved into the four components of the package that can help create a fully customizable chatbot, complete with the latest news updates and customization options.
Try the demo and read more about the model to learn about model refinement and other important tools.
Abid Ali Awan (@1abidaliawan:) is a certified data scientist who loves building machine learning models. He currently focuses on content creation and writes technical blogs on machine learning and data science technologies. Abid holds an MSc in Technology Management and a BS in Telecommunications Engineering. His vision is to create an AI product using a graph neural network for students struggling with mental illness.