The weights and architecture of Mixture-of-Experts model, Grok-1 (By xAI)
It is a next-generation AI assistant. It is accessible through chat interface and API. It is capable of a wide variety of conversational and text-processing tasks while maintaining a high degree of reliability and predictability. | Boost your customer service with Coldi's AI call center. Try our AI voice agents to handle calls, bookings, and support – fast, efficient, and always on. |
Designed to reduce brand risk. Best in class data retention, and no training on your data;
Can handle complex multi-step instructions over large amounts of content;
Personalize it to excel at your use cases | Products, Pricing, Products, Pricing, info@coldi.ai |
Statistics | |
Stacks 157 | Stacks 0 |
Followers 63 | Followers 1 |
Votes 0 | Votes 1 |
Integrations | |
| No integrations available | |

It is the base model weights and network architecture of Grok-1, the large language model. Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.

Creating safe artificial general intelligence that benefits all of humanity. Our work to create safe and beneficial AI requires a deep understanding of the potential risks and benefits, as well as careful consideration of the impact.

It is Google’s largest and most capable AI model. It is built to be multimodal, it can generalize, understand, operate across, and combine different types of info — like text, images, audio, video, and code.

It is a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI.

It is a large multimodal model (accepting text inputs and emitting text outputs today, with image inputs coming in the future) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities.

It is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.

It offers an API to add cutting-edge language processing to any system. Through training, users can create massive models customized to their use case and trained on their data.

It is a small, yet powerful model adaptable to many use cases. It is better than Llama 2 13B on all benchmarks, has natural coding abilities, and 8k sequence length. We made it easy to deploy on any cloud.

It is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language. There are various sizes of the code model, ranging from 1B to 33B versions.

It is an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese.