Microsoft CEO Satya Nadella described the arrival of big-language AI models like GPT-4 as a “mosaic moment” comparable to the arrival of the first graphical web browser. Unlike Mozaic’s early days, when Microsoft was late to the browser wars and forced to buy its first web development tool, the company has taken the pole position in AI, rapidly deploying AI technologies in enterprise and consumer products.
The key to understanding Microsoft is its view as a platform company. That internal culture drives it to provide tools and technologies for developers and foundations that developers can build on. For AI, this starts with Azure OpenAI APIs and extends to tools like the Prompt Engine and Semantic Kernel, which simplify the development of common experiences on top of OpenAI’s transformer-based neural networks.
As a result, much of this year’s Microsoft Build developer event focused on how you can use this tooling to build your own AI-powered apps, drawing from the “copilot” model of AI assistive tools that Microsoft is promoting is in its entire territory. Edge browser and Bing search engine, on GitHub and its developer tools, and Microsoft 365 and Power Platform for enterprise. We also learn where Microsoft plans to fill gaps in its platform and make its tooling a one-stop shop for AI development.
LLMs are vector processing tools
At the heart of a large language model like OpenAI’s GPT-4 is a massive neural network that works with a vector representation of language, searching for similar vectors that describe its cues and creating and refining an optimal path through a multidimensional semantic space that leads to an intelligible of the result. It’s similar to the approach used by search engines, but when the search is about finding vectors similar to those that answer your queries, LLMs expand the original set of semantic symbols that make up your original message (and the cue used to set the context. LLM is used). This is why Microsoft’s first LLM products, GitHub Copilot and Bing Copilot, are based on search-based services, as they already use vector databases and indexes to keep LLM answers on track. context:
Unfortunately for the rest of us, vector databases are relatively rare and are built on very different principles than the familiar SQL and NoSQL databases. They are perhaps best understood as multidimensional extensions of graph databases, transformed and embedded in the form, directions, and dimensions of data. Vectors make finding similar data fast and accurate, but they require a very different way of working than other forms of data.
If we want to build our own enterprise copilot, we should have our own vector databases, because they allow us to extend and improve the LLMs with our domain specific data. Maybe that data is a library of custom contracts or decades worth of product documents, or even all of your customer support queries and responses. If we could store that data in the right way, it could be used to create AI-powered interfaces for your business.
But do we have the time or resources to take that data and store it in an unfamiliar format on an unproven product? We need a way to quickly feed that data to AI based on the tools we already use.
Vector search comes to Cosmos DB
Microsoft announced a number of updates to its Cosmos DB cloud document database at BUILD 2023. While most of the updates focus on handling large amounts of data and query management, perhaps the most useful is the addition of Vector for developing AI applications. search capabilities. This also applies to existing Cosmos DB instances, allowing customers to avoid migrating data to a new vector database.
Cosmos DB’s new Vector Search is based on the recently launched MongoDB vCore service for Cosmos DB, which allows you to scale instances across specific virtual infrastructure as well as availability zones with high availability, and use a more predictable per-node pricing model while using familiar MongoDB APIs. Existing MongoDB databases can be migrated to Cosmos DB, allowing you to use MongoDB on-premises to manage your data and use Cosmos DB on Azure to run your applications. Cosmos DB’s new change feed tooling should make it easier to create backups across regions by replicating changes from one database to other clusters.
Vector Search extends this toolkit by adding a new query mode to your databases that can be used to power your AI applications. Although Vector Search is not a true vector database, it offers many of the same features, including a way to store embeddings and use them as a key to search your data, applying the same similarity rules as more sophisticated alternatives. The Microsoft-launched toolkit will support basic vector indexing (using IVF Flat), three types of distance measurements, and the ability to store and search vectors of up to 2,000 dimensions. Distance measures are a key feature of vector search because they help determine how similar vectors are.
Perhaps the most interesting thing about Microsoft’s original solution is that it is an extension of the popular document database. Using a document database to create a semantic repository for LLM makes a lot of sense; it’s a familiar tool that we already know how to use to deliver and manage content. There are already libraries that allow us to collect and convert different document formats and encapsulate them in JSON, so we can move from existing storage tools to LLM-ready vector embeddings without changing workflows or dealing with a whole new class of databases. to develop skills.
This is an approach that should simplify the task of gathering the specific data sets needed to build your own semantic search. Azure OpenAI provides APIs to create embeds from your documents, which can then be stored in Cosmos DB along with the source documents. Applications will create new embeds based on user inputs that can be used with Cosmos DB vector search to find similar documents.
These documents do not need to contain any of the keywords in the original query; they only need to be semantically similar. All you need to do is run the documents through the GPT digester and then create the embeds by adding a data preparation step to your application development. Once you have your dataset ready, you need to create a loading process that automates adding embeds as new documents are stored in Cosmos DB.
This approach should work well with updates to Azure AI Studio to deliver AI-ready personal data to your Azure OpenAI-based applications. What this means for your code is that it will be much easier to keep applications focused, reducing the risk of them shutting down quickly and producing illusory results. Instead, an app that provides bid responses for, say, government contracts, can use document data from your company’s history of successful bids to create an outline that can be shaped and personalized.
Using vector search as semantic memory
Along with its cloud-based AI tools, Microsoft brings an interactive Semantic Kernel extension to Visual Studio Code that lets developers build and test AI skills and plugins around Azure OpenAI and OpenAI APIs using C# or Python : Tools like Cosmos DB’s vector search should simplify building semantic memories for the Semantic Kernel, allowing more complex applications to be built around API calls. An example of using input is available as an extension to the Copilot Chat sample, which should allow you to swap vector search in place of the pre-built document analysis functionality.
Microsoft’s AI platform is just that, a platform you can build on. Azure OpenAI forms the backbone, hosting the LLMs. Vector search on Cosmos DB data will make it easier to ground the results in our own organization’s knowledge and content. It should take into account other AI platform announcements, around tools like Azure Cognitive Search, which automates connecting any data source to Azure OpenAI models, providing a simple endpoint for your applications and a toolkit to test the service without Azure AI Studio to exit from
What Microsoft offers here is a spectrum of tools for AI developers, starting with Azure AI Studio and its low-code Copilot Maker, from custom cognitive search endpoints to your own vector search on your documents. It should be enough to help you create an LLM-based application that suits your needs.
Copyright © 2023 IDG Communications, Inc.