Part 1 - Architecture: Building the Foundation for AI-Powered Conversations


To be able to empower chatbot applications over your documents the most important part is to store vector embeddings of those documents.

What are vector embeddings?

Also known as vector representations or word embeddings, are a fundamental concept in natural language processing and machine learning. They are numerical representations of words, phrases, sentences, or even larger pieces of text, designed to capture semantic and contextual relationships between them in a high-dimensional space.

Azure Cognitive Search

Azure Cognitive Search is a cloud-based search service provided by Microsoft Azure. It allows you to build powerful and customizable search solutions for your applications, websites, or enterprise data using advanced search capabilities. Recently (June 2023) Microsoft announced on public preview the support for Vector Storage and Vector Search in Azure Cognitive Search, in previous posts we used Pinecone as Vector Database, and there are other providers, however having everything inside Azure is a no-brainer.


Some concepts before:

  • Indexes: It's where we store data, basically indexes are JSON text files with the content of the files, but when using Vector Storage, these indexes can also contain vector embeddings.
  • Indexers: It's the software component which goes to a data source to fetch new or modified files and then puts them into an index as text/JSON documents.
  • Data Sources: It's just the connection to the final storage, in our case, the files are in blob storage, but there are several options available, like Azure SQL, CosmosDB, etc.
  • Skillsets: Skill sets are part of Indexers, they are pieces of software that process documents in a pipeline fashion, after the text from a document has been retrieved you can plug and play skillsets which do other tasks, like generate text from images, take OCR text from pictures, etc, etc. Microsoft has developed for us several skills for us to reuse, however, the interesting part is that we can develop our custom skills.
  • Knowledge Store: A knowledge store is a data sink created by a Cognitive Search enrichment pipeline that stores AI-enriched content in tables and blob containers in Azure Storage for independent analysis or downstream processing in non-search scenarios like knowledge mining. Projections are the physical tables, objects, and files in a knowledge store that accept Cognitive Search AI enrichment pipeline content. Defining and shaping projections is most of the work if you're creating a knowledge store. You can learn more about creating a knowledge store in Azure Cognitive Search by following this quickstart guide. In the context of our Architecture the Knowledge store will contain our chunks of text with the vector embeddings, a Knowledge Store is backed by a storage account, and this storage account then will be used for the chunk indexer to create the chunk indexes (where our vector embeddings will reside)

Visual overview

Explanation: We have documents which are stored in blob storage (left of the diagram) and then we have Azure Cognitive Search which takes those files with an indexer, processes the text and created the vector embeddings with a custom skillset which uses the Open AI Embedding Generator after processing the processed results are stored in Indexes for later retrieval. The chunks and embeddings are generated in the process as projections and stored in a Knowledge Store (Blob Storage), data sources are created as the connection between the blob storage/knowledge store and the indexers. The Backend is the core of the solution (from a code perspective and it's explained below)


The solution is split into several software-engineered components which contribute to the entire process. The components developed by us are just 3, the rest of the components in our diagram are cloud components which support the entire solution (and these are explained above).

  • OpenAI Embedding Generator

It's an Azure Function which will be called for every single document indexed, this function gets text and returns embedding vectors. The function is used as part of an Azure Cognitive Search Skillset

  • Backend

IndexDocuments: This is An Azure Function HTTP Trigger whose responsibility is to create the Azure Cognitive Search infrastructure to make all happen, it will create the indexes, one for plain text documents and another for vector storage, it will create the data source, the indexer, the skill set, and of course it will wait for the indexer to finish, at the end it will return all the resources created.

DeleteDocuments: It will delete all the resources created by the previous function: Indexes, Indexers, Skillsets and Datasources. It won't delete the source of documents. (Blob Storage)

AskYourDocuments: This backend function is the one responsible to receive a question and generate an answerback, it uses the Langchain framework to encapsulate the complexity of building LLM Applications, and it also uses Azure Cognitive Search to find relevant chunks of documents from the vector embeddings, and it also uses Azure OpenAI to generate an answer in English (or any supported language) based on the context received from the chunks found based on the question asked by the user.

  • Frontend

It's an Azure Web Application developed in Streamlit. Its functionality is only to call the backend REST API, specifically the AskYourDocuments Azure Function, with a query and then render in screen a response back.

Want to know more?

This insight is part of a series where we go through the necessary steps to create and optimize Chat & AI Applications.

Below, you can find the full overview and the links to the different parts of the series:

If you want to get started with creating your own AI-powered chatbot, contact us