Recently I had the need to index documents stored in Azure Storage Blobs. Additionally I wanted to use the blob storage metadata also to add some information to those documents. Because I needed rich text information on the metadata I could use blob storage metadata directly. Se here why.
So I had to use 2 different data sources. One for the documents and another to the metadata. So I chose Azure Blob Storage and Azure Table Storage. This is the full diagram of the final solution:
The indexers are responsible for updating the index with the contents of the 2 different data sources. There is a very important field that in my case it’s called the UniqueIdentifier field because this field is marked with the key property. This is the field that uniquely identifies each document on the Azure Search Index.
And it’s this field that is responsible for correlating the items that come from one data source (documents from blob storage) and items that come from the other data source (records from table storage).
Every document inserted in blob storage has a custom metadata property named also UniqueIdentifier that will have a table storage record associated with the corresponding metadata.