Azure Search Index with 2 datasources

Recently I had the need to index documents stored in Azure Storage Blobs. Additionally I wanted to use the blob storage metadata also to add some information to those documents. Because I needed rich text information on the metadata I could use blob storage metadata directly. Se here why.

So I had to use 2 different data sources. One for the documents and another to the metadata. So I chose Azure Blob Storage and Azure Table Storage. This is the full diagram of the final solution:

The indexers are responsible for updating the index with the contents of the 2 different data sources. There is a very important field that in my case it’s called the UniqueIdentifier field because this field is marked with the key property. This is the field that uniquely identifies each document on the Azure Search Index.

And it’s this field that is responsible for correlating the items that come from one data source (documents from blob storage) and items that come from the other data source (records from table storage).

Every document inserted in blob storage has a custom metadata property named also UniqueIdentifier that will have a table storage record associated with the corresponding metadata.

Continue reading

Azure Blob Storage Metadata 400 Bad Request

I was getting a 400 Bad Request when inserting blobs in Azure Blob Storage because I was setting metadata with non-ASCII characters.


// Retrieve storage account from connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
CloudConfigurationManager.GetSetting("StorageConnectionString"));

// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

// Retrieve a reference to a container.
CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");

// Create the container if it doesn't already exist.
container.CreateIfNotExists();

// Add some metadata to the container.
container.Metadata.Add("docType", "textDocuments");

According to Microsfot documentation:

“You will receive a 400 Bad Request if any name/value pairs contain non-ASCII characters. Metadata name/value pairs are valid HTTP headers, and so must adhere to all restrictions governing HTTP headers. It is therefore recommended that you use URL encoding or Base64 encoding for names and values containing non-ASCII characters.”

https://docs.microsoft.com/en-us/azure/storage/blobs/storage-properties-metadata