Is there a way to do symbolic links to the blob data when using Azure Storage to avoid duplicate blobs?

混江龙づ霸主 提交于 2019-12-05 10:25:17

No, there's no symbolic links (source: http://social.msdn.microsoft.com/Forums/vi-VN/windowsazuredata/thread/6e5fa93a-0d09-44a8-82cf-a3403a695922).

A good solution depends on the anticipated size of the files and the number of duplicates. If there aren't going to be many duplicates, or the files are small, then it may actually be quicker and cheaper to live with it - $0.15 per gigabyte per month is not a great deal to pay, compared to the development cost! (That's the approach we're taking.)

If it was worthwhile to remove duplicates I'd use table storage to create some kind of redirection between the file name and the actual location of the data. I'd then do a client-side redirect to redirect the client's browser to download the proper version.

If you do this you'll want to preserve the file name (as that will be what's visible to the user) but you can call the "folder" location what you want.

Poul K. Sørensen

Another solution to keep all structure of your files but still provide a way to do "symbolic links" could be as follows, but as in the other answer the price might be so small that its not worth the effort of implementing it.

I decided in similar setup to just store the md5 of each uploaded file in a table and then in a year go back and see how many duplicates that got uploaded and how much storage that could be saved. It will at that time make it easy to evaluate if its worth implementing a solution for symbolic links.

The downside of maintaining it all in table storage is that you get a limited query API to your blobs. Instead i would suggest to use the Metadata on blobs for creating links. (meta data turns in to normal headers on the requests when using REST API etc).

So for duplicate blobs, just keep one of them and store a link header telling where the data is.

blob.Metadata.Add("link", dataBlob.Name);
await blob.SetMetadataAsync();
await blob.UploadTextAsync("");

at this point the blob now takes up no data but is still present in storage and will be returned when listing blobs.

Then when accessing data you simply would have to check if a blob has a "link" metadata set or with rest, check if a x-ms-meta-link header is present and then read the data from there instead.

blob.Container.GetBlockBlobReference(blob.Metadata["link"]).DownloadTextAsync()

or any of the other methods for accessing the data.

Above is just the basics and I am sure you can figure out the rest if this is used.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!