2019-09-26

Copy an Azure Storage Table using Rust

To set up a development environment, I need to copy a bunch of the data from another test environment. A lot of the data is stored in Azure Storage Tables, so what is an easy, repeatable, and performant way to copy Azure Storage Tables? Ideally, I want to run it from an Azure Cloud Shell running in the same location to avoid any network latency. In my case, the storage accounts are both in East US, so I moved my cloud shell there too. From cloud shell, I ran `cloudshell unmount` to disconnect the current instance and ran through the advanced setup prompts when I launched cloud shell again.

The Table Storage REST API has query entities and insert entity operations which I used for this. It looks simple enough. The only issue is that I have several tables that exceed 1000 entities, so they will be paged. The service may actually page at less than 1000 entities for a couple of server performance reasons. The service provides a continuation to get the next page if there is one. To copy all of the entities, we need all of the pages. Paging can be wrapped into a async stream.

I made a copy_table example in the AzureSDKForRust project. It resulted in a static Linux executable that is only 7 MB and copied 17000 entities in 2 minutes, running from my cloud shell. That meets my needs. The example code is very readable:

loading ...
The hard part was figuring out how to add stream_query_entities, because it did not exist yet. The project hasn't upgraded yet to std::future::Future and is still using futures 0.1, so the new Rust async/await can't be used yet, and I find the combinators and callbacks pretty painful. The addition of async/await, makes thing much easier and makes Rust my favorite language work with. My F# experience really helps. Rust fully embraces options instead of nulls, results for error handling, pattern matching, async polling instead of eager evaluation, and compiling everything from source.

Since I'm on a Mac, an easy way to target x86_64-unknown-linux-musl was to use a docker image. I build a target/x86_64-unknown-linux-musl/release/examples/copy_table executable with:

git clone [email protected]:MindFlavor/AzureSDKForRust.git
docker run -it --rm -v ${PWD}:/src -w /src clux/muslrust
cargo build --release --example copy_table --target x86_64-unknown-linux-musl

I used `az storage blob upload` and `az storage blob download` to upload the executable to storage and download it onto my cloud shell.