Azure for beginners - Banner

Azure for Beginners - Creating an Azure Storage Account

You might have heard the names Azure and/or AWS dropped, but what are they? And what are they used for? Azure and AWS (Amazon Web Services) are two of the leading cloud computing platforms provided by Microsoft and Amazon, respectively. Both Azure and AWS are widely used by businesses for, among other things, data storage, processing, and analysis. A common problems for beginners is that these platforms contain many features. In fact, they contain enough features that it can be very overwhelming if you don't know what to look for. 
In this article we will focus on Azure. We will guide you through creating and using one of the most fundamental features of Azure: a storage account.
Note: The look of Azure tends to change regularly. Your screen might not look exactly like the images in this guide. Some features might have been moved to a different tab, but the general steps should be the same.
Note: please note that the services in Azure are all paid services. The cost is dependent on your usage and configuration. (More information can be found here.) It's advised to do this under the supervision of a database administrator or a data engineer.

Azure Storage Account

One of the first things you want to do in Azure is to create an Azure storage account. An Azure storage account provides access to a scalable and secure object storage service for your data files. It offers several types of storage services, including blobs, files, queues, and tables. In practice, such accounts are often used to safely and securely share files with clients or across different teams. They're also great for automated data storage.
After logging in, navigate to Storage accounts. You can do so in various ways. One of the easiest ways would be to click on the 'Create a resource' button in the top left corner of the start screen.
Azure - Creating a storage account - Create a resource in Azure services
Next, click on Storage in the bottom left corner and on Storage acccount button.
Azure - Creating a storage account - Create button
If this is your first Azure storage account, click on the blue button in the middle of the screen. If you already have a storage account, click on the '+ Create' button in the top left corner.
Azure - Creating a storage account - Create storage account button
The setup of our new storage account will lead us through several different steps, divided in multiple categories. These categories are reflected by the tabs on top. (Basics, advanced, networking, etc.) We will start with the Basics tab. Select your subscription and click on 'Create new' to create a new Resource group.
Azure - Creating a storage account - Create new resource group cropped
If you haven't created a resource group yet, do so now.
Azure - Creating a storage account - Create new resource group - popup notice
Next, we will have to name our new storage account. The name can only contain lowercase letters and numbers. (So no spaces.) It also needs to be unique; you can't choose one which has already been taken by another user.
Azure - Creating a storage account - Storage account name unique
After choosing a name, you'll need to select the region where you'd like your storage account to be located. This is a very important step, because you're essentially choosing the location of the data center for your files. When you want to access your files, you will have to retrieve it from that data center. If you choose a region far away, retrieving your files will obviously take longer.
However, choosing a region might not be as straightforward as choosing the region closest to you. For instance, if you are creating a storage account on behalf of your client, you might have to take into account your client's region. What if you're working with other teams, remote and otherwise, accessing the data from different locations? This can make such a choice quite challenging. Depending on your company's needs, it is also very possible you will have to run multiple data centers, all located in different regions. For our beginner guide, simply choose the location closest to you. If you have no idea what region to choose, you can use this tool to see which region would give you the highest speeds. (Lower latency is better.)
Azure - Creating a storage account - Region choice
Next, we'll need to make some decisions about the type of storage account we want or need. Firstly, we need to choose between standard or premium performance.
Standard accounts are ideal for the majority of customer scenarios. In a standard general purpose V2 account, you will have access to all four of the storage subresources: blobs, files, queues, and tables.
Premium accounts are meant for low latency scenarios, which will let you retrieve your files very quickly. If you opt for a premium account, you will have to choose for what service you want it to be premium for (block blobs, file shares, or page blobs) and the redundancy type (locally- or zone-redundant storage).
For this beginner's guide, we will choose the standard performance, as it will be what most people will choose. However, if you have clients or services which are heavily depending on this storage, you might want to choose premium. Premium accounts obviously come with a higher price tag. 
The last decision on this of the Basics tab is about the level of redundancy we want for the data in our account. Data redundancy comes down to having your data in at least two places within a data storage, so that when something happens to your data (e.g., transient hardware failures, network or power outages, and massive natural disasters)  you have the other copy to fall back to. Microsoft mentions always storing multiple copies of your data for this reason, but depending on your needs you still have several options when it comes to redundancy. This mostly comes down to tradeoffs between lower costs and higher availability:
Locally-redundant storage (LRS) - Your data is copied synchronously three times within a single physical location in the primary region. This provides strong resiliency against hardware failures like server rack and drive failures, but if something happens to the data center itself, like a natural disaster, your data might be lost. It is not be the best option if you're requiring high data availability or durability. It is the cheapest option out of all redundancy options.
Zone Redundant Storage (ZRS) - Your data is copied synchronously across three Azure availability zones in the primary region. Each availability zone is a data center in a separate physical location, having its own independent power, cooling, and networking. This offers strong protection within a single Azure region. This is recommended for high availability scenarios.
Geo-redundant Storage (GRS) - Your data is copied synchronously three times within a single physical location in the primary region using LRS, after which your data is then copied asynchronously to a single physical location in a secondary region hundreds of miles away from the primary region. This is great for failover scenarios. It allows read-only access to data in the secondary region, which is great in cases of regional unavailability of the primary region. (E.g., natural disaster or power grid failure.)
Geo Zone Redundant Storage (GZRS) - Your data is copied across three Azure availability zones in the primary region. The data is then also replicated to a secondary geographic region. This provides great protection against regional disasters. It basically combines the offerings of both GRS and ZRS. It is the optimal data protection solution on the Azure platform and also allows read-only access to data in the secondary region. This is the most expensive option.
In our case, we will choose Locally-redundant storage (LRS).
Azure - Creating a storage account - Reduncancy choice

Advanced Tab

We will move on to the Advanced tab now, where you will be able to configure key security settings for the storage account, as well as settings specific to the blobs and file services. Unless your situation specifically asks for an exception, you can leave most of the settings untouched.
6. Azure - Creating a storage account - Advanced tab choice
For example, if your situation asks for a way to securely transfer files, you might need to place a checkmark next to Enable SFTP as well as one next to Enable hierarchical namespace. However, for this beginners guide, we will leave them unchecked.
6.1 Azure - Creating a storage account - SFTP 2 checkmark
Next we have to specify the default access tier for your blob storage. Unless you will use your blob storage for backing up your files, which means that you're not accessing these files regularly, it's best to stick with hot storage. Hot storage might have the highest storage costs, but it also has the lowest access costs.
Below it you will have an option to enable large file shares for your file storage. If checked, this setting allows your file share to scale up to 100 terabytes. This only applies to standard file shares, because all premium file shares already scale up to 100 terabytes.
6.2 Azure - Creating a storage account - Hot 3 checkmark

Networking Tab

The Networking tab is where we can define network access to the storage account. 
6.3 Azure - Creating a storage account - Networking tab
First, we will have to choose the way you want your storage account to be accessed. Do you want your storage account to be accessed from all networks, from a specific set of virtual networks (you can add the virtual network and the subnet here), or only through a private endpoint. For this beginner's guide we will choose Enable public access from all networks.
6.5 Azure - Creating a storage account - Enable public access
Next we will have to choose how your traffic will travel to the storage account. In most situations it's best to choose Microsoft network routing, as it will use Microsoft's own resilient networking capabilities to transmit your data securely. If you choose Internet routing, you keep your data out of the Azure network (and on the public Internet) up until the last possible moment. If you are not sure which option to choose, it's good to know you can change the settings on this tab after you've created your storage account. For our purpose, we will choose Microsoft network routing.
6.6 Azure - Creating a storage account - Network routing option

Data Protection Tab

The next tab is called Data protection, which is about protecting your data against accidental deletion or modification, as well managing version control and tracking changes. All of the settings on this tab can be modified later on if needed.
6.7 Azure - Creating a storage account - Data protection
The first settings we see are about recovering your data from accidental or erroneous deletion or modification. By default, soft delete settings for blobs, containers and file shares are enabled, with a retention period of seven days. This means that if the data in your blob has been soft-deleted or overwritten, or if your container or data has been soft-deleted by you or another storage account user (or application), you will be able to restore the data to its state at the time it was deleted. However, you will need to do so within the particular retention period. When the retention period expires, the data will be deleted permanently and cannot be recovered.
In case you want even more protection you can enable point-in-time restore for containers, which lets you restore one or more containers to an earlier state, for which you will also have to specify a retention period (between 1 and 19 days). To enable this option, you will also need to enable versioning, blob change feed, and blob soft delete. (See below.)
6.8 Azure - Creating a storage account - Recovery options
Next, we have to choose whether or not we want to enable versioning and change feed for blobs. Versioning lets you access earlier versions of a blob to recover your data if it has been modified or deleted. Since every write operation to a blob will result in the creation of a new 'version', enabling versioning might become quite expensive if you haven't set up a lifecycle management policy to automatically delete old versions. For this beginner's guide, we will leave both disabled.
Versioning lets you access earlier versions of a blob to recover your data if it has been modified or deleted. 
• Enabling change feed will create transaction logs whenever you make a change to the blob (and blob metadata) in your storage account.
6.8.1 Azure - Creating a storage account - Tracking options
Finally, you have the option to enable version-level immutability support at an account-level. This paid option lets you store your business-critical data in a WORM (Write Once, Read Many) state. When this data has been stored in a WORM, it can't be modified or deleted for a specific interval. It comes in two flavors: time-based retention policies and legal hold policies. The former lets you set up policies to store data for a specified interval, after which it can be deleted but not overwritten. The latter lets you specify a legal hold, during which your data is immutable, until the legal hold is explicitly cleared. During the legal hold, objects can be created and read, but not modified or deleted. For this guide, we will leave this option unchecked.

Encryption Tab

Next, we move on to the Encryption tab, which is about, you guessed it, encryption of your data. Regardless of your choice, your data in your Azure storage account will remain encrypted using 256-bit AES with GCM mode encryption. This is true regardless of your performance tier (standard or premium), access tier (hot or cool), deployment model (Azure Resource Manager or classic) and resource type (blobs, disks, files, queues, and tables). 
You have a choice in letting Microsoft manage your encryption keys or doing it yourself:
Microsoft-managed keys (MMK) - The data in your storage account is encrypted with Microsoft-managed keys. 
Customer-managed keys (CMK) - The data in your storage account is encrypted with your own keys. These keys will need to be must be stored in Azure Key Vault or Azure Key Vault Managed Hardware Security Model (HSM). You will also need to create a user-assigned identity. This helps you to provide access to other resources, like Azure Key Vault.
If you are in need of a second layer of encryption at the hardware level, place a checkmark next to enable infrastructure encryption. This encrypts your data through 256-bit AES with CBC encryption at the Azure Storage infrastructure level. This additional layer of encryption protects your data in case one of the encryption algorithms or keys is compromised.
For our beginner's guide, we will opt for Microsoft-managed keys and leave Enable infrastructure encryption unchecked.
6.9. Azure - Creating a storage account - Encryption type options

Tags Tab

On the Tags tab you can set your index tags (if needed). Using blob index tags might be a great option if you work with lots of data. If you have many blobs in your storage account, finding the exact data you're looking for can be a real pain. Using blob index tags will let you (dynamically) categorize your data using key-value index tags, so you can quickly find objects in your storage account, even across containers. For example, if you have to find all data from a specific project somewhere in your storage account, tags will let you do this quickly and efficiently. For our beginner's guide, we won't use any tags.
6.10. Azure - Creating a storage account - Tags tab options 2

Review + Create Tab

Finally we have arrived at the Review + create tab. This is an overview of all the choices we have made thusfar.
6.11. Azure - Creating a storage account - Review and Create options
Your answers will be validated when you switch to this tab. If an answer doesn't meet the requirements you will see a message, along with a red cross next to the tab which contains the error.
6.12. Azure - Creating a storage account - Not validated options
After having checked everything, create your storage account. It might take a while before the deployment has been completed. If successful, you will see the screen below. Click the 'Go to resource' button to enter your new storage account. That's all there is to it. Great job!

Uploading Files to our Azure Storage Account

In your resource, you will be bombarded with options, most of which you might never use. What we do want is to store files, for which we have to create a container. In Azure, they are used to organize and store unstructured data. When we upload our files into our storage account, they will be placed in a container of our choice. Our files will be stored as so called blobs (Binary Large Objects). Our files' content and format are preserved exactly as they were, but they are now managed as blobs within Azure's infrastructure. If we were to download them to our computer, they would be downloaded in their original file format. Each container can hold an unlimited number of blobs, but the names of the blobs within a container must be unique. To use an analogy: if a storage account is like a hard drive, a container is like a top-level folder on that drive, and a blob is like an individual file within that folder.
To create a container, click on Containers in the sidebar to the left, and click on the '+ Container' button.
Azure for beginners - How to create an Azure Storage Account - 4. Create container
You will need to name your new container, after which you will need to click on the 'Create' button.
Azure for beginners - How to create an Azure Storage Account - 4.1. Container name new
If you have been successful in creating a new container, it will show up in the container overview. To see the contents of your container, click on it.
Azure for beginners - How to create an Azure Storage Account - 5. Your new container showing
In the top left corner we can see that we are now in our new container. We will now upload some files. In our case, we will upload the famous Northwind dataset (in CSV format), which is freely available on Kaggle.com. Click on the Upload button.
Azure for beginners - How to create an Azure Storage Account - 8. Upload files
A new screen will slide in. To upload our files, we can simply drag and drop them into the newly appeared field, or we can browse to the location of these files on our computer by clicking 'Browse for files'. When you have done so, click the blue 'Upload' button.
Azure for beginners - How to create an Azure Storage Account - 9. Upload files drop
Our new files will now be visible in the overview. With that, we have come to the end of this part of the Azure for Beginners guide.