Every now and then we get calls from customers who catch themselves in a very bad situation. They needed a restore, but at a certain point hit an obstacle they could not circumvent. And I’m not talking about lost backups, CryptoLocker or something! It’s just that their focus was on creating a backup or replica. They never considered that data recovery is a whole different process that must be examined and tested separately. I’ll give you several examples to get the taste of it:
- The customer had a critical 20-terabyte VM that failed. Nobody wants downtime, so they started the VM in instant recovery and had it working in five minutes. However, instant recovery is a temporary state and must be finalized by migration to the production datastore. As it turned out, the infrastructure did not allow it to copy 20 TB of data in any reasonable time. And since instant recovery was started with an option to write changes to the C: drive of Veeam Backup & Replication (as opposed to using a vSphere snapshot), it was quickly filling up without any possibility for sufficient extension. As some time passed before the customer approached support, the VM had already accumulated some changes that could not be discarded. With critical data at risk, there’s no way to finalize instant recovery in a sufficiently short time and imminent failure approaching. Quite a pickle, huh?
- The customer had a single domain controller in the infrastructure and everything added in Veeam Backup & Replication using DNS. I know, I know. It could have gone wrong in a hundred ways, but here is what happened: The customer planned some maintenance and decided to fail over to the replica of that DC. They used planned failover, which is ideal for such situations. The first phase went fine, however during the second phase, the original VM was turned off to transfer the last bits of data. Of course, at that moment the job failed because DNS went down. Luckily, here we could simply turn on the replica VM manually from vSphere (this is not something we recommend, see the next advice). However, it disrupted and delayed the maintenance process. Plus, we had to manually add host names to the C:\Windows\System32\drivers\etc\hosts file on Veeam Backup & Replication to allow a proper failback.
- The customer based backup infrastructure around tapes and maintained only a very short backup chain on disks. When they had to restore some guest files from a large file server, it turned out there was simply not sufficient space to be found on any machine to act as a staging repository.
I think in all these situations the clients fell into the same trap — they simply assumed that if a backup is successful, then restore should be as well! Learn about restore, just as you learn about backups. A good way to start is our user guide. This section contains information on all the major types of restores. In the “Before you begin” section of each restore option, you can find initial considerations and prerequisites. Information on other types of restores such as restore from tapes or from storage snapshots can be found in their respective sections. Apart from the main user guide, be sure to check out the Veeam Explorers guide too. Each Veeam Explorer has a “Planning and preparation” section — this will help you prepare your system for restore beforehand.
Do not manage replicas from vSphere console
Veeam replicas are essentially normal virtual machines. As such, they can be managed using usual vSphere management tools, mainly vSphere client. It can, but should not be used. Replica failover in Veeam Backup & Replication is a sophisticated process, which allows you to carefully go one step at a time (with the possibility to roll back if something goes wrong) and finalize failover in a proper way. Take a look at the scheme below:
If instead of using the Veeam Backup & Replication console, you simply start a replica in vSphere client or start a failover from Veeam Backup & Replication. But if you switch to managing from the vSphere client later, you get a number of serious consequences:
- The failover mechanism in Veeam Backup & Replication will no longer be usable for this VM, as all that flexibility described above will no longer be available.
- You will have data in the Veeam Backup & Replication database that does not represent the actual state of the VM. In worst cases, fixing it requires database edits.
- You can lose data. Consider this example: A customer started a replica manually in vSphere client and decided to simply stick with it. Some time passed, and they noticed that the replica was still present in the Veeam Backup & Replication console. The customer decided to clean it up a little, right-clicked on the replica and chose “Delete from disk.” Veeam Backup & Replication did exactly what was told — deleted the replica, which unbeknownst to the software, had become a production VM with data.
There are situations when starting the replicas from the vSphere client is necessary (mainly, if the Veeam Backup & Replication server is down as well and replicas must be started without delay). However, if the Veeam Backup & Replication server is operational, it should be the management point from start to finish.
It is also not recommended to delete the replica VMs from vSphere client. Veeam Backup & Replication will not be aware of such changes, which can lead to failures and stale data in the console. If you do not need a replica anymore, delete it from the console and not from the vSphere client as a VM. That way your list of replicas will contain only the actual data.
Use Replication? Consider doing more with Veeam Availability Orchestrator
If you are using replication, or want to, consider Veeam Availability Orchestrator (VAO). VAO implements advanced DataLab capabilities that aren’t available in Veeam Backup & Replication by itself. This can be used for Disaster Recovery, advanced testing use cases and more!
Careful with updates!
I’m speaking about updates for hypervisors and various applications backed up by Veeam. From a Veeam Backup & Replication perspective, such updates can be roughly divided into two categories — major updates that bring a lot of changes and minor updates.
Let’s speak about major updates first. The most important ones are hypervisor updates. Before installing them, it is necessary to confirm that Veeam Backup & Replication supports them. These updates bring a lot of changes to the libraries and APIs that Veeam Backup & Replication uses, so updating Veeam Backup & Replication code and rigorous testing from QA is necessary before a new version is officially supported. Unfortunately, as of now VMware does not provide any preliminary access to the new vSphere versions for the vendors. So Veeam’s R&D gets access together with the rest of the world, which means that there is always a lag between a new version release and official support. The magnitude of changes also does not allow R&D to fit everything in a hotfix, so official support is typically added with the new Veeam Backup & Replication versions. This puts support and our customers in a tricky situation. Usually after a new vSphere release, the amount of cases increases because administrators start installing updates, only to find out that their backups are failing with weird issues. This forces us, support, to ask the customers to perform a rollback (if possible) or to propose workarounds that we cannot officially support, due to lack of testing. So please check the version compatibility before updates!
The same applies to backed up applications. Veeam Explorers also has a list of supported versions and new versions are added to this list with Veeam Backup & Replication updates. So once again, be sure to check the Veeam Explorers user guide before passing to a new version.
In the minor updates’ category, I put things like cumulative updates for Exchange, new VMware Tools versions, security updates for vSphere, etc. Typically, they do not contain major changes and in most situations Veeam Backup & Replication does not experience any issues. That’s why QA does not release official statements as with major updates. However, in our experience there were situations where minor updates changed workflow enough to cause issues with Veeam Backup & Replication. In these cases, once the presence of an issue is confirmed, R&D develops a hotfix as soon as possible.
How should you stay up to date on the recent developments? My advice is to register on https://forums.veeam.com/. You will be subscribed to a weekly “Word from Gostev” newsletter from our Senior Vice President Anton Gostev. It contains information on discovered issues (and not limited to Veeam products), release plans and interesting IT news. If you do not find what you are looking for in the newsletter, I recommend checking the forum. Due to the sheer number of Veeam clients, if any update breaks something, a related thread appears soon after.
Now backups are not the only thing that patches and updates can break. In reality, they can break a lot of stuff, the application itself included. And here Veeam has something to offer — Veeam DataLabs. Maybe you heard about SureBackup — our ultimate tool for verifying the consistency of backups. SureBackup is based on DataLabs, which allows you to create an isolated environment where you can test updates, before bringing them to production.
Advice to those planning to buy Veeam Backup & Replication or switching from another solution
Sometimes in technical support we get cases that go like this: “We have designed our backup strategy like this, we acquired Veeam Backup & Replication, however we can’t seem to find a way to do X. Can you help with it?” (Most commonly such requests are about unusual retention policies or tape management). We are happy to help, but at times we have to explain that Veeam Backup & Replication works differently and they will need to change their design. Sure enough, customers are not happy to hear that. However, I believe they are following an incorrect approach.
Veeam Backup & Replication is very robust and flexible and in its current form it can satisfy the absolute majority of the companies. But it is important to understand that it was designed with certain ideas in mind and to make the product really shine, it is necessary to follow these ideas. Unfortunately, sometimes the reality is quite different. Here is what I imagine happens with some of the customers: They decide that they need a backup solution. So they sit down in a room and meticulously design each element of their strategy. Once done, they move to choosing a backup solution, where Veeam Backup & Replication seems to be an obvious choice. In another scenario, the customer already has a backup solution and a developed backup strategy. However, for some reason their solution does not meet their expectations. So they decide to switch to Veeam and wish to carry their backup strategy in Veeam Backup & Replication unchanged. My firm belief is that this process should go vice versa.
These days Veeam Backup & Replication has become a de-facto standard for backup solution, so probably any administrator would like to have a glance at it. However, if you are serious about implementation, Veeam Backup & Replication needs to be studied and tested. Once you know the capabilities and know this is what you are looking for, build your backup strategy specifically for Veeam Backup & Replication. You will be able to use the functionality to the maximum, reduce the risks and support will have an easier time understanding the setup.
And that’s what I have for today’s episode. I hope that gave you something to consider.
source: Veeam