Closed for Maintenance: Things You Should Know About Maintenance Mode

Admins shut down their hosts for servicing from time to time. After closing for maintenance one node, vSAN cluster resources are to be re-distributed, and here Maintenance Mode comes into play. Today, I’d like to discuss the whole idea of maintenance mode and its options.

hardware

Enabling maintenance mode on a standalone host

Once the host enters maintenance mode, its icon and state in the Summary tab change.
wp-image-1002This sign means that there’s no way to do any I/Os to that host. There are no active client network sessions either. For these two reasons, you cannot run or create VMs on that host (they all are basically shut down) while it is under maintenance.

C:\4c2df209a427063db0f720fc7aa53900

The host can leave the maintenance mode automatically (after some process is finished), or per user request. Note that rebooting won’t put the host back to the normal functioning. While doing any updates with vSphere Update Manager though, there’s an option to exit maintenance mode after a reboot.

How to put a host in maintenance mode?

You can activate maintenance mode via vCenter.

C:\68082548078796bfaa94314433238b57

For a standalone host, you can enter this mode in the web console too.

C:\a6747c7f9daf87d133f9043d3f7f7328

For CLI-minded users, there’s a way to enter the mode from an SSH session. Here are 3 commands that may come in handy:

  • esxcli system maintenanceMode get informs whether maintenance mode is enabled.

  • esxcli system maintenanceMode set –enable true enables maintenance mode on the host.

  • esxcli system maintenanceMode set –enable false disables maintenance mode.

C:\590da79567b34e5dca64800afc14aaee

You can do just the same procedure from a vCenter instance with PowerCLI. Here are some commands:

  • Connect-VIServer “My vCenter IP” -user “[email protected]” -password “password” enables to connect to a vCenter Server instance.

  • Get-VMHost -name “My ESXi host IP” informs about the current host state.

  • Set-VMHost -VMHost “My ESXi host IP” -State “Maintenance” -RunAsync set the host into maintenance mode.

  • Set-VMHost -VMHost “My ESXi host IP” -State “Connected” –RunAsync brings the host to the normal state.

  • Disconnect-VIServer 172.16.10.5 -confirm:$false disconnects you from the vCenter Server instance.

Here are the command outputs.

C:\0dfe21afda6851416c1f81465289ae0f

How does it work in a vSAN cluster?

Before I move any further, I’d like to clarify the whole concept of maintenance mode for some host in a vSAN cluster. Long story short: By putting a host under maintenance, you, basically, disconnect it from the cluster. In other words, you temporarily remove the capacity and compute power from that cluster. Of course, this triggers workload distribution mechanisms, but you need to be really careful as your VMs may become a bit sluggish, or there may even be several stability threats.

While enabling maintenance mode on some host, there’s a message popping up, telling you about host maintenance mode options to mitigate risks:

1.Full data migration

2.Ensure accessibility

3.No data migration

C:\4cf0a71dbb8ff60b3a093f993bb902e2

Full data migration

That’s the right option if you have a strong feeling that the host is going to be shut down for a long time. Keep in mind that migration is an I/O-intense process that is associated with heavy network loads. So, set on the time when the migration won’t overlap with your production activity. Good news: The Enter Maintenance Mode wizard provides an approximate amount of data that has to be migrated (this parameter can be translated into time). Note that the host cannot be shut down until data transfer is over. The scheme below shows how data are evacuated.

migrating all the data from the host

Here’s how to start migrating all the data from the host.

C:\9c29b9d7c8596a6ea27682a339b3abe7

Ensure accessibility

Ensure accessibility is the default maintenance mode option. It works fine when you shut down a host for a short time (i.e., updating ESXi, swapping some worn-out parts, etc.). Unlike the option discussed above, this one is not recommended when you expect to shut down a host for more than one day: there’s a good risk of performance and stability degradation.

This mode is the right balance between the stability and migration duration. There’s only partial migration done: just as few files as needed to ensure VM uptime while the server stays shut down. Here’s a scheme showing how the Ensure accessibility works. Red flags indicate data that are unavailable all the time when the node is stopped.

Ensure accessibility mode

Note that enabling Ensure accessibility mode may lead to changing storage policies. The cluster temporarily uses some resources, so vSAN naturally tries to avoid any event of performance, stability, and data loss. The Enter Maintenance Mode wizard shows how much data need to be evacuated and how many objects may become incompatible with the new storage policy. It also indicates the amount of data that is to be transferred so that you can free some datastore space if needed.

C:\a3d0251596e5e64391bbccd848d00bca

By default, after you set the host in maintenance mode, there’s a 60-minute window before synchronization starts and new storage policies are applied to VMs. You can prolong that waiting time by setting a greater Object Repair Time value.

C:\23494d712ef301205f2c3fe31f654901

No data migration

As the name of this option implies, there’s no data migration done at all, meaning that it is the fastest way to put the host into maintenance mode. But, it’s the most dangerous one: there’s a risk that some VMs go down due to being pinned to the host that leaves the cluster.

maintenance mode with no data migration

If you are ready to take that risk, here’s how to put the host into maintenance mode with no data migration.

C:\7ebc9ee19ffb26c34bb5526683cfb8a3

Conclusion

Maintenance Mode allows for managing vSphere environments in a more convenient way, nevertheless you should be aware of the risks associated with this feature. Well, I believe this article to cover maintenance mode good enough, providing an important info one should know before enabling it.