Author Archives: Mike

About Mike

Michael Pruett, CISSP has a wide range of cyber-security and network engineering expertise. The plethora of vendors that resell hardware but have zero engineering knowledge resulting in the wrong hardware or configuration being deployed is a major pet peeve of Michael's. This site was started in an effort to spread information while providing the option of quality consulting services at a much lower price than Fortinet Professional Services. Owns PacketLlama.Com (Fortinet Hardware Sales) and Office Of The CISO, LLC (Cybersecurity consulting firm).

HA and failover protection

HA and failover protection

In FortiGate active-passive HA, the FortiGate Clustering Protocol (FGCP) provides failover protection. This means that an active-passive cluster can provide FortiGate services even when one of the cluster units encounters a problem that would result in complete loss of connectivity for a stand-alone FortiGate unit. This failover protection provides a backup mechanism that can be used to reduce the risk of unexpected downtime, especially in a mission-critical environment.

The FGCP supports three kinds of failover protection. Device failover automatically replaces a failed device and restarts traffic flow with minimal impact on the network. Link failover maintains traffic flow if a link fails. Session failover resumes communication sessions with minimal loss of data if a device or link failover occurs.

This chapter describes how FGCP failover protection works and provides detailed NAT/Route and Transparent mode packet flow descriptions.

 

About active-passive failover

To achieve failover protection in an active-passive cluster, one of the cluster units functions as the primary unit, while the rest of the cluster units are subordinate units, operating in an active stand-by mode. The cluster IP addresses and HA virtual MAC addresses are associated with the cluster interfaces of the primary unit. All traffic directed at the cluster is actually sent to and processed by the primary unit.

While the cluster is functioning, the primary unit functions as the FortiGate network security device for the networks that it is connected to. In addition, the primary unit and subordinate units use the HA heartbeat to keep in constant communication. The subordinate units report their status to the cluster unit and receive and store connection and state table updates.

 

Device failure

If the primary unit encounters a problem that is severe enough to cause it to fail, the remaining cluster units negotiate to select a new primary unit. This occurs because all of the subordinate units are constantly waiting to negotiate to become primary units. Only the heartbeat packets sent by the primary unit keep the subordinate units from becoming primary units. Each received heartbeat packet resets negotiation timers in the subordinate units. If this timer is allowed to run out because the subordinate units do not receive heartbeat packets from the primary unit, the subordinate units assume that the primary unit has failed, and negotiate to become primary units themselves.

Using the same FGCP negotiation process that occurs when the cluster starts up, after they determine that the primary unit has failed, the subordinate units negotiate amongst themselves to select a new primary unit. The subordinate unit that wins the negotiation becomes the new primary unit with the same MAC and IP addresses as the former primary unit. The new primary unit then sends gratuitous ARP packets out all of its interfaces to inform attached switches to send traffic to the new primary unit. Sessions then resume with the new primary unit.

 

Link failure

If a primary unit interface fails or is disconnected while a cluster is operation, a link failure occurs. When a link failure occurs the cluster units negotiate to select a new primary unit. Since the primary unit has not stopped operating, it participates in the negotiation. The link failure means that a new primary unit must be selected and the cluster unit with the link failure joins the cluster as a subordinate unit.

Just as for a device failover, the new primary unit sends gratuitous arp packets out all of its interfaces to inform attached switches to send traffic to it. Sessions then resume with the new primary unit.

If a subordinate unit experiences a device failure its status in the cluster does not change. However, in future negotiations a cluster unit with a link failure is unlikely to become the primary unit.

 

Session failover

If you enable session failover (also called session pickup) for the cluster, during cluster operation the primary unit informs the subordinate units of changes to the primary unit connection and state tables, keeping the subordinate units up-to-date with the traffic currently being processed by the cluster.

After a failover the new primary unit recognizes open sessions that were being handled by the cluster. The sessions continue to be processed by the new primary unit and are handled according to their last known state.

If you leave session pickup disabled, the cluster does not keep track of sessions and after a failover, active sessions have to be restarted or resumed.

 

Primary unit recovery

If a primary unit recovers after a device or link failure, it will operate as a subordinate unit, unless the override CLI keyword is enabled and its device priority is set higher than the unit priority of other cluster units (see HA override).

HA diagnose commands

HA diagnose commands

You can use the following diagnose command to display a data about a cluster:

diagnose sys ha dump-by {all-xdb | all-vcluster| rcache | all-group |

memory | debug-zone | vdom | kernel | device | stat| sesync}

The example out put below is from a cluster of two FortiGate-5001Cs. In this cluster the base1 and base2 interfaces communicate the HA heartbeat and port monitoring has been added to poort1.

 

allxdb

This command displays information about the current configuration of the cluster and how its operating. You can use the out to determine the primary unit, the state of port monitoring as well as most cluster configuration details and status.

diagnose sys ha dump-by all-xdb HA information. idx=1,nxentry=2,linkfails=7,flags=0,digest=7.72.e3.2e.8e.d1…

 

xentry FG-5KC3E13800046 nhbdev=2,nventry=0, hops=0. base1, 50, mac=0.9.f,bc.e.6c, neighbor=1.

id=FG-5KC3E13800084, mac=0.9.f,bc.11.18. base2, 50, mac=0.9.f,bc.e.71, neighbor=1. id=FG-5KC3E13800084, mac=0.9.f,bc.11.1d.

xentry FG-5KC3E13800084 nhbdev=2,nventry=1, hops=1. base1, 50, mac=0.9.f,bc.11.18, neighbor=1.

id=FG-5KC3E13800046, mac=0.9.f,bc.e.6c. base2, 50, mac=0.9.f,bc.11.1d, neighbor=1. id=FG-5KC3E13800046, mac=0.9.f,bc.e.71. npath=1,FG-5KC3E13800084

ventry idx=0,id=1,FG-5KC3E13800084,prio=128,0,claimed=0,override=0,flag=0,time=12974,mon=0 mondev=port1,50

idx=0,nxentry=2,linkfails=7,flags=3,digest=7.95.b.9.a8.5d… xentry FG-5KC3E13800084 nhbdev=2,nventry=1, hops=0.

base1, 50, mac=0.9.f,bc.11.18, neighbor=1. id=FG-5KC3E13800046, mac=0.9.f,bc.e.6c. base2, 50, mac=0.9.f,bc.11.1d, neighbor=1. id=FG-5KC3E13800046, mac=0.9.f,bc.e.71.

ventry idx=0,id=1,FG-5KC3E13800084,prio=128,0,claimed=0,override=0,flag=0,time=12974,mon=0 mondev=port1,50

xentry FG-5KC3E13800046 nhbdev=2,nventry=1, hops=1. base1, 50, mac=0.9.f,bc.e.6c, neighbor=1.

id=FG-5KC3E13800084, mac=0.9.f,bc.11.18. base2, 50, mac=0.9.f,bc.e.71, neighbor=1. id=FG-5KC3E13800084, mac=0.9.f,bc.11.1d. npath=1,FG-5KC3E13800046

ventry idx=0,id=1,FG-5KC3E13800046,prio=128,0,claimed=0,override=0,flag=0,time=2,mon=0 mondev=port1,50

 

allvcluster

This command displays the status and configuration of the individual cluster units. You can use the output of this command to determine the primary unit and the status of each cluster unit.

diagnose sys ha dump-by all-vcluster

HA information.

vcluster id=1, nventry=2, state=work, digest=5.f8.d1.63.4d.d2…

ventry idx=0,id=1,FG-5KC3E13800046,prio=128,0,claimed=0,override=0,flag=1,time=0,mon=0 mondev=port1,50

ventry idx=1,id=1,FG-5KC3E13800084,prio=128,0,claimed=0,override=0,flag=0,time=12974,mon=0

 

stat

This command displays some statistics about how well the cluster is functioning. Information includes packet counts, memory use, failed links and ping failures.

diagnose sys ha dump-by stat

HA information.

packet count = 1, memory = 220.

check_linkfails = 0, linkfails = 0, check_pingsvrfails = 2822 bufcnt = -5, bufmem = 0

 

Adding a disconnected FortiGate unit back to its cluster

Adding a disconnected FortiGate unit back to its cluster

If you disconnect a FortiGate unit from a cluster, you can re-connect the disconnected FortiGate unit to the cluster by setting the HA mode of the disconnected unit to match the HA mode of the cluster. Usually the disconnected unit rejoins the cluster as a subordinate unit and the cluster automatically synchronizes its configuration.

You do not have to change the HA password on the disconnected unit unless the HA password has been changed after the unit was disconnected. Disconnecting a unit from a cluster does not change the HA password.

You should make sure that the device priority of the disconnected unit is lower than the device priority of the current primary unit. You should also make sure that the HA override CLI option is not enabled on the disconnected unit. Otherwise, when the disconnected unit joins the cluster, the cluster will renegotiate and the disconnected unit may become the primary unit. If this happens, the configuration of the dis- connected unit is synchronized to all other cluster units. This configuration change might disrupt the operation of the cluster.

The following procedure assumes that the disconnected FortiGate unit is correctly physically connected to your network and to the cluster but is not running in HA mode and not part of the cluster.

Before you start this procedure you should note the device priority of the primary unit.

 

To add a disconnected FortiGate unit back to its cluster – web-based manager

1. Log into the disconnected FortiGate unit.

If virtual domains are enabled, log in as the admin administrator and select Global Configuration.

2. Go to System > HA.

3. Change Mode to match the mode of the cluster.

4. If required, change the group name and password to match the cluster.

5. Set the Device Priority lower than the device priority of the primary unit.

6. Select OK.

The disconnected FortiGate unit joins the cluster.

 

 

To add a disconnected FortiGate unit back to its cluster – CLI

1. Log into the CLI of the FortiGate unit to be added back to the cluster.

2. Enter the following command to access the global configuration and add the FortiGate unit back to a cluster operating in active-passive mode and set the device priority to 50 (a low number) so that this unit will not become the primary unit:

config global

config system ha set mode a-p

set priority 50 end

end

You may have to also change the group name, group id and password. However if you have not changed these for the cluster or the FortiGate unit after it was disconnected from the cluster you should not have to adjust them now.

Disconnecting a cluster unit from a cluster

Disconnecting a cluster unit from a cluster

Use the following procedures to disconnect a cluster unit from a functioning cluster without disrupting the operation of the cluster. You can disconnect a cluster unit if you need to use the disconnected FortiGate unit for another purpose, such as to act as a standalone firewall.

You can use the following procedures for a standard cluster and for a virtual clustering configuration. To use the following procedures from a virtual cluster you must be logged in as the admin administrator and you must have selected Global Configuration.

When you disconnect a cluster unit you must assign an IP address and netmask to one of the interfaces of the disconnected unit. You can disconnect any unit from the cluster even the primary unit. After the unit is disconnected, the cluster responds as if the disconnected unit has failed. The cluster may renegotiate and may select a new primary unit.

When the cluster unit is disconnected the HA mode is changed to standalone. In addition, all interface IP addresses of the disconnected unit are set to 0.0.0.0 except for the interface that you configure.

Otherwise the configuration of the disconnected unit is not changed. The HA configuration of the disconnected unit is not changed either (except to change the HA mode to Standalone).

 

To disconnect a cluster unit from a cluster – web-based manager

1. Go to System > HA to view the cluster members list.

2. Select the Disconnect from cluster icon for the cluster unit to disconnect from the cluster.

3. Select the interface that you want to configure. You also specify the IP address and netmask for this interface.

When the FortiGate unit is disconnected, all management access options are enabled for this interface.

4. Specify an IP address and netmask for the interface. You can use this IP address to connect to the interface to configure the disconnected FortiGate unit.

5. Select OK.

The FortiGate unit is disconnected from the cluster and the cluster may renegotiate and select a new primary unit. The selected interface of the disconnected unit is configured with the specified IP address and netmask.

 

To disconnect a cluster unit from a cluster – CLI

1. Enter the following command to disconnect a cluster unit with serial number FGT5002803033050. The internal interface of the disconnected unit is set to IP address 1.1.1.1 and netmask 255.255.255.0.

execute ha disconnect FGT5002803033050 internal 1.1.1.1 255.255.255.0

Managing individual cluster units

Managing individual cluster units

The following procedure describes how to use SSH to log into the primary unit CLI and from there to use the execute ha manage command to connect to the CLI of any other unit in the cluster. The procedure is very similar if you use telnet, or the web-based manager dashboard CLI console.

You can use the execute ha manage command from the CLI of any cluster unit to log into the CLI of another the cluster unit. Usually you would use this command from the CLI of the primary unit to log into the CLI of a subordinate unit. However, if you have logged into a subordinate unit CLI, you can use this command to log into the primary unit CLI, or the CLI of another subordinate unit.

Using SSH or telnet or the web-based manager dashboard CLI console you can only log into the primary unit CLI. Using a direct console connection you can log into any cluster unit. In both cases you can use execute ha manage to connect to the CLI of other cluster units.

You log into the subordinate unit using the FGT_ha_admin administrator account. This built-in administrator account gives you read and write permission on the sub- ordinate unit. Normally this built-in administrative account is not visible, however FGT_ha_admin does appear in event log messages.

1. Use SSH to connect to the cluster and log into the primary unit CLI.

Connect to any cluster interface configured for SSH administrative access to log into the cluster.

2. Enter the following command followed by a space and type a question mark (?):

execute ha manage

The CLI displays a list of all the subordinate units in the cluster. Each cluster unit is numbered, starting at 1. The information displayed for each cluster unit includes the unit serial number and the host name of the unit.

3. Complete the command with the number of the subordinate unit to log into. For example, to log into subordinate unit 1, enter the following command:

execute ha manage 1

Press Enter to connect to and log into the CLI of the selected subordinate unit. If this subordinate unit has a different host name, the CLI prompt changes to this host name.

You can use CLI commands to manage this subordinate unit. If you make changes to the configuration of any cluster unit (primary or subordinate unit) these changes are synchronized to all cluster units.

4. You can now use the execute ha manage command to connect to any other cluster unit (including the primary unit). You can also use the exit command to return to the primary unit CLI.

About the HA cluster index and the execute ha manage command

About the HA cluster index and the execute ha manage command

When a cluster starts up, the FortiGate Cluster Protocol (FGCP) assigns a cluster index and a HA heartbeat IP address to each cluster unit based on the serial number of the cluster unit. The FGCP selects the cluster unit with the highest serial number to become the primary unit. The FGCP assigns a cluster index of 0 and an HA heartbeat IP address of 169.254.0.1 to this unit. The FGCP assigns a cluster index of 1 and an HA heartbeat IP address of 169.254.0.2 to the cluster unit with the second highest serial number. If the cluster contains more units, the cluster unit with the third highest serial number is assigned a cluster index of 2 and an HA heartbeat IP address of 169.254.0.3, and so on. You can display the cluster index assigned to each cluster unit using the get system ha status command. Also when you use the execute ha manage command you select a cluster unit to log into by entering its cluster index.

The cluster index and HA heartbeat IP address only change if a unit leaves the cluster or if a new unit joins the cluster. When one of these events happens, the FGCP resets the cluster index and HA heartbeat IP address of each cluster unit according to serial number in the same way as when the cluster first starts up.

Each cluster unit keeps its assigned cluster index and HA heartbeat IP address even as the units take on different roles in the cluster. After the initial cluster index and HA heartbeat IP addresses are set according to serial number, the FGCP checks other primary unit selection criteria such as device priority and monitored interfaces. Checking these criteria could result in selecting a cluster unit without the highest serial number to operate as the primary unit.

Even if the cluster unit without the highest serial number now becomes the primary unit, the cluster indexes and HA heartbeat IP addresses assigned to the individual cluster units do not change. Instead the FGCP assigns a second cluster index, which could be called the operating cluster index, to reflect this role change. The operating cluster index is 0 for the primary unit and 1 and higher for the other units in the cluster. By default both sets of cluster indexes are the same. But if primary unit selection selects the cluster unit that does not have the highest serial number to be the primary unit then this cluster unit is assigned an operating cluster index of 0. The operating cluster index is used by the FGCP only. You can display the operating cluster index assigned to each cluster unit using the get system ha status command. There are no CLI commands that reference the operating cluster index.

Even though there are two cluster indexes there is only one HA heartbeat IP address and the HA heartbeat address is not affected by a change in the operating cluster index.

 

Using the execute ha manage command

When you use the CLI command execute ha manage <index_integer> to connect to the CLI of another cluster unit, the <index_integer> that you enter is the cluster index of the unit that you want to connect to.

 

Using get system ha status to display cluster indexes

You can display the cluster index assigned to each cluster unit using the CLI command get system ha status. The following example shows the information displayed by the get system ha status command for a cluster consisting of two FortiGate-5001SX units operating in active-passive HA mode with virtual domains not enabled and without virtual clustering.

get system ha status

Model: 5000

Mode: a-p

Group: 0

Debug: 0

ses_pickup: disable

Master:128 5001_slot_7 FG50012205400050 0

Slave :128 5001_slot_11 FG50012204400045 1 number of vcluster: 1

vcluster 1: work 169.254.0.1

Master:0 FG50012205400050

Slave :1 FG50012204400045

In this example, the cluster unit with serial number FG50012205400050 has the highest serial number and so has a cluster index of 0 and the cluster unit with serial number FG50012204400045 has a cluster index of 1. From the CLI of the primary (or master) unit of this cluster you can connect to the CLI of the subordinate (or slave) unit

using the following command:

execute ha manage 1

This works because the cluster unit with serial number FG50012204400045 has a cluster index of 1.

The get system ha status command output shows two similar lists of indexes and serial numbers. The listing on the sixth and seventh lines of the command output are the cluster indexes assigned according to cluster unit serial number. These are the cluster indexes that you enter when using the execute ha manage command. The cluster indexes shown in the last two lines of the command output are the operating cluster indexes that reflect how the cluster units are actually operating in the cluster. In this example both sets of cluster indexes are the same.

The last three lines of the command output display the status of vcluster 1. In a cluster consisting of two cluster units operating without virtual domains enabled all clustering actually takes place in virtual cluster 1. HA is designed to work this way to support virtual clustering. If this cluster was operating with virtual domains enabled, adding virtual cluster 2 is similar to adding a new copy of virtual cluster 1. Virtual cluster 2 is visible in the get system ha status command output when you add virtual domains to virtual cluster 2.

The HA heartbeat IP address displayed on line 8 is the HA heartbeat IP address of the cluster unit that is actually operating as the primary unit. For a default configuration this IP address will always be 169.254.0.1 because the cluster unit with the highest serial number will be the primary unit. This IP address changes if the operating primary unit is not the primary unit with the highest serial number.

 

Example actual and operating cluster indexes do not match

This example shows get system ha status command output for same cluster of two FortiGate-5001SX units. However, in this example the device priority of the cluster unit with the serial number FG50012204400045 is increased to 200. As a result the cluster unit with the lowest serial number becomes the primary unit. This means the actual and operating cluster indexes of the cluster units do not match.

get system ha status

Model: 5000

Mode: a-p

Group: 0

Debug: 0

ses_pickup: disable

Master:128 5001_slot_7 FG50012205400050 0

Slave :200 5001_slot_11 FG50012204400045 1 number of vcluster: 1

vcluster 1: work 169.254.0.2

Master:1 FG50012205400050

Slave :0 FG50012204400045

The actual cluster indexes have not changed but the operating cluster indexes have. Also, the HA heartbeat IP

address displayed for vcluster 1 has changed to 169.254.0.2.

 

Virtual clustering example output

The get system ha status command output is the same if a cluster is operating with virtual clustering turned on but with all virtual domains in virtual cluster 1. The following get system ha status command output example shows the same cluster operating as a virtual cluster with virtual domains in virtual cluster 1 and added to virtual cluster 2. In this example the cluster unit with serial number FG50012204400045 is the primary unit for virtual cluster 1 and the cluster unit with serial number FG50012205400050 is the primary unit for virtual cluster 2.

get system ha status

Model: 5000

Mode: a-p

Group: 0

Debug: 0

ses_pickup: disable

Master:128 5001_slot_7 FG50012205400050 0

Slave :200 5001_slot_11 FG50012204400045 1 number of vcluster: 2

vcluster 1: work 169.254.0.2

Master:1 FG50012205400050

Slave :0 FG50012204400045 vcluster 2: standby 169.254.0.1

Master:0 FG50012205400050

Slave :1 FG50012204400045

This example shows three sets of indexes. The indexes in lines six and seven are still used by the execute ha manage command. The indexes on lines ten and eleven are for the primary and subordinate units in virtual cluster 1 and the indexes on the last two lines are for virtual cluster 2.

 

Monitoring cluster units for failover

Monitoring cluster units for failover

If the primary unit in the cluster fails, the units in the cluster renegotiate to select a new primary unit. Failure of the primary unit results in the following:

  • If SNMP is enabled, the new primary unit sends HA trap messages. The messages indicate a cluster status change, HA heartbeat failure, and HA member down.
  • If event logging is enabled and HA activity event is selected, the new primary unit records log messages that show that the unit has become the primary unit.
  • If alert email is configured to send email for HA activity events, the new primary unit sends an alert email containing the log message recorded by the event log.
  • The cluster contains fewer FortiGate units. The failed primary unit no longer appears on the Cluster Members list.
  • The host name and serial number of the primary unit changes. You can see these changes when you log into the web-based manager or CLI.
  • The cluster info displayed on the dashboard, cluster members list or from the get system ha status command changes.

 

If a subordinate unit fails, the cluster continues to function normally. Failure of a subordinate unit results in the following:

  • If event logging is enabled and HA activity event is selected, the primary unit records log messages that show that a subordinate has been removed from the cluster.
  • If alert email is configured to send email for HA activity events, the new primary unit sends an alert email containing the log message recorded by the event log.
  • The cluster contains fewer FortiGate units. The failed unit no longer appears on the Cluster Members list.

 

Viewing cluster status from the CLI

Use the get system ha status command to display information about an HA cluster. The command displays general HA configuration settings. The command also displays information about how the cluster unit that you have logged into is operating in the cluster.

Usually you would log into the primary unit CLI using SSH or telnet. In this case the get system ha status command displays information about the primary unit first, and also displays the HA state of the primary unit (the primary unit operates in the work state). However, if you log into the primary unit and then use the execute ha manage command to log into a subordinate unit, (or if you use a console connection to log into a subordinate unit) the get system status command displays information about this subordinate unit first, and also displays the HA state of this subordinate unit. The state of a subordinate unit is work for an active-active cluster and standby for an active-passive cluster.

For a virtual cluster configuration, the get system ha status command displays information about how the cluster unit that you have logged into is operating in virtual cluster 1 and virtual cluster 2. For example, if you connect to the cluster unit that is the primary unit for virtual cluster 1 and the subordinate unit for virtual cluster 2, the output of the get system ha status command shows virtual cluster 1 in the work state and virtual cluster 2 in the standby state. The get system ha status command also displays additional information about virtual cluster 1 and virtual cluster 2.

The command display includes the following fields.

Fields                      Description

Model          The FortiGate model number.

Mode            The HA mode of the cluster: a-a or a-p.

Group          The group ID of the cluster.

Debug          The debug status of the cluster.

ses_pickup     The status of session pickup: enable or disable.

load balance   The status of the load-balance-all keyword: enable or disable. Relevant to act- ive-active clusters only.

schedule       The active-active load balancing schedule. Relevant to active-active clusters only.

Master displays the device priority, host name, serial number, and cluster index of the primary (or master) unit.

Slave displays the device priority, host name, serial number, and cluster index of the subordinate (or slave, or backup) unit or units.

Master Slave

The list of cluster units changes depending on how you log into the CLI. Usually you would use SSH or telnet to log into the primary unit CLI. In this case the primary unit would be at the top the list followed by the other cluster units.

If you use execute ha manage or a console connection to log into a subordinate unit CLI, and then enter get system ha status the subordinate unit that you have logged into appears at the top of the list of cluster units.

number of vcluster

The number of virtual clusters. If virtual domains are not enabled, the cluster has one virtual cluster. If virtual domains are enabled the cluster has two virtual clusters.