Logical Domains with IPMP

From Peter Pap's Technowiki
Jump to: navigation, search

IP Multipathing (IPMP) is a way to bring redundancy into your Solaris networking. In essence, you can connect one physical, or, in this case, virtual server to multiple switches. In the event that one network interface should go down, then the server will still be accessible over the network. The idea is that you take two or more interfaces and make them part of an interface group. You assign an IP to one or both of those interfaces. In the event of a link failure on one interface, the other interface in the group continues to advertise the IP of the other. For a full explanation of IPMP, see the Oracle Solaris System Administration Guide : IP Services

There are two types of IPMP, Link-based and Probe-based, and both can be set up as either active/active or active/standby. You can use either for physical servers. For LDOM's it's a little less clear cut. All versions of LDOM manager support Probe-based IPMP, but you need version 1.3 or higher for Link-based. You will also need to have patched the 'vnet' driver in the LDOM to the latest version so that it is link-state-aware.

Probe-based IPMP for LDOMs

In Probe based IPMP, one or more of the interfaces are given a real IP which the server can use for external communication. Each interface in the IPMP group also has a virtual interface assigned to it which is given a test IP. The test IP is used to probe the network, by sending out ICMP ping packets on the virtual test interface, to either manually specified or automatically generated targets. In the event that one of the test IP's fails in it's probing, then that interface is marked as failed, and then one of the other interfaces in the IPMP group picks up the 'real' IP.

In the sample 'ifconfig' output below, you can see there are two interface, vnet0 and vnet1:

 vnet0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
         inet 10.10.10.40 netmask ffffff00 broadcast 10.10.10.255
         groupname testing
         ether 0:14:4f:fa:a2:6b
 vnet0:1: flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS> mtu 1500 index 2
         inet 10.10.10.42 netmask ffffff00 broadcast 10.10.10.255
 vnet1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3
         inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
         groupname testing
         ether 0:14:4f:fa:87:50
 vnet1:1: flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS> mtu 1500 index 3
         inet 10.10.10.43 netmask ffffff00 broadcast 10.10.10.255


vnet0 is an interface that is created on vsw0 and vnet1 is on vsw1. Each vswitch is configured against a different physical interface that are plugged into different physical switches. vnet0 has the external IP of 10.10.10.40 and the test IP of 10.10.10.42 on it's virtual interface, vnet0:1. vnet1 has the dummy IP 0.0.0.0, as we only want to use one external IP for data. However, vnet1 has a test IP assigned on vnet1:1 of 10.10.10.43. This is an active/active setup, which in this case means, that both interfaces can be used for outbound traffic, so in theory you can double your outbound throughput. In the event that vnet0 should fail, then vnet1 will create a new virtual interface and assign it the IP 10.10.10.40, thereby keeping the box on the network. In the event that vnet1 should fail, then only outbound traffic will be affected as in this case, vnet1 has no 'real' IP.

So how do you do this? Here's how:

1. Configure the Control/Service Domain with two vwitches:

 ldm add-vsw net-dev=e1000g0 inter-vnet-link=off primary-vsw0 primary
 ldm add-vsw net-dev=e1000g1 inter-vnet-link=off primary-vsw1 primary 

Each vswitch is configured against a separate physical network port. The network ports will be plugged into at least two different switches.

2. Create two or more virtual network devices for the ldom, one against each vswitch

 ldm add-vnet vnet0 primary-vsw0 ldmtest
 ldm add-vnet vnet1 primary-vsw1 ldmtest

3. After you've botted the LDOM, edit the /etc/hosts file and add entries for the real and test IP's

 10.10.10.40   ldmtest
 10.10.10.42   ldmtest-0
 10.10.10.43   ldmtest-1

4. Edit the file /etc/hostname.vnet0 and add the following:

 ldmtest netmask + broadcast + group testing up addif ldmtest-0 netmask + broadcast + -failover deprecated up

This creates the real interface vnet0 with the IP specified by the hostname ldmtest as part of the group testing. It also creates the test interface, vnet0:1, with the IP specified by the hostname ldmtest-0. The -failover option tells the box not to failover the test IP. The deprecated option tells the box not to use the test interface for any application data. The netmask and broadcast options tell the box to use the default netmask and broadcast values for these interfaces.

5. Edit/create the file /etc/hostname.vnet1 and add the following:

 group testing up addif ldmtest-1 netmask + broadcast + -failover deprecated up

This makes the vnet1 interface part of the IPMP group testing and creates a test interface vnet1:1 with the IP specified by the hostname ldmtest-1. Again, the -failover option tells the box not to failover the test IP and the deprecated option tells the box not to use the test interface for any application data. In this case, the vnet1 interface has no IP and does not get any incoming traffic, but can send outgoing traffic. If you want this interface to have an IP and accept inbound traffic, then add this to /etc/hostname.vnet1 instead:

 ldmtest-sec netmask + broadcast + group testing up addif ldmtest-1  netmask + broadcast + -failover deprecated up

and add an entry for ldmtest-sec in the /etc/hosts file.

6. Reboot the box (you could also use ifconfig to assign all this without rebooting. Figure that out yourself :-)

That's it.

Link-Based IPMP for LDOMs

Link-Based IPMP is a little easier to configure and uses fewer IP addresses. It relies on the vswitch and the virtual network interface being link-state-aware, i.e. being aware when the state of the physical link has changed. In Link-Based IPMP there are no test IP's assigned to the interfaces and no probes are sent out. The vswitch and virtual network interface are simply notified by the OS/physical driver that the link state has changed. The example below sets up an active/active IPMP group, where one interface is used for incoming traffic, but both are used for outbound traffic.

To set this up:

1. Create/Modify two or more vswitches with linkprop=phys-state enabled so that they are link-state-aware:

To create:

 ldm add-vsw net-dev=e1000g0 inter-vnet-link=off linkprop=phys-state primary-vsw0 primary
 ldm add-vsw net-dev=e1000g1 inter-vnet-link=off linkprop=phys-state primary-vsw1 primary

To modify existing vswitches:

 ldm set-vsw linkprop=phys-state primary-vsw0 primary
 ldm set-vsw linkprop=phys-state primary-vsw1 primary

2. Add/modify at least two virtual network interfaces to the LDOM with linkprop=phys-state enabled so that they are link-state-aware:

To create:

 ldm add-vnet linkprop=phys-state vnet0 primary-vsw0 ldmtest
 ldm add-vnet linkprop=phys-state vnet1 primary-vsw1 ldmtest

To modify existing virtual interfaces:

 ldm set-vnet linkprop=phys-state vnet0 ldmtest
 ldm set-vnet linkprop=phys-state vnet1 ldmtest

3. Boot the LDOM and edit /etc/hostname.vnet0, replacing the contents with:

 ldmtest netmask + broadcast + group testing up

This gives the interface the the IP defined by the hostname 'ldmtest' (as defined in /etc/hosts) with the default netmask and broadcast addresses, and makes it a member of the IPMP group 'testing'.

4. Edit/create the file /etc/hostname.vnet1 and make the contents:

 group testing up

This makes vnet1 a member of the IPMP group 'testing'. This interface will only be used fro outbound traffic.

5. Reboot the machine. (yes you can do this all with ifconfig .....)

6. After reboot, the output of ifconfig should look something like:

 root@ldmtest /etc# ifconfig -a
 lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
         inet 127.0.0.1 netmask ff000000
 vnet0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
         inet 10.10.10.30 netmask ffffff00 broadcast 10.10.10.255
         groupname testing
         ether 0:14:4f:fb:7c:28
 vnet1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3
         inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
         groupname testing
         ether 0:14:4f:fb:a1:f7

And now you're done.