[lustre-devel] Is network interface fail-over on same server on the road map?

Wed Aug 8 20:55:00 PDT 2012

You are correct.  A node will choose a single Lnet network (ib0, ib1, or eth0) over which to talk at the establishment of the connection, and never change its mind.  If the server becomes unresponsive, the client will try one network on the failover list. 
The reason why you can't failover to a second network is that the server state in that case is not "just failed over and awaiting replay" but is instead "perfectly happy in normal operating mode", and so doesn't know what to do with this new client who wants to do replay. 
This could be fixed by some more sophisticated handling on the server, or a new Lnet failover mode, but I don't think this is in anybody's plans at the moment. 

On Aug 8, 2012, at 7:41 PM, Brian O'Connor <briano at sgi.com> wrote:

> 
> Hi
> 
>   at present AFAIK a lustre connection will not failover between NIDS on the same server. For example if you have a 2 MDS servers with two IB interfaces and an ETH interface configured as a HA pair, you can configure the client to *mount* on the basis of which they can see, but once the connection is made if the network switch, cable or HCA fails then the client cannot failover to the other NID on the *same* server, it fails over to a NID on the HA pair (and the resources probably wont be on the HA pair)
> 
> So my question is... is this  feature on a road map, or is it just not compatible with other aspects of Lustre.
> 
> So,  to try and be clear, assuming the following;
> 
> Servers
> MDS1:  ib0=192.168.1.1/24, ib1=192.168.2.1/24, eth0=10.0.0.1/24
> MDS2:  ib0=192.168.1.2/24, ib1=192.168.2.2/24, eth0=10.0.0.2/24
> OSS1:  ib0=192.168.1.11/24, ib1=192.168.2.11/24, eth0=10.0.0.11/24
> ..
> OSS20: ib0=192.168.1.30/24, ib1=192.168.2.30/24, eth0=10.0.0.30/24
> 
> Clients
> 
> c1:    ib0=192.168.1.101/24, ib1=192.168.2.101/24, eth0=10.0.0.101/24
> ..
> c100:  ib0=192.168.1.200/24, ib1=192.168.2.200/24, eth0=10.0.0.200/24
> 
> I mount on the client with
> 
> mount -t lustre 192.168.1.1 at o2ib,192.168.2.1 at o2ib,10.0.0.1 at tcp:192.168.1.2 at o2ib,192.168.2.2 at o2ib,10.0.0.2 at tcp:/lustre /lustre
> 
> (and setup to load balance odd/even clients to odd/even ips on servers)
> 
> client "c1" mounts initially via 192.168.1.1 at o2ib. If later the client fails to communicate on  192.168.1.1 at o2ib at present it will *not* try 192.168.2.1 or 10.0.0.1, it will try  192.168.1.2 or 192.168.2.2  or 10.0.0.2  on the configured HA pair. This complicates the HA setup in that you have to monitor the networks, and stonith all the mds/oss so that the resources are available on the HA pair when needed, and so far this has never worked out for me.
> 
> Is there an architectural reason that you can't failover over to another NID on the same server?
> 
> 
> 
> -- 
> Brian O'Connor
> -------------------------------------------------------------
> SGI Consulting
> Email: briano at sgi.com, Mobile +61 417 746 452
> Phone: +61 3 9963 1900, Fax:  +61 3 9963 1902
> 691 Burke Road, Camberwell, Victoria, 3124
> AUSTRALIA
> http://www.sgi.com/support/services
> -------------------------------------------------------------
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.opensfs.org
> http://lists.opensfs.org/listinfo.cgi/lustre-devel-opensfs.org