[lustre-devel] Is network interface fail-over on same server on the road map?
Brian O'Connor
briano at sgi.com
Wed Aug 8 19:41:57 PDT 2012
Hi
at present AFAIK a lustre connection will not failover between NIDS
on the same server. For example if you have a 2 MDS servers with two IB
interfaces and an ETH interface configured as a HA pair, you can
configure the client to *mount* on the basis of which they can see, but
once the connection is made if the network switch, cable or HCA fails
then the client cannot failover to the other NID on the *same* server,
it fails over to a NID on the HA pair (and the resources probably wont
be on the HA pair)
So my question is... is this feature on a road map, or is it just not
compatible with other aspects of Lustre.
So, to try and be clear, assuming the following;
Servers
MDS1: ib0=192.168.1.1/24, ib1=192.168.2.1/24, eth0=10.0.0.1/24
MDS2: ib0=192.168.1.2/24, ib1=192.168.2.2/24, eth0=10.0.0.2/24
OSS1: ib0=192.168.1.11/24, ib1=192.168.2.11/24, eth0=10.0.0.11/24
..
OSS20: ib0=192.168.1.30/24, ib1=192.168.2.30/24, eth0=10.0.0.30/24
Clients
c1: ib0=192.168.1.101/24, ib1=192.168.2.101/24, eth0=10.0.0.101/24
..
c100: ib0=192.168.1.200/24, ib1=192.168.2.200/24, eth0=10.0.0.200/24
I mount on the client with
mount -t lustre
192.168.1.1 at o2ib,192.168.2.1 at o2ib,10.0.0.1 at tcp:192.168.1.2 at o2ib,192.168.2.2 at o2ib,10.0.0.2 at tcp:/lustre
/lustre
(and setup to load balance odd/even clients to odd/even ips on servers)
client "c1" mounts initially via 192.168.1.1 at o2ib. If later the client
fails to communicate on 192.168.1.1 at o2ib at present it will *not* try
192.168.2.1 or 10.0.0.1, it will try 192.168.1.2 or 192.168.2.2 or
10.0.0.2 on the configured HA pair. This complicates the HA setup in
that you have to monitor the networks, and stonith all the mds/oss so
that the resources are available on the HA pair when needed, and so far
this has never worked out for me.
Is there an architectural reason that you can't failover over to another
NID on the same server?
--
Brian O'Connor
-------------------------------------------------------------
SGI Consulting
Email: briano at sgi.com, Mobile +61 417 746 452
Phone: +61 3 9963 1900, Fax: +61 3 9963 1902
691 Burke Road, Camberwell, Victoria, 3124
AUSTRALIA
http://www.sgi.com/support/services
-------------------------------------------------------------
More information about the lustre-devel
mailing list