Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: multipath

  1. #1

    Default multipath

    Hi,
    I am trying to setup multipath with failover policy on openSuSE 11. I have two qla2xxx HBA's installed and they appear to be working. Here is the output of "multipath -l" command
    ---------------
    SAN_dsk (WWIDnnn) dm-0 SUN,CSM200_R
    [size=1.0T][features=1 queue_if_no_path][hwhandler=1 rdac]
    \_ round-robin 0 [prio=-1][enabled]
    \_ 4:0:0:0 sdc 8:32 [active][undef]
    \_ round-robin 0 [prio=-1][active]
    \_ 5:0:0:0 sdd 8:48 [active][undef]
    ---------------

    While testing, I pulled one of the two connection to SAN, and the connection failed over to second HBA connection to SAN.

    When I plug the cable back in, it does not fall back to original connection... It stays in failed state.

    Also, I noticed that failed disk (sdd disk) comes back as (sdg disk), which is probably why connection does not fall back to original HBA.

    But, when I run "/sbin/service multipathd restart" sdg disk shows as as enabled in multipath -l...

    What am I missing here? Any ideas / pointers?

    Thanks

  2. #2
    goldie NNTP User

    Default Re: multipath

    > on openSuSE 11

    is this on openSUSE 11.0, 11.1 or 11.2?

    or SUSE Linux Enterprise Server (SLES)?

    from the level of the question i *guess* the latter...

    cat /etc/SuSE-release
    should be definitive..

    don't confuse what i write: you ARE welcome here and eventually
    someone who can help will probably wander by....but, for the most part
    the volunteer helpers here on the openSUSE side are dealing with less
    complex 'problems' (like the growing pains of n00bs fleeing Redmond
    nose rings, etc)...but, if you are running SLES 11 i guess you are
    likely to have a better answer from the folks you purchased from, over
    at forums.novell.com

    but, check back because there are a few folks here with the level of
    know your Q needs..

    --
    goldie
    Give a hacker a fish and you feed him for a day.
    Teach man and you feed him for a lifetime.

  3. #3

    Default Re: multipath

    @daksh

    This is from memory, so please check multipath docs.

    You have set up failover on your array, but you do not have automatic faiback set up. As far as I can remember it's a config parameter.

    And one more thing. Again, as far as I can remember, if and only if you have true active-active storage controller pair, round-robin i/o policy makes sense. For active-passive and ALUA controllers round-robin policy decreases performance. I do not know your array, so rr can be good choice.


    HTH

    Milan

  4. #4

    Default Re: multipath

    I am using openSuSE version 11.0; here is the content of /etc/SuSE-release
    ------
    openSUSE 11.0 (i586)
    VERSION = 11.0
    ------


    @ Milan,
    I am reading man pages for multipath which does not say anything about failback; perhaps, it is time to start googling...
    We have Sun StorageTek 6140.

    Also, thanks for your tip about round-robin...

    Thanks

  5. #5

    Default Re: multipath

    Quote Originally Posted by daksh View Post

    @ Milan,
    I am reading man pages for multipath which does not say anything about failback; perhaps, it is time to start googling...
    Ahem.
    Code:
    man multipathd.conf
    
    failback         Tell  the  daemon  to  manage  path group failback, or not to. 0 or immediate means immediate failback, values >0 means deferred failback (in seconds). manual means no failback. Default value is manual
    Sorry, I am not familiar with your Sun box. Perhaps you should check at their site. And at device mapper related mail lists & sites.

    Chances are, your storage is well known and device mapper & multipath have default configuration for it. Otherwise it's reading time.

    Have a nice day.

    Milan

  6. #6

    Default Re: multipath

    Found failback setting and set it in my multipath.conf

    Here are steps I just took to test failover and failback (after setting failback)...

    Code:
    multipath -l
    SAN_dsk (WWID-nnn) dm-0 SUN,CSM200_R
    [size=1.0T][features=1 queue_if_no_path][hwhandler=1 rdac]
    \_ round-robin 0 [prio=-1][enabled]
     \_ 4:0:0:0  sdc 8:32  [active][undef]
    \_ round-robin 0 [prio=-1][active]
     \_ 5:0:0:0  sdd 8:48  [active][undef]
    Now, I unplug cable from one of the HBA's to test fail over:

    Code:
    multipath -l
    SAN_dsk (WWID-nnn) dm-0 SUN,CSM200_R
    [size=1.0T][features=1 queue_if_no_path][hwhandler=1 rdac]
    \_ round-robin 0 [prio=-1][active]
     \_ 4:0:0:0  sdc 8:32  [active][undef]
    \_ round-robin 0 [prio=-1][enabled]
     \_ #:#:#:#  -   #:#   [failed][undef]
    I checked my data to make sure fail over worked, and it did...
    I can still read and write to disk on SAN; so far so good.



    Now few minutes later, I plug the cable back in to HBA, so it should fail back...

    Code:
    SAN_dsk (WWID-nnn) dm-0 SUN,CSM200_R
    [size=1.0T][features=1 queue_if_no_path][hwhandler=1 rdac]
    \_ round-robin 0 [prio=-1][active]
     \_ 4:0:0:0  sdc 8:32  [active][undef]
    \_ round-robin 0 [prio=-1][enabled]
     \_ #:#:#:#  -   #:#   [failed][undef]
    But, it does not looks like fail back worked. Waited few minutes... Still same output of multipath -l

    Now, I restart multipathd service
    Code:
    /sbin/service multipathd restart
    Shutting down multipathd                                              done
    Starting multipathd                                                   done
    And, here is output of multipath -l after restarting multipathd

    Code:
    SAN_dsk (WWID-nnn) dm-0 SUN,CSM200_R
    [size=1.0T][features=1 queue_if_no_path][hwhandler=1 rdac]
    \_ round-robin 0 [prio=-1][enabled]
     \_ 4:0:0:0  sdc 8:32  [active][undef]
    \_ round-robin 0 [prio=-1][active]
     \_ 5:0:0:0  sde 8:64  [active][undef]
    So, it seems like failback only works after restarting multipathd, and sdd now appears as sde.
    May be disk changing from sdd to sde is causing failback to not work properly?

  7. #7

    Default Re: multipath

    daksh,

    please read following thread at dm-devel:

    [dm-devel] Question about dm-multipath and Sun StorageTek 6140

    and set up your box accordingly.

    Also, there might be other threads interesting to you at dm-devel.

    Best wishes

    Milan

  8. #8

    Default Re: multipath

    Just read the article and some more about LUN trespassing...
    Made chanes to my config file, but need to wait until I get OK to test again (as I do not know for sure if the changes I made will crash the machine).

    Thanks Milan for your help in this; I will post the results after I test.

  9. #9

    Default Re: multipath

    Tried testing failover and failback after making changes described here:
    [dm-devel] Question about dm-multipath and Sun StorageTek 6140

    But, same results as before.

    In my test, after the fail over happens successfully, I reconnect the cable, but failback does not happen until I restart multipathd service.

    Strange, and it feels like I am missing something simple...

    Thanks

  10. #10

    Default Re: multipath

    Strange, and it feels like I am missing something simple...
    Is your multipathd running? I mean the daemon (service), not the user command multipath.

    boot.multipathd will set up things during boot, but you do need running multipathd to take care of things.

    I think it's worth shecking.

    Milan

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •