Welcome to the { mindfrost82.com } forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact contact us.

Go Back   { mindfrost82.com } > Gadget Corner > Tech Newsgroups > Linux > Linux Networking

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 03-04-2008, 09:14 PM
Greg.A.Fischer@gmail.com
 
Posts: n/a
disappointing performance with Ethernet bonding on Linux

Hello,

Our group has a small, mostly-homogeneous Linux cluster (5 boxes, two
dual-core Opterons per box, GigE interconnects on PCI-X bus) that
we're using to develop a parallel-ized engineering code. The code is
compiled against MPICH2. From what we're seeing right now, inter-node
communication (in the form of relatively few bulk-transfers of data)
represents a large chunk of the code's execution time. In an attempt
to improve performance at a modest cost, we implemented NIC (channel)
bonding.

The results that I'm seeing, so far, aren't all that impressive. We
bought a round of dual-port GigE NICs, so we're bonding 3 NICs per
box. We have an 802.3ad-compliant switch (Linksys SLM2024), so we're
running bonding mode=4, but we've tried several of the others, to
little avail. The basic benchmarks we're running (subounce v.1.0) are
only showing sporadic ~10% improvements in bandwidth and latency.
Performance of our primary code of interest has only improved by a
factor of 5%-20%, which is significantly less than I was expecting.

The contents of /proc/net/bonding/bond0 is at the end of this post.
At some point, I'll increase the maximum packet size (MTU), which is
currently at the default level, but it seems like something more
fundamental is wrong here. Should the "Number of Ports" perhaps read
higher than "1"?

Can anyone think of something we might be forgetting to do? Have I
misunderstood what channel bonding is capable of? Any experience or
pointers would be greatly appreciated. Let me know if more info would
be helpful.

Thanks,
Greg

***

[fischega@master BOUNCE]$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v2.6.1 (October 29, 2004)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 17
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00

Slave Interface: eth0
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:e0:81:43:75:9c
Aggregator ID: 1

Slave Interface: eth2
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:21:13:4c:e8
Aggregator ID: 2

Slave Interface: eth3
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:21:13:4c:e9
Aggregator ID: 3

***
Reply With Quote
  #2 (permalink)  
Old 03-04-2008, 11:23 PM
Georg Bisseling
 
Posts: n/a
Re: disappointing performance with Ethernet bonding on Linux

Am 04.03.2008, 22:14 Uhr, schrieb <Greg.A.Fischer@gmail.com>:

> Hello,
>
> Our group has a small, mostly-homogeneous Linux cluster (5 boxes, two
> dual-core Opterons per box, GigE interconnects on PCI-X bus) that
> we're using to develop a parallel-ized engineering code. The code is
> compiled against MPICH2. From what we're seeing right now, inter-node
> communication (in the form of relatively few bulk-transfers of data)
> represents a large chunk of the code's execution time. In an attempt
> to improve performance at a modest cost, we implemented NIC (channel)
> bonding.


Since I do not know too much about Ethernet bonding I will try
to challenge your analýsis.

If you positively measured that
- the bonding improves the bandwidth between the nodes
and that
- the small latency increase by bonding doe not hurt
then my conclusion is that bandwidth may be not your problem.

How did you conclude that the bandwidth is the problem? Did you check
if long time spent in MPI is not caused by load imbalance?

Regards
Georg


>
> The results that I'm seeing, so far, aren't all that impressive. We
> bought a round of dual-port GigE NICs, so we're bonding 3 NICs per
> box. We have an 802.3ad-compliant switch (Linksys SLM2024), so we're
> running bonding mode=4, but we've tried several of the others, to
> little avail. The basic benchmarks we're running (subounce v.1.0) are
> only showing sporadic ~10% improvements in bandwidth and latency.
> Performance of our primary code of interest has only improved by a
> factor of 5%-20%, which is significantly less than I was expecting.
>
> The contents of /proc/net/bonding/bond0 is at the end of this post.
> At some point, I'll increase the maximum packet size (MTU), which is
> currently at the default level, but it seems like something more
> fundamental is wrong here. Should the "Number of Ports" perhaps read
> higher than "1"?
>
> Can anyone think of something we might be forgetting to do? Have I
> misunderstood what channel bonding is capable of? Any experience or
> pointers would be greatly appreciated. Let me know if more info would
> be helpful.
>
> Thanks,
> Greg
>
> ***
>
> [fischega@master BOUNCE]$ cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v2.6.1 (October 29, 2004)
>
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 0
> Down Delay (ms): 0
>
> 802.3ad info
> LACP rate: slow
> Active Aggregator Info:
> Aggregator ID: 1
> Number of ports: 1
> Actor Key: 17
> Partner Key: 1
> Partner Mac Address: 00:00:00:00:00:00
>
> Slave Interface: eth0
> MII Status: up
> Link Failure Count: 1
> Permanent HW addr: 00:e0:81:43:75:9c
> Aggregator ID: 1
>
> Slave Interface: eth2
> MII Status: up
> Link Failure Count: 1
> Permanent HW addr: 00:1b:21:13:4c:e8
> Aggregator ID: 2
>
> Slave Interface: eth3
> MII Status: up
> Link Failure Count: 1
> Permanent HW addr: 00:1b:21:13:4c:e9
> Aggregator ID: 3
>
> ***
>




--
This signature was left intentionally almost blank.
http://www.this-page-intentionally-left-blank.org/
Reply With Quote
  #3 (permalink)  
Old 03-05-2008, 01:05 AM
Rick Jones
 
Posts: n/a
Re: disappointing performance with Ethernet bonding on Linux

In comp.os.linux.networking Greg.A.Fischer@gmail.com wrote:
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation


IIRC this means that any one "flow" will have the services of only one
link in the bond and a "flow" will be defined as going to a given
destination MAC address.

On the "inbound" side, it will be what the switch does - IIRC most
switches by default will also use destination MAC address.

Depending on the distribution of the MAC addresses in your cluster,
you may or may not get very good distribution of traffic among the
links in your bond.

rick jones
--
portable adj, code that compiles under more than one compiler
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
Reply With Quote
  #4 (permalink)  
Old 03-06-2008, 08:53 AM
Heiko Bauke
 
Posts: n/a
Re: disappointing performance with Ethernet bonding on Linux

Hello,

On Tue, 4 Mar 2008 13:14:06 -0800 (PST)
Greg.A.Fischer@gmail.com wrote:

[...]
> In an attempt
> to improve performance at a modest cost, we implemented NIC (channel)
> bonding.
>
> The results that I'm seeing, so far, aren't all that impressive. We
> bought a round of dual-port GigE NICs, so we're bonding 3 NICs per
> box. We have an 802.3ad-compliant switch (Linksys SLM2024), so we're
> running bonding mode=4, but we've tried several of the others, to
> little avail. The basic benchmarks we're running (subounce v.1.0) are
> only showing sporadic ~10% improvements in bandwidth and latency.
> Performance of our primary code of interest has only improved by a
> factor of 5%-20%, which is significantly less than I was expecting.


I never have experienced any throughput-gain by channel bonding on
Gigabit Ethernet (in contrast to Fast Ethernet).

If you need bonding for MPI applications only, then I would suggest to
use Open MPI [1]. This MPI implementation benefits from multiple NICs
without depending on the Linux bonding kernel module. For large
messages you should get an almost ideal speedup but the latency will
not decrease (no or little speedup for short messages).


Heiko


[1] http://www.open-mpi.org

--
-- Ein guter Spruch ist die Wahrheit eines ganzen Buches
-- in einem einzigen Satz. (Theodor Fontane, 1819-1898)
-- Cluster Computing @ http://www.clustercomputing.de
-- Heiko Bauke @ http://www.mpi-hd.mpg.de/personalhomes/bauke
Reply With Quote
Reply

  { mindfrost82.com } > Gadget Corner > Tech Newsgroups > Linux > Linux Networking


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 11:12 AM.


Powered by vBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.1.0 ©2007, Crawlability, Inc.
© 1999-2008 mindfrost82.com v11.0


Sponsors:
Credit Cards | Personal Injury Lawyer Los Angeles | Electricity Suppliers | Online Advertising | Myspace Proxy



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114