[j-nsp] ISIS metric of redistributed directly connected routes

Discussion:

Richard A Steenbergen

2009-03-01 17:53:24 UTC

Question about the default metric of a route being redistributed from
directly connected to isis...

Router A is connected to Router B (both Juniper) via a single link which
has an isis (level 2 only) interface metric of 10 on both sides. The
loopbacks are injected into isis via an interface passive command with
explicit metric of 0. Router B has an interface which is being
redistributed into isis using an export policy like so:

term DIRECT {
from protocol direct;
then accept;
}

But the observed metric value from router A on all of the routes being
"redistributed" in this way from router B is 20, even though the
interface metric between routers is only 10. The loopback route (which
isn't redistributed) has a correct value of 10. A show isis database
detail confirms that the other routers are receiving the "redistributed"
route with a metric of 10 before adding interface costs. After playing
around with it for a bit, I was able to reset this to 0 by changing the
above redistribution to:

term DIRECT {
from protocol direct;
then {
metric 0;
accept;
}
}

Now the reason I noticed this in the first place was that there was also
a router A-C link where router C was doing the exact same thing, but
with a metric of 5 instead of 10, thus preventing router A from load
balancing properly.

After looking through other routers doing the same thing, the common
pattern seems to be the type or speed of the directly connected
interface being redistributed, with xe's being given a cost 10, and
multi-xe ae's given costs 5 or 3. This pretty much screams some kind of
reference bandwidth, of say 100g being divided by 10g increments to get
10, 5, 3, though oddly enough it doesn't always match this pattern (for
example, I found a 2x10G AE being given metric 3). I do have isis
reference bandwidth configured, but with a value of 1000g not 100g
(mostly to set a high interface metric in the even that something is
accidentally left unconfigured), and changing the reference bandwidth
value seems to have no effect on the default isis route metrics. As far
as I knew reference bandwidth only affected the metric of interfaces
participating in isis, not the metric on the route as it is being
redistributed from directly connected into isis.

Maybe I'm being dense, but I really can't say that I've ever seen this
documented anywhere... Can anybody point to anything explaining this
behavior?

--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)

Mark Tinka

2009-03-01 23:39:19 UTC

Permalink

On Monday 02 March 2009 01:53:24 am Richard A Steenbergen

Post by Richard A Steenbergen
But the observed metric value from router A on all of the
routes being "redistributed" in this way from router B is
20, even though the interface metric between routers is
only 10. The loopback route (which isn't redistributed)
has a correct value of 10. A show isis database detail
confirms that the other routers are receiving the
"redistributed" route with a metric of 10 before adding
interface costs. After playing around with it for a bit,
I was able to reset this to 0...

This is the behaviour we have also seen in JunOS.

We use 'passive' rather than 'redistribute' for interfaces
we need to do this on.

In JunOS, we have seen that Loopback interfaces passively
introduced into the IS-IS topology have a correct metric of
0. So there is no ambiguity from another router's point of
view in the network, that reachability to the Loopback
address has a metric similar to that of the router's
physical interface(s) itself.

This is the same behaviour in IOS.

However, we've noticed that passively introducing physical
interfaces into the IS-IS topology always adds a cost of 10
to routes associated with that interface, which creates the
behaviour you're seeing.

Conversely, IOS will maintain routes whose physical
interfaces have been passively introduced into IS-IS with a
metric of 0.

We don't use the 'reference-bandwidth' command in our IS-IS
domain, so we default to the little documentation Juniper
have about this that, with the exception of the Loopback
interface - which inherits a metric of 0 - all other
interfaces default to a metric of 10 if the 'reference-
bandwidth' command is not used.

Perhaps Juniper folk on this list can elaborate further.

Cheers,

Mark.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: <https://puck.nether.net/pipermail/juniper-nsp/attachments/20090302/1f183f76/attachment.bin>

Hannes Gredler

2009-03-02 08:29:22 UTC

Permalink

hi richard,

guessing the "correct" export metric of a direct route
is a bit of philosophical question ;-) - let me explain:

since JUNOS does not know which end(point) of the subnet you are
interested in, it advertises the worst case, which is the cost
to reach the far-end, the cost of "crossing" the interface.

for loopback interfaces (which do not have a notion of far-end)
the "local" cost (0) is advertised.

HTH,

/hannes

Post by Richard A Steenbergen
Question about the default metric of a route being redistributed from
directly connected to isis...
Router A is connected to Router B (both Juniper) via a single link which
has an isis (level 2 only) interface metric of 10 on both sides. The
loopbacks are injected into isis via an interface passive command with
explicit metric of 0. Router B has an interface which is being
term DIRECT {
from protocol direct;
then accept;
}
But the observed metric value from router A on all of the routes being
"redistributed" in this way from router B is 20, even though the
interface metric between routers is only 10. The loopback route (which
isn't redistributed) has a correct value of 10. A show isis database
detail confirms that the other routers are receiving the "redistributed"
route with a metric of 10 before adding interface costs. After playing
around with it for a bit, I was able to reset this to 0 by changing the
term DIRECT {
from protocol direct;
then {
metric 0;
accept;
}
}

Richard A Steenbergen

2009-03-02 19:57:46 UTC

Permalink

Post by Hannes Gredler
hi richard,
guessing the "correct" export metric of a direct route
since JUNOS does not know which end(point) of the subnet you are
interested in, it advertises the worst case, which is the cost
to reach the far-end, the cost of "crossing" the interface.
for loopback interfaces (which do not have a notion of far-end)
the "local" cost (0) is advertised.

So it guesses this based on what, a hardcoded reference bandwidth of
100g? That would explain the behavior I'm seeing, if combined with an
implementation which didn't automatically update when the speed of an ae
changes.

I do see value in setting a cost on a non-igp speaking interface which
is simply being redistributed, I use this for traffic engineering
purposes all the time (which is why I carry the edge interfaces in igp
in the first place, instead of just a loopback+NHS). But assigning an
arbitrary cost to non-igp speaking interfaces by default seems wrong,
especially when it is assigned based on interface speed and may cause
load balancing between equal cost paths to not work. Also, this doesn't
seem to happen when redistributing static routes which are routed to a
physical interface, which seems like another inconsistent application of
metrics which would cause inconsistent load balancing.

Is this documented anywhere though? I really don't think I've ever seen
that, nobody else who I asked expected to see this behavior, and Cisco
seems to set the cost to 0 on all interfaces which aren't speaking the
igp or don't have an explicit metric assigned to them. Now that I know
whats going on at least I can disable this behavior via policy, but it
seems like if you're going to do it then a) it should be better
documented, and b) a show route detail/extensive on the direct route
should show the cost it has assigned (since the only way to see this was
to look at the isis database for the individual routes in question, to
figure out why they didn't match). And maybe throw in c) ae bandwidth
changes should auto-update the cost, and d) the reference-bandwidth
should be configurable, for good measure. :)

--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)

Richard A Steenbergen

2009-03-03 13:42:22 UTC

Permalink

much more simple - the default IS-IS metric for any interface
(i.e. if you do not have reference-bandwidth configured) is 10.

Except that I'm seeing some interfaces with an isis metric of 5 or 3,
which is the root of the entire problem (inconsistent costs). if they
were all set to 10 I wouldn't have noticed. :)

The reference bandwidth I have configured doesn't match up with these
results either. I'm using 100g (the max possible value), to create a
high metric incase anyone ever accidentally left off a metric, so the
metric on a 10g interface there would be 100 not 10.

http://www.juniper.net/techpubs/software/junos/junos94/swconfig-routing/modifying-the-interface-metric.html#id-11111080

I saw that page, but honestly I would never have interpreted anything on
there to mean "interface ROUTES being exported INTO isis" rather than
the more conventional "isis speaking interfaces". And infact it can't
be, or it would have been affected by reference bandwidth as mentioned
above.

| And maybe throw in c) ae bandwidth
| changes should auto-update the cost, and d) the reference-bandwidth
| should be configurable, for good measure. :)
i am not sure what you mean with c)

I found several instances of 10g interfaces with a metric of 10, 20g ae
interfaces with metric 5, 30g ae with a metric of 3, etc, but I also
found a 30g ae with a metric of 5. The only excuse I can come up with
for this behavior is something is being calculated with a reference
bandwidth of 100g and the isis metric isn't refreshing if the ae
bandwidth changes (i.e. another member is added to lacp). Of course it
could be something completely different, but thats the only theory I
have which fits the facts.

--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)

Hannes Gredler

2009-03-03 17:38:24 UTC

Permalink

Post by Richard A Steenbergen
I found several instances of 10g interfaces with a metric of 10, 20g ae
interfaces with metric 5, 30g ae with a metric of 3, etc, but I also
found a 30g ae with a metric of 5. The only excuse I can come up with
for this behavior is something is being calculated with a reference
bandwidth of 100g and the isis metric isn't refreshing if the ae
bandwidth changes (i.e. another member is added to lacp). Of course it
could be something completely different, but thats the only theory I
have which fits the facts.

Richard A Steenbergen

2009-03-04 22:37:23 UTC

Permalink

Post by Hannes Gredler

-you basically want the auto-bw knob to work when exporting
non-igp-if direct routes, plus you want the metric calculator
to also honor ae dynamic bandwidth information.

No, I expected non-igp-if direct routes to have a metric of 0 unless I
set something different, and was surprised when that wasn't the case.
You're telling me that this behavior is actually a feature, but I can't
find any documentation to support the metrics I've observed, so I'm
wondering if something is actually broken here. The only explanation I
can reverse engineer from the numbers I've observed so far is a
hard-coded reference bandwidth of 100g + not auto-updating the metric
when an ae changes speed.

What I actually want was accomplished by setting metric 0 in the export
policy on direct routes, I'm just trying to help you guys figure out if
this is actually a bug and/or a documentation glitch to save the next
guy the trouble that I had. :)

Post by Hannes Gredler
can you mark all the testcases with 'X'
that appears to be broken in your setup ?

I'm completely ignoring igp-if interfaces, I manually set my metrics on
those interfaces and I'm going to assume that auto-bw works correctly on
them.

What I CAN'T explain are the default metrics on non-igp-if direct
routes. I have a configured reference-bandwidth of 1000g, but after
checking about two dozen interfaces on my network I observed the
following default value behaviors:

* 10 on all of the non-igp-if 10GE interfaces I checked
* 5 on most of the non-igp-if 2x10GE AE interfaces I checked
* 3 on most of the non-igp-if 3x10GE AE interfaces I checked
* 5 on some non-igp-if 3x10GE AE interfaces that were probably initially
2x10GE AEs when the IGP came up.

This doesn't match the configured reference-bandwidth of 1000g. The only
thing it would match is a reference-bandwidth of 100g that doesn't
auto-update for AE members with dynamically changing bandwidth values.

This might or not be the case, it's entirely possible that something
completely unrelated is causing these values and if I looked through
more interfaces I would find an exception to disprove the theory. But I
know for certain that neither "always set to 10" nor "always set to the
reference bandwidth" are correct. :)

--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)