[j-nsp] Core network design for an ISP

Discussion:

[j-nsp] Core network design for an ISP

Matthew Crocker

2016-03-24 23:57:14 UTC

Hello,

What is the current best practice for carrying full tables in MX series routers? I have 3 new MX480s coming soon and will use them to rebuild my core network (currently a mix of MX240 & MX80 routers). MPC-NG (w/ 20x1g & 10x10g MICS )& RE-S-X6-64G-BB.

I’m running MPLS now and have full tables in the default route instance. Does it make more sense (i.e. more secure core) to run full tables in a separate virtual-router? I’ve been doing this small ISP thing for 20+ years, Cisco before, Juniper now, I’ve always bashed my way through.

Looking for a book, NANOG presentation or guide on what is current best practice with state of the art gear.

MPLS? BGP? IS-IS? LDP? etc.

The network is a triangle (A -> B -> C -> A), MX480 at each POP, 10g connections between POPs, 10g connections to IX & upstreams. Most customers are fed redundantly from A & B

Thanks

-Matt

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mail

Luis Balbinot

2016-03-25 01:02:58 UTC

A good practice on MX480s would be to keep upstream and downstream ports at
separate MPCs if possible. Depending on your config the standard 256M
RLDRAM from some cards might be an issue in the not so near future. I'm not
sure how much RLDRAM those NG cards have though.

I don't see any advantages of running full tables on a virtual-router,
specially if you have a 64GB RE.

For iBGP consider multiple loopback addresses for different families. I'd
do v4 and v6 (6PE with MPLS) with one family and inet-vpn, l2vpn, etc on
another one. Even with newer REs a full table is taking quite some time to
come up.

For IGP keep a flat area, no need to segment.

If starting from scratch, look at BGP-LU. Running an MX core is expensive
in terms of cost per port. You could run a much cheaper MPLS-only core in
the future with 100Gbps interfaces at only a fraction of the cost of what a
bunch of MPC4 cards would cost.

For IXs I'd recommend a separate routing-instance. This will help you avoid
stuff like someone defaulting to you and transit deviations.

Luis

Post by Matthew Crocker
Hello,
What is the current best practice for carrying full tables in MX series
routers? I have 3 new MX480s coming soon and will use them to rebuild my
core network (currently a mix of MX240 & MX80 routers). MPC-NG (w/ 20x1g &
10x10g MICS )& RE-S-X6-64G-BB.
I’m running MPLS now and have full tables in the default route instance.
Does it make more sense (i.e. more secure core) to run full tables in a
separate virtual-router? I’ve been doing this small ISP thing for 20+
years, Cisco before, Juniper now, I’ve always bashed my way through.
Looking for a book, NANOG presentation or guide on what is current best
practice with state of the art gear.
MPLS? BGP? IS-IS? LDP? etc.
The network is a triangle (A -> B -> C -> A), MX480 at each POP, 10g
connections between POPs, 10g connections to IX & upstreams. Most
customers are fed redundantly from A & B
Thanks
-Matt
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
htt

Mark Tinka

2016-03-25 08:32:49 UTC

Post by Luis Balbinot
For iBGP consider multiple loopback addresses for different families. I'd
do v4 and v6 (6PE with MPLS) with one family and inet-vpn, l2vpn, etc on
another one. Even with newer REs a full table is taking quite some time to
come up.

I'd rather native IPv6 than 6PE, just to remove IPv6 fate from IPv4.

We just had an issue where a router stopped forwarding IPv4 packets when
MPLS is enabled due to a software defect. Luckily, we could still log
into the box remotely over IPv6 is being run native.

Post by Luis Balbinot
For IGP keep a flat area, no need to segment.
If starting from scratch, look at BGP-LU. Running an MX core is expensive
in terms of cost per port. You could run a much cheaper MPLS-only core in
the future with 100Gbps interfaces at only a fraction of the cost of what a
bunch of MPC4 cards would cost.

Wouldn't work if he had to run native IPv6 and whatever cheap core
router he is using does not have enough FIB slots to support the growing
IPv6 table. However, if he going to do 6PE (which I wouldn't), then this
is fine.

Post by Luis Balbinot
For IXs I'd recommend a separate routing-instance. This will help you avoid
stuff like someone defaulting to you and transit deviations.

Or a separate router, to keep things simple, if he can afford it.

Since the OP already has an outgoing MX80, he can dedicate that to
peering and not muck about with putting Internet traffic in VRF if he's
not so inclined.

Mark.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-25 15:20:30 UTC

Post by Luis Balbinot
A good practice on MX480s would be to keep upstream and downstream ports at
separate MPCs if possible. Depending on your config the standard 256M
RLDRAM from some cards might be an issue in the not so near future. I'm not
sure how much RLDRAM those NG cards have though.

It should be safe for at least 1.5M IPv4 FIB + reasonable IPv6 FIB.
It's pretty far future, unless you have large L3 MPLS VPN tables in
addition.

There are some other benefits running separate MPC for edge and core,
but it might not make financial sense. Obviously you want core
interfaces in separate MPCs, so having 3 MPC on smallest pop, and
potentially just 1 interface in each core MPC, may be just too high
premium for it.

I would not specifically plan on separate MPC for edge+core, unless
I'd knew that I'm going to have large VPN tables and 1.5M won't be
enough.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Raphael Mazelier

2016-03-25 15:28:08 UTC

Post by Saku Ytti

Post by Luis Balbinot
A good practice on MX480s would be to keep upstream and downstream ports at
separate MPCs if possible. Depending on your config the standard 256M
RLDRAM from some cards might be an issue in the not so near future. I'm not
sure how much RLDRAM those NG cards have though.

It should be safe for at least 1.5M IPv4 FIB + reasonable IPv6 FIB.
It's pretty far future, unless you have large L3 MPLS VPN tables in
addition.
There are some other benefits running separate MPC for edge and core,
but it might not make financial sense. Obviously you want core
interfaces in separate MPCs, so having 3 MPC on smallest pop, and
potentially just 1 interface in each core MPC, may be just too high
premium for it.
I would not specifically plan on separate MPC for edge+core, unless
I'd knew that I'm going to have large VPN tables and 1.5M won't be
enough.

What the point to separate upstream and downstream port on different MPC
? (apart FIB size)

--
Raphael Mazelier
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-25 15:37:21 UTC

What the point to separate upstream and downstream port on different MPC ?
(apart FIB size)

If you've cocked up your lo0/ddos-protection config (have not yet seen
network which has not) customer side attack won't bring your device
down if it's on different mpc, as there is build-in policer from
npu=>lc_cpu, so lc_cpu can only offer known amount of traffic to RE,
which is not enough to congest you.

It's minor benefit and I wouldn't separate MPCs based on this. Only
reason I'd do edge/core MPC separation if I'm anyhow going to have
enough MPC/ports to pull it off without extra CAPEX, then it would be
no brainer.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Raphael Mazelier

2016-03-25 15:46:20 UTC

Post by Saku Ytti
It's minor benefit and I wouldn't separate MPCs based on this. Only
reason I'd do edge/core MPC separation if I'm anyhow going to have
enough MPC/ports to pull it off without extra CAPEX, then it would be
no brainer.

Interesting, but I have make the opposite, aka mixed edge and core link
on MPC. The idea was to provide redundancy in case of one MPC failure.

--
Raphael Mazelier
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mai

Saku Ytti

2016-03-25 15:55:20 UTC

Interesting, but I have make the opposite, aka mixed edge and core link on
MPC. The idea was to provide redundancy in case of one MPC failure.

Yes this is absolutely more important. So if you you can buy just 2
MPC, then for sure mix and match so single MPC failure does not kill
everything.

Only if due to organic reasons (lot of ports/capacity to core) you can
put core and edge in different MPC, do it, but don't 'waste' MPC
capacity to get it.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 10:24:52 UTC

Post by Raphael Mazelier
Interesting, but I have make the opposite, aka mixed edge and core
link on MPC. The idea was to provide redundancy in case of one MPC
failure.

This is what we do.

I can't tell you how many times it has saved us when MPC's die randomly.

Mark.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-27 10:30:32 UTC

Post by Mark Tinka

Post by Raphael Mazelier
Interesting, but I have make the opposite, aka mixed edge and core
link on MPC. The idea was to provide redundancy in case of one MPC
failure.

This is what we do.
I can't tell you how many times it has saved us when MPC's die randomly.

I hope I was clear enough that this is absolutely more important that
edge/core separation. But if you organically have enough MPCs and
ports, that you don't need to invest on more cards, then you should
separate edge+core as well.
But of course if your density is low, you're not gonna buy 4 MPCs just
to have this, and that is fine.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 10:46:52 UTC

Post by Saku Ytti
I hope I was clear enough that this is absolutely more important that
edge/core separation. But if you organically have enough MPCs and
ports, that you don't need to invest on more cards, then you should
separate edge+core as well.
But of course if your density is low, you're not gonna buy 4 MPCs just
to have this, and that is fine.

Agree.

With 10Gbps ports in the MPC's, and if you have an even number of MPC's,
we find that we naturally share MPC's between edge and core links, but
split the links across the MPC's.

We could end up with an MPC carrying only edge or core links at a single
time, but only if organic growth ends up in us adding only a single MPC
due to insufficient ports on a MIC in one of the pre-existing MPC's,
leading to an odd number of MPC's.

That said, we see that when we have to migrate from N x 10Gbps to
100Gbps for core-facing links, then we end up easily having separate
MPC's for the core (100Gps ports), and separate MPC's for the edge
(10Gbps ports).

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 10:23:51 UTC

Post by Raphael Mazelier
What the point to separate upstream and downstream port on different
MPC ? (apart FIB size)

I wouldn't put them on separate MPC's - I'd split it.

Two links to the core on separate MPC's + two links to customers on
separate MPC's.

That gives you both capacity and resiliency.

Mark.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Adam Vitkovsky

2016-03-25 17:42:59 UTC

Hey Saku,

Saku Ytti
Sent: Friday, March 25, 2016 3:21 PM

Post by Luis Balbinot
A good practice on MX480s would be to keep upstream and downstream
ports at separate MPCs if possible. Depending on your config the
standard 256M RLDRAM from some cards might be an issue in the not so
near future. I'm not sure how much RLDRAM those NG cards have though.

It should be safe for at least 1.5M IPv4 FIB + reasonable IPv6 FIB.
It's pretty far future, unless you have large L3 MPLS VPN tables in addition.
There are some other benefits running separate MPC for edge and core, but
it might not make financial sense. Obviously you want core interfaces in
separate MPCs, so having 3 MPC on smallest pop, and potentially just 1
interface in each core MPC, may be just too high premium for it.
I would not specifically plan on separate MPC for edge+core, unless I'd knew
that I'm going to have large VPN tables and 1.5M won't be enough.
--

My understanding is that MX does not support(yet) "selective VRF download" (don't know the juniper name for the feature)
Anyways Cisco stopped using it as it was causing more problems than it solved.

Also since you folks talk about converged networks that is mixing services and internet on one network -have you tested how the kit performs in corner cases (DDoS), would love to hear your experiences.
adam

Adam Vitkovsky
IP Engineer

T: 0333 006 5936
E: ***@gamma.co.uk
W: www.gamma.co.uk

This is an email from Gamma Telecom Ltd, trading as “Gamma”. The contents of this email are confidential to the ordinary user of the email address to which it was addressed. This email is not intended to create any legal relationship. No one else may place any reliance upon it, or copy or forward all or any of it in any form (unless otherwise notified). If you receive this email in error, please accept our apologies, we would be obliged if you would telephone our postmaster on +44 (0) 808 178 9652 or email ***@gamma.co.uk

Gamma Telecom Limited, a company incorporated in England and Wales, with limited liability, with registered number 04340834, and whose registered office is at 5 Fleet Place London EC4M 7RD and whose principal place of business is at Kings House, Kings Road West, Newbury, Berkshire, RG14 5BY.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https:/

Saku Ytti

2016-03-25 17:50:00 UTC

On 25 March 2016 at 19:42, Adam Vitkovsky <***@gamma.co.uk> wrote:

Hey Adam,

Post by Adam Vitkovsky
My understanding is that MX does not support(yet) "selective VRF download" (don't know the juniper name for the feature)
Anyways Cisco stopped using it as it was causing more problems than it solved.

I believe Luis refers to FIB localisation introduced in 12.3:
http://www.juniper.net/documentation/en_US/junos15.1/topics/concept/fib-localization-overview.html>

Post by Adam Vitkovsky
Also since you folks talk about converged networks that is mixing services and internet on one network -have you tested how the kit performs in corner cases (DDoS), would love to hear your experiences.

I've tried to punish MX in lab quite a bit and have found issues in
ddos-protection behaviour, some very dramatic. But today, AFAIK,
correctly configured MX is very robust against control-plane attacks,
much more so than ASR9k. But out-of-the-box ASR9k is much better
defended. And I've not yet read any lo0 filter anywhere which isn't
fundamentally broken, including cymry secure templates.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Adam Vitkovsky

2016-03-25 19:39:04 UTC

Adam Vitkovsky
IP Engineer

T: 0333 006 5936
E: ***@gamma.co.uk
W: www.gamma.co.uk

This is an email from Gamma Telecom Ltd, trading as “Gamma”. The contents of this email are confidential to the ordinary user of the email address to which it was addressed. This email is not intended to create any legal relationship. No one else may place any reliance upon it, or copy or forward all or any of it in any form (unless otherwise notified). If you receive this email in error, please accept our apologies, we would be obliged if you would telephone our postmaster on +44 (0) 808 178 9652 or email ***@gamma.co.uk

Gamma Telecom Limited, a company incorporated in England and Wales, with limited liability, with registered number 04340834, and whose registered office is at 5 Fleet Place London EC4M 7RD and whose principal place of business is at Kings House, Kings Road West, Newbury, Berkshire, RG14 5BY.

-----Original Message-----

Sent: Friday, March 25, 2016 5:50 PM
To: Adam Vitkovsky
Cc: Luis Balbinot; jnsp list
Subject: Re: [j-nsp] Core network design for an ISP
On 25 March 2016 at 19:42, Adam Vitkovsky
Hey Adam,

Post by Adam Vitkovsky
My understanding is that MX does not support(yet) "selective VRF
download" (don't know the juniper name for the feature) Anyways Cisco

stopped using it as it was causing more problems than it solved.
http://www.juniper.net/documentation/en_US/junos15.1/topics/concept/f
ib-localization-overview.html>

Hmm interesting concept -then with this feature enabled where would the VRF filter be executed on FIB-remote PFE or FIB-local PFE?

Post by Adam Vitkovsky
Also since you folks talk about converged networks that is mixing services

and internet on one network -have you tested how the kit performs in
corner cases (DDoS), would love to hear your experiences.
I've tried to punish MX in lab quite a bit and have found issues in ddos-
protection behaviour, some very dramatic. But today, AFAIK, correctly
configured MX is very robust against control-plane attacks, much more so
than ASR9k. But out-of-the-box ASR9k is much better defended. And I've not
yet read any lo0 filter anywhere which isn't fundamentally broken, including
cymry secure templates.

Sorry I wasn’t clear I meant how the box performs when under DDoS attack.

But yeah I guess I know what you mean with regards to lo0 filters I've been there, what I miss in Junos is the ability to say that only defined interfaces can be used to access the box. So one has to be very careful with the filter construction as well as understand the lo0 filter applicability rules posted here recently.

adam

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/ma

Saku Ytti

2016-03-25 19:56:27 UTC

Post by Adam Vitkovsky

Post by Saku Ytti
http://www.juniper.net/documentation/en_US/junos15.1/topics/concept/f
ib-localization-overview.html>

Hmm interesting concept -then with this feature enabled where would the VRF filter be executed on FIB-remote PFE or FIB-local PFE?

I'm not big fan, due to the potential multiple NPUs involved in
lookups and multiple fabric travels. I'm not intimately familiar with
the feature though.

Post by Adam Vitkovsky
Sorry I wasn’t clear I meant how the box performs when under DDoS attack.

Do you mean transit DDoS? With proper QoS, should be fine.

Post by Adam Vitkovsky
But yeah I guess I know what you mean with regards to lo0 filters I've been there, what I miss in Junos is the ability to say that only defined interfaces can be used to access the box. So one has to be very careful with the filter construction as well as understand the lo0 filter applicability rules posted here recently.

You could use interface-groups, they are mutually exclusive with some
forwarding filters though. I've previously used interface-groups to
mark edge interfaces with 'privileged' access to control-plane, such
like DHCP.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/ma

Adam Vitkovsky

2016-03-25 20:52:38 UTC

Sent: Friday, March 25, 2016 7:56 PM
On 25 March 2016 at 21:39, Adam Vitkovsky
http://www.juniper.net/documentation/en_US/junos15.1/topics/concept/f

Post by Adam Vitkovsky

Post by Saku Ytti
ib-localization-overview.html>

Hmm interesting concept -then with this feature enabled where would the

VRF filter be executed on FIB-remote PFE or FIB-local PFE?
I'm not big fan, due to the potential multiple NPUs involved in lookups and
multiple fabric travels. I'm not intimately familiar with the feature though.

Not a fan of VRF based features or localization
As far as I know you'll get involved with lookups on multiple NPUs either way, though I'm not aware of any multiple fabric travels (apart from m-cast replication god forbid :) )

Post by Adam Vitkovsky
Sorry I wasn’t clear I meant how the box performs when under DDoS

attack.
Do you mean transit DDoS? With proper QoS, should be fine.

Yeah transit DDoS and how it flows through the chassis along VPN traffic, well "should be fine" but have anyone tested this actually please?

Post by Adam Vitkovsky
But yeah I guess I know what you mean with regards to lo0 filters I've been

there, what I miss in Junos is the ability to say that only defined interfaces
can be used to access the box. So one has to be very careful with the filter
construction as well as understand the lo0 filter applicability rules posted
here recently.
You could use interface-groups, they are mutually exclusive with some
forwarding filters though. I've previously used interface-groups to mark edge
interfaces with 'privileged' access to control-plane, such like DHCP.

Not familiar with interface-groups but wouldn't want to restrict myself with such an elemental thing I guess.

adam

Adam Vitkovsky
IP Engineer

T: 0333 006 5936
E: ***@gamma.co.uk
W: www.gamma.co.uk

This is an email from Gamma Telecom Ltd, trading as “Gamma”. The contents of this email are confidential to the ordinary user of the email address to which it was addressed. This email is not intended to create any legal relationship. No one else may place any reliance upon it, or copy or forward all or any of it in any form (unless otherwise notified). If you receive this email in error, please accept our apologies, we would be obliged if you would telephone our postmaster on +44 (0) 808 178 9652 or email ***@gamma.co.uk

Gamma Telecom Limited, a company incorporated in England and Wales, with limited liability, with registered number 04340834, and whose registered office is at 5 Fleet Place London EC4M 7RD and whose principal place of business is at Kings House, Kings Road West, Newbury, Berkshire, RG14 5BY.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
htt

Saku Ytti

2016-03-25 20:57:18 UTC

Post by Adam Vitkovsky
Not a fan of VRF based features or localization
As far as I know you'll get involved with lookups on multiple NPUs either way, though I'm not aware of any multiple fabric travels (apart from m-cast replication god forbid :) )

As far as I understand packet coming from core won't know which edge
interface to take, so send to some edge NPU for lookup, which if lucky
has egress interface, if not, at least we now know which edge NPU has.

Post by Adam Vitkovsky
Yeah transit DDoS and how it flows through the chassis along VPN traffic, well "should be fine" but have anyone tested this actually please?

No, not specifically. I just don't see how it's different than any
normal congestion.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Paul S.

2016-03-29 13:02:34 UTC

Hi Saku,

What would a good lo0 filter template look like, in your opinion then?

Post by Saku Ytti
And I've not yet read any lo0 filter anywhere which isn't
fundamentally broken, including cymry secure templates.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tees

2016-03-29 13:05:57 UTC

There is a very nice example in the Doug Hanks MX book of what a
comprehensive lo0 filter looks like. Complete with instructions on how to
roll it out from memory.

Post by Paul S.
Hi Saku,
What would a good lo0 filter template look like, in your opinion then?

Post by Saku Ytti
And I've not yet read any lo0 filter anywhere which isn't
fundamentally broken, including cymry secure templates.

_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp

--
Regards,

Mark L. Tees
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-29 13:46:46 UTC

Post by Mark Tees
There is a very nice example in the Doug Hanks MX book of what a
comprehensive lo0 filter looks like. Complete with instructions on how to
roll it out from memory.

I arrogantly stand behind my statement that all lo filters I've seen
are fundamentally broken.

It's very hard to read, due to unjustifiiable level of abstraction.
I'll only quickly glance through IPv4 and quicker IPv6.

1. example treats internal and external ICMP the same, causing false
positives for ICMP monitoring during attack
2. ICMP ND is not limited to TTL 255, meaning anyone from Internet can
congest your next-hop resolution, not just connected attacker
3. it treats 'tcp-established' as magic toggle without verifying
source-address, allowing anyone to inject potentially 0day
packet-of-death parsing bugs abusing packets
4. it does not consistently verify daddr, allowing l3 mpls vpn
customer potentially to hammer control-plane
5. it does not discriminate various BFD modes, which have different
security posture (singlehop can limit ttl=255, multihop cant, echo and
control are different)
6. it uses 'port' match, allowing crafted source port to reach any
destination port (bgp peers can reach ssh port)
7. it does not discriminate different OSPF operations, your most
likely OSPF will work with TTL==1, which is additional hurdle for
attacker
8. it does not limit VRRP to ttl 255
9. it does not limit copy protocols (http, https,...) to
connected/establlished state
10. it does not discriminate between basic and extended LDP discovery
11. it uses 'next-header' as discard match, you should only use it as
permit match, because it's easy to circumvent

That is just 10min look. It's very complicated approach yet not
particularly secure one. But at least it's less broken than Cymru
secure template.

Few basic principles
a) never use 'port', all bidir TCP needs 'active' and 'passive' rule separately
b) never use prefix-list, always directional source/desination
c) if you run l3 mpls vpn, always verify 'destination-address'
d) have long list of permit/allow, then single discard at the end
e) if standard makes statement about TTL/hop-limit, use it, it's super
critical for ICMPv6 ND particularly
f) only use 'tcp-established' to make rule more strict, not to have
some handy catch-all return traffic permitter
g) avoid high level of abstraction, people will need to be able to
review it, preferably fast, bitrot is serious problem

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-29 14:12:15 UTC

Great points. Would you care to share an example template implementing your suggestions?

I don't think I ethically can. I have three but they've all been paid
for. I'm waiting to find some not-for-profit project to publish
something in github.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Raphael Mazelier

2016-03-29 19:51:26 UTC

Post by Saku Ytti
That is just 10min look. It's very complicated approach yet not
particularly secure one. But at least it's less broken than Cymru
secure template.
Few basic principles
a) never use 'port', all bidir TCP needs 'active' and 'passive' rule separately
b) never use prefix-list, always directional source/desination
c) if you run l3 mpls vpn, always verify 'destination-address'
d) have long list of permit/allow, then single discard at the end
e) if standard makes statement about TTL/hop-limit, use it, it's super
critical for ICMPv6 ND particularly
f) only use 'tcp-established' to make rule more strict, not to have
some handy catch-all return traffic permitter
g) avoid high level of abstraction, people will need to be able to
review it, preferably fast, bitrot is serious problem

I have always found RE protection filter over-complicated and error
prone. I stand with my very simple filter (8 terms) which are far for
perfect (and it break one of your rule), but at least it was understable
and work in my environnement.

The easy part is to protect from the external, you can even use private
IP on your core, or better dedicate a public subnet not announced in the
DMZ.

The difficult part is to protect your core from your customer. And then
filter bgp, vrrp, etc...

I think a collaborative repo on github from different source should be
helpfull for all of us (I've grab many of the filter over the years, and
can publish it if someone are interrested).

--
Raphael Mazelier
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 10:22:44 UTC

Post by Saku Ytti
It should be safe for at least 1.5M IPv4 FIB + reasonable IPv6 FIB.
It's pretty far future, unless you have large L3 MPLS VPN tables in
addition.
There are some other benefits running separate MPC for edge and core,
but it might not make financial sense. Obviously you want core
interfaces in separate MPCs, so having 3 MPC on smallest pop, and
potentially just 1 interface in each core MPC, may be just too high
premium for it.
I would not specifically plan on separate MPC for edge+core, unless
I'd knew that I'm going to have large VPN tables and 1.5M won't be
enough.

We do at least 2x MPC's on each edge router - place core-facing and
customer-facing links on both MPC's. If one MPC fails, you maintain
connectivity to both north and south network infrastructure, with just a
reduce amount of capacity.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Raphael Mazelier

2016-03-25 16:01:40 UTC

Post by Luis Balbinot
For iBGP consider multiple loopback addresses for different families. I'd
do v4 and v6 (6PE with MPLS) with one family and inet-vpn, l2vpn, etc on
another one. Even with newer REs a full table is taking quite some time to
come up.

Multiple loopbacks are always a good idea.
It make maintenance much more painless.
One loopback for inet, one for inet6, one for *vpn.
Also colleague of mine point that is good to separate familly who
support GRES from those who not.

Post by Luis Balbinot
For IGP keep a flat area, no need to segment.

Agreed, flat design with least pfx possible.
Eventually look at LFA (it does not cost much and it was cool to have
pre-installed backup path).

Post by Luis Balbinot
For IXs I'd recommend a separate routing-instance. This will help you avoid
stuff like someone defaulting to you and transit deviations.

OK, but this vrf would be leaked in the "dmz" vrf, so how do you avoid
this kind of leaking ?
For the default leak, what about a static default backup route ?

--
Raphael Mazelier

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/jun

Mark Tinka

2016-03-27 10:27:14 UTC

Post by Raphael Mazelier
Multiple loopbacks are always a good idea.
It make maintenance much more painless.
One loopback for inet, one for inet6, one for *vpn.
Also colleague of mine point that is good to separate familly who
support GRES from those who not.

I've used multiple Loopbacks for cases where RSVP tunnels need to be
strict and avoid the IGP/LDP path.

Otherwise, I've found a single Loopback to be just fine, although I can
see the case for the different Loopbacks that can be used for different
address families. I just feel it's over-engineering.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-25 08:32:38 UTC

Post by Matthew Crocker
Hello,
What is the current best practice for carrying full tables in MX series routers? I have 3 new MX480s coming soon and will use them to rebuild my core network (currently a mix of MX240 & MX80 routers). MPC-NG (w/ 20x1g & 10x10g MICS )& RE-S-X6-64G-BB.

If you're going for that many 10Gbps ports, consider 100Gbps instead;
although I'm not sure what your requirements for that would be.

Post by Matthew Crocker
I’m running MPLS now and have full tables in the default route instance. Does it make more sense (i.e. more secure core) to run full tables in a separate virtual-router? I’ve been doing this small ISP thing for 20+ years, Cisco before, Juniper now, I’ve always bashed my way through.

You'll get varying views on carrying the full table in a VRF or logical
system.

I find not doing this to be simple, but others on this list feel the
reverse. It's up to you.

Post by Matthew Crocker
Looking for a book, NANOG presentation or guide on what is current best practice with state of the art gear.
MPLS? BGP? IS-IS? LDP? etc.

Yes, all of those would be good.

Since you already have an MPLS network, keep it.

IS-IS is a great IGP. Definitely recommend it.

LDP is a simple way to distribute labels. But think about SR and SPRING
for the future as well.

Mark.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.n

Raphael Mazelier

2016-03-25 14:37:19 UTC

There are so much debate on how to construct a good network core,
but if you don't need special features, I will stay with something very
simple :

- IGP : ISIS (over OSPF because it doesn't relies on IP, more flexible,
more simple) with only loopback
- iBGP full mesh with DMZ in the main table/main vr (vrf provide more
flexibility for a little increased complexity)
- LDP for signaling MPLS (unless you really need FRR, and/or QOS)

as always KISS is a good approach :)
--
Raphael Mazelier

Post by Matthew Crocker
Hello,
What is the current best practice for carrying full tables in MX series routers? I have 3 new MX480s coming soon and will use them to rebuild my core network (currently a mix of MX240 & MX80 routers). MPC-NG (w/ 20x1g & 10x10g MICS )& RE-S-X6-64G-BB.
I’m running MPLS now and have full tables in the default route instance. Does it make more sense (i.e. more secure core) to run full tables in a separate virtual-router? I’ve been doing this small ISP thing for 20+ years, Cisco before, Juniper now, I’ve always bashed my way through.
Looking for a book, NANOG presentation or guide on what is current best practice with state of the art gear.
MPLS? BGP? IS-IS? LDP? etc.
The network is a triangle (A -> B -> C -> A), MX480 at each POP, 10g connections between POPs, 10g connections to IX & upstreams. Most customers are fed redundantly from A & B
Thanks
-Matt
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailma

Brad Fleming

2016-03-25 14:52:02 UTC

Might reach out to your Juniper SE.. I believe they have some internal “gold configs” for different sized ISPs that have been well tested internally. One of their configs might make a good base to start from.

Post by Matthew Crocker
Hello,
What is the current best practice for carrying full tables in MX series routers? I have 3 new MX480s coming soon and will use them to rebuild my core network (currently a mix of MX240 & MX80 routers). MPC-NG (w/ 20x1g & 10x10g MICS )& RE-S-X6-64G-BB.
I’m running MPLS now and have full tables in the default route instance. Does it make more sense (i.e. more secure core) to run full tables in a separate virtual-router? I’ve been doing this small ISP thing for 20+ years, Cisco before, Juniper now, I’ve always bashed my way through.
Looking for a book, NANOG presentation or guide on what is current best practice with state of the art gear.
MPLS? BGP? IS-IS? LDP? etc.
The network is a triangle (A -> B -> C -> A), MX480 at each POP, 10g connections between POPs, 10g connections to IX & upstreams. Most customers are fed redundantly from A & B
Thanks
-Matt

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listin

Saku Ytti

2016-03-25 15:07:28 UTC

On 25 March 2016 at 01:57, Matthew Crocker <***@corp.crocker.com> wrote:

Hey,

Post by Matthew Crocker
I’m running MPLS now and have full tables in the default route instance. Does it make more sense (i.e. more secure core) to run full tables in a separate virtual-router? I’ve been doing this small ISP thing for 20+ years, Cisco before, Juniper now, I’ve always bashed my way through.

If you're gonna run L3 MPLS VPN's for what ever purpose, or might run
in future, I strongly recommend putting Internet in VRF. Global table
is annoying special case and doing route injection between global
table and vrf is huge PITA in JunOS. Having Internet in VRF completely
removes this problem.

Post by Matthew Crocker
MPLS? BGP? IS-IS? LDP? etc.

Today I think you need good reason and justification not to run MPLS
and default to running it. I would certainly run MPLS. As it is
greenfield I'd try to see if running segment routing is an option
instead of LDP or RSVP. If SR is not an option, decision between LDP
and RSVP would depend on if you need strategic or tactical traffic
engineering. If you have sufficient capacity to carry all traffic in
best path during normal situation and single failure definitely no
RSVP, if you do not have sufficient capacity to carry all traffic in
best path during normal situation then definitely RSVP. If you can run
all traffic in best path, but not during single failure then LDP/RSVP
might be debatable.
Even if you choose LDP, you probably want to enable RSVP on links
without configuring any LSPs just in case you can do ad-hoc tactical
TE for specific needs. Like maybe some PE<->PE pair requires two
non-fate-sharing paths. Or maybe your capacity planning cocked up and
you can't turn-up customer until some capacity delivery is done, you
might want to run this customer's traffic on offSPT while waiting for
capacity planning to catch up.
With SR you can cover both LDP use-cases and tactical RSVP use-cases
and not run any new protocol. Your core would run only one protocol,
IGP.

If you're going to have real core (i.e. devices which do not connect
customers) then core can be BGP free, as long as your edge will have
iBGP full-mesh or your route reflectors support ORR (optimal route
reflection).
I would start with RR iBGP day1, because RIB scale likely will hit you
before hardware FIB scale. I would work very hard to do off-path RR
with vMX or equivalent, but I would absolutely require ORR to be there
for this solution to be acceptable.

I'm great supporter of separating control-plane from services and
would run IPv6 in 6PE until one day fork lift network IPV6 only and
put IPv4 in 4PE. Only reason why I might not run 6PE is if I'd run SR.
Goal would be to keep signalling and state in network to minimum.
Software from all vendors is extremely bad, and the less codepaths you
need to explore, the better.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.ne

Mark Tinka

2016-03-27 10:20:58 UTC

Post by Saku Ytti
If you're gonna run L3 MPLS VPN's for what ever purpose, or might run
in future, I strongly recommend putting Internet in VRF. Global table
is annoying special case and doing route injection between global
table and vrf is huge PITA in JunOS. Having Internet in VRF completely
removes this problem.

I find the opposite to be true, but as I mentioned before, there are
several members on this list that find Internet in a VRF to better than
in the global table.

Post by Saku Ytti
Today I think you need good reason and justification not to run MPLS
and default to running it. I would certainly run MPLS. As it is
greenfield I'd try to see if running segment routing is an option
instead of LDP or RSVP. If SR is not an option, decision between LDP
and RSVP would depend on if you need strategic or tactical traffic
engineering. If you have sufficient capacity to carry all traffic in
best path during normal situation and single failure definitely no
RSVP, if you do not have sufficient capacity to carry all traffic in
best path during normal situation then definitely RSVP. If you can run
all traffic in best path, but not during single failure then LDP/RSVP
might be debatable.
Even if you choose LDP, you probably want to enable RSVP on links
without configuring any LSPs just in case you can do ad-hoc tactical
TE for specific needs. Like maybe some PE<->PE pair requires two
non-fate-sharing paths. Or maybe your capacity planning cocked up and
you can't turn-up customer until some capacity delivery is done, you
might want to run this customer's traffic on offSPT while waiting for
capacity planning to catch up.
With SR you can cover both LDP use-cases and tactical RSVP use-cases
and not run any new protocol. Your core would run only one protocol,
IGP.

We run LDP as standard, and RSVP for tactical customers want specific
paths for their services.

RSVP is enabled on all core backbone links, as well as core-facing links
on edge routers. LSP's are provisioned as needed.

Post by Saku Ytti
If you're going to have real core (i.e. devices which do not connect
customers) then core can be BGP free, as long as your edge will have
iBGP full-mesh or your route reflectors support ORR (optimal route
reflection).

Unless you are native IPv6, which will necessitate BGP for IPv6 in the core.

Post by Saku Ytti
I would start with RR iBGP day1, because RIB scale likely will hit you
before hardware FIB scale. I would work very hard to do off-path RR
with vMX or equivalent, but I would absolutely require ORR to be there
for this solution to be acceptable.

I know ORR was coming to IOS XR (not sure if it has yet).

I'm checking again to see if it's coming to the IOS XE/CSR1000v.

Post by Saku Ytti
I'm great supporter of separating control-plane from services and
would run IPv6 in 6PE until one day fork lift network IPV6 only and
put IPv4 in 4PE. Only reason why I might not run 6PE is if I'd run SR.
Goal would be to keep signalling and state in network to minimum.
Software from all vendors is extremely bad, and the less codepaths you
need to explore, the better.

I don't like 6PE due to fate-sharing, but I know a lot of people run it
because it reduces workload.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-27 10:27:06 UTC

Post by Mark Tinka
I don't like 6PE due to fate-sharing, but I know a lot of people run it
because it reduces workload.

I understand this argument, but I feel it's opposite. Let's assume
there is 5% chance of failure to occur in some time frame, i.e. 95% of
not occurring. If your control-plane depends on two separate instance
working to produce services then you have only 90% chance of failure
not occurring, i.e. you reduced service availability by deploying more
features.

Now you could argue that sure, you maybe have slightly higher chance
of failing one of them, but you have much lower chance of failing all
of them at same time, that may be true. But for me, having IPv6 up if
IPv4 is down is 0 value. The control-plane example you have is
non-issue, because I'm going to need OOB to RS232 anyhow.

I would decouple my services from my control-plane signalling. And
crucially in this particular example, if I'm running MPLS, native IPV6
is not an option, I need labeled paths to have same IPv4 and IPv6
behaviour, RSVP applicability, convergence, BGP free core etc. Only
reason I would ever consider native IPv6 is if I'm also doing IPv4
lookups in core.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 10:43:12 UTC

Post by Saku Ytti
I understand this argument, but I feel it's opposite. Let's assume
there is 5% chance of failure to occur in some time frame, i.e. 95% of
not occurring. If your control-plane depends on two separate instance
working to produce services then you have only 90% chance of failure
not occurring, i.e. you reduced service availability by deploying more
features.
Now you could argue that sure, you maybe have slightly higher chance
of failing one of them, but you have much lower chance of failing all
of them at same time, that may be true. But for me, having IPv6 up if
IPv4 is down is 0 value. The control-plane example you have is
non-issue, because I'm going to need OOB to RS232 anyhow.

We hit a severe Cisco bug on the ASR920 that broke MPLS. This broke IPv4
traffic as it is encapsulated in MPLS.

We were still able to manage the box over IPv6, as there is currently
not native no MPLS transport for that on the ASR920.

You may see this as a small, zero-value case of keeping both protocols
separate, but I see lots of value in it. OoB is not always a possibility
as it becomes more difficult to do when you have a large Metro-E Access
network, in locations that may have fibre but no phone lines or poor
2G/3G/4G connectivity.

This is just one example of fate-sharing that quickly comes to mind on
this Easter Sunday. But my point is if I can produce enough separation
for a moderate increase in human workload, while still fundamentally
keeping the actual protocol deployment simple (native IPv6 is simpler
than 6PE), why not go for it even if the gains may seem marginal on the
outside?

It's easier for me to add complexity later, than to add simplicity later.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-27 11:06:10 UTC

Post by Mark Tinka
We hit a severe Cisco bug on the ASR920 that broke MPLS. This broke IPv4
traffic as it is encapsulated in MPLS.

You just as well might have hit IPv6 bug which crashes iosd ending in
completely opposite conclusion. I think it's anecdotal. I think math
supports fewer protocols and states. With sufficient large sample
rate, you're always going to lose with your design.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 11:15:49 UTC

Post by Saku Ytti
You just as well might have hit IPv6 bug which crashes iosd ending in
completely opposite conclusion. I think it's anecdotal. I think math
supports fewer protocols and states. With sufficient large sample
rate, you're always going to lose with your design.

True.

I could also have hit the same issue if MPLS were natively encapsulating
IPv6, assuming the bug was exposed to both planes. When you're stuffed,
you're stuffed.

My point is that I want to always to be able to default to simplicity
first. For this specific case, disabling MPLS resolved the IPv4
forwarding problem, even though it broke MPLS services (a moot point
since signaling of [IPv4] LSP's failed anyway, with MPLS enabled).

I'm not sure what other failure vector or scenario could present itself
with a more complex design (even if it brought reduced network state and
control plane activity), but I'm willing to add a little more state in
the network if it means the end design is structurally simple, until it
has to get complex.

Mark.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-27 11:25:35 UTC

Post by Mark Tinka
I could also have hit the same issue if MPLS were natively encapsulating
IPv6, assuming the bug was exposed to both planes. When you're stuffed,
you're stuffed.

Agreed, in edge there is no escaping producing the edge service. But
in core you're protected from defects in large amount of edge services
code if you make core MPLS only.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-27 11:30:34 UTC

Post by Saku Ytti
Agreed, in edge there is no escaping producing the edge service. But
in core you're protected from defects in large amount of edge services
code if you make core MPLS only.

I'm getting close.

MPLSv6 for LDP is now available in IOS XR. Junos 16 will have support as
well, if they stick to the schedule.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Saku Ytti

2016-03-27 11:47:04 UTC

Post by Mark Tinka
I'm getting close.
MPLSv6 for LDP is now available in IOS XR. Junos 16 will have support as
well, if they stick to the schedule.

I think this is not really delivering the fewer state/protocols?
You're going to run another LDP session, exposing new LDP code. It's
probably going to more fragile than either native IPv6 or 6PE.
I would not touch LDPv6 ever. I'd do 6PE or SR.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Mark Tinka

2016-03-28 05:42:10 UTC

Post by Saku Ytti
I think this is not really delivering the fewer state/protocols?
You're going to run another LDP session, exposing new LDP code. It's
probably going to more fragile than either native IPv6 or 6PE.
I would not touch LDPv6 ever. I'd do 6PE or SR.

I won't be deploying it in the early code, just testing. It'll probably
take me a year to 18x months to operationalize LDPv6.

While it will add yet another protocol, I can take BGPv6 out of the core.

I'm still looking at SR, but its deployment will depend on its level of
support and maturity compared to LDPv6 at the time I'm ready to add
either one.

Definitely won't do 6PE, but that's just me.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

38 Replies
200 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Matthew Crocker 2016-03-24 23:57:14 UTC

Luis Balbinot 2016-03-25 01:02:58 UTC

Mark Tinka 2016-03-25 08:32:49 UTC

Saku Ytti 2016-03-25 15:20:30 UTC

Raphael Mazelier 2016-03-25 15:28:08 UTC

Saku Ytti 2016-03-25 15:37:21 UTC

Raphael Mazelier 2016-03-25 15:46:20 UTC

Saku Ytti 2016-03-25 15:55:20 UTC

Mark Tinka 2016-03-27 10:24:52 UTC

Saku Ytti 2016-03-27 10:30:32 UTC

Mark Tinka 2016-03-27 10:46:52 UTC

Mark Tinka 2016-03-27 10:23:51 UTC

Adam Vitkovsky 2016-03-25 17:42:59 UTC

Saku Ytti 2016-03-25 17:50:00 UTC

Adam Vitkovsky 2016-03-25 19:39:04 UTC

Saku Ytti 2016-03-25 19:56:27 UTC

Adam Vitkovsky 2016-03-25 20:52:38 UTC

Saku Ytti 2016-03-25 20:57:18 UTC

Paul S. 2016-03-29 13:02:34 UTC

Mark Tees 2016-03-29 13:05:57 UTC

Saku Ytti 2016-03-29 13:46:46 UTC

Saku Ytti 2016-03-29 14:12:15 UTC

Raphael Mazelier 2016-03-29 19:51:26 UTC

Mark Tinka 2016-03-27 10:22:44 UTC

Raphael Mazelier 2016-03-25 16:01:40 UTC

Mark Tinka 2016-03-27 10:27:14 UTC

Mark Tinka 2016-03-25 08:32:38 UTC

Raphael Mazelier 2016-03-25 14:37:19 UTC

Brad Fleming 2016-03-25 14:52:02 UTC

Saku Ytti 2016-03-25 15:07:28 UTC

Mark Tinka 2016-03-27 10:20:58 UTC

Saku Ytti 2016-03-27 10:27:06 UTC

Mark Tinka 2016-03-27 10:43:12 UTC

Saku Ytti 2016-03-27 11:06:10 UTC

Mark Tinka 2016-03-27 11:15:49 UTC

Saku Ytti 2016-03-27 11:25:35 UTC

Mark Tinka 2016-03-27 11:30:34 UTC

Saku Ytti 2016-03-27 11:47:04 UTC

Mark Tinka 2016-03-28 05:42:10 UTC

about - legalese

Loading...