Discussion:
[j-nsp] VPC mc-lag
Mehul Gajjar
2018-07-03 13:42:41 UTC
Permalink
Hello experts,

Can someone pls understand me how is different between Cisco VPC and
Juniper Mc-LAG.




--

Cheers !!!
Mehul
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Tomasz Mikołajek
2018-07-03 14:07:34 UTC
Permalink
Hello. You want to Configure mc-lag between Cisco and Juniper?

W dniu wt., 3.07.2018 o 15:43 Mehul Gajjar <***@gmail.com> napisał(a):

> Hello experts,
>
> Can someone pls understand me how is different between Cisco VPC and
> Juniper Mc-LAG.
>
>
>
>
> --
>
> Cheers !!!
> Mehul
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.n
Mehul Gajjar
2018-07-03 14:36:14 UTC
Permalink
Need to know what is the difference between Cisco VPC and juniper Mc-lag.

On Tuesday, July 3, 2018, Tomasz Mikołajek <***@gmail.com> wrote:

> Hello. You want to Configure mc-lag between Cisco and Juniper?
>
> W dniu wt., 3.07.2018 o 15:43 Mehul Gajjar <***@gmail.com>
> napisał(a):
>
>> Hello experts,
>>
>> Can someone pls understand me how is different between Cisco VPC and
>> Juniper Mc-LAG.
>>
>>
>>
>>
>> --
>>
>> Cheers !!!
>> Mehul
>> _______________________________________________
>> juniper-nsp mailing list juniper-***@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
>

--

Cheers !!!
Mehul
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-n
Doug McIntyre
2018-07-03 14:45:22 UTC
Permalink
On Tue, Jul 03, 2018 at 08:06:14PM +0530, Mehul Gajjar wrote:
> Need to know what is the difference between Cisco VPC and juniper Mc-lag.

Overall, they do the same general function, allow a pair of switches to
port-channel/aggregated-ethernet a connection in a redundant fashion.

The main difference with Juniper it is one of a variety of ways to
solve the same problem, between virtual-chassis, MC-LAG, virtual-fabric,
etc.

So, the answer to your question is it is just each vendor's way to do
the same thing.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-03 20:19:56 UTC
Permalink
On 3/Jul/18 16:45, Doug McIntyre wrote:

> Overall, they do the same general function, allow a pair of switches to
> port-channel/aggregated-ethernet a connection in a redundant fashion.
>
> The main difference with Juniper it is one of a variety of ways to
> solve the same problem, between virtual-chassis, MC-LAG, virtual-fabric,
> etc.
>
> So, the answer to your question is it is just each vendor's way to do
> the same thing.

I'm yet to hear of anyone trying to do MC-LAG between different vendors.

Sounds like a proper recipe for disaster, AFAIK, if it'll actually
launch off the pad...

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Gert Doering
2018-07-03 20:29:57 UTC
Permalink
Hi,

On Tue, Jul 03, 2018 at 10:19:56PM +0200, Mark Tinka wrote:
> On 3/Jul/18 16:45, Doug McIntyre wrote:
>
> > Overall, they do the same general function, allow a pair of switches to
> > port-channel/aggregated-ethernet a connection in a redundant fashion.
> >
> > The main difference with Juniper it is one of a variety of ways to
> > solve the same problem, between virtual-chassis, MC-LAG, virtual-fabric,
> > etc.
> >
> > So, the answer to your question is it is just each vendor's way to do
> > the same thing.
>
> I'm yet to hear of anyone trying to do MC-LAG between different vendors.
>
> Sounds like a proper recipe for disaster, AFAIK, if it'll actually
> launch off the pad...

Since side A does not know that side B is actually a "MC-LAG" (if
side B does things right) not sure where you expect problems.

A proper MC-LAG is not really that much different from "a LAG across
a stacked set of switches" or "a LAG across different line cards" - just
the coupling/learning protocol between the entities involved differs.

gert
--
"If was one thing all people took for granted, was conviction that if you
feed honest figures into a computer, honest figures come out. Never doubted
it myself till I met a computer with a sense of humor."
Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany ***@greenie.muc.de
Mark Tinka
2018-07-03 20:42:45 UTC
Permalink
On 3/Jul/18 22:29, Gert Doering wrote:

> Since side A does not know that side B is actually a "MC-LAG" (if
> side B does things right) not sure where you expect problems.
>
> A proper MC-LAG is not really that much different from "a LAG across
> a stacked set of switches" or "a LAG across different line cards" - just
> the coupling/learning protocol between the entities involved differs.

As far as the theory goes, yes, I'm with you...

But like with Segment Routing, who has actually done it? And more
importantly, done it to be doing anything else at 3AM in lieu of waiting
for a call from the NOC?

Mark.
Mark Tinka
2018-07-03 20:42:23 UTC
Permalink
On 3/Jul/18 22:29, Gert Doering wrote:

> Since side A does not know that side B is actually a "MC-LAG" (if
> side B does things right) not sure where you expect problems.
>
> A proper MC-LAG is not really that much different from "a LAG across
> a stacked set of switches" or "a LAG across different line cards" - just
> the coupling/learning protocol between the entities involved differs.

As far as the theory goes, yes, I'm with you...

But like with Segment Routing, who has actually done it? And more
importantly, done it to be doing anything else at 3AM in lieu of waiting
for a call from the NOC?

Mark.
Niall Donaghy
2018-07-04 08:58:57 UTC
Permalink
Hi Mark,

As for segment routing, several of our NREN partners have SR up and running in their backbones.
We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.

Can't comment on MC-LAG though. :)

Br,
Niall

Niall Donaghy
Senior Network Engineer
GÉANT
T: +44 (0)1223 371393
M: +44 (0) 7557770303
Skype: niall.donaghy-dante
PGP Key ID: 0x77680027
nic-hdl: NGD-RIPE

Networks • Services • People
Learn more at www.geant.org​
​​
GÉANT Vereniging (Association) is registered with the Chamber of Commerce in Amsterdam with registration number 40535155 and operates in the UK as a branch of GÉANT Vereniging. Registered office: Hoekenrode 3, 1102BR Amsterdam, The Netherlands. UK branch address: City House, 126-130 Hills Road, Cambridge CB2 1PQ, UK.



-----Original Message-----
From: juniper-nsp [mailto:juniper-nsp-***@puck.nether.net] On Behalf Of Mark Tinka
Sent: 03 July 2018 21:42
To: Gert Doering <***@greenie.muc.de>
Cc: juniper-***@puck.nether.net
Subject: Re: [j-nsp] VPC mc-lag



On 3/Jul/18 22:29, Gert Doering wrote:

> Since side A does not know that side B is actually a "MC-LAG" (if side
> B does things right) not sure where you expect problems.
>
> A proper MC-LAG is not really that much different from "a LAG across a
> stacked set of switches" or "a LAG across different line cards" - just
> the coupling/learning protocol between the entities involved differs.

As far as the theory goes, yes, I'm with you...

But like with Segment Routing, who has actually done it? And more importantly, done it to be doing anything else at 3AM in lieu of waiting for a call from the NOC?

Mark.
Mark Tinka
2018-07-04 09:09:36 UTC
Permalink
On 4/Jul/18 10:58, Niall Donaghy wrote:
> Hi Mark,
>
> As for segment routing, several of our NREN partners have SR up and running in their backbones.
> We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.

Thanks, Niall. This will probably be the first deployment of SR in the
wild that I've heard of (I'm on the PC for several NOG's, and getting a
submission on SR from anyone other than a vendor has been the bane of my
PC existence since 2013).

I'll be in London this August. Can we have a beer? I'll buy...

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listin
Niall Donaghy
2018-07-04 11:05:44 UTC
Permalink
Hi Mark,



Though I’m attached to our Cambridge office, I work from home in Ireland.

If I were in England I surely would accept a ‘s/beer/coffee/g’.

Let me know if you’re in Belfast or Dublin?



Suggest you can email me off-list and see what information you’re after.

I can maybe put you in touch with the relevant folk (not me, yet – mountains of work to do before we get to SR readiness; just theoretical at the moment).



Br,

Niall





From: Mark Tinka [mailto:***@seacom.mu]
Sent: 04 July 2018 10:10
To: Niall Donaghy <***@geant.org>; Gert Doering <***@greenie.muc.de>
Cc: juniper-***@puck.nether.net
Subject: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)





On 4/Jul/18 10:58, Niall Donaghy wrote:

Hi Mark,

As for segment routing, several of our NREN partners have SR up and running in their backbones.
We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.


Thanks, Niall. This will probably be the first deployment of SR in the wild that I've heard of (I'm on the PC for several NOG's, and getting a submission on SR from anyone other than a vendor has been the bane of my PC existence since 2013).

I'll be in London this August. Can we have a beer? I'll buy...

Mark.
Mark Tinka
2018-07-04 11:39:47 UTC
Permalink
Much obliged, Niall.

Mark.

On 4/Jul/18 13:05, Niall Donaghy wrote:
>
> Hi Mark,
>
>  
>
> Though I’m attached to our Cambridge office, I work from home in Ireland.
>
> If I were in England I surely would accept a ‘s/beer/coffee/g’.
>
> Let me know if you’re in Belfast or Dublin?
>
>  
>
> Suggest you can email me off-list and see what information you’re after.
>
> I can maybe put you in touch with the relevant folk (not me, yet –
> mountains of work to do before we get to SR readiness; just
> theoretical at the moment).
>
>  
>
> Br,
>
> Niall
>
>  
>
>  
>
> *From:*Mark Tinka [mailto:***@seacom.mu]
> *Sent:* 04 July 2018 10:10
> *To:* Niall Donaghy <***@geant.org>; Gert Doering
> <***@greenie.muc.de>
> *Cc:* juniper-***@puck.nether.net
> *Subject:* Re: [j-nsp] Segment Routing Real World Deployment (was: VPC
> mc-lag)
>
>  
>
>  
>
> On 4/Jul/18 10:58, Niall Donaghy wrote:
>
> Hi Mark,
>
>  
>
> As for segment routing, several of our NREN partners have SR up and running in their backbones.
>
> We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.
>
>
> Thanks, Niall. This will probably be the first deployment of SR in the
> wild that I've heard of (I'm on the PC for several NOG's, and getting
> a submission on SR from anyone other than a vendor has been the bane
> of my PC existence since 2013).
>
> I'll be in London this August. Can we have a beer? I'll buy...
>
> Mark.
>

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.
Aaron Gould
2018-07-04 21:25:09 UTC
Permalink
I'm concerned how to go from my LDP environment to SR/SPRING and what if some of my gear doesn't support SR/SPRING ? Is this LDP/SR mapping thing easy ?


Aaron

> On Jul 4, 2018, at 6:05 AM, Niall Donaghy <***@geant.org> wrote:
>
> Hi Mark,
>
>
>
> Though I’m attached to our Cambridge office, I work from home in Ireland.
>
> If I were in England I surely would accept a ‘s/beer/coffee/g’.
>
> Let me know if you’re in Belfast or Dublin?
>
>
>
> Suggest you can email me off-list and see what information you’re after.
>
> I can maybe put you in touch with the relevant folk (not me, yet – mountains of work to do before we get to SR readiness; just theoretical at the moment).
>
>
>
> Br,
>
> Niall
>
>
>
>
>
> From: Mark Tinka [mailto:***@seacom.mu]
> Sent: 04 July 2018 10:10
> To: Niall Donaghy <***@geant.org>; Gert Doering <***@greenie.muc.de>
> Cc: juniper-***@puck.nether.net
> Subject: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)
>
>
>
>
>
> On 4/Jul/18 10:58, Niall Donaghy wrote:
>
> Hi Mark,
>
> As for segment routing, several of our NREN partners have SR up and running in their backbones.
> We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.
>
>
> Thanks, Niall. This will probably be the first deployment of SR in the wild that I've heard of (I'm on the PC for several NOG's, and getting a submission on SR from anyone other than a vendor has been the bane of my PC existence since 2013).
>
> I'll be in London this August. Can we have a beer? I'll buy...
>
> Mark.
>
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://p
Gustav Ulander
2018-07-04 21:46:00 UTC
Permalink
Yea we were eying SR also but like Aron said the hole mapping between LDP and SR was to big of a risk for us so it wasn’t worth the Pros with SR.
We weren’t actually looking at including a controller rather just get it done at the same time we changed our P boxes.
So no real business case meant the risk was to big.
Has anyone actually managed to verify a business case with SR? Im guessing those mentioned bellow did?

//gustav

-----Ursprungligt meddelande-----
Från: juniper-nsp <juniper-nsp-***@puck.nether.net> För Aaron Gould
Skickat: den 4 juli 2018 23:25
Till: Niall Donaghy <***@geant.org>
Kopia: juniper-***@puck.nether.net
Ämne: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

I'm concerned how to go from my LDP environment to SR/SPRING and what if some of my gear doesn't support SR/SPRING ? Is this LDP/SR mapping thing easy ?


Aaron

> On Jul 4, 2018, at 6:05 AM, Niall Donaghy <***@geant.org> wrote:
>
> Hi Mark,
>
>
>
> Though I’m attached to our Cambridge office, I work from home in Ireland.
>
> If I were in England I surely would accept a ‘s/beer/coffee/g’.
>
> Let me know if you’re in Belfast or Dublin?
>
>
>
> Suggest you can email me off-list and see what information you’re after.
>
> I can maybe put you in touch with the relevant folk (not me, yet – mountains of work to do before we get to SR readiness; just theoretical at the moment).
>
>
>
> Br,
>
> Niall
>
>
>
>
>
> From: Mark Tinka [mailto:***@seacom.mu]
> Sent: 04 July 2018 10:10
> To: Niall Donaghy <***@geant.org>; Gert Doering
> <***@greenie.muc.de>
> Cc: juniper-***@puck.nether.net
> Subject: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC
> mc-lag)
>
>
>
>
>
> On 4/Jul/18 10:58, Niall Donaghy wrote:
>
> Hi Mark,
>
> As for segment routing, several of our NREN partners have SR up and running in their backbones.
> We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.
>
>
> Thanks, Niall. This will probably be the first deployment of SR in the wild that I've heard of (I'm on the PC for several NOG's, and getting a submission on SR from anyone other than a vendor has been the bane of my PC existence since 2013).
>
> I'll be in London this August. Can we have a beer? I'll buy...
>
> Mark.
>
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
http
a***@netconsultings.com
2018-07-05 08:18:02 UTC
Permalink
> Of Gustav Ulander
> Sent: Wednesday, July 04, 2018 10:46 PM
>
> Has anyone actually managed to verify a business case with SR? Im guessing
> those mentioned bellow did?
>
How I see it, currently the only feasible business case to go with SR is if you outgrew your scaling limits with regards to the amount of RSVP state or you plan on deploying a solution that will not scale using "stateful" RSVP. (so in other words only if you have to).
Cause clearly if you are looking at SR you already have a valid business case for TE, and in my opinion the only business reason why would one not leverage the years of development and debugging done for RSVP and become a pioneer in SR is if there's an absolute need.
If you're using RSVP solely for TE purposes and not to enforce QOS and it's contained within a single AS/IGP-domain then it's fairly easy, and with SR some of the complexity is still there it's just moved around.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
James Bensley
2018-07-05 07:46:07 UTC
Permalink
On 4 July 2018 at 22:25, Aaron Gould <***@gvtc.com> wrote:
> I'm concerned how to go from my LDP environment to SR/SPRING and what if some of my gear doesn't support SR/SPRING ? Is this LDP/SR mapping thing easy ?
>
>
> Aaron

Hi Aaron,

I think you're running Cisco gear too right so hopefully it's OK if I
supply you with a Cisco link? SR has been designed to explicitly
support an LDP to SR migration. To do this you need to use an SR
mapping server and mapping client. In terms of implementation though,
this is as simple as nominating one (or preferably more) of your boxes
that support both LDP and SR to be the mapping server and client. Here
is an IOS-XR example, it's literally a couple of lines of config:

https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/segment-routing/configuration/guide/b-seg-routing-cg-asr9k/b-seg-routing-cg-asr9k_chapter_01001.html

SR mapping nodes that support both LDP and SR will allocate SIDs to
label mappings received from your LDP only nodes and advertise them
through IGP extensions to your SR only nodes. Vice versa they can map
SR to LDP. There is also no problem having SR and LDP running on the
same box, set your SRGB/SRLB appropriate and SR and LDP will allocate
labels in different ranges and not overlap.

SR has been designed such that if you have a TE-free deployment and
have an LDP set-and-forget type deployment you don't need a controller
to deploy it and a controller-free migration is natively supported. So
risks relating to the SR technology it's self should be minimal.

Having said all that - I'm not telling you this works perfectly and
without bugs, the usual caveats apply, YMMV etc. It is new code and
not all the drafts are finalised but vendors are implemented them even
though are still subject to change, which we all know comes with
virtually guaranteed issues ;) I'm just saying all this because I've
been reading through all the drafts lately trying to evaluate SR like
everyone else.See this link for more details on LDP to SR migration:
https://tools.ietf.org/html/draft-ietf-spring-segment-routing-ldp-interop-13

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-05 12:04:04 UTC
Permalink
Thanks a lot James, that's very nice of you to explain all that to me and the community.

I have Cisco and Juniper network.

MX960 - (5 nodes) supercore
ACX5048 - (~40 nodes) distribution
ASR9k - (15 nodes) core
ME3600 - (~50 nodes) distribution


Aaron

> On Jul 5, 2018, at 2:46 AM, James Bensley <***@gmail.com> wrote:
>
>> On 4 July 2018 at 22:25, Aaron Gould <***@gvtc.com> wrote:
>> I'm concerned how to go from my LDP environment to SR/SPRING and what if some of my gear doesn't support SR/SPRING ? Is this LDP/SR mapping thing easy ?
>>
>>
>> Aaron
>
> Hi Aaron,
>
> I think you're running Cisco gear too right so hopefully it's OK if I
> supply you with a Cisco link? SR has been designed to explicitly
> support an LDP to SR migration. To do this you need to use an SR
> mapping server and mapping client. In terms of implementation though,
> this is as simple as nominating one (or preferably more) of your boxes
> that support both LDP and SR to be the mapping server and client. Here
> is an IOS-XR example, it's literally a couple of lines of config:
>
> https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/segment-routing/configuration/guide/b-seg-routing-cg-asr9k/b-seg-routing-cg-asr9k_chapter_01001.html
>
> SR mapping nodes that support both LDP and SR will allocate SIDs to
> label mappings received from your LDP only nodes and advertise them
> through IGP extensions to your SR only nodes. Vice versa they can map
> SR to LDP. There is also no problem having SR and LDP running on the
> same box, set your SRGB/SRLB appropriate and SR and LDP will allocate
> labels in different ranges and not overlap.
>
> SR has been designed such that if you have a TE-free deployment and
> have an LDP set-and-forget type deployment you don't need a controller
> to deploy it and a controller-free migration is natively supported. So
> risks relating to the SR technology it's self should be minimal.
>
> Having said all that - I'm not telling you this works perfectly and
> without bugs, the usual caveats apply, YMMV etc. It is new code and
> not all the drafts are finalised but vendors are implemented them even
> though are still subject to change, which we all know comes with
> virtually guaranteed issues ;) I'm just saying all this because I've
> been reading through all the drafts lately trying to evaluate SR like
> everyone else.See this link for more details on LDP to SR migration:
> https://tools.ietf.org/html/draft-ietf-spring-segment-routing-ldp-interop-13
>
> Cheers,
> James.
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
James Bensley
2018-07-04 16:09:13 UTC
Permalink
On 4 July 2018 at 10:09, Mark Tinka <***@seacom.mu> wrote:
>
>
> On 4/Jul/18 10:58, Niall Donaghy wrote:
>> Hi Mark,
>>
>> As for segment routing, several of our NREN partners have SR up and running in their backbones.
>> We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.
>
> Thanks, Niall. This will probably be the first deployment of SR in the
> wild that I've heard of (I'm on the PC for several NOG's, and getting a
> submission on SR from anyone other than a vendor has been the bane of my
> PC existence since 2013).

Hi Mark,

Walmart, Microsoft and Comcast all claim to have been running SR since 2016:

http://www.segment-routing.net/conferences/2016-sr-strategy-and-deployment-experiences/

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
http
James Bensley
2018-07-04 16:28:40 UTC
Permalink
On 4 July 2018 at 17:09, James Bensley <***@gmail.com> wrote:
> On 4 July 2018 at 10:09, Mark Tinka <***@seacom.mu> wrote:
>>
>>
>> On 4/Jul/18 10:58, Niall Donaghy wrote:
>>> Hi Mark,
>>>
>>> As for segment routing, several of our NREN partners have SR up and running in their backbones.
>>> We in GÉANT (the backbone that connects these NRENs) are looking toward deploying SR across our entire backbone in the medium term.
>>
>> Thanks, Niall. This will probably be the first deployment of SR in the
>> wild that I've heard of (I'm on the PC for several NOG's, and getting a
>> submission on SR from anyone other than a vendor has been the bane of my
>> PC existence since 2013).
>
> Hi Mark,
>
> Walmart, Microsoft and Comcast all claim to have been running SR since 2016:
>
> http://www.segment-routing.net/conferences/2016-sr-strategy-and-deployment-experiences/

Also

Clarence Filsfils from Cisco lists some of their customers who are
happy to be publicly named as running SR:

https://www.youtube.com/watch?v=NJxtvNssgA8&feature=youtu.be&t=11m50s

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puc
Mark Tinka
2018-07-04 17:13:54 UTC
Permalink
On 4/Jul/18 18:28, James Bensley wrote:

> Also
>
> Clarence Filsfils from Cisco lists some of their customers who are
> happy to be publicly named as running SR:
>
> https://www.youtube.com/watch?v=NJxtvNssgA8&feature=youtu.be&t=11m50s

We've been struggling to get vendors to present deployments from their
customers when they submit talks around SR. So the SR talks end up
becoming updates on where SR is from a protocol development standpoint,
recaps for those that are new to SR, e.t.c.

Perhaps those willing to talk about SR from the vendor community do not
have the in with their customers like folk like Clarence might, but I'm
not sure.

I'll reach out to Clarence and see if we can get him to talk about this
with one or two of his customers at an upcoming meeting.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
James Bensley
2018-07-05 08:15:21 UTC
Permalink
On 4 July 2018 at 18:13, Mark Tinka <***@seacom.mu> wrote:
>
>
> On 4/Jul/18 18:28, James Bensley wrote:
>
> Also
>
> Clarence Filsfils from Cisco lists some of their customers who are
> happy to be publicly named as running SR:
>
> https://www.youtube.com/watch?v=NJxtvNssgA8&feature=youtu.be&t=11m50s
>
>
> We've been struggling to get vendors to present deployments from their
> customers when they submit talks around SR. So the SR talks end up becoming
> updates on where SR is from a protocol development standpoint, recaps for
> those that are new to SR, e.t.c.
>
> Perhaps those willing to talk about SR from the vendor community do not have
> the in with their customers like folk like Clarence might, but I'm not sure.
>
> I'll reach out to Clarence and see if we can get him to talk about this with
> one or two of his customers at an upcoming meeting.

Hi Mark,

If you get any feedback you can publicly share I'm all ears!

As far as a greenfield deployment goes I'm fairly convinced that SR
would be a good idea now, it would future proof that deployment and
for our use case it does actually bring some benefits. To explain
further; we don't have one large contiguous AS or IGP, we build
regional MPLS networks, each on a private AS (and with a stand alone
IGP+LDP) and use Inter-AS services to provide end-to-end services over
the core AS network between regional networks.

If we built a new regional network tomorrow these are the benefits I
see from SR over our existing IGP+LDP design:

- Obviously remove LDP which is one less protocol in the network: This
means less to configure, less for CoPPs/Lo0 filter, less for inter-op
testing as we're mixed vendor, less for operations to support.

- Easier to support: Now that labels are transported in the IGP I hope
that it would be easier to train support staff and troubleshooting
MPLS related issues.They don't need to check LDP is up, they should
see the SID for a prefix inside the IGP along with the prefix. No
prefix, then no SID, etc. I would ideally move all services into BGP
(so no more LDP signaled pseudowires, BGP signaled only service to
unify all services as BGP signaled [L3 VPN, L2 VPN
VPWS/EVPN/VPLS/etc.]).

- Go IPv6 native: If using ISIS as the IGP we should be able to go
IPv4 free (untested and I haven't research that much!).

- Bring label mapping into the IGP: No microloops during
re-convergence as we heavily use IP FRR rLFA.

- 100% rFLA coverage: TI-LA covers the "black spots" we currently have.

- Remove LACP from the network: SR has some nice ECMP features, I'm
not going to start an ECMP vs LAG discussion (war?) but ECMP means we
don't need LACP which again is one less protocol for inter-op testing,
less to configure, less to support etc.It also keeps our p-t-p links
all they same instead of two kinds, p-t-p L3 or LAG bundle (also fewer
config templates).

- Remove microBFD sessions: In the case of LAGs in the worst case
scenario we would have LACP, uBFD, IGP, LDP and BGP running over a set
of links between PEs, we can chop that down to just BFD, IGP and BGP
with SR. If we wish, we can still have visibility of the ECMP paths or
we can use prefix-suppression and hide them (this goes against my IPv6
only item above as I think IS-IS is missing this feature?).


The downsides that I know of are;

- Need to up-skill staff: For NOC staff it should be easy, use this
command "X" to check for prefix/label, this command "Y" to check for
label neighborship. For design and senior engineers since we don't use
MPLS-TE it shouldn't be difficult, we're typically deploying
set-and-forget LDP regional networks so they don't need to know every
single detail of SR (he said, naively).

- New code: Obviously plenty of bugs exist, in the weekly emails I
receive from Cisco and Juniper with the latest bug reports many relate
to SR. But again, any established operator should have good testing
procedures in place for new hardware and software, this is no
different to all those times Sales sold something we don't actually
do. We should all be well versed in testing new code and working out
when it's low risk enough for us to deploy. Due to our lack of MPLS-TE
I see SR as fairly low risk.


I'd be very interested to hear yours or anyone else's views on the
pros and cons of SR in a greenfield network (I don't really care about
brownfield right now because we have no problems in that existing
networks that only SR can fix).

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-05 08:40:36 UTC
Permalink
On 5/Jul/18 10:15, James Bensley wrote:

>
> If you get any feedback you can publicly share I'm all ears!

Will do.

I'm currently working on getting those that have deployed it in the wild
to do a preso at an upcoming conference.


> As far as a greenfield deployment goes I'm fairly convinced that SR
> would be a good idea now, it would future proof that deployment and
> for our use case it does actually bring some benefits.

If you are deploying greenfield, then you have a good opportunity here
to go with SR.

In our case, we have different boxes from Cisco, each with varying
support for SR. This makes things very tricky, and then we need to also
throw in our Juniper gear. For me, the potential pain isn't worth the
hassle, as we are not suffering in any way that makes the move to SR
overly compelling.


> - Go IPv6 native: If using ISIS as the IGP we should be able to go
> IPv4 free (untested and I haven't research that much!).

For me, this is the #1 use-case I was going for; to be able to natively
forward IPv6 packets inside MPLS, and remove BGPv6 from within my core.

I had a discussion about this with Saku on NANOG:

    http://seclists.org/nanog/2018/May/257

Where we left things was that while the spec allows for signaling of
IPv6 in the IGP, there is no clear definition and/or implementation of
MPLSv6 in the data plane today.

For me, I don't really care whether I get MPLSv6 via LDPv6 or SR. For
the moment, LDPv6 has varying support within Cisco, so it's currently
not a migration path. SR support for MPLSv6 is unknown at the moment,
and certainly not a priority either, which leaves me with no immediate
appetite for SR.


> - Remove LACP from the network: SR has some nice ECMP features, I'm
> not going to start an ECMP vs LAG discussion (war?) but ECMP means we
> don't need LACP which again is one less protocol for inter-op testing,
> less to configure, less to support etc.It also keeps our p-t-p links
> all they same instead of two kinds, p-t-p L3 or LAG bundle (also fewer
> config templates).

I feel your pain.

As a matter of course, we stopped using LACP for IP/MPLS backbone links.
We rely on ECMP, until it makes sense to move a circuit to 100Gbps.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/m
James Bensley
2018-07-07 11:03:25 UTC
Permalink
On 5 July 2018 at 09:40, Mark Tinka <***@seacom.mu> wrote:
>
> In our case, we have different boxes from Cisco, each with varying support
> for SR. This makes things very tricky, and then we need to also throw in our
> Juniper gear. For me, the potential pain isn't worth the hassle, as we are
> not suffering in any way that makes the move to SR overly compelling.

Previously I mentioned that we build out greenfield regional networks
but the core that links them is of course brownfield. We have the same
problem there, mixed Cisco Juniper and a reasonable amount of variance
then within those two vendor selections. As previously mentioned,
there are no requirements that we can only fix with SR and the
benefits aren't worth the truckroll to get SR capable kit and code
everywhere.

> - Go IPv6 native: If using ISIS as the IGP we should be able to go
> IPv4 free (untested and I haven't research that much!).
>
>
> For me, this is the #1 use-case I was going for; to be able to natively
> forward IPv6 packets inside MPLS, and remove BGPv6 from within my core.
>
> I had a discussion about this with Saku on NANOG:
>
> http://seclists.org/nanog/2018/May/257
>
> Where we left things was that while the spec allows for signaling of IPv6 in
> the IGP, there is no clear definition and/or implementation of MPLSv6 in the
> data plane today.

Ah, I remember that thread. It became quite long and I was very busy
so I lost track of it. Just read through it. I also looked at LDPv6 a
while back and saw it was not well supported so passed. For us 6PE
(and eventually 6vPE as we move to Internet in a VRF) "just works".
IPv6 native in SR isn't actually enough of a reason for me to migrate
to it I don't think.

You mentioned in the NANOG thread that you wanted to remove BGP from
your core - are you using 6PE or BGP IPv6-LU on every hop in the path?
I know you are a happy user of BGP-SD so I guess it's Internet in the
GRT for you?

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-07 11:16:17 UTC
Permalink
On 7/Jul/18 13:03, James Bensley wrote:

> Ah, I remember that thread. It became quite long and I was very busy
> so I lost track of it. Just read through it. I also looked at LDPv6 a
> while back and saw it was not well supported so passed. For us 6PE
> (and eventually 6vPE as we move to Internet in a VRF) "just works".
> IPv6 native in SR isn't actually enough of a reason for me to migrate
> to it I don't think.

LDPv6 implementation in IOS XR was a bit spotty 2 years. After our next
round of code upgrades later this year on Cisco and Juniper, I'll see
where we are and target to get LDPv6 going before we close out 2018.


> You mentioned in the NANOG thread that you wanted to remove BGP from
> your core - are you using 6PE or BGP IPv6-LU on every hop in the path?
> I know you are a happy user of BGP-SD so I guess it's Internet in the
> GRT for you?

I removed BGPv4 from the core back in 2008 (previous job). So all IPv4
traffic is forwarded inside MPLS, purely label-switched in the core.
This is just simple LDP signaling + MPLS forwarding toward a BGP next-hop.

For IPv6, as LDPv6 is not yet fully deployed around the network, we are
still carrying BGPv6 in the core. That is normal hop-by-hop IP
forwarding. We don't like 6PE or anything like that because it still
depends on IPv4; I'd much rather run both protocols ships-in-the-night.
That way, if one of them were to break, chances are the other should
still be good.

We don't do Internet in a VRF. Seems like too much headache, but that's
just me, as I know it's a very popular architecture with many operators
out there. We carry all routing in global, as you rightly point out,
making heavy use of iBGP and communities to create excitement :-).

We love BGP-SD -- it means we can deliver the same types of services on
all platforms in all sections of the network, be it in the data centre
or Metro, or be it on a big chassis or a tiny 1U router, or be it in a
large or small PoP.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Alexandre Guimaraes
2018-07-07 12:00:54 UTC
Permalink
My Usage Cent

My core Network, P and PE, are 100% Juniper

We start using VPLS, based in BGP sessions, at that time we was working at maximum of 2 or 3 new provisions per day.
We won a big project contract, we reach 90/100 per month.
VPLS become a issue in all fronts...

Planning/ low ports - price of 10G ports using MX and rack space usage

Provisioning... vlan remap, memory usage of the routers and 2000/2500 circuits/customers per MX

Tshoot, a headache to find the signaling problem when, for example: fiber degraded, all BGP sessions start flapping and the things become crazy and the impact increase each minute.

Operating, vpls routing table become a pain is the ass when you use multipoint connections and with Lucifer reason, those multipoint become unreachable and the vpls table and all routing tables become ruge to analyze.

Regarding L2circuits using Ldp

We migrate every vpls p2p to L2circuits, with that...

Very fast provisioning, 6 lines of configuration per box.

Fast tshoot,
less MX, more QFX/EX4550/ACX2200

End to end circuits... ripping of all those routes from tables keeping tables clean as possible.

Today I can sleep a entire night, weeks and month with a problem of those MX dying due vpls memory usage, vlan misconfigured causing l2 looping.

My efforts today is working only with l2circuits... that’s the problem that I am facing now...

Ex4550 will be EOS soon... with no replacement of features... maybe the ACX5448? Who knows

L2circuits is clean and fast... with no mistakes
I use, rsvp/mpls/isis/ldp routing services in every MX family, ex4550, qfx, acx2200 that I have, Everyone knows everyone, l2circuits reach everyone part and every city where we have our network.



att
Alexandre

Em 7 de jul de 2018, à(s) 08:16, Mark Tinka <***@seacom.mu> escreveu:

>
>
>> On 7/Jul/18 13:03, James Bensley wrote:
>>
>> Ah, I remember that thread. It became quite long and I was very busy
>> so I lost track of it. Just read through it. I also looked at LDPv6 a
>> while back and saw it was not well supported so passed. For us 6PE
>> (and eventually 6vPE as we move to Internet in a VRF) "just works".
>> IPv6 native in SR isn't actually enough of a reason for me to migrate
>> to it I don't think.
>
> LDPv6 implementation in IOS XR was a bit spotty 2 years. After our next
> round of code upgrades later this year on Cisco and Juniper, I'll see
> where we are and target to get LDPv6 going before we close out 2018.
>
>
>> You mentioned in the NANOG thread that you wanted to remove BGP from
>> your core - are you using 6PE or BGP IPv6-LU on every hop in the path?
>> I know you are a happy user of BGP-SD so I guess it's Internet in the
>> GRT for you?
>
> I removed BGPv4 from the core back in 2008 (previous job). So all IPv4
> traffic is forwarded inside MPLS, purely label-switched in the core.
> This is just simple LDP signaling + MPLS forwarding toward a BGP next-hop.
>
> For IPv6, as LDPv6 is not yet fully deployed around the network, we are
> still carrying BGPv6 in the core. That is normal hop-by-hop IP
> forwarding. We don't like 6PE or anything like that because it still
> depends on IPv4; I'd much rather run both protocols ships-in-the-night.
> That way, if one of them were to break, chances are the other should
> still be good.
>
> We don't do Internet in a VRF. Seems like too much headache, but that's
> just me, as I know it's a very popular architecture with many operators
> out there. We carry all routing in global, as you rightly point out,
> making heavy use of iBGP and communities to create excitement :-).
>
> We love BGP-SD -- it means we can deliver the same types of services on
> all platforms in all sections of the network, be it in the data centre
> or Metro, or be it on a big chassis or a tiny 1U router, or be it in a
> large or small PoP.
>
> Mark.
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puc
Saku Ytti
2018-07-07 12:16:46 UTC
Permalink
Hey Alexandre,

I feel your frustration, but to me it feels like there is very little
sharable knowledge in your experience. You seem to compare full-blown
VPLS, with virtual switch and MAC learning to single LDD martini. You
also seem to blame BGP pseudowires for BGP transport flapping, clearly
you'd have equally bad time if LDP transport flaps, but at least in
BGP's case you can have redundancy via multiple RR connections, with
LDP you're reliant on the single session.

On Sat, 7 Jul 2018 at 15:01, Alexandre Guimaraes
<***@ascenty.com> wrote:
>
> My Usage Cent
>
> My core Network, P and PE, are 100% Juniper
>
> We start using VPLS, based in BGP sessions, at that time we was working at maximum of 2 or 3 new provisions per day.
> We won a big project contract, we reach 90/100 per month.
> VPLS become a issue in all fronts...
>
> Planning/ low ports - price of 10G ports using MX and rack space usage
>
> Provisioning... vlan remap, memory usage of the routers and 2000/2500 circuits/customers per MX
>
> Tshoot, a headache to find the signaling problem when, for example: fiber degraded, all BGP sessions start flapping and the things become crazy and the impact increase each minute.
>
> Operating, vpls routing table become a pain is the ass when you use multipoint connections and with Lucifer reason, those multipoint become unreachable and the vpls table and all routing tables become ruge to analyze.
>
> Regarding L2circuits using Ldp
>
> We migrate every vpls p2p to L2circuits, with that...
>
> Very fast provisioning, 6 lines of configuration per box.
>
> Fast tshoot,
> less MX, more QFX/EX4550/ACX2200
>
> End to end circuits... ripping of all those routes from tables keeping tables clean as possible.
>
> Today I can sleep a entire night, weeks and month with a problem of those MX dying due vpls memory usage, vlan misconfigured causing l2 looping.
>
> My efforts today is working only with l2circuits... that’s the problem that I am facing now...
>
> Ex4550 will be EOS soon... with no replacement of features... maybe the ACX5448? Who knows
>
> L2circuits is clean and fast... with no mistakes
> I use, rsvp/mpls/isis/ldp routing services in every MX family, ex4550, qfx, acx2200 that I have, Everyone knows everyone, l2circuits reach everyone part and every city where we have our network.
>
>
>
> att
> Alexandre
>
> Em 7 de jul de 2018, à(s) 08:16, Mark Tinka <***@seacom.mu> escreveu:
>
> >
> >
> >> On 7/Jul/18 13:03, James Bensley wrote:
> >>
> >> Ah, I remember that thread. It became quite long and I was very busy
> >> so I lost track of it. Just read through it. I also looked at LDPv6 a
> >> while back and saw it was not well supported so passed. For us 6PE
> >> (and eventually 6vPE as we move to Internet in a VRF) "just works".
> >> IPv6 native in SR isn't actually enough of a reason for me to migrate
> >> to it I don't think.
> >
> > LDPv6 implementation in IOS XR was a bit spotty 2 years. After our next
> > round of code upgrades later this year on Cisco and Juniper, I'll see
> > where we are and target to get LDPv6 going before we close out 2018.
> >
> >
> >> You mentioned in the NANOG thread that you wanted to remove BGP from
> >> your core - are you using 6PE or BGP IPv6-LU on every hop in the path?
> >> I know you are a happy user of BGP-SD so I guess it's Internet in the
> >> GRT for you?
> >
> > I removed BGPv4 from the core back in 2008 (previous job). So all IPv4
> > traffic is forwarded inside MPLS, purely label-switched in the core.
> > This is just simple LDP signaling + MPLS forwarding toward a BGP next-hop.
> >
> > For IPv6, as LDPv6 is not yet fully deployed around the network, we are
> > still carrying BGPv6 in the core. That is normal hop-by-hop IP
> > forwarding. We don't like 6PE or anything like that because it still
> > depends on IPv4; I'd much rather run both protocols ships-in-the-night.
> > That way, if one of them were to break, chances are the other should
> > still be good.
> >
> > We don't do Internet in a VRF. Seems like too much headache, but that's
> > just me, as I know it's a very popular architecture with many operators
> > out there. We carry all routing in global, as you rightly point out,
> > making heavy use of iBGP and communities to create excitement :-).
> >
> > We love BGP-SD -- it means we can deliver the same types of services on
> > all platforms in all sections of the network, be it in the data centre
> > or Metro, or be it on a big chassis or a tiny 1U router, or be it in a
> > large or small PoP.
> >
> > Mark.
> > _______________________________________________
> > juniper-nsp mailing list juniper-***@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://
Mark Tinka
2018-07-07 12:38:04 UTC
Permalink
On 7/Jul/18 14:16, Saku Ytti wrote:

> I feel your frustration, but to me it feels like there is very little
> sharable knowledge in your experience.

Hmmh, I thought there was a fair bit, explicit and inferred. I feel
Alexandre could have gone on more than his TL;DR post allowed for :-).


> You seem to compare full-blown
> VPLS, with virtual switch and MAC learning to single LDD martini. You
> also seem to blame BGP pseudowires for BGP transport flapping, clearly
> you'd have equally bad time if LDP transport flaps, but at least in
> BGP's case you can have redundancy via multiple RR connections, with
> LDP you're reliant on the single session.

Sounded like BGP flaps was one of several problems Alexandre described,
including an unruly BGP routing table, et al.

Not sure how relevant RR redundancy is per your argument, as ultimately,
a single customer needing an end-to-end pw is mostly relying on the
uptime of the PE devices at each end of their circuit, and liveliness of
the core. If those pw's are linked by an LDP thread, what would a 2nd
LDP-based pw (if that were sensibly possible) bring to the table?  I'm
not dissing BGP-based pw signaling in any way or form, but for that,
you'd need "Router + IGP + BGP + RR + RR" to be fine. With LDP-based
signaling for just this one customer, you only need "Router + IGP + LDP"
to be fine.

Personally, I've never deployed VPLS, nor had the appetite for it. It
just seemed like a handful on paper the moment it was first published,
not to mention the war stories around it from the brave souls that
deployed it when VPLS was the buzzword then that SDN is these days. It
certainly made the case for EVPN, which I still steer clear from until I
find a requirement that can't be solved any other way.

Again, no dis to anyone running VPLS; just much respect to you for all
your nerve :-).

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/li
Alexandre Guimaraes
2018-07-07 15:45:33 UTC
Permalink
Saku,

Mark is correct, l2circuits are end-to-end services where uptime is based at the termination points,
Inside the backbone, every l2circuits works under LSP with FRR, so... from on point to another, LSP can search the best route, put in place a second best route(standby), how many routes you want...

If we experience some degraded fiber/service, checking the LSP, we know where to look. And after disabling that segment of network, all problem solved.

Tshoot time decreased to few minutes only, and this makes everyone happy.

VPLS is very good when you use one port per customer, 1/10Gb, when you have to set trunk ports of 10Gb and put lots of vlan in a VC of 5to10 QFX switches, things gone wild... one error, a l2 loop will shut every customer inside that distribution POD.

Think that layer2 still inside the vpls instance inside the MX routers and drop down to the VC. You can loop easily, yes, here, ops team did l2 loop two times in one year, dropping more than 500 customers. With a big impact.... after the second, the order was to move to
L2circuits, Mac learning and so one still resides at cpe equipment, and the core still clean.

I’m just sharing my operational experience and fears, quality of services offered... we provide last mile connect from carriers like Verizon, Orange, Algar, British Telecom and so on... so quality of service and availability is the rule of the business. Also we provide l2circuits of 10/40/100Gbps for ISP over here. So, quality and uptime is the focus.

ELS cli bring me some problem with QinQ L2TP services since it doesn’t works, also with the RTG, doesn’t work to, cause that I still using ex2200, ex3300.

About those Aristas, I don’t have notice if some one use them as a P router or if they have xconnect/l2circuits services, even local Arista dealer knows.... we have some Aristas working only with layer2/3 for colocation services inside our datacenters facilities.



att
Alexandre

Em 7 de jul de 2018, à(s) 09:38, Mark Tinka <***@seacom.mu<mailto:***@seacom.mu>> escreveu:



On 7/Jul/18 14:16, Saku Ytti wrote:


I feel your frustration, but to me it feels like there is very little
sharable knowledge in your experience.

Hmmh, I thought there was a fair bit, explicit and inferred. I feel Alexandre could have gone on more than his TL;DR post allowed for :-).



You seem to compare full-blown
VPLS, with virtual switch and MAC learning to single LDD martini. You
also seem to blame BGP pseudowires for BGP transport flapping, clearly
you'd have equally bad time if LDP transport flaps, but at least in
BGP's case you can have redundancy via multiple RR connections, with
LDP you're reliant on the single session.

Sounded like BGP flaps was one of several problems Alexandre described, including an unruly BGP routing table, et al.

Not sure how relevant RR redundancy is per your argument, as ultimately, a single customer needing an end-to-end pw is mostly relying on the uptime of the PE devices at each end of their circuit, and liveliness of the core. If those pw's are linked by an LDP thread, what would a 2nd LDP-based pw (if that were sensibly possible) bring to the table? I'm not dissing BGP-based pw signaling in any way or form, but for that, you'd need "Router + IGP + BGP + RR + RR" to be fine. With LDP-based signaling for just this one customer, you only need "Router + IGP + LDP" to be fine.

Personally, I've never deployed VPLS, nor had the appetite for it. It just seemed like a handful on paper the moment it was first published, not to mention the war stories around it from the brave souls that deployed it when VPLS was the buzzword then that SDN is these days. It certainly made the case for EVPN, which I still steer clear from until I find a requirement that can't be solved any other way.

Again, no dis to anyone running VPLS; just much respect to you for all your nerve :-).

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listin
Saku Ytti
2018-07-07 16:26:54 UTC
Permalink
On Sat, 7 Jul 2018 at 18:45, Alexandre Guimaraes
<***@ascenty.com> wrote:

Hey Alexandre,

> Mark is correct, l2circuits are end-to-end services where uptime is based at the termination points,
> Inside the backbone, every l2circuits works under LSP with FRR, so... from on point to another, LSP can search the best route, put in place a second best route(standby), how many routes you want...

You can (and should) run your iBGP inside LSP too, so there is no
difference, except iBGP can be redundant.

Consider this

Edge1----Core1----Edge2
| | |
+--------Core2-----+

Now imagine you have L2 transport between Edge1 and Edge2. With iBGP
Edge[12] would receive information redundantly from both core
interfaces, to lose signalling information either end needs to lose
state to both BGP sessions.
With LDP it's single session, if that becomes sufficiently lossy that
it flaps, it's done, you lose state.

So BGP just provides more signalling redundancy.

> If we experience some degraded fiber/service, checking the LSP, we know where to look. And after disabling that segment of network, all problem solved.
> Tshoot time decreased to few minutes only, and this makes everyone happy.

What I was trying to tell, your example is implementation detail, not
fundamental problem of BGP signalled pseudowire.

> VPLS is very good when you use one port per customer, 1/10Gb, when you have to set trunk ports of 10Gb and put lots of vlan in a VC of 5to10 QFX switches, things gone wild... one error, a l2 loop will shut every customer inside that distribution POD.

Here we are not comparing same things, of course if you have L2
redundancy it's vulnerable to loops. The debate isn't should you do
ELAN or EPIPE, the debate it should you signal EPIPE on LDP or BGP,
and the examples you detail making the BGP case poorer are
implementation detail, not fundamental problem in BGP signalled
pseudowires.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-07 19:58:40 UTC
Permalink
On 7/Jul/18 18:26, Saku Ytti wrote:

> You can (and should) run your iBGP inside LSP too, so there is no
> difference, except iBGP can be redundant.

I just let iBGP sessions form normally over IGP-mapped paths. Or am I
missing something.

While I know an IGP path can follow an LSP, short of IGP Shortcuts
(Autoroute Announce, as it was known in Cisco land), I think that would
be too much layering.


> Consider this
>
> Edge1----Core1----Edge2
> | | |
> +--------Core2-----+
>
> Now imagine you have L2 transport between Edge1 and Edge2. With iBGP
> Edge[12] would receive information redundantly from both core
> interfaces, to lose signalling information either end needs to lose
> state to both BGP sessions.
> With LDP it's single session, if that becomes sufficiently lossy that
> it flaps, it's done, you lose state.
>
> So BGP just provides more signalling redundancy.

Hmmh - not sure I understand your use-case, Saku.

LDP forms sessions over the IGP. Whatever path is available via IGP is
what LDP will follow, which covers the signaling redundancy concern.

In most cases, LDP instability will be due to IGP instability.  IGP
instability will bring everything down. Worrying about LDP or BGP
signaling will be a luxury, at that point.


> What I was trying to tell, your example is implementation detail, not
> fundamental problem of BGP signalled pseudowire.

I think there are several times when the best of intentions in a
protocol have not materialized in practice.

What I would like to hear, though, is how you would overcome the
problems that Alexandre faced with how he, specifically, deployed
BGP-signaled VPLS.


>
> Here we are not comparing same things, of course if you have L2
> redundancy it's vulnerable to loops. The debate isn't should you do
> ELAN or EPIPE, the debate it should you signal EPIPE on LDP or BGP,
> and the examples you detail making the BGP case poorer are
> implementation detail, not fundamental problem in BGP signalled
> pseudowires.

How differently would you do it?

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether
Saku Ytti
2018-07-07 21:10:09 UTC
Permalink
On Sat, 7 Jul 2018 at 22:58, Mark Tinka <***@seacom.mu> wrote:

Hey Mark,

> I just let iBGP sessions form normally over IGP-mapped paths. Or am I missing something.

Alexandre's point, to which I agree, is that when you run them over
LSP, you get all the convergency benefits of TE. But I can understand
why someone specifically would not want to run iBGP on LSP,
particularly if they already do not run all traffic in LSPs, so it is
indeed option for operator. Main point was, it's not an argument for
using LDP signalled pseudowires.

> Hmmh - not sure I understand your use-case, Saku.
>
> LDP forms sessions over the IGP. Whatever path is available via IGP is what LDP will follow, which covers the signaling redundancy concern.

If there is some transport problems, as there were in Alexandre's
case, then you may have lossy transport, which normally does not mean
rerouting, so you drop 3 hellos and get LDP down and pseudowire down,
in iBGP case not only would you be running iBGP on both of the
physical links, but you'd also need to get 6 hellos down, which is
roughly 6 orders of magnitude less likely.

The whole point being using argument 'we had transport problem causing
BGP to flap' cannot be used as rationale reason to justify LDP
pseudowires.

> What I would like to hear, though, is how you would overcome the problems that Alexandre faced with how he, specifically, deployed BGP-signaled VPLS.

I would need to look at the details.

> How differently would you do it?

I would provision both p2mp and p2p through minimal difference, as to
reduce complexity in provisioning. I would make p2p special case of
p2mp, so that when there are exactly two attachment circuits, there
will be no mac learning.
However if you do not do p2mp, you may have some stronger arguments
for LDP pseudowires, more so, if you have some pure LDP edges, with no
BGP.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-08 08:19:54 UTC
Permalink
On 7/Jul/18 23:10, Saku Ytti wrote:

> Alexandre's point, to which I agree, is that when you run them over
> LSP, you get all the convergency benefits of TE.

Unless you've got LFA (or BFD, for the poor man), in which case there is
no real incremental benefit.

We run BFD + LFA for IS-IS. We've never seen the need for RSVP-TE for
FRR requirements.


> But I can understand
> why someone specifically would not want to run iBGP on LSP,
> particularly if they already do not run all traffic in LSPs, so it is
> indeed option for operator. Main point was, it's not an argument for
> using LDP signalled pseudowires.

We run all of our IPv4 and l2vpn pw's in (LDP-generated) LSP's. Not sure
if that counts...

I'm not sure whether there is a better reason for BGP- or LDP-signaled
pw's. I think folk just use what makes sense to them. I'm with Alexandre
where I feel, at least in our case, BGP-based signaling for simple p2p
or p2mp pw's would be too fat.


> If there is some transport problems, as there were in Alexandre's
> case, then you may have lossy transport, which normally does not mean
> rerouting, so you drop 3 hellos and get LDP down and pseudowire down,
> in iBGP case not only would you be running iBGP on both of the
> physical links, but you'd also need to get 6 hellos down, which is
> roughly 6 orders of magnitude less likely.
>
> The whole point being using argument 'we had transport problem causing
> BGP to flap' cannot be used as rationale reason to justify LDP
> pseudowires.

So LDP will never be more stable than your IGP. Even with
over-configuration of LDP, it's still pretty difficult to totally mess
it up that it's unstable all on it own.

If my IGP loses connectivity, I don't want a false sense of session
uptime either with LDP or BGP. I'd prefer they tear-down immediately, as
that is easier to troubleshoot. What would be awkward is BGP or LDP
being up, but no traffic being passed, as they wait for their Keepalive
Hello's to time out.


>
> I would provision both p2mp and p2p through minimal difference, as to
> reduce complexity in provisioning. I would make p2p special case of
> p2mp, so that when there are exactly two attachment circuits, there
> will be no mac learning.
> However if you do not do p2mp, you may have some stronger arguments
> for LDP pseudowires, more so, if you have some pure LDP edges, with no
> BGP.

Agreed that EVPN and VPLS better automate the provisioning of p2mp pw's.
However, this is something you can easily script for LDP as well; and
once it's up, it's up.

And with LDP building p2mp pw's, you are just managing LDP session
state. Unlike BGP, you are not needing to also manage routing tables, e.t.c.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-07-08 20:22:20 UTC
Permalink
> Of Mark Tinka
> Sent: Sunday, July 08, 2018 9:20 AM
>
Hi Mark,
two points

>
>
> On 7/Jul/18 23:10, Saku Ytti wrote:
>
> > Alexandre's point, to which I agree, is that when you run them over
> > LSP, you get all the convergency benefits of TE.
>
> Unless you've got LFA (or BFD, for the poor man), in which case there is
no
> real incremental benefit.
>
> We run BFD + LFA for IS-IS. We've never seen the need for RSVP-TE for FRR
> requirements.
>
>
> > But I can understand
> > why someone specifically would not want to run iBGP on LSP,
> > particularly if they already do not run all traffic in LSPs, so it is
> > indeed option for operator. Main point was, it's not an argument for
> > using LDP signalled pseudowires.
>
> We run all of our IPv4 and l2vpn pw's in (LDP-generated) LSP's. Not sure
if
> that counts...
>
> I'm not sure whether there is a better reason for BGP- or LDP-signaled
pw's. I
> think folk just use what makes sense to them. I'm with Alexandre where I
> feel, at least in our case, BGP-based signaling for simple p2p or p2mp
pw's
> would be too fat.
>
>
> > If there is some transport problems, as there were in Alexandre's
> > case, then you may have lossy transport, which normally does not mean
> > rerouting, so you drop 3 hellos and get LDP down and pseudowire down,
> > in iBGP case not only would you be running iBGP on both of the
> > physical links, but you'd also need to get 6 hellos down, which is
> > roughly 6 orders of magnitude less likely.
> >
> > The whole point being using argument 'we had transport problem causing
> > BGP to flap' cannot be used as rationale reason to justify LDP
> > pseudowires.
>
> So LDP will never be more stable than your IGP. Even with
over-configuration
> of LDP, it's still pretty difficult to totally mess it up that it's
unstable all on it
> own.
>
> If my IGP loses connectivity, I don't want a false sense of session uptime
> either with LDP or BGP. I'd prefer they tear-down immediately, as that is
> easier to troubleshoot. What would be awkward is BGP or LDP being up, but
> no traffic being passed, as they wait for their Keepalive Hello's to time
out.
>
The only way how you can be 100% sure about the service availability is
inserting test traffic onto the PW, that's why in Carrier Ethernet a good
practice is to use CFM so you can not only turn the L2ckt down if corrupted
but also be able to pinpoint the culprit precisely which in p2p L2 services
(with no mac learning) is quite problematic.

>
> >
> > I would provision both p2mp and p2p through minimal difference, as to
> > reduce complexity in provisioning. I would make p2p special case of
> > p2mp, so that when there are exactly two attachment circuits, there
> > will be no mac learning.
> > However if you do not do p2mp, you may have some stronger arguments
> > for LDP pseudowires, more so, if you have some pure LDP edges, with no
> > BGP.
>
> Agreed that EVPN and VPLS better automate the provisioning of p2mp pw's.
> However, this is something you can easily script for LDP as well; and once
it's
> up, it's up.
>
> And with LDP building p2mp pw's, you are just managing LDP session state.
> Unlike BGP, you are not needing to also manage routing tables, e.t.c.
>
We have to distinguish here whether you're using BGP just for the VC
endpoint reachability and VC label propagation (VPLS) or also to carry
end-host reachability information (EVPN) and only in the latter you need ot
worry about the routing tables I the former the BGP function is exactly the
same as the function of a targeted LDP session -well in the VC label
propagation bit anyways (not the auto discovery bit of course).


adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-09 10:33:51 UTC
Permalink
On 8/Jul/18 22:22, ***@netconsultings.com wrote:

> The only way how you can be 100% sure about the service availability is
> inserting test traffic onto the PW, that's why in Carrier Ethernet a good
> practice is to use CFM so you can not only turn the L2ckt down if corrupted
> but also be able to pinpoint the culprit precisely which in p2p L2 services
> (with no mac learning) is quite problematic.

I think that problem is already at a much higher level. In this case, we
are mainly concerned about the stability of the underlying
infrastructure, never mind the stability of the service itself.


> We have to distinguish here whether you're using BGP just for the VC
> endpoint reachability and VC label propagation (VPLS) or also to carry
> end-host reachability information (EVPN) and only in the latter you need ot
> worry about the routing tables I the former the BGP function is exactly the
> same as the function of a targeted LDP session -well in the VC label
> propagation bit anyways (not the auto discovery bit of course).

Of course, LDP- or BGP-based signaling both deal with the same
requirement. How they go about it is, obviously, different.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-07 20:10:17 UTC
Permalink
On 7/Jul/18 17:45, Alexandre Guimaraes wrote:

>
> ELS cli bring me some problem with QinQ L2TP services since it doesn’t
> works, also with the RTG, doesn’t work to, cause that I still using
> ex2200, ex3300.

Why Juniper broke this, is simply beyond me. It was working just fine...

I'm switching vendor because I think the Arista option is not only more
superior, but also more scalable (much more buffer memory, more ports,
more port options, e.t.c.).

Juniper need to learn that they cannot behave in this way and expect to
keep our loyalty.


>
> About those Aristas, I don’t have notice if some one use them as a P
> router or if they have xconnect/l2circuits services, even local Arista
> dealer knows.... we have some Aristas working only with layer2/3 for
> colocation services inside our datacenters facilities.

I hear decent stories about folk using them as P routers.

I also hear good things about folk using them as peering or transit
routers.

Where I hear further development is needed is as a typical high-touch
edge router, which doesn't surprise me. On our end, we're giving Arista
at least 2 - 3 years to bring their software up to scratch in terms of
hardcore IP/MPLS routing vs. what Cisco, Juniper, Brocade and Nokia
(ALU) can to today. Of course, this depends on where Broadcom will be by
then vs. custom silicon, otherwise that's 3 years wasted waiting.

If your requirements are simple Ethernet + IP routing for ToR, I think
the Arista should be fine, but don't take my word for it.

Where we're currently very happy with them is in the core switching
area, with their 7508E switches. But that's just pure high-capacity,
Layer 2 Ethernet switching. No IP or MPLS applications.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
htt
Aaron Gould
2018-07-07 17:37:15 UTC
Permalink
BGP based VPLS isn't that much to deploy... It's actually easier then manual VPLS ... It's similar to saying , RR-based ibgp is easier than manually full meshing an ibgp environment ....Like ATM LANE was easier then full meshing SPVC's , lol , God rest your soul ATM

Aaron

> On Jul 7, 2018, at 7:38 AM, Mark Tinka <***@seacom.mu> wrote:
>
>
>
>> On 7/Jul/18 14:16, Saku Ytti wrote:
>>
>> I feel your frustration, but to me it feels like there is very little
>> sharable knowledge in your experience.
>
> Hmmh, I thought there was a fair bit, explicit and inferred. I feel
> Alexandre could have gone on more than his TL;DR post allowed for :-).
>
>
>> You seem to compare full-blown
>> VPLS, with virtual switch and MAC learning to single LDD martini. You
>> also seem to blame BGP pseudowires for BGP transport flapping, clearly
>> you'd have equally bad time if LDP transport flaps, but at least in
>> BGP's case you can have redundancy via multiple RR connections, with
>> LDP you're reliant on the single session.
>
> Sounded like BGP flaps was one of several problems Alexandre described,
> including an unruly BGP routing table, et al.
>
> Not sure how relevant RR redundancy is per your argument, as ultimately,
> a single customer needing an end-to-end pw is mostly relying on the
> uptime of the PE devices at each end of their circuit, and liveliness of
> the core. If those pw's are linked by an LDP thread, what would a 2nd
> LDP-based pw (if that were sensibly possible) bring to the table? I'm
> not dissing BGP-based pw signaling in any way or form, but for that,
> you'd need "Router + IGP + BGP + RR + RR" to be fine. With LDP-based
> signaling for just this one customer, you only need "Router + IGP + LDP"
> to be fine.
>
> Personally, I've never deployed VPLS, nor had the appetite for it. It
> just seemed like a handful on paper the moment it was first published,
> not to mention the war stories around it from the brave souls that
> deployed it when VPLS was the buzzword then that SDN is these days. It
> certainly made the case for EVPN, which I still steer clear from until I
> find a requirement that can't be solved any other way.
>
> Again, no dis to anyone running VPLS; just much respect to you for all
> your nerve :-).
>
> Mark.
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-07 19:59:37 UTC
Permalink
On 7/Jul/18 19:37, Aaron Gould wrote:

> BGP based VPLS isn't that much to deploy... It's actually easier then manual VPLS ... It's similar to saying , RR-based ibgp is easier than manually full meshing an ibgp environment ....Like ATM LANE was easier then full meshing SPVC's , lol , God rest your soul ATM

I wasn't talking about BGP-based VPLS vs. Manual VPLS. I was talking
about VPLS, itself, in general.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-07 12:37:59 UTC
Permalink
On 7/Jul/18 14:00, Alexandre Guimaraes wrote:

> My Usage Cent

Oh wow! Thanks for that, Alexandre. I love to hear such stories, as
that's what network operations is really all about - despite all the
fluff we get fed everyday.


> Ex4550 will be EOS soon... with no replacement of features... maybe the ACX5448? Who knows

Slightly off-topic, we tried the EX4600 as a way to move away from the
now-EoS EX4550, but Juniper broke that badly with their evil ELS CLI. So
we are moving to Arista's 7280R, and getting rid of all the EX4550's we
have.

I know the Arista does have some MPLS capability, but that is not our
use-case. If you want to consider it for that, then I'd suggest giving
Arista a call and testing.

Have you considered the Cisco ASR920 if you are running MPLS pw's on
your EX4550's?

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Alexandre Guimaraes
2018-07-07 16:03:11 UTC
Permalink
> Have you considered the Cisco ASR920 if you are running MPLS pw's on your EX4550's?


Yes! But... Ex4550 we have 32 ports 1/10Gb, using expansion slots, more 1/10Gb or 40Gb ports. L2circuits, QinQ L2TP, vlan translation, rtg local-interface switching and so on...

We eat 1/10Gb ports, ASR920 didn’t help us with that.

I am waiting local Juniper team test the new ACX5448 with services that I run, to see what way I will follow, I will keep everyone posted....

Att
Alexandre
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman
Aaron Gould
2018-07-07 18:01:04 UTC
Permalink
I have ACX5448 powered on in my lab ready to play with... We can share test stories as we proceed... I think mine has Junos 18.x and I haven't done anything with it yet

I love my dual-ex4550 data center virtual chassis's However, I enabled mpls vrf on one recently and didn't feel very good about proceeding... So I will revert to pure ethernet switching, I put 40 gig into it and AE for fat 80 gig pipe to PE...actually my Facebook FNA, Google ggc, Netflix oca and Akamai aanp, plus private data center stuff are all connected to that EX4550 virtual chassis x2 for site diversity and redundancy. That's how solid they are. I depend on them for all that content... And have for about 5 years... Pretty much never a problem. Rock Solid.

Aaron

> On Jul 7, 2018, at 11:03 AM, Alexandre Guimaraes <***@ascenty.com> wrote:
>
>
>> Have you considered the Cisco ASR920 if you are running MPLS pw's on your EX4550's?
>
>
> Yes! But... Ex4550 we have 32 ports 1/10Gb, using expansion slots, more 1/10Gb or 40Gb ports. L2circuits, QinQ L2TP, vlan translation, rtg local-interface switching and so on...
>
> We eat 1/10Gb ports, ASR920 didn’t help us with that.
>
> I am waiting local Juniper team test the new ACX5448 with services that I run, to see what way I will follow, I will keep everyone posted....
>
> Att
> Alexandre
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listin
Mark Tinka
2018-07-07 20:02:53 UTC
Permalink
On 7/Jul/18 20:01, Aaron Gould wrote:

>
> I love my dual-ex4550 data center virtual chassis's However, I enabled mpls vrf on one recently and didn't feel very good about proceeding... So I will revert to pure ethernet switching, I put 40 gig into it and AE for fat 80 gig pipe to PE...actually my Facebook FNA, Google ggc, Netflix oca and Akamai aanp, plus private data center stuff are all connected to that EX4550 virtual chassis x2 for site diversity and redundancy. That's how solid they are. I depend on them for all that content... And have for about 5 years... Pretty much never a problem. Rock Solid.

We've been super happy with the EX4550, save for two issues:

* VC bandwidth doesn't scale well. You have to be careful about having
a large EX4550-based VC, either in terms of member nodes or
bandwidth being switched.

* Buffer memory is very low.

The EX4600 only took the portfolio up to about 12MB of buffer memory,
which is peanuts. But, as I've mentioned before, the back-breaker was
that ELS debacle.

So bye-bye rock solid EX4550, hello Arista. Juniper royally messed with
the pooch on this one.

No point in us trying to maintain a platform that is now EoS, and very
soon, will be EoL.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-07 21:54:56 UTC
Permalink
Thanks Mark, I haven't been aware of any buffer deficiency in my 4550's. If something adverse is occurring, I'm not aware.

Thanks for the warning about large VC... I don't really intend on going past the (2) stacked. After we outgrow it, I'll move on.


Aaron

> On Jul 7, 2018, at 3:02 PM, Mark Tinka <***@seacom.mu> wrote:
>
>
>
> On 7/Jul/18 20:01, Aaron Gould wrote:
>
>> I love my dual-ex4550 data center virtual chassis's However, I enabled mpls vrf on one recently and didn't feel very good about proceeding... So I will revert to pure ethernet switching, I put 40 gig into it and AE for fat 80 gig pipe to PE...actually my Facebook FNA, Google ggc, Netflix oca and Akamai aanp, plus private data center stuff are all connected to that EX4550 virtual chassis x2 for site diversity and redundancy. That's how solid they are. I depend on them for all that content... And have for about 5 years... Pretty much never a problem. Rock Solid.
>
> We've been super happy with the EX4550, save for two issues:
> VC bandwidth doesn't scale well. You have to be careful about having a large EX4550-based VC, either in terms of member nodes or bandwidth being switched.
> Buffer memory is very low.
> The EX4600 only took the portfolio up to about 12MB of buffer memory, which is peanuts. But, as I've mentioned before, the back-breaker was that ELS debacle.
>
> So bye-bye rock solid EX4550, hello Arista. Juniper royally messed with the pooch on this one.
>
> No point in us trying to maintain a platform that is now EoS, and very soon, will be EoL.
>
> Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-08 08:34:24 UTC
Permalink
On 7/Jul/18 23:54, Aaron Gould wrote:

> Thanks Mark, I haven't been aware of any buffer deficiency in my
> 4550's.  If something adverse is occurring, I'm not aware.

The EX4550 has only 4MB of shared buffer memory. The EX4600 has only 12MB.

You need the "set class-of-service shared-buffer percent 100" command to
ensure some ports don't get starved of buffer space (which will manifest
as dropped frames, e.t.c.).

The Arista 7280R series switches have 4GB of buffer space on the
low-end, all the way to 8GB, 12GB, 16GB, 24GB and 32GB as you scale up.


>
> Thanks for the warning about large VC... I don't really intend on
> going past the (2) stacked.  After we outgrow it, I'll move on.

We are dropping the VC idea going forward. Simpler to just have enough
bandwidth between a switch and the router that you can predict.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mai
joel jaeggli
2018-07-09 05:52:23 UTC
Permalink
On 7/8/18 01:34, Mark Tinka wrote:
>
>
> On 7/Jul/18 23:54, Aaron Gould wrote:
>
>> Thanks Mark, I haven't been aware of any buffer deficiency in my
>> 4550's.  If something adverse is occurring, I'm not aware.
>
> The EX4550 has only 4MB of shared buffer memory. The EX4600 has only 12MB.
>
> You need the "set class-of-service shared-buffer percent 100" command to
> ensure some ports don't get starved of buffer space (which will manifest
> as dropped frames, e.t.c.).
>
> The Arista 7280R series switches have 4GB of buffer space on the
> low-end, all the way to 8GB, 12GB, 16GB, 24GB and 32GB as you scale up.

As they are a matrix of cell forwarders either attached to each other or
to a fabric it's probably more proper to think of that as 4GB of packet
buffer per asic. There is in fact a small amount onboard the asic and
then the large dollop of offboard gddr5. while memory is still shared
among the locally attached ports it's unlikely that anything is going to
starve.

>
>>
>> Thanks for the warning about large VC... I don't really intend on
>> going past the (2) stacked.  After we outgrow it, I'll move on.
>
> We are dropping the VC idea going forward. Simpler to just have enough
> bandwidth between a switch and the router that you can predict.
>
> Mark.
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
Mark Tinka
2018-07-09 10:46:44 UTC
Permalink
On 9/Jul/18 07:52, joel jaeggli wrote:

> As they are a matrix of cell forwarders either attached to each other or
> to a fabric it's probably more proper to think of that as 4GB of packet
> buffer per asic. There is in fact a small amount onboard the asic and
> then the large dollop of offboard gddr5. while memory is still shared
> among the locally attached ports it's unlikely that anything is going to
> starve.

Indeed.

Mark.
Mark Tinka
2018-07-07 20:16:07 UTC
Permalink
On 7/Jul/18 18:03, Alexandre Guimaraes wrote:

> Yes! But... Ex4550 we have 32 ports 1/10Gb, using expansion slots, more 1/10Gb or 40Gb ports. L2circuits, QinQ L2TP, vlan translation, rtg local-interface switching and so on...
>
> We eat 1/10Gb ports, ASR920 didn’t help us with that.

Agreed - the ASR920 lacks port density. But, it does have the features,
which come at a decent price.

Depending on how things pan out with Broadcom in the few short years to
come, I think this will be a particularly good area for Arista to pick
all their competitors off, should they come right with their IP/MPLS
software implementations.

I feel the established/traditional equipment vendors are too busy
producing half-baked Broadcom-based solutions just to have a "cheap"
option to deal with customers considering Arista or white boxes; and
focusing more on pushing their heavily-bloated "data centre" switches at
massive $$ premiums. Slowly but surely, Arista (or anyone else copying
their model) will rise to fill the gap.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puc
Alexandre Guimaraes
2018-07-07 21:10:13 UTC
Permalink
Saku,

Indeed. iBGP will be redundant and resilient, yes... with a cost, 90 seconds (timers) of unavailability and more 1-3 minutes to get back online. I know, we can change timers, bfd and so on...

I used that before... but....

Not everyone have MX960, MX480 handling BGP in every part of the network, I don’t have... I have QFX, hundreds of them. Now imagine in some MX, you have 5/6 full routing table coming from upstream or peerings partners. Now experience a flap between two of those MX exchanging full routing table for a entire night....
At some point, routing engines become angry and stop updating routes(normally, MX have a baaad routing update rate). Doomsday have arrived!

Everyone gets crazy, angry customers blaming, services inside vpls, vpls getting loss bla bla bla....

Degraded fibers keep flapping lights on/off less than 30/90 seconds, no iBGP alarms. No one knows what’s going on...

As I said: VPLS save me from the dark(another Operation History: Once Upon time: we used Portugal Telecom IP/MPLS solution), Now L2circuits now enlightened my days. I can sleep!

By the way, I still using VPLS/iBGP for point multipoint services.

att
Alexandre

Em 7 de jul de 2018, à(s) 17:16, Mark Tinka <***@seacom.mu> escreveu:

>
>
> On 7/Jul/18 18:03, Alexandre Guimaraes wrote:
>
>> Yes! But... Ex4550 we have 32 ports 1/10Gb, using expansion slots, more 1/10Gb or 40Gb ports. L2circuits, QinQ L2TP, vlan translation, rtg local-interface switching and so on...
>>
>> We eat 1/10Gb ports, ASR920 didn’t help us with that.
>
> Agreed - the ASR920 lacks port density. But, it does have the features, which come at a decent price.
>
> Depending on how things pan out with Broadcom in the few short years to come, I think this will be a particularly good area for Arista to pick all their competitors off, should they come right with their IP/MPLS software implementations.
>
> I feel the established/traditional equipment vendors are too busy producing half-baked Broadcom-based solutions just to have a "cheap" option to deal with customers considering Arista or white boxes; and focusing more on pushing their heavily-bloated "data centre" switches at massive $$ premiums. Slowly but surely, Arista (or anyone else copying their model) will rise to fill the gap.
>
> Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-ns
Saku Ytti
2018-07-07 21:21:11 UTC
Permalink
Hey,


> Not everyone have MX960, MX480 handling BGP in every part of the network, I don’t have... I have QFX, hundreds of them. Now imagine in some MX, you have 5/6 full routing table coming from upstream or peerings partners. Now experience a flap between two of those MX exchanging full routing table for a entire night....
> At some point, routing engines become angry and stop updating routes(normally, MX have a baaad routing update rate). Doomsday have arrived!

I understand that, but I feel like this is specific to your
implementation, not a general thing. Obviously BGP should not be
flapping, neither should LDP, if all the iBGP sessions carrying the
pseudowire label are flapping, of course there is outage, if LDP is
flapping, of course there is outage.
If you have full-mesh iBGP topology, you have no redundancy, single
flap is outage, it's quite scary (nor can you change AFIs without
outage). But in RR scenario, single iBGP flap does not mean customer
observable outage.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/
Aaron Gould
2018-07-07 21:58:37 UTC
Permalink
Sweet dreams Alexandre.... I can see you counting l2circuits now... I mean sheep... I mean l2circuits...

Aaron

> On Jul 7, 2018, at 4:10 PM, Alexandre Guimaraes <***@ascenty.com> wrote:
>
> Saku,
>
> Indeed. iBGP will be redundant and resilient, yes... with a cost, 90 seconds (timers) of unavailability and more 1-3 minutes to get back online. I know, we can change timers, bfd and so on...
>
> I used that before... but....
>
> Not everyone have MX960, MX480 handling BGP in every part of the network, I don’t have... I have QFX, hundreds of them. Now imagine in some MX, you have 5/6 full routing table coming from upstream or peerings partners. Now experience a flap between two of those MX exchanging full routing table for a entire night....
> At some point, routing engines become angry and stop updating routes(normally, MX have a baaad routing update rate). Doomsday have arrived!
>
> Everyone gets crazy, angry customers blaming, services inside vpls, vpls getting loss bla bla bla....
>
> Degraded fibers keep flapping lights on/off less than 30/90 seconds, no iBGP alarms. No one knows what’s going on...
>
> As I said: VPLS save me from the dark(another Operation History: Once Upon time: we used Portugal Telecom IP/MPLS solution), Now L2circuits now enlightened my days. I can sleep!
>
> By the way, I still using VPLS/iBGP for point multipoint services.
>
> att
> Alexandre
>
> Em 7 de jul de 2018, à(s) 17:16, Mark Tinka <***@seacom.mu> escreveu:
>
>>
>>
>>> On 7/Jul/18 18:03, Alexandre Guimaraes wrote:
>>>
>>> Yes! But... Ex4550 we have 32 ports 1/10Gb, using expansion slots, more 1/10Gb or 40Gb ports. L2circuits, QinQ L2TP, vlan translation, rtg local-interface switching and so on...
>>>
>>> We eat 1/10Gb ports, ASR920 didn’t help us with that.
>>
>> Agreed - the ASR920 lacks port density. But, it does have the features, which come at a decent price.
>>
>> Depending on how things pan out with Broadcom in the few short years to come, I think this will be a particularly good area for Arista to pick all their competitors off, should they come right with their IP/MPLS software implementations.
>>
>> I feel the established/traditional equipment vendors are too busy producing half-baked Broadcom-based solutions just to have a "cheap" option to deal with customers considering Arista or white boxes; and focusing more on pushing their heavily-bloated "data centre" switches at massive $$ premiums. Slowly but surely, Arista (or anyone else copying their model) will rise to fill the gap.
>>
>> Mark.
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
ht
Alexandre Guimaraes
2018-07-07 21:18:55 UTC
Permalink
Saku, just forgot

When we use l2circuits, you remove some layer of routing protocol troubleshooting. In just few command you know what’s going on.

In a flap, BGP session will be dropped after timers reached.

RSVP/ISIS/LDP will be affect immediately. Also ISIS is the fundamental key of everything over here....

With BGP, you have to check everything twice, including filters everywhere if someone change this or change that.



att
Alexandre

Em 7 de jul de 2018, à(s) 17:16, Mark Tinka <***@seacom.mu> escreveu:

>
>
> On 7/Jul/18 18:03, Alexandre Guimaraes wrote:
>
>> Yes! But... Ex4550 we have 32 ports 1/10Gb, using expansion slots, more 1/10Gb or 40Gb ports. L2circuits, QinQ L2TP, vlan translation, rtg local-interface switching and so on...
>>
>> We eat 1/10Gb ports, ASR920 didn’t help us with that.
>
> Agreed - the ASR920 lacks port density. But, it does have the features, which come at a decent price.
>
> Depending on how things pan out with Broadcom in the few short years to come, I think this will be a particularly good area for Arista to pick all their competitors off, should they come right with their IP/MPLS software implementations.
>
> I feel the established/traditional equipment vendors are too busy producing half-baked Broadcom-based solutions just to have a "cheap" option to deal with customers considering Arista or white boxes; and focusing more on pushing their heavily-bloated "data centre" switches at massive $$ premiums. Slowly but surely, Arista (or anyone else copying their model) will rise to fill the gap.
>
> Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-
Saku Ytti
2018-07-07 21:24:03 UTC
Permalink
Hey Alexandre,


> When we use l2circuits, you remove some layer of routing protocol troubleshooting. In just few command you know what’s going on.
> In a flap, BGP session will be dropped after timers reached.
>
> RSVP/ISIS/LDP will be affect immediately. Also ISIS is the fundamental key of everything over here....
> With BGP, you have to check everything twice, including filters everywhere if someone change this or change that.

All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
of them you'd like to configure to trigger from events without delay
when possible, instead of relying on timers. Indeed you can have BGP
next-hop invalidated the moment IGP informs it, allowing rapid
convergence.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nethe
Alexandre Guimaraes
2018-07-07 22:05:52 UTC
Permalink
Saku,

You are correct, they have. To many things to tshoot. I aways try to keep my configuration simple and clean. Things have to be more fast, reliable, available, 200% uptime, with no downtime with a good price.

This is a part of a QFX5110 protocol configuration, it represents a part of a 100Gbps P ring of QFX5110 in a specific area collecting all traffic that’s will jump to another city. The rest of configuration are system login access and filters

set protocols rsvp interface et-0/0/30.0 bandwidth 100g
set protocols rsvp interface et-0/0/31.0 bandwidth 100g
set protocols rsvp interface ae16.0 bandwidth 40g
set protocols mpls log-updown syslog
set protocols mpls optimize-aggressive
set protocols mpls optimize-timer 21600
set protocols mpls interface et-0/0/31.0
set protocols mpls interface et-0/0/30.0
set protocols mpls interface ae16.0
set protocols isis interface et-0/0/30.0 level 1 metric 10
set protocols isis interface et-0/0/30.0 level 2 metric 10
set protocols isis interface et-0/0/31.0 level 1 metric 10
set protocols isis interface et-0/0/31.0 level 2 metric 10
set protocols isis interface ae16.0 level 1 metric 30
set protocols isis interface ae16.0 level 2 disable
set protocols isis interface lo0.0 passive
set protocols ldp interface lo0.0

> show mpls lsp | match transit
Transit LSP: 181 sessions, 19 detours

I am not here to make my words the rule, I am just sharing my -Real World Deployment and Operation- knowledge and experience.


Att
Alexandre


-----Mensagem original-----
De: Saku Ytti <***@ytti.fi>
Enviada em: sábado, 7 de julho de 2018 18:24
Para: Alexandre Guimaraes <***@ascenty.com>
Cc: Mark Tinka <***@seacom.mu>; Juniper List <juniper-***@puck.nether.net>
Assunto: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

Hey Alexandre,


> When we use l2circuits, you remove some layer of routing protocol troubleshooting. In just few command you know what’s going on.
> In a flap, BGP session will be dropped after timers reached.
>
> RSVP/ISIS/LDP will be affect immediately. Also ISIS is the fundamental key of everything over here....
> With BGP, you have to check everything twice, including filters everywhere if someone change this or change that.

All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each of them you'd like to configure to trigger from events without delay when possible, instead of relying on timers. Indeed you can have BGP next-hop invalidated the moment IGP informs it, allowing rapid convergence.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/m
Mark Tinka
2018-07-08 15:26:43 UTC
Permalink
On 8/Jul/18 00:05, Alexandre Guimaraes wrote:

> I am not here to make my words the rule, I am just sharing my -Real World Deployment and Operation- knowledge and experience.

Thanks for sharing, Alexandre. This honest description of your
experiences is what I really like.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-08 08:23:30 UTC
Permalink
On 7/Jul/18 23:24, Saku Ytti wrote:

> All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
> of them you'd like to configure to trigger from events without delay
> when possible, instead of relying on timers. Indeed you can have BGP
> next-hop invalidated the moment IGP informs it, allowing rapid
> convergence.

For several years now, we've been happy with BFD offering this
capability, primarily to IS-IS.

In my experience, as long as IS-IS (or your favorite IGP) is stable,
upper level protocols will be just as happy (of course, notwithstanding
environmental factors such as a slow CPU, exhausted RAM, DoS attacks,
link congestion, e.t.c.).

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-07-08 20:35:36 UTC
Permalink
> Of Mark Tinka
> Sent: Sunday, July 08, 2018 9:24 AM
>
>
>
> On 7/Jul/18 23:24, Saku Ytti wrote:
>
> > All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
> > of them you'd like to configure to trigger from events without delay
> > when possible, instead of relying on timers. Indeed you can have BGP
> > next-hop invalidated the moment IGP informs it, allowing rapid
> > convergence.
>
> For several years now, we've been happy with BFD offering this capability,
> primarily to IS-IS.
>
> In my experience, as long as IS-IS (or your favorite IGP) is stable, upper
level
> protocols will be just as happy (of course, notwithstanding environmental
> factors such as a slow CPU, exhausted RAM, DoS attacks, link congestion,
> e.t.c.).
>
Hold on gents,
You are still talking about multi-hop TCP sessions, right? Sessions that
carry information that is ephemeral to the underlying transport network -why
would you want those session ever go down as a result of anything going on
in the underlying transport network -that's a leaky abstraction , not good
in my opinion.
You just reroute the multi-hop control-plane TCP session around the failed
link and move on, failed/flapping link should remain solely a data-plane
problem right?
So in this particular case the VC label remains the same no matter that
transport labels change in reaction to failed link.
The PW should go down only in case any of the entities it's bound to goes
down be it a interface or a bridge-domain at either end (or a whole PE for
that matter) -and not because there's a problem somewhere in the core.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
James Bensley
2018-07-09 08:55:46 UTC
Permalink
On 8 July 2018 21:35:36 BST, ***@netconsultings.com wrote:
>Hold on gents,
>You are still talking about multi-hop TCP sessions, right? Sessions
>that
>carry information that is ephemeral to the underlying transport network
>-why
>would you want those session ever go down as a result of anything going
>on
>in the underlying transport network -that's a leaky abstraction , not
>good
>in my opinion.
>You just reroute the multi-hop control-plane TCP session around the
>failed
>link and move on, failed/flapping link should remain solely a
>data-plane
>problem right?
>So in this particular case the VC label remains the same no matter that
>transport labels change in reaction to failed link.
>The PW should go down only in case any of the entities it's bound to
>goes
>down be it a interface or a bridge-domain at either end (or a whole PE
>for
>that matter) -and not because there's a problem somewhere in the core.

I was having the exact same thoughts. LDP or BGP signaled - it should
be independent of IGP link flaps. Saku raises a good point that with
BGP signaled we can have multiple RR's meaning that loosing one
doesn't mean that the server state is lost from the network (so the
service stays up) however, if there is one ingress PE that SPoF
undermines multiple RR's. With LDP we can signal backup pseudowires
(haven't tried with BGP?) - there is a service disruption whilst the
LDP session is detected as dead - but it does work if you have two
ingress PEs and two egress PEs and set up a crisscross topology of
pseudowires/backup-pseudowires.

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-09 10:50:24 UTC
Permalink
On 9/Jul/18 10:55, James Bensley wrote:

> I was having the exact same thoughts. LDP or BGP signaled - it should
> be independent of IGP link flaps. Saku raises a good point that with
> BGP signaled we can have multiple RR's meaning that loosing one
> doesn't mean that the server state is lost from the network (so the
> service stays up) however, if there is one ingress PE that SPoF
> undermines multiple RR's. With LDP we can signal backup pseudowires
> (haven't tried with BGP?) - there is a service disruption whilst the
> LDP session is detected as dead - but it does work if you have two
> ingress PEs and two egress PEs and set up a crisscross topology of
> pseudowires/backup-pseudowires.

An unstable IGP, generally, infers a physical link instability (hardware
resources and software bugs notwithstanding, of course).

I don't understand how having multiple RR's for redundant iBGP sessions
deals with an unstable IGP. If the IGP is unstable, it's unstable.

Perhaps the greater discussion to have is how to deal with an unstable
IGP. Is it a flapping link, and if so, do we take it out of operation if
it's bouncing rather going hard down?

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-09 10:36:11 UTC
Permalink
On 8/Jul/18 22:35, ***@netconsultings.com wrote:

> Hold on gents,
> You are still talking about multi-hop TCP sessions, right? Sessions that
> carry information that is ephemeral to the underlying transport network -why
> would you want those session ever go down as a result of anything going on
> in the underlying transport network -that's a leaky abstraction , not good
> in my opinion.
> You just reroute the multi-hop control-plane TCP session around the failed
> link and move on, failed/flapping link should remain solely a data-plane
> problem right?
> So in this particular case the VC label remains the same no matter that
> transport labels change in reaction to failed link.
> The PW should go down only in case any of the entities it's bound to goes
> down be it a interface or a bridge-domain at either end (or a whole PE for
> that matter) -and not because there's a problem somewhere in the core.

We all wake up everyday to keep the backbone up and growing.

A broken backbone will break a pw signal.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Michael Hare via juniper-nsp
2018-07-09 14:09:50 UTC
Permalink
Great thread.

I want to emphasize (and perhaps ask Saku for clarification), the following statement.

>>All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
>>of them you'd like to configure to trigger from events without delay
>>when possible, instead of relying on timers. Indeed you can have BGP
>>next-hop invalidated the moment IGP informs it, allowing rapid
>>convergence.

When I was a bit greener with our MPLS network I would experience the same concern as Alexandre when a dual connected customer lost a PE, in that I would experience loss of service waiting for BGP to timeout. I briefly went down the wrong path (IMHO) of BFD everywhere, including on directly connected links (yes, I know BFD can help for control plane issues, but 99%+ of the time for us it is 'switch-in-the-middle' problem). I even put multihop BFD on my iBGP sessions, which I later removed. The correct configuration (at least in my experience) was to invalidate BGP next hops, as Saku points out, and keep the BGP session up (assuming outage is less than 60s - 90s and normal timers), so I had no delay in re-estalbishing BGP and repopulating RIB in the flap was brief.

In our network, this means the following

set routing-options resolution rib inet.0 import limit-inet0-resolution
set policy-options policy-statement limit-inet0-resolution term reject-routes from prefix-list-filter limit-inet0-resolution exact
set policy-options policy-statement limit-inet0-resolution term reject-routes then reject
set policy-options policy-statement limit-inet0-resolution then accept
set policy-options prefix-list sync_lists-limit-inet0-resolution 0.0.0.0/0
set policy-options prefix-list sync_lists-limit-inet0-resolution $any_less_specific_routes_for_your_loopbacks

Are others doing this?

FWIW, we're doing pseudowire, redundant pseudowire, L3VPN and multipoint E-VPN (in that order of preference). Regarding services, like others, I avoid mac learning if at all possible, and have strict mac limits on our E-VPN routing instances.

-Michael

>>-----Original Message-----
>>From: juniper-nsp [mailto:juniper-nsp-***@puck.nether.net] On Behalf
>>Of Saku Ytti
>>Sent: Saturday, July 07, 2018 4:24 PM
>>To: ***@ascenty.com
>>Cc: Juniper List <juniper-***@puck.nether.net>
>>Subject: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-
>>lag)
>>
>>Hey Alexandre,
>>
>>
>>> When we use l2circuits, you remove some layer of routing protocol
>>troubleshooting. In just few command you know what’s going on.
>>> In a flap, BGP session will be dropped after timers reached.
>>>
>>> RSVP/ISIS/LDP will be affect immediately. Also ISIS is the fundamental key
>>of everything over here....
>>> With BGP, you have to check everything twice, including filters everywhere
>>if someone change this or change that.
>>
>>All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
>>of them you'd like to configure to trigger from events without delay
>>when possible, instead of relying on timers. Indeed you can have BGP
>>next-hop invalidated the moment IGP informs it, allowing rapid
>>convergence.
>>
>>--
>> ++ytti
>>_______________________________________________
>>juniper-nsp mailing list juniper-***@puck.nether.net
>>https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
htt
Saku Ytti
2018-07-09 16:12:34 UTC
Permalink
Hey Michael,

On Mon, 9 Jul 2018 at 17:09, Michael Hare <***@wisc.edu> wrote:

> When I was a bit greener with our MPLS network I would experience the same concern as Alexandre when a dual connected customer lost a PE, in that I would experience loss of service waiting for BGP to timeout. I briefly went down the wrong path (IMHO) of BFD everywhere, including on directly connected links (yes, I know BFD can help for control plane issues, but 99%+ of the time for us it is 'switch-in-the-middle' problem). I even put multihop BFD on my iBGP sessions, which I later removed. The correct configuration (at least in my experience) was to invalidate BGP next hops, as Saku points out, and keep the BGP session up (assuming outage is less than 60s - 90s and normal timers), so I had no delay in re-estalbishing BGP and repopulating RIB in the flap was brief.

This is highly subjective, but I do not like BFD. I would only
consider BFD when L1 liveliness does not work (L2 in between or such).
I think issues undetected by L1 liveliness detection is far smaller
than issues caused by BFD false positives.

I like your solution of limiting resolution, because this is
essentially same as making BGP convergence depend entirely on IGP
convergence, for ~free. Potentially add ADD-PATH and PIC and you can
have redundant route already programmed in HW, requiring no SW
convergence to switch to backup path.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-08 08:21:23 UTC
Permalink
On 7/Jul/18 23:18, Alexandre Guimaraes wrote:

> When we use l2circuits, you remove some layer of routing protocol troubleshooting. In just few command you know what’s going on.
>
> In a flap, BGP session will be dropped after timers reached.
>
> RSVP/ISIS/LDP will be affect immediately. Also ISIS is the fundamental key of everything over here....
>
> With BGP, you have to check everything twice, including filters everywhere if someone change this or change that.

Sage...

Protein...

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/m
Andy Koch
2018-07-09 21:52:32 UTC
Permalink
On 7/7/18 7:37 AM, Mark Tinka wrote:
> Slightly off-topic, we tried the EX4600 as a way to move away from the
> now-EoS EX4550,

Hi Mark,

You mentioned a couple times in the thread that the EX4550 is EoS. I am not finding any reference to that on the Juniper site. In fact, they still tout the switch. Do you have a link to the EoS/EoL notices?

Thanks,
Andy

Andy Koch
Hoyos Consulting LLC
ofc: +1 608 616 9950
***@hoyosconsulting.com
http://www.hoyosconsulting.com

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-09 22:33:04 UTC
Permalink
On 9/Jul/18 23:52, Andy Koch wrote:

>  
>
> You mentioned a couple times in the thread that the EX4550 is EoS.  I
> am not finding any reference to that on the Juniper site.  In fact,
> they still tout the switch.  Do you have a link to the EoS/EoL notices?

It was during a discussion with our Juniper SE. They insisted that they
stop pricing up to the EX4550, as the EX4600 was the replacement unit,
despite the slightly fewer ports.

He could have been wrong. Frankly, I was too bored with the entire
situation to follow-up.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/
a***@netconsultings.com
2018-07-08 19:28:09 UTC
Permalink
> Of Alexandre Guimaraes
> Sent: Saturday, July 07, 2018 1:01 PM
>
Hi Alexandre,
With the level of detail you provided I'm afraid that it seems like some of your troubles are rooted in somewhat suboptimal design choices.

> My Usage Cent
>
> My core Network, P and PE, are 100% Juniper
>
> We start using VPLS, based in BGP sessions, at that time we was working at
> maximum of 2 or 3 new provisions per day.
> We won a big project contract, we reach 90/100 per month.
> VPLS become a issue in all fronts...
>
> Planning/ low ports - price of 10G ports using MX and rack space usage
>
This is good business case for an aggregation network out of say those EX switches you mentioned to aggregate low speed customer links into bundles of 10/40GE links towards PEs
This allows you then to use the potential of a PE slot fully as dictated by the fabric making a better use of the chassis.
The carrier Ethernet features on PE that allows you to realize such L2 services aggregation are flexible VLAN tag manipulation (push/pop/translate 1/2 tags) and per interface VLANs range.
Although the EX switches don't support per interface VLAN range but -I still think that ~4000 customers (or service vlans) per Agg. switch is enough.

> Provisioning... vlan remap, memory usage of the routers and 2000/2500
> circuits/customers per MX
>
Templates and automation in provisioning will really make the difference if you go pass a certain scale or customer onboarding rate,

> Tshoot, a headache to find the signaling problem when, for example: fiber
> degraded, all BGP sessions start flapping and the things become crazy and
> the impact increase each minute.
>
I think BGP sessions are no different to LSP sessions in this regard, maybe just routed differently (not PE-to-PE but PE-to-RR).
Maybe running a BFD on your core links for rapid problems detection and interface hold down or dampening to stabilize the network could have helped with this.

> Operating, vpls routing table become a pain is the ass when you use
> multipoint connections and with Lucifer reason, those multipoint become
> unreachable and the vpls table and all routing tables become ruge to analyze.
>
On the huge routing table sizes,
I think the problem of huge tables is something we all have to bear when in business of L2/L3 VPN services.
But in Ethernet services only p2mp and mp2mp services require standard l2-switch-like mac learning and thus exhibit this scaling problem, but there's no need for mac learning for p2p services.
So I guess you could have just disabled the mac learning on instances that were intended to support p2p services.
Also it's a good practice to limit , contractually, how much resources can each VPN customer use, -in L2 services is for instance the MACs per interface or per bridge-domain, etc...

> Regarding L2circuits using LDP.
>
But hey I'm glad it worked out for you with the LDP signalled PWs, and yes I do agree the config is simpler for LDP.

adam

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-07-05 08:56:40 UTC
Permalink
> Of James Bensley
> Sent: Thursday, July 05, 2018 9:15 AM
>
> - 100% rFLA coverage: TI-LA covers the "black spots" we currently have.
>
Yeah that's an interesting use case you mentioned, that I haven't
considered, that is no TE need but FRR need.
But I guess if it was business critical to get those blind spots
FRR-protected then you would have done something about it already right?
So I guess it's more like it would be nice to have, now is it enough to
expose the business to additional risk?
Like for instance yes you'd test the feature to death to make sure it works
under any circumstances (it's the very heart of the network after all if
that breaks everything breaks), but the problem I see is then going to a
next release couple of years later -since SR is a new thing it would have a
ton of new stuff added to it by then resulting in higher potential for
regression bugs with comparison to LDP or RSVP which have been around since
ever and every new release to these two is basically just bug fixes.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
James Bensley
2018-07-06 13:04:04 UTC
Permalink
On 5 July 2018 09:56:40 BST, ***@netconsultings.com wrote:
>> Of James Bensley
>> Sent: Thursday, July 05, 2018 9:15 AM
>>
>> - 100% rFLA coverage: TI-LA covers the "black spots" we currently
>have.
>>
>Yeah that's an interesting use case you mentioned, that I haven't
>considered, that is no TE need but FRR need.
>But I guess if it was business critical to get those blind spots
>FRR-protected then you would have done something about it already
>right?

Hi Adam,

Yeah correct, no mission critical services are effected by this for us, so the business obviously hasn't allocated resource to do anything about it. If it was a major issue, it should be as simple as adding an extra back haul link to a node or shifting existing ones around (to reshape the P space and Q space to "please" the FRR algorithm).

>So I guess it's more like it would be nice to have, now is it enough
>to
>expose the business to additional risk?
>Like for instance yes you'd test the feature to death to make sure it
>works
>under any circumstances (it's the very heart of the network after all
>if
>that breaks everything breaks), but the problem I see is then going to
>a
>next release couple of years later -since SR is a new thing it would
>have a
>ton of new stuff added to it by then resulting in higher potential for
>regression bugs with comparison to LDP or RSVP which have been around
>since
>ever and every new release to these two is basically just bug fixes.

Good point, I think its worth breaking that down into two separate points/concerns:

Initial deployment bugs:
We've done stuff like pay for a CPoC with Cisco, then deployed, then had it all blow up, then paod Cisco AS to asses the situation only to be told it's not a good design :D So we just assume a default/safe view now that no amount of testing will protect us. We ensure we have backout plans if something immediately blows up, and heightened reporting for issues that take 72 hours to show up, and change freezes to cover issues that take a week to show up etc. etc. So I think as far as an initial SR deployment goes, all we can do is our best with regards to being cautious, just as we would with any major core changes. So I don't see the initial deployment as any more risky than other core projects we've undertaken like changing vendors, entire chassis replacements, code upgrades between major versions etc.

Regression bugs:
My opinion is that in the case of something like SR which is being deployed based on early drafts, regression bugs is potentially a bigger issue than an initial deployment. I hadn't considered this. Again though I think its something we can reasonably prepare for. Depending on the potential impact to the business you could go as far as standing up a new chassis next to an existing one, but on the newer code version, run them in parallel, migrating services over slowly, keep the old one up for a while before you take it down. You could just do something as simple and physically replace the routing engine, keep the old one on site for a bit so you can quickly swap back. Or just drain the links in the IGP, downgraded the code, and then un-drain the links, if you've got some single homed services on there. If you have OOB access and plan all the rollback config in advance, we can operationally support the risks, no differently to any other major core change.

Probably the hardest part is assessing what the risk actually is? How to know what level of additional support, monitoring, people, you will need. If you under resource a rollback of a major failure, and fuck the rollback too, you might need some new pants :)

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-07-08 20:57:52 UTC
Permalink
> From: James Bensley [mailto:***@gmail.com]
> Sent: Friday, July 06, 2018 2:04 PM
>
>
>
> On 5 July 2018 09:56:40 BST, ***@netconsultings.com wrote:
> >> Of James Bensley
> >> Sent: Thursday, July 05, 2018 9:15 AM
> >>
> >> - 100% rFLA coverage: TI-LA covers the "black spots" we currently
> >have.
> >>
> >Yeah that's an interesting use case you mentioned, that I haven't
> >considered, that is no TE need but FRR need.
> >But I guess if it was business critical to get those blind spots
> >FRR-protected then you would have done something about it already
> >right?
>
> Hi Adam,
>
> Yeah correct, no mission critical services are effected by this for us, so the
> business obviously hasn't allocated resource to do anything about it. If it was
> a major issue, it should be as simple as adding an extra back haul link to a
> node or shifting existing ones around (to reshape the P space and Q space to
> "please" the FRR algorithm).
>
> >So I guess it's more like it would be nice to have, now is it enough
> >to expose the business to additional risk?
> >Like for instance yes you'd test the feature to death to make sure it
> >works under any circumstances (it's the very heart of the network after
> >all if that breaks everything breaks), but the problem I see is then
> >going to a next release couple of years later -since SR is a new thing
> >it would have a ton of new stuff added to it by then resulting in
> >higher potential for regression bugs with comparison to LDP or RSVP
> >which have been around since
> >ever and every new release to these two is basically just bug fixes.
>
> Good point, I think its worth breaking that down into two separate
> points/concerns:
>
> Initial deployment bugs:
> We've done stuff like pay for a CPoC with Cisco, then deployed, then had it
> all blow up, then paod Cisco AS to asses the situation only to be told it's not a
> good design :D So we just assume a default/safe view now that no amount
> of testing will protect us. We ensure we have backout plans if something
> immediately blows up, and heightened reporting for issues that take 72
> hours to show up, and change freezes to cover issues that take a week to
> show up etc. etc. So I think as far as an initial SR deployment goes, all we can
> do is our best with regards to being cautious, just as we would with any
> major core changes. So I don't see the initial deployment as any more risky
> than other core projects we've undertaken like changing vendors, entire
> chassis replacements, code upgrades between major versions etc.
>
> Regression bugs:
> My opinion is that in the case of something like SR which is being deployed
> based on early drafts, regression bugs is potentially a bigger issue than an
> initial deployment. I hadn't considered this. Again though I think its
> something we can reasonably prepare for. Depending on the potential
> impact to the business you could go as far as standing up a new chassis next
> to an existing one, but on the newer code version, run them in parallel,
> migrating services over slowly, keep the old one up for a while before you
> take it down. You could just do something as simple and physically replace
> the routing engine, keep the old one on site for a bit so you can quickly swap
> back. Or just drain the links in the IGP, downgraded the code, and then un-
> drain the links, if you've got some single homed services on there. If you
> have OOB access and plan all the rollback config in advance, we can
> operationally support the risks, no differently to any other major core
> change.
>
> Probably the hardest part is assessing what the risk actually is? How to know
> what level of additional support, monitoring, people, you will need. If you
> under resource a rollback of a major failure, and fuck the rollback too, you
> might need some new pants :)
>
Well yes I suppose one could actually look at it as on any other major project like upgrade to a new SW release, or migration from LDP to RSVP-TE or adding a second plane -or all 3 together.
And apart from the tedious and rigorous testing (god there's got to be a better way of doing SW validation testing) you made me think about scoping the fallback and contingency options in case things down work out.
These huge projects are always carried out in number of stages each broken down to several individual steps all this is to ease out the deployment but also to scope the fallout in case things go south.
Like in migrations from LDP to RSVP you go intra-pop first then inter-pop between a pair of POPs and so on using small incremental steps and all this time the fallback option is the good old LDP maybe even well after the project is done until the operational confidence is high enough or till the next code upgrade. And I think a similar approach can be used to de-risk an SR rollout.


adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Alexandre Guimaraes
2018-07-08 22:16:04 UTC
Permalink
Adam,

Important observation, I prefer keep my pw working even a lot of segments of the network are affected by fiber cut and so on...

When I migrate my BGP VPLS services to l2circuits, my problems today is almost Zero.

No matter what happens, business order for everyone is to keep everything running 24/7/365 with zero downtime no matter what.... planned maintenance doesn’t count, since is planned.

VPLS services, as I said before, cause two outages in one year due l2 loop caused by operation team, after hours with no progress to find the loop origin, I was called (escalated) to solve the problem.

That’s is I want to mean with my experience, uptime, availability, quality of services and so on....

I was a Cisco CCxx for many years with blind eyes in one vendor only.... even this vendor cause downtime “with brand”! My Cisco env goes down!!? Oh yes, it’s a Cisco!!! I am ok with that? Not anymore! I want peace, happy customers, sell more.

With the time that I have today, I can study new tech, make some lab tests, asking for this or for that with different vendors.

Today, I can sleep well without that fear if, someone will loop something, if some equipment will crash due cpu/memory problems.

And yes, I am a Network Warrior! But now.... a warrior tech. Like Call Of Duty Infinity Warfare!

:)

att
Alexandre

Em 8 de jul de 2018, à(s) 17:58, "***@netconsultings.com" <***@netconsultings.com> escreveu:

>> From: James Bensley [mailto:***@gmail.com]
>> Sent: Friday, July 06, 2018 2:04 PM
>>
>>
>>
>> On 5 July 2018 09:56:40 BST, ***@netconsultings.com wrote:
>>>> Of James Bensley
>>>> Sent: Thursday, July 05, 2018 9:15 AM
>>>>
>>>> - 100% rFLA coverage: TI-LA covers the "black spots" we currently
>>> have.
>>>>
>>> Yeah that's an interesting use case you mentioned, that I haven't
>>> considered, that is no TE need but FRR need.
>>> But I guess if it was business critical to get those blind spots
>>> FRR-protected then you would have done something about it already
>>> right?
>>
>> Hi Adam,
>>
>> Yeah correct, no mission critical services are effected by this for us, so the
>> business obviously hasn't allocated resource to do anything about it. If it was
>> a major issue, it should be as simple as adding an extra back haul link to a
>> node or shifting existing ones around (to reshape the P space and Q space to
>> "please" the FRR algorithm).
>>
>>> So I guess it's more like it would be nice to have, now is it enough
>>> to expose the business to additional risk?
>>> Like for instance yes you'd test the feature to death to make sure it
>>> works under any circumstances (it's the very heart of the network after
>>> all if that breaks everything breaks), but the problem I see is then
>>> going to a next release couple of years later -since SR is a new thing
>>> it would have a ton of new stuff added to it by then resulting in
>>> higher potential for regression bugs with comparison to LDP or RSVP
>>> which have been around since
>>> ever and every new release to these two is basically just bug fixes.
>>
>> Good point, I think its worth breaking that down into two separate
>> points/concerns:
>>
>> Initial deployment bugs:
>> We've done stuff like pay for a CPoC with Cisco, then deployed, then had it
>> all blow up, then paod Cisco AS to asses the situation only to be told it's not a
>> good design :D So we just assume a default/safe view now that no amount
>> of testing will protect us. We ensure we have backout plans if something
>> immediately blows up, and heightened reporting for issues that take 72
>> hours to show up, and change freezes to cover issues that take a week to
>> show up etc. etc. So I think as far as an initial SR deployment goes, all we can
>> do is our best with regards to being cautious, just as we would with any
>> major core changes. So I don't see the initial deployment as any more risky
>> than other core projects we've undertaken like changing vendors, entire
>> chassis replacements, code upgrades between major versions etc.
>>
>> Regression bugs:
>> My opinion is that in the case of something like SR which is being deployed
>> based on early drafts, regression bugs is potentially a bigger issue than an
>> initial deployment. I hadn't considered this. Again though I think its
>> something we can reasonably prepare for. Depending on the potential
>> impact to the business you could go as far as standing up a new chassis next
>> to an existing one, but on the newer code version, run them in parallel,
>> migrating services over slowly, keep the old one up for a while before you
>> take it down. You could just do something as simple and physically replace
>> the routing engine, keep the old one on site for a bit so you can quickly swap
>> back. Or just drain the links in the IGP, downgraded the code, and then un-
>> drain the links, if you've got some single homed services on there. If you
>> have OOB access and plan all the rollback config in advance, we can
>> operationally support the risks, no differently to any other major core
>> change.
>>
>> Probably the hardest part is assessing what the risk actually is? How to know
>> what level of additional support, monitoring, people, you will need. If you
>> under resource a rollback of a major failure, and fuck the rollback too, you
>> might need some new pants :)
>>
> Well yes I suppose one could actually look at it as on any other major project like upgrade to a new SW release, or migration from LDP to RSVP-TE or adding a second plane -or all 3 together.
> And apart from the tedious and rigorous testing (god there's got to be a better way of doing SW validation testing) you made me think about scoping the fallback and contingency options in case things down work out.
> These huge projects are always carried out in number of stages each broken down to several individual steps all this is to ease out the deployment but also to scope the fallout in case things go south.
> Like in migrations from LDP to RSVP you go intra-pop first then inter-pop between a pair of POPs and so on using small incremental steps and all this time the fallback option is the good old LDP maybe even well after the project is done until the operational confidence is high enough or till the next code upgrade. And I think a similar approach can be used to de-risk an SR rollout.
>
>
> adam
>
> netconsultings.com
> ::carrier-class solutions for the telecommunications industry::
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.n
a***@netconsultings.com
2018-07-09 09:58:53 UTC
Permalink
> From: Alexandre Guimaraes [mailto:***@ascenty.com]
> Sent: Sunday, July 08, 2018 11:16 PM
>
> Adam,
>
> Important observation, I prefer keep my pw working even a lot of segments
> of the network are affected by fiber cut and so on...
>
> When I migrate my BGP VPLS services to l2circuits, my problems today is
> almost Zero.
>
I don't think this was down to BGP/tLDP cause both of these are TCP sessions carrying PW VC label and in both cases underlying LDP is used to carry information about transport labels.
It was probably down to the inadequate use of bridge domains on PEs.
Or do you see this as an overall trend in your network please? That is faulty fibre causes BGP session problems but not targeted-LDP session problems?

> No matter what happens, business order for everyone is to keep everything
> running 24/7/365 with zero downtime no matter what.... planned
> maintenance doesn’t count, since is planned.
>
> VPLS services, as I said before, cause two outages in one year due l2 loop
> caused by operation team, after hours with no progress to find the loop
> origin, I was called (escalated) to solve the problem.
>
This one I'd blame on bridge-domains.
Yes L2 is tricky and dumb -that's why I hate it, but if configured correctly you should not run into L2 loops.
Maybe the STP BPDUs where not making it through the PEs for some reason -resulting in inability of the L2 network to block the loop?
Also each bridge-domain should be configured with tight "storm-control" i.e. rate-limiting BUM traffic.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.
Mark Tinka
2018-07-09 10:54:10 UTC
Permalink
On 9/Jul/18 11:58, ***@netconsultings.com wrote:

> That is faulty fibre causes BGP session problems but not targeted-LDP
> session problems?

Faulty fibre will affect any control plane sessions.

I think what Alexandre was trying to say is that troubleshooting the
network where issues could be fibre-related is easier with LDP than with
BGP, because with BGP, you now have to check iBGP sessions, routing
policies (if you have them), BGP routing tables, e.t.c. This could
divert your attention for longer than is necessary before you realize
the fibre could be the issue.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Alexandre Guimaraes
2018-07-09 12:28:55 UTC
Permalink
That’s correct Mark!

I don’t need run BGP in every router or switch that I have, and I don’t have VPLS capable equipments in every location, but i have l2circuits equipments in every location because our business is provide l2 services for Carriers and Corporations.
When VPLS p2mp or mp2mp is needed, we use l2circuits to a routing location POP and VPLS to bridge all routing location POP(PointOfPresence). With IP transit L3 customers, peering partners, upstream providers, the same.

We don’t have small users( xDSL, FTTx, GPON) só we don’t need to take care about some type of usage.

We can tshoot BGP fast and discard after few minutes BGP problems, but Operation Team, NOC Team, can’t or they will be stuck at some point.

Less complexity, more fast to solve the problem.



att
Alexandre

Em 9 de jul de 2018, à(s) 07:54, Mark Tinka <***@seacom.mu<mailto:***@seacom.mu>> escreveu:



On 9/Jul/18 11:58, ***@netconsultings.com<mailto:***@netconsultings.com> wrote:

That is faulty fibre causes BGP session problems but not targeted-LDP session problems?

Faulty fibre will affect any control plane sessions.

I think what Alexandre was trying to say is that troubleshooting the network where issues could be fibre-related is easier with LDP than with BGP, because with BGP, you now have to check iBGP sessions, routing policies (if you have them), BGP routing tables, e.t.c. This could divert your attention for longer than is necessary before you realize the fibre could be the issue.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listi
a***@netconsultings.com
2018-07-09 15:25:46 UTC
Permalink
> From: Mark Tinka [mailto:***@seacom.mu]
> Sent: Monday, July 09, 2018 11:54 AM
>
>> On 9/Jul/18 11:58, ***@netconsultings.com wrote:
>> That is faulty fibre causes BGP session problems but not targeted-LDP session problems?
>
> Faulty fibre will affect any control plane sessions.

Well that really depends on the type of fault, let me explain:

Fault A) flapping rate is within the dampening reuse limit of the core interface so the link stays down.
In a network where there's redundancy between two edge PE nodes.
A failed fibre somewhere in the network should not affect just any control plane protocol.
Quite the opposite.
Does it affect IS-IS or LDP "sessions" (more appropriately adjacencies) yes it does.
But does it affect BGP or targeted LDP (basically TCP)sessions between the two edge PE nodes -or BGP session from PEs to RRs? -no it should not (if there's redundancy in the network).
And by the same token it should not cause downtime on the PW between the two edge PEs (if there's redundancy in the network)
-yes the PW will now be routed via an alternate path circumventing the faulty fibre but it will remain UP (unless bound to a TE path forcing it to stay and fail) and if the switchover is based on FRR customer should hardly notice anything.

Fault B) flapping rate is outside the dampening reuse limit of the core interface so the link goes up and down in dampening reuse time intervals.
This is a nasty one, how nasty depends on your interface state dampening scheme.
But with FRR it should be just inconvenience without much effect to customer traffic carried by the PW that keeps on constantly switching paths, (maybe resulting in some out of order packets if primary and backup paths are not symmetric).

Fault C) link never fails,
It exhibits very high packet drop rate to screw your SLAs but not high enough to take bfd session down (affecting 3 hellos in a row).
Maybe the remedy for this one could be LFM sessions instead of BFD if your network suffers from low quality fibres -as LFM should take into account Frame and Symbol error count for all data passing through the interface not just LFM PDUs, which you can then use as a threshold to bring the link down.


adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::



_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 10:33:11 UTC
Permalink
On 9/Jul/18 17:25, ***@netconsultings.com wrote:

> Well that really depends on the type of fault, let me explain:

All agreed.

My point was that if there is enough redundancy inside the core network
to deal with fibre failures that can keep iBGP sessions up, it will also
keep LDP sessions up, but most importantly, traffic will continue to flow.

The issue I have is when sessions remain up (because of high Keepalive
timers), but there is no actual data plane. This is why I am saying that
it would be remiss of us to give the community the impression that
session uptime is better than what happens at the transport level,
particularly where link redundancy may not be clear enough to abstract
the relationship between both.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Saku Ytti
2018-07-10 10:39:02 UTC
Permalink
On Tue, 10 Jul 2018 at 13:33, Mark Tinka <***@seacom.mu> wrote:

> My point was that if there is enough redundancy inside the core network
> to deal with fibre failures that can keep iBGP sessions up, it will also
> keep LDP sessions up, but most importantly, traffic will continue to flow.

I'd say this is true if you compare full-mesh iBGP and LDP, if you
compare RR iBGP and LDP, it's not true, RR iBGP has signalling
redundancy LDP does not have.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 11:14:13 UTC
Permalink
On 10/Jul/18 12:39, Saku Ytti wrote:

> I'd say this is true if you compare full-mesh iBGP and LDP, if you
> compare RR iBGP and LDP, it's not true, RR iBGP has signalling
> redundancy LDP does not have.

Agreed on that point.

Perhaps an option for the LDP heads is LDP Session Protection, which is
supported by the major vendors. I don't use it in my network, though.

Again, my main focus is maintaining a super stable IGP, one that can
converge quickly enough for the upper-layer protocols (and data plane)
not to notice. Features in upper-layer protocols that speed up
convergence are also exploited in my network to ensure data plane
changes happen as quickly (but as simply) as possible, e.g., the
Indirect Next Hop feature.

As an example, below is a revenue-generating p2p LDP-based EoMPLS
circuit that has been up 46.5 weeks:

***@device-re0# run show l2circuit connections neighbor aaa.bb.c.129
extensive
<snip>
...
Neighbor: aaa.bb.c.129
    Interface                 Type  St     Time last up          # Up trans
    ae2.62(vc 22)             rmt   Up     Aug 18 23:59:44 2017           1
<snip>
...
    Connection History:
        *Aug 18 23:59:44 2017  status update timer**
**        Aug 18 23:59:43 2017  PE route changed**
**        Aug 18 23:59:43 2017  Out lbl Update                    299776**
**        Aug 18 23:59:43 2017  In lbl Update                     299824**
**        Aug 18 23:59:43 2017  loc intf up                       ae2.62*

{master}[edit]
***@device-re0#

***@device-re0# run show ldp session aaa.bb.c.129 extensive
Address: aaa.bb.c.129, State: Operational, Connection: Open, Hold time: 27
<snip>
...
 *Up for 46w3d 11:12:55*
  *Last down 46w3d 11:12:59 ago*; Reason: received notification from peer
  Capabilities advertised: p2mp, make-before-break
  Capabilities received: p2mp, make-before-break
  Protection: disabled
  Session flags: none
<snip>
...
{master}[edit]
***@device-re0#

Have the backbone links carrying this pw been 46.5 weeks stable? That's
a firm NO!

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman
Saku Ytti
2018-07-10 11:22:32 UTC
Permalink
On Tue, 10 Jul 2018 at 14:14, Mark Tinka <***@seacom.mu> wrote:

> Perhaps an option for the LDP heads is LDP Session Protection, which is supported by the major vendors. I don't use it in my network, though.

I don't think so. This is to establish loop-to-loop LDP in addition to
link LDP, meaning it'll help with core LDP, but not pseudowire LDP.
And you should use it :)

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 12:53:02 UTC
Permalink
On 10/Jul/18 13:22, Saku Ytti wrote:

> I don't think so. This is to establish loop-to-loop LDP in addition to
> link LDP, meaning it'll help with core LDP, but not pseudowire LDP.

Never used it, so can't say for sure.

Seems like the Cisco implementation supports protection of VRF's:

   
https://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/fssespro.html#wp1052325

> And you should use it :)

I'll evaluate whether it will add joy or not. For now, since we rebuilt
the network in 2014, it's been humming without it.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/
Pavel Lunin
2018-07-08 23:13:27 UTC
Permalink
Hi experts,

I had a pleasure time reading the whole thread. Thanks, folks !

Honestly, I also (a bit like Saku) feel that Alexandre's case is more about
throwing the *unneeded* complexity away than about BGP vs. LDP.

The whole story of Kompella-style signaling for L2VPN and VPLS is
auto-discovery in a multi-point VPN service case.

But yes, there is the whole bunch of reasons why multi-point L2 VPN sucks,
and when bridged it sucks 10x more. So if you can throw it away, just throw
it away and you won't need to discuss how to signal it and auto-discover
remote sites.

And yes, as pseudo-wire data plane is way simpler than VPLS, depending on
your access network design, you can [try to] extend it end-to-end, all the
way to the access switch and [maybe, if you are lucky] dramatically
simplify your NOC's life.

However p2p pseudo-wire service is a kind of rare thing these days. There
are [quite a lot of] those poor folks who were never asked whether bridged
L2 VPN (aka VPLS) is needed in the network, they operate. They have no much
choice.

BGP signaling is the coolest part of the VPLS hell (some minimal magic is
required though). In general I agree with the idea that iBGP stability is
all about making the underlaying stuff simple and clean (IGP, BFD, Loss of
Light, whatever). Who said "policies"? For VPLS BGP signaling? Please don't.

And yes, switching frames between fancy full-feature PEs is just half of
the game. The autodiscovery beauty breaks when the frames say bye bye to
the MPLS backbone and meet the ugly access layer. Now you need to switch it
down to the end-point and this often ends up in old good^W VLAN
provisioning. But it's not about BGP, it's about VPLS. Or rather about
those brave folks, who build their services relying on all these
ethernet-on-steroid things.

--
Kind regards,
Pavel


On Sun, Jul 8, 2018 at 10:57 PM, <***@netconsultings.com> wrote:

> > From: James Bensley [mailto:***@gmail.com]
> > Sent: Friday, July 06, 2018 2:04 PM
> >
> >
> >
> > On 5 July 2018 09:56:40 BST, ***@netconsultings.com wrote:
> > >> Of James Bensley
> > >> Sent: Thursday, July 05, 2018 9:15 AM
> > >>
> > >> - 100% rFLA coverage: TI-LA covers the "black spots" we currently
> > >have.
> > >>
> > >Yeah that's an interesting use case you mentioned, that I haven't
> > >considered, that is no TE need but FRR need.
> > >But I guess if it was business critical to get those blind spots
> > >FRR-protected then you would have done something about it already
> > >right?
> >
> > Hi Adam,
> >
> > Yeah correct, no mission critical services are effected by this for us,
> so the
> > business obviously hasn't allocated resource to do anything about it. If
> it was
> > a major issue, it should be as simple as adding an extra back haul link
> to a
> > node or shifting existing ones around (to reshape the P space and Q
> space to
> > "please" the FRR algorithm).
> >
> > >So I guess it's more like it would be nice to have, now is it enough
> > >to expose the business to additional risk?
> > >Like for instance yes you'd test the feature to death to make sure it
> > >works under any circumstances (it's the very heart of the network after
> > >all if that breaks everything breaks), but the problem I see is then
> > >going to a next release couple of years later -since SR is a new thing
> > >it would have a ton of new stuff added to it by then resulting in
> > >higher potential for regression bugs with comparison to LDP or RSVP
> > >which have been around since
> > >ever and every new release to these two is basically just bug fixes.
> >
> > Good point, I think its worth breaking that down into two separate
> > points/concerns:
> >
> > Initial deployment bugs:
> > We've done stuff like pay for a CPoC with Cisco, then deployed, then had
> it
> > all blow up, then paod Cisco AS to asses the situation only to be told
> it's not a
> > good design :D So we just assume a default/safe view now that no amount
> > of testing will protect us. We ensure we have backout plans if something
> > immediately blows up, and heightened reporting for issues that take 72
> > hours to show up, and change freezes to cover issues that take a week to
> > show up etc. etc. So I think as far as an initial SR deployment goes,
> all we can
> > do is our best with regards to being cautious, just as we would with any
> > major core changes. So I don't see the initial deployment as any more
> risky
> > than other core projects we've undertaken like changing vendors, entire
> > chassis replacements, code upgrades between major versions etc.
> >
> > Regression bugs:
> > My opinion is that in the case of something like SR which is being
> deployed
> > based on early drafts, regression bugs is potentially a bigger issue
> than an
> > initial deployment. I hadn't considered this. Again though I think its
> > something we can reasonably prepare for. Depending on the potential
> > impact to the business you could go as far as standing up a new chassis
> next
> > to an existing one, but on the newer code version, run them in parallel,
> > migrating services over slowly, keep the old one up for a while before
> you
> > take it down. You could just do something as simple and physically
> replace
> > the routing engine, keep the old one on site for a bit so you can
> quickly swap
> > back. Or just drain the links in the IGP, downgraded the code, and then
> un-
> > drain the links, if you've got some single homed services on there. If
> you
> > have OOB access and plan all the rollback config in advance, we can
> > operationally support the risks, no differently to any other major core
> > change.
> >
> > Probably the hardest part is assessing what the risk actually is? How to
> know
> > what level of additional support, monitoring, people, you will need. If
> you
> > under resource a rollback of a major failure, and fuck the rollback too,
> you
> > might need some new pants :)
> >
> Well yes I suppose one could actually look at it as on any other major
> project like upgrade to a new SW release, or migration from LDP to RSVP-TE
> or adding a second plane -or all 3 together.
> And apart from the tedious and rigorous testing (god there's got to be a
> better way of doing SW validation testing) you made me think about scoping
> the fallback and contingency options in case things down work out.
> These huge projects are always carried out in number of stages each broken
> down to several individual steps all this is to ease out the deployment but
> also to scope the fallout in case things go south.
> Like in migrations from LDP to RSVP you go intra-pop first then inter-pop
> between a pair of POPs and so on using small incremental steps and all this
> time the fallback option is the good old LDP maybe even well after the
> project is done until the operational confidence is high enough or till the
> next code upgrade. And I think a similar approach can be used to de-risk an
> SR rollout.
>
>
> adam
>
> netconsultings.com
> ::carrier-class solutions for the telecommunications industry::
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-09 10:45:24 UTC
Permalink
On 9/Jul/18 01:13, Pavel Lunin wrote:

> And yes, as pseudo-wire data plane is way simpler than VPLS, depending on
> your access network design, you can [try to] extend it end-to-end, all the
> way to the access switch and [maybe, if you are lucky] dramatically
> simplify your NOC's life.

We run MPLS all the way into the access, on ASR920's. So pw's are
end-to-end, and the Provisioning/NOC teams only need to look at the end
boxes. I found the whole idea of centralized "gateways" in the core to
be a bit clunky.


> However p2p pseudo-wire service is a kind of rare thing these days. There
> are [quite a lot of] those poor folks who were never asked whether bridged
> L2 VPN (aka VPLS) is needed in the network, they operate. They have no much
> choice.

This is a number I'd like to, someday, actually qualify. When VPLS was
the buzzword in 2009, everyone was jumping on to it. I'd like to know
how many of those have continued with it, moved over to EVPN, moved to
l3vpn, moved to plain-old Internet or moved to LDP-based p2p and p2mp
solutions.

My previous employer was heavy on VPLS when I joined, using both for
services (to deliver customer VPN's) and a backbone (as an overlay to
carry other traffic). By the time I'd left, we'd moved most VPN services
across to LDP-based p2p/p2mp, and only had the Broadband Subscribe
backhaul running over VPLS (this was before PWHE).


> And yes, switching frames between fancy full-feature PEs is just half of
> the game. The autodiscovery beauty breaks when the frames say bye bye to
> the MPLS backbone and meet the ugly access layer. Now you need to switch it
> down to the end-point and this often ends up in old good^W VLAN
> provisioning. But it's not about BGP, it's about VPLS. Or rather about
> those brave folks, who build their services relying on all these
> ethernet-on-steroid things.

Couldn't agree more.

I know a number of mobile networks that use VPLS as a backbone to handle
their IP data traffic. I've always been curious what that's like to manage.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Pavel Lunin
2018-07-09 22:46:21 UTC
Permalink
We run MPLS all the way into the access, on ASR920's. So pw's are
> end-to-end, and the Provisioning/NOC teams only need to look at the end
> boxes. I found the whole idea of centralized "gateways" in the core to be a
> bit clunky.
>


I have no doubt that you know how to run MPLS in the access smoothly.
However, choosing the right gear for this role has always been a hard job.
Those folks who chose Brocade CES some 5-7 years ago, where are they now?

The problem is that most real-world networkers have not enough
understanding of MPLS internals, or time, or both to check all those
hardware and software limits and rather look at the vendor's specs in terms
of "supported/not supported". This approach works _relatively_ well in many
cases like choosing a classic switch or a firewall or even an MX/ASR-like
full-feature PE. But for the MPLS in the access you need to tear the guts
out of your vendor, test everything yourself in all possible scenarios and
still be extremely suspicious about every single thing. Moreover a lot of
people have some commercial/political limits in choosing hardware.

So, while MPLS in the access looks like a good idea, and there are people
who manage to run it well, I know more failure than success stories.

However p2p pseudo-wire service is a kind of rare thing these days. There
> are [quite a lot of] those poor folks who were never asked whether bridged
> L2 VPN (aka VPLS) is needed in the network, they operate. They have no much
> choice.
>
>
> This is a number I'd like to, someday, actually qualify. When VPLS was the
> buzzword in 2009, everyone was jumping on to it. I'd like to know how many
> of those have continued with it, moved over to EVPN, moved to l3vpn, moved
> to plain-old Internet or moved to LDP-based p2p and p2mp solutions.
>

Good question, indeed. In my opinion there are still a lot of folks out
there who build DC networks with vPC, FEX, VirtualChassis, Fusion etc,
which is finally the old good VLANs in a vendor packaged black magic box.
Sooner or later those VLANs need to go across multiple sites. It's nearly
improbable that having such a design, you'll mange to build a
EVPN-VXLAN-hipster-buzz-based DCI. So VPLS is still their best friend. I've
seen some of them who understand that it's evil, and some who believe that
it's OK, both had no choice.

However my original point was rather about pseudo-wires than VPLS. I mean,
I don't see a lot of pseudo-wires in the wild. Mostly because PW is a kind
of hard to sell. Customers can be of two types: those who love Metro
Ethernet and those who don't. It's true for real customers, whose
requirements are amplified by the sales people, and internal infrastructure
folks.

Those who love L2 because "it's better and easier" usually don't know what
a pseudowire is. And they just don't care. "Like a switch" is what they are
looking for.

Those who avoid metro-ethernet just don't need pseudowires, certainly
automesh Kompella-style. L3VPN works well for them, or they buy L1 between
their routers, or go EVPN.

A pseudo-wire is a kind of side application in my experience, even though
technically it's simple and powerful. Not that it doesn't exist as a
commercial service, but mostly used for internal infrastructure needs on an
occasional basis.

So I tend to think, that if your business can make money out of
pseudo-wires, it's not about your network design, you are just lucky ;)
Aaron Gould
2018-07-10 00:32:02 UTC
Permalink
My entire Ethernet cellular backhaul architecture is based an a pair of pseudowires per cell tower. We are making money off of pw's.

Aaron

> On Jul 9, 2018, at 5:46 PM, Pavel Lunin <***@gmail.com> wrote:
>
> We run MPLS all the way into the access, on ASR920's. So pw's are
>> end-to-end, and the Provisioning/NOC teams only need to look at the end
>> boxes. I found the whole idea of centralized "gateways" in the core to be a
>> bit clunky.
>
>
> I have no doubt that you know how to run MPLS in the access smoothly.
> However, choosing the right gear for this role has always been a hard job.
> Those folks who chose Brocade CES some 5-7 years ago, where are they now?
>
> The problem is that most real-world networkers have not enough
> understanding of MPLS internals, or time, or both to check all those
> hardware and software limits and rather look at the vendor's specs in terms
> of "supported/not supported". This approach works _relatively_ well in many
> cases like choosing a classic switch or a firewall or even an MX/ASR-like
> full-feature PE. But for the MPLS in the access you need to tear the guts
> out of your vendor, test everything yourself in all possible scenarios and
> still be extremely suspicious about every single thing. Moreover a lot of
> people have some commercial/political limits in choosing hardware.
>
> So, while MPLS in the access looks like a good idea, and there are people
> who manage to run it well, I know more failure than success stories.
>
> However p2p pseudo-wire service is a kind of rare thing these days. There
>> are [quite a lot of] those poor folks who were never asked whether bridged
>> L2 VPN (aka VPLS) is needed in the network, they operate. They have no much
>> choice.
>>
>>
>> This is a number I'd like to, someday, actually qualify. When VPLS was the
>> buzzword in 2009, everyone was jumping on to it. I'd like to know how many
>> of those have continued with it, moved over to EVPN, moved to l3vpn, moved
>> to plain-old Internet or moved to LDP-based p2p and p2mp solutions.
>
> Good question, indeed. In my opinion there are still a lot of folks out
> there who build DC networks with vPC, FEX, VirtualChassis, Fusion etc,
> which is finally the old good VLANs in a vendor packaged black magic box.
> Sooner or later those VLANs need to go across multiple sites. It's nearly
> improbable that having such a design, you'll mange to build a
> EVPN-VXLAN-hipster-buzz-based DCI. So VPLS is still their best friend. I've
> seen some of them who understand that it's evil, and some who believe that
> it's OK, both had no choice.
>
> However my original point was rather about pseudo-wires than VPLS. I mean,
> I don't see a lot of pseudo-wires in the wild. Mostly because PW is a kind
> of hard to sell. Customers can be of two types: those who love Metro
> Ethernet and those who don't. It's true for real customers, whose
> requirements are amplified by the sales people, and internal infrastructure
> folks.
>
> Those who love L2 because "it's better and easier" usually don't know what
> a pseudowire is. And they just don't care. "Like a switch" is what they are
> looking for.
>
> Those who avoid metro-ethernet just don't need pseudowires, certainly
> automesh Kompella-style. L3VPN works well for them, or they buy L1 between
> their routers, or go EVPN.
>
> A pseudo-wire is a kind of side application in my experience, even though
> technically it's simple and powerful. Not that it doesn't exist as a
> commercial service, but mostly used for internal infrastructure needs on an
> occasional basis.
>
> So I tend to think, that if your business can make money out of
> pseudo-wires, it's not about your network design, you are just lucky ;)
>
>
> --
> Pavel
> _______________________________________________
> juniper-nsp mailing list juniper-***@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 07:36:05 UTC
Permalink
On 10/Jul/18 02:32, Aaron Gould wrote:

> My entire Ethernet cellular backhaul architecture is based an a pair of pseudowires per cell tower. We are making money off of pw's.

That I get.

What I was keen on understanding are those MNO's that use VPLS as the
backbone, i.e., as in a network-wide LAN.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Pavel Lunin
2018-07-10 11:54:01 UTC
Permalink
On Tue, Jul 10, 2018 at 2:32 AM, Aaron Gould <***@gvtc.com> wrote:

> My entire Ethernet cellular backhaul architecture is based an a pair of
> pseudowires per cell tower. We are making money off of pw's.
>
>
Yep, mobile backhaul is a classic example. But it's also a case when
business model permits the network to be designed as such and not
vise-versa.

I still won't call you lucky though, as you also need to deliver clocking
there ;)
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 07:35:09 UTC
Permalink
On 10/Jul/18 00:46, Pavel Lunin wrote:


>
>
> I have no doubt that you know how to run MPLS in the access smoothly.
> However, choosing the right gear for this role has always been a hard
> job. Those folks who chose Brocade CES some 5-7 years ago, where are
> they now?
>
> The problem is that most real-world networkers have not enough
> understanding of MPLS internals, or time, or both to check all those
> hardware and software limits and rather look at the vendor's specs in
> terms of "supported/not supported". This approach works _relatively_
> well in many cases like choosing a classic switch or a firewall or
> even an MX/ASR-like full-feature PE. But for the MPLS in the access
> you need to tear the guts out of your vendor, test everything yourself
> in all possible scenarios and still be extremely suspicious about
> every single thing. Moreover a lot of people have some
> commercial/political limits in choosing hardware.
>
> So, while MPLS in the access looks like a good idea, and there are
> people who manage to run it well, I know more failure than success
> stories.

Truth!

I was there, back in 2008, beating the guts out of Juniper, Cisco and
Brocade, to build me a box that had a 1U form factor, had between 20 -
40 1Gbps ports, supported IP/MPLS in full, and was priced around
US$5,000 - US$8,000 per unit. Boy, was that an experience. I needed to
migrate a network that was carrying nearly 6,000 l2vpn/l3vpn services
from classic Huawei- and Cisco-driven 802.1Q-based Metro-E to IP/MPLS.

For the existing platforms in 2008:

* The Cisco ME6500 was more of a switch than a router.

* The Cisco 3750ME was more of a switch than a router, although it had
some MPLS tendencies.

* The MX (960 and 480, at the time), were the right gear, but too big
and pricey.

By the start of 2009:

* Juniper promised they'd have a 1U, 48-port version of the MX80. I
was happy.

* Cisco had shipped me a test board (no protective chassis at the
time) of the ME3600X.

* I began testing the Brocade CES/CER2000 NetIron.

By the end of 2009:

* Juniper disappointed me with their lack of vision on where things
were going in the Metro. When I say that the MX204 is their first
real attempt at having something that can work in the Metro, I'm not
just talking :-)...

* We dumped the NetIron because while it was working well, and had a
large FIB for the period (512,000 IPv4 entries), we wanted to stick
with 2 vendors. They also didn't support QPPB, which we wanted at
the time.

* We settled on the ME3600X. Because we insisted on QPPB, we managed
to convince Cisco to re-spin the Nile ASIC before the box was FCS.

If I'm not mistaken, we were the first network in the world, at the
time, to go full IP/MPLS in the Access, at a large scale (some 100+
nodes by the end of 2013).

So yes, the battle was long and hard-fought. I suppose the biggest
hurdle to overcome was the temptation to slip back into classic
802.1Q-based Metro-E deployments, just with newer kit. I spent a whole
year convincing the business that that was the wrong approach. Wasn't
easy, but in 2018, I believe that operation is now several hundred units
large (on the ASR920, I believe), and managing thousands of l2vpn/l3vpn
Enterprise customers.


>
> Good question, indeed. In my opinion there are still a lot of folks
> out there who build DC networks with vPC, FEX, VirtualChassis, Fusion
> etc, which is finally the old good VLANs in a vendor packaged black
> magic box. Sooner or later those VLANs need to go across multiple
> sites. It's nearly improbable that having such a design, you'll mange
> to build a EVPN-VXLAN-hipster-buzz-based DCI. So VPLS is still their
> best friend. I've seen some of them who understand that it's evil, and
> some who believe that it's OK, both had no choice.
>
> However my original point was rather about pseudo-wires than VPLS. I
> mean, I don't see a lot of pseudo-wires in the wild. Mostly because PW
> is a kind of hard to sell. Customers can be of two types: those who
> love Metro Ethernet and those who don't. It's true for real customers,
> whose requirements are amplified by the sales people, and internal
> infrastructure folks.
>
> Those who love L2 because "it's better and easier" usually don't know
> what a pseudowire is. And they just don't care. "Like a switch" is
> what they are looking for.
>
> Those who avoid metro-ethernet just don't need pseudowires, certainly
> automesh Kompella-style. L3VPN works well for them, or they buy L1
> between their routers, or go EVPN.
>
> A pseudo-wire is a kind of side application in my experience, even
> though technically it's simple and powerful. Not that it doesn't exist
> as a commercial service, but mostly used for internal infrastructure
> needs on an occasional basis.
>
> So I tend to think, that if your business can make money out of
> pseudo-wires, it's not about your network design, you are just lucky ;)

In our market (primarily Africa, and parts of Europe as end points),
customers that require point-to-point connectivity are divided into 2
categories; those that are technically-astute, and those that only know
enough to get what they want.

The technical customers will be concerned about whether the transport is
MPLS- or DWDM-based, whether it's Ethernet or SDH, how policing and
queuing is done, e.t.c. The non-technical customers will just want a
scalable point-to-point service, across a medium that they can afford,
which tends to be Ethernet.

Generally, customers interested in DWDM would be those that need
anything over 10Gbps, or are the type that have a strong desire to run a
"clean" backbone, even if that's not their core business.

Customers that take our EoMPLS service have never been interested in
whether it's provisioned via LDP, BGP, VPLS, EVPN, e.t.c. And this
includes a bunch of well-known global companies that know their IP/MPLS.

Maybe it's just our side of the seas.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-07-10 19:43:33 UTC
Permalink
> Of Mark Tinka
> Sent: Tuesday, July 10, 2018 8:35 AM
> To: Pavel Lunin
> Cc: juniper-nsp
> Subject: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-
> lag)
>
>
> In our market (primarily Africa, and parts of Europe as end points),
customers
> that require point-to-point connectivity are divided into 2 categories;
those
> that are technically-astute, and those that only know enough to get what
> they want.
>
> The technical customers will be concerned about whether the transport is
> MPLS- or DWDM-based, whether it's Ethernet or SDH, how policing and
> queuing is done, e.t.c. The non-technical customers will just want a
scalable
> point-to-point service, across a medium that they can afford, which tends
to
> be Ethernet.
>
> Generally, customers interested in DWDM would be those that need
> anything over 10Gbps, or are the type that have a strong desire to run a
> "clean" backbone, even if that's not their core business.
>
> Customers that take our EoMPLS service have never been interested in
> whether it's provisioned via LDP, BGP, VPLS, EVPN, e.t.c. And this
includes a
> bunch of well-known global companies that know their IP/MPLS.
>
> Maybe it's just our side of the seas.
>
Yep same experience,

Although some of the customers (especially the big carriers) really knew
what they wanted and how they want it :)
So is the market still blooming the same way or has it saturated over time
please?

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 20:08:53 UTC
Permalink
On 10/Jul/18 21:43, ***@netconsultings.com wrote:

> So is the market still blooming the same way or has it saturated over time
> please?

Connectivity (you know, the "dumb-pipe" talk) is very big business in
Africa right now; and growing massively YoY.

In the more matured markets like South Africa, we are moving away from
classic l3vpn's to basic Internet Access, IP Transit and p2p/p2mp
Ethernet. The market is embracing cloud, so VPN's are no longer all the
rage. Ethernet circuits are mainly to drive connectivity between
branches, and not to "guarantee" QoS to internal resources - those are
now "cloudified" services.

Some ISP's are still trying to hold on to l3vpn's, and using SD-WAN as a
crutch to maintain that revenue. But this won't last long, since the
major cloud providers are deploying presence in the region.

The future is very bright.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
a***@netconsultings.com
2018-07-10 19:33:37 UTC
Permalink
> Of Pavel Lunin
> Sent: Monday, July 09, 2018 11:46 PM
>
> However my original point was rather about pseudo-wires than VPLS. I
> mean, I don't see a lot of pseudo-wires in the wild. Mostly because PW is
a
> kind of hard to sell. Customers can be of two types: those who love Metro
> Ethernet and those who don't. It's true for real customers, whose
> requirements are amplified by the sales people, and internal
infrastructure
> folks.
>
> Those who love L2 because "it's better and easier" usually don't know what
a
> pseudowire is. And they just don't care. "Like a switch" is what they are
> looking for.
>
> Those who avoid metro-ethernet just don't need pseudowires, certainly
> automesh Kompella-style. L3VPN works well for them, or they buy L1
> between their routers, or go EVPN.
>
> A pseudo-wire is a kind of side application in my experience, even though
> technically it's simple and powerful. Not that it doesn't exist as a
commercial
> service, but mostly used for internal infrastructure needs on an
occasional
> basis.
>
> So I tend to think, that if your business can make money out of pseudo-
> wires, it's not about your network design, you are just lucky ;)
>
Speaking for the Carrier Ethernet market here,

There was a huge market potential several years ago for p2p PWs (I don't
really know what the situation is nowadays).
It all started the same way as the shift from leased lines to Frame-Relay.
Leased lines where expensive and not very profitable for service providers
-so instead of selling a leased-line between point A and B to one customer
SPs put a FR switch at each end and sold it to 10 customers.
Same happened several years ago with operators who owned their own fibres or
had enough lambdas -you can't make good money selling wavelengths -and
besides it's not like people need SONET or SDH -everyone converged on
Ethernet, so what do you do? You stick a PE router at each end of the lambda
and you sell it to 10 customers instead.
Customers can't tell the difference cause this thing is like an Ethernet
cable between point A and point B -and you sell it at a much lower price
point in comparison to wavelength so it becomes very attractive to a lot of
folks who could not afford a dedicated wavelength or fibre.
The p2p PW were the killer app, for providing services to all the smaller
SPs that wanted to expand their MPLS backbones across the country or even to
other countries but did not quite qualify for dedicated fibre or lambda to
some remote places that we had covered. But this was also used by large SPs
like at&t, etc.. to extend their (mostly) PE-CE links to places where they
did not have their own infrastructure.
ENNIs to other Carrier-Ethernet providers and of course MEF standardization
made it really easy to stretch these PW services all over the place.
And we all did use just simple p2p PWs, no mac learning, just simple what
goes in goes out including L2CPs.
This was carriers providing services to other carriers.

Now with corporate customers or DC folks the story was always "yeah this
l2vpn stuff is hot right now we need that too", but if you talked to them it
turned out what they really needed was just a series of p2p links -no one
wanted to have mac limits imposed on them or be haunted by the complexity of
mp2mp and large l2 domains.

So where we used what someone might call VPLS (bunch of PW into a BD) was
primarily for internal services for l2 backhauling.

As you can see my experience is quite the opposite, that is no really much
of mp2mp or p2mp VPLS style services, but a whole lot of p2p PWs all over
the place.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::








_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-10 20:16:07 UTC
Permalink
On 10/Jul/18 21:33, ***@netconsultings.com wrote:

> Speaking for the Carrier Ethernet market here,
>
> There was a huge market potential several years ago for p2p PWs (I don't
> really know what the situation is nowadays).
> It all started the same way as the shift from leased lines to Frame-Relay.
> Leased lines where expensive and not very profitable for service providers
> -so instead of selling a leased-line between point A and B to one customer
> SPs put a FR switch at each end and sold it to 10 customers.
> Same happened several years ago with operators who owned their own fibres or
> had enough lambdas -you can't make good money selling wavelengths -and
> besides it's not like people need SONET or SDH -everyone converged on
> Ethernet, so what do you do? You stick a PE router at each end of the lambda
> and you sell it to 10 customers instead.
> Customers can't tell the difference cause this thing is like an Ethernet
> cable between point A and point B -and you sell it at a much lower price
> point in comparison to wavelength so it becomes very attractive to a lot of
> folks who could not afford a dedicated wavelength or fibre.
> The p2p PW were the killer app, for providing services to all the smaller
> SPs that wanted to expand their MPLS backbones across the country or even to
> other countries but did not quite qualify for dedicated fibre or lambda to
> some remote places that we had covered. But this was also used by large SPs
> like at&t, etc.. to extend their (mostly) PE-CE links to places where they
> did not have their own infrastructure.
> ENNIs to other Carrier-Ethernet providers and of course MEF standardization
> made it really easy to stretch these PW services all over the place.
> And we all did use just simple p2p PWs, no mac learning, just simple what
> goes in goes out including L2CPs.
> This was carriers providing services to other carriers.
>
> Now with corporate customers or DC folks the story was always "yeah this
> l2vpn stuff is hot right now we need that too", but if you talked to them it
> turned out what they really needed was just a series of p2p links -no one
> wanted to have mac limits imposed on them or be haunted by the complexity of
> mp2mp and large l2 domains.
>
> So where we used what someone might call VPLS (bunch of PW into a BD) was
> primarily for internal services for l2 backhauling.
>
> As you can see my experience is quite the opposite, that is no really much
> of mp2mp or p2mp VPLS style services, but a whole lot of p2p PWs all over
> the place.

From an African perspective, leasing backhaul in Europe is cheap. The
cost of 10Gbps or 100Gbps EoDWDM makes so much sense, there is no need
to build our own routes. In such an instance, EoMPLS wouldn't work for
us. Also, it's a lot cheaper for an operator to deliver 10Gbps or
100Gbps via EoDWDM than via EoMPLS, to another operator.

With Africa growing so fast right now, B-end circuits are all over the
place. So delivering anywhere from 2Mbps - 10Gbps via p2p LDP-based
EoMPLS pw's is big business as well. Like I mentioned before, most
customers don't really care how it works, as long as the capability and
the price is right.

Africa is not yet at a stage where a simple enterprise business can call
an operator for an EoDWDM service. But it is at a place where they can
call an operator for an EoMPLS service, because those work out much
better at various bandwidth options, unlike pure EoDWDM.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-05 13:08:02 UTC
Permalink
I really like the simplicity of my ldp-based l2vpn's... eline and elan

You just made me realize how that would change if I turned off ldp.

So, SR isn't able to signal those l2circuits, and manual vpls instances ?
... I would have to do all that with bgp ? I use bgp in some cases for
rfc4762, but not for simple martini l2circuits.

My entire cell backhaul environment is based on ldp based pseudowires.

-Aaron



_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Saku Ytti
2018-07-06 09:59:02 UTC
Permalink
Hey Aaron,

> So, SR isn't able to signal those l2circuits, and manual vpls instances ?

Correct.

> ... I would have to do all that with bgp ? I use bgp in some cases for
> rfc4762, but not for simple martini l2circuits.

Correct.

Why do you prefer to dynamically start PtP LDP for pseudowires,
instead of just using the existing BGP session? Particularly if you
already offer VPLS product, why do you insist provisioning two
entirely different stacks? Programmatically it's very complicated,
programmatically speaking your best option would be to provision
point-to-point pseudowires as VPLS with two points and MAC learning
disabled having lowest overall complexity possible.
The LDP argument might make sense, in an environment where you do not
need BGP at the edge at all. Otherwise, I think it is just not
justifiable.

--
++ytti
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
James Bensley
2018-07-06 13:28:35 UTC
Permalink
On 5 July 2018 14:08:02 BST, Aaron Gould <***@gvtc.com> wrote:
>I really like the simplicity of my ldp-based l2vpn's... eline and elan
>
>You just made me realize how that would change if I turned off ldp.
>
>So, SR isn't able to signal those l2circuits, and manual vpls instances
>?
>... I would have to do all that with bgp ? I use bgp in some cases for
>rfc4762, but not for simple martini l2circuits.
>
>My entire cell backhaul environment is based on ldp based pseudowires.

Hi Aaron,

Yes that would be a change in your existing setup but only if you turned off LDP. SR fully supports (on paper at least!) running LDP and SR simultaneously so you wouldn't need a big bang approach and have to hard switch if you were to move to BGP signalled services and/or SR. However, I don't think SR is designed to be run along side LDP long term either. I'm sure bugs will pop up, if you can use LDP for only signalling L2 VPNs somehow and SR for transport LSP signalling you wouldn't need to migrate. I think on Juniper you might be able to raise the "preference" (Administrative Distance is Cisco parlance) of LDP separate from the IGP but I don't think you can do that on Cisco?

I'm ranting a bit here, but I'd personally look to move to all BGP signalled services if I was moving to SR. You have one protocol for IGP transport (SR extended OSPF or SR extended IS-IS) and one protocol for all service transport signalling (BGP). We (the industry) have our lovely L3 VPNs already, with standard BGP communities, RTs and RDs and then a bunch of policies and route reflectors to efficiently control route distribution and label allocation. We also have high-availability of that information through RR clusters and features like BGP Add-Path and PIC. We also have good scalability from signalled services using FAT and Entropy labels.

Now with BGP signalled EVPN using MPLS for transport instead of VXLAN, we have again RTs and RDs and communities et al. This means we can use similar policies on the same RR's to control route (MAC or GW) and label distribution efficiently and only to those who exactly need to carry the extra state. We get to use the same HA and scalability benefits too. Even with BGP signalled and BGP based auto discovery for ELINE services, we control who has that AFI/SAFI combo enabled cleanly. With LDP, the configuration and control are both fully distributed to the PEs. Not a major issue, but "BGP for everything" helps to keep the design, implementation and limitations of all our services more closely aligned.

If you're also using FlowSpec, BMP, BGP-LS, BGP-MDT etc, it makes sense to me to keep capitalising on that single signaling protocol for all services.

Cheers,
James.

P.s. sorry, on a plane so I've got time to kill.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-06 15:03:59 UTC
Permalink
Thanks, appreciate the thoughts/insights

BGP-based ELINE ... Is this Kompella you speak of ? If so, seems like a lot for a simply/quick p2p pw

Aaron

> On Jul 6, 2018, at 8:28 AM, James Bensley <***@gmail.com> wrote:
>
>
>
>> On 5 July 2018 14:08:02 BST, Aaron Gould <***@gvtc.com> wrote:
>> I really like the simplicity of my ldp-based l2vpn's... eline and elan
>>
>> You just made me realize how that would change if I turned off ldp.
>>
>> So, SR isn't able to signal those l2circuits, and manual vpls instances
>> ?
>> ... I would have to do all that with bgp ? I use bgp in some cases for
>> rfc4762, but not for simple martini l2circuits.
>>
>> My entire cell backhaul environment is based on ldp based pseudowires.
>
> Hi Aaron,
>
> Yes that would be a change in your existing setup but only if you turned off LDP. SR fully supports (on paper at least!) running LDP and SR simultaneously so you wouldn't need a big bang approach and have to hard switch if you were to move to BGP signalled services and/or SR. However, I don't think SR is designed to be run along side LDP long term either. I'm sure bugs will pop up, if you can use LDP for only signalling L2 VPNs somehow and SR for transport LSP signalling you wouldn't need to migrate. I think on Juniper you might be able to raise the "preference" (Administrative Distance is Cisco parlance) of LDP separate from the IGP but I don't think you can do that on Cisco?
>
> I'm ranting a bit here, but I'd personally look to move to all BGP signalled services if I was moving to SR. You have one protocol for IGP transport (SR extended OSPF or SR extended IS-IS) and one protocol for all service transport signalling (BGP). We (the industry) have our lovely L3 VPNs already, with standard BGP communities, RTs and RDs and then a bunch of policies and route reflectors to efficiently control route distribution and label allocation. We also have high-availability of that information through RR clusters and features like BGP Add-Path and PIC. We also have good scalability from signalled services using FAT and Entropy labels.
>
> Now with BGP signalled EVPN using MPLS for transport instead of VXLAN, we have again RTs and RDs and communities et al. This means we can use similar policies on the same RR's to control route (MAC or GW) and label distribution efficiently and only to those who exactly need to carry the extra state. We get to use the same HA and scalability benefits too. Even with BGP signalled and BGP based auto discovery for ELINE services, we control who has that AFI/SAFI combo enabled cleanly. With LDP, the configuration and control are both fully distributed to the PEs. Not a major issue, but "BGP for everything" helps to keep the design, implementation and limitations of all our services more closely aligned.
>
> If you're also using FlowSpec, BMP, BGP-LS, BGP-MDT etc, it makes sense to me to keep capitalising on that single signaling protocol for all services.
>
> Cheers,
> James.
>
> P.s. sorry, on a plane so I've got time to kill.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-06 16:26:12 UTC
Permalink
On 6/Jul/18 17:03, Aaron Gould wrote:


> Thanks, appreciate the thoughts/insights
>
> BGP-based ELINE ... Is this Kompella you speak of ? If so, seems like a lot for a simply/quick p2p pw

There have been plenty of threads arguing for/against BGP-based vs.
LDP-based pw signaling.

Technically, modern kit should be able to scale well with either
solution. IMHO, it's just a case of what you find simpler.

Some folk find BGP-based signaling simpler than LDP-based signaling, and
vice versa.

Both are mature solutions today that will work and continue to be
well-supported.

We find LDP-based signaling simpler than having this in BGP, considering
all that is needed is just to setup a simple p2p pw. We don't do VPLS,
and don't do l3vpn's either, so don't rely that much on BGP for anything
else beyond vanilla iBGP/eBGP routing.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-06 16:49:08 UTC
Permalink
Thanks Mark,

To be clear I'm not arguing the difference and preference of RFC 4761 and RFC 4762 BGP or LDP signaled multi point VPLS...

...i'm strictly speaking about eline point to point....

Aaron

> On Jul 6, 2018, at 11:26 AM, Mark Tinka <***@seacom.mu> wrote:
>
>
>
> On 6/Jul/18 17:03, Aaron Gould wrote:
>
>
>> Thanks, appreciate the thoughts/insights
>>
>> BGP-based ELINE ... Is this Kompella you speak of ? If so, seems like a lot for a simply/quick p2p pw
>
> There have been plenty of threads arguing for/against BGP-based vs. LDP-based pw signaling.
>
> Technically, modern kit should be able to scale well with either solution. IMHO, it's just a case of what you find simpler.
>
> Some folk find BGP-based signaling simpler than LDP-based signaling, and vice versa.
>
> Both are mature solutions today that will work and continue to be well-supported.
>
> We find LDP-based signaling simpler than having this in BGP, considering all that is needed is just to setup a simple p2p pw. We don't do VPLS, and don't do l3vpn's either, so don't rely that much on BGP for anything else beyond vanilla iBGP/eBGP routing.
>
> Mark.

_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-06 18:01:35 UTC
Permalink
On 6/Jul/18 18:49, Aaron Gould wrote:

> Thanks Mark,
>
> To be clear I'm not arguing the difference and preference of RFC 4761 and RFC 4762 BGP or LDP signaled multi point VPLS...
>
> ...i'm strictly speaking about eline point to point....

BGP can also be used for p2p services. In our case, I feel that'd be
overkill, as p2p makes the majority of Ethernet circuits we sell.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Mark Tinka
2018-07-06 18:01:20 UTC
Permalink
On 6/Jul/18 18:49, Aaron Gould wrote:

> Thanks Mark,
>
> To be clear I'm not arguing the difference and preference of RFC 4761 and RFC 4762 BGP or LDP signaled multi point VPLS...
>
> ...i'm strictly speaking about eline point to point....

BGP can also be used for p2p services. In our case, I feel that'd be
overkill, as p2p makes the majority of Ethernet circuits we sell.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Gustav Ulander
2018-07-05 22:21:33 UTC
Permalink
Hello James.
Interesting feedback, thank you.

//Gustav

-----Ursprungligt meddelande-----
Från: juniper-nsp <juniper-nsp-***@puck.nether.net> För James Bensley
Skickat: den 5 juli 2018 10:15
Till: juniper-***@puck.nether.net
Ämne: Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

On 4 July 2018 at 18:13, Mark Tinka <***@seacom.mu> wrote:
>
>
> On 4/Jul/18 18:28, James Bensley wrote:
>
> Also
>
> Clarence Filsfils from Cisco lists some of their customers who are
> happy to be publicly named as running SR:
>
> https://www.youtube.com/watch?v=NJxtvNssgA8&feature=youtu.be&t=11m50s
>
>
> We've been struggling to get vendors to present deployments from their
> customers when they submit talks around SR. So the SR talks end up
> becoming updates on where SR is from a protocol development
> standpoint, recaps for those that are new to SR, e.t.c.
>
> Perhaps those willing to talk about SR from the vendor community do
> not have the in with their customers like folk like Clarence might, but I'm not sure.
>
> I'll reach out to Clarence and see if we can get him to talk about
> this with one or two of his customers at an upcoming meeting.

Hi Mark,

If you get any feedback you can publicly share I'm all ears!

As far as a greenfield deployment goes I'm fairly convinced that SR would be a good idea now, it would future proof that deployment and for our use case it does actually bring some benefits. To explain further; we don't have one large contiguous AS or IGP, we build regional MPLS networks, each on a private AS (and with a stand alone
IGP+LDP) and use Inter-AS services to provide end-to-end services over
the core AS network between regional networks.

If we built a new regional network tomorrow these are the benefits I see from SR over our existing IGP+LDP design:

- Obviously remove LDP which is one less protocol in the network: This means less to configure, less for CoPPs/Lo0 filter, less for inter-op testing as we're mixed vendor, less for operations to support.

- Easier to support: Now that labels are transported in the IGP I hope that it would be easier to train support staff and troubleshooting MPLS related issues.They don't need to check LDP is up, they should see the SID for a prefix inside the IGP along with the prefix. No prefix, then no SID, etc. I would ideally move all services into BGP (so no more LDP signaled pseudowires, BGP signaled only service to unify all services as BGP signaled [L3 VPN, L2 VPN VPWS/EVPN/VPLS/etc.]).

- Go IPv6 native: If using ISIS as the IGP we should be able to go
IPv4 free (untested and I haven't research that much!).

- Bring label mapping into the IGP: No microloops during re-convergence as we heavily use IP FRR rLFA.

- 100% rFLA coverage: TI-LA covers the "black spots" we currently have.

- Remove LACP from the network: SR has some nice ECMP features, I'm not going to start an ECMP vs LAG discussion (war?) but ECMP means we don't need LACP which again is one less protocol for inter-op testing, less to configure, less to support etc.It also keeps our p-t-p links all they same instead of two kinds, p-t-p L3 or LAG bundle (also fewer config templates).

- Remove microBFD sessions: In the case of LAGs in the worst case scenario we would have LACP, uBFD, IGP, LDP and BGP running over a set of links between PEs, we can chop that down to just BFD, IGP and BGP with SR. If we wish, we can still have visibility of the ECMP paths or we can use prefix-suppression and hide them (this goes against my IPv6 only item above as I think IS-IS is missing this feature?).


The downsides that I know of are;

- Need to up-skill staff: For NOC staff it should be easy, use this command "X" to check for prefix/label, this command "Y" to check for label neighborship. For design and senior engineers since we don't use MPLS-TE it shouldn't be difficult, we're typically deploying set-and-forget LDP regional networks so they don't need to know every single detail of SR (he said, naively).

- New code: Obviously plenty of bugs exist, in the weekly emails I receive from Cisco and Juniper with the latest bug reports many relate to SR. But again, any established operator should have good testing procedures in place for new hardware and software, this is no different to all those times Sales sold something we don't actually do. We should all be well versed in testing new code and working out when it's low risk enough for us to deploy. Due to our lack of MPLS-TE I see SR as fairly low risk.


I'd be very interested to hear yours or anyone else's views on the pros and cons of SR in a greenfield network (I don't really care about brownfield right now because we have no problems in that existing networks that only SR can fix).

Cheers,
James.
Mark Tinka
2018-07-04 17:12:06 UTC
Permalink
On 4/Jul/18 18:09, James Bensley wrote:

> Hi Mark,
>
> Walmart, Microsoft and Comcast all claim to have been running SR since 2016:
>
> http://www.segment-routing.net/conferences/2016-sr-strategy-and-deployment-experiences/

Thanks, James. This is helpful.

I don't know anyone at Walmart, but I have some friends at Microsoft I
can ping to verify.

Mark.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-03 19:01:55 UTC
Permalink
lot of info here...

https://www.packetmischief.ca/2012/07/20/4-types-of-port-channels-and-when-t
heyre-used/

seems like you might also want to ask the question ...

Here's some info on interop between cisco vpc and juniper mc-lag

https://supportforums.cisco.com/t5/other-data-center-subjects/nexus-7k-vpc-c
onnecting-to-juniper-mx-series/td-p/1498544

- Aaron


_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Aaron Gould
2018-07-03 19:04:48 UTC
Permalink
Please forgive the confusing strayed statement I left in there about "seems
like you might.... blah blah".... I meant to delete that...

-Aaron


_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Loading...