Discussion:
[j-nsp] Traffic delayed
james list
2018-10-02 16:37:49 UTC
Permalink
Dear experts

I’ve a strange issue.

Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.

Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..

It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..

Can anybody help me in sorting out which can be the main point here ?

Thanks in advance

Cheers,

James
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether
Tom Beecher
2018-10-02 18:33:55 UTC
Permalink
You have switches with completely different buffer depths than you used to.
You prob want to look into that.
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/
james list
2018-10-02 18:58:24 UTC
Permalink
Can you elaborate?
Why just every 30 minutes the issue?
Post by Tom Beecher
You have switches with completely different buffer depths than you used
to. You prob want to look into that.
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman
James Bensley
2018-10-02 19:18:38 UTC
Permalink
Post by james list
Can you elaborate?
Why just every 30 minutes the issue?
Seeing as you have an all Juniper set up I don't think there is a need
to cross-post to two lists simultaneously. If you feel there is a
need, please post to the two lists separately as not all subscribers
will be subscribed to both lists.

What basic troubleshooting have you done so far? What have you ruled out?

The very first think you should have done is try to replicate the
issue in the lab, can you replicate it?

If yes, have you tried a code upgrade to see if this fixes anything?
Or changing any settings?

If not and you've only got the issue in production, can you enable
some logging to see if there is anything in the logs when the issues
happens? Do you see any packet drops on interfaces when the issue
happens? CPU spikes? Anything?

So far you haven't provided any data at all on the problem or what you
have tried to do to resolve it, before coming to the list.

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
james list
2018-10-02 22:21:53 UTC
Permalink
I put on both lists cause we had cisco and we have now juniper. Hence maybe
this is something known by cisco guru as well.

I have not a mx960 in lab unfortunately, no cpu spikes, no relevant logs.

Upgrade is not in roadmap since we re in 16.1

Cheers
Post by James Bensley
Post by james list
Can you elaborate?
Why just every 30 minutes the issue?
Seeing as you have an all Juniper set up I don't think there is a need
to cross-post to two lists simultaneously. If you feel there is a
need, please post to the two lists separately as not all subscribers
will be subscribed to both lists.
What basic troubleshooting have you done so far? What have you ruled out?
The very first think you should have done is try to replicate the
issue in the lab, can you replicate it?
If yes, have you tried a code upgrade to see if this fixes anything?
Or changing any settings?
If not and you've only got the issue in production, can you enable
some logging to see if there is anything in the logs when the issues
happens? Do you see any packet drops on interfaces when the issue
happens? CPU spikes? Anything?
So far you haven't provided any data at all on the problem or what you
have tried to do to resolve it, before coming to the list.
Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
james list
2018-10-03 07:09:50 UTC
Permalink
on access qfx5100 is disabled flow control

Il giorno mer 3 ott 2018 alle ore 06:50 Eldon Koyle <
Have you checked flow control counters?
--
Eldon
Post by james list
I put on both lists cause we had cisco and we have now juniper. Hence maybe
this is something known by cisco guru as well.
I have not a mx960 in lab unfortunately, no cpu spikes, no relevant logs.
Upgrade is not in roadmap since we re in 16.1
Cheers
Post by James Bensley
Post by james list
Can you elaborate?
Why just every 30 minutes the issue?
Seeing as you have an all Juniper set up I don't think there is a need
to cross-post to two lists simultaneously. If you feel there is a
need, please post to the two lists separately as not all subscribers
will be subscribed to both lists.
What basic troubleshooting have you done so far? What have you ruled
out?
Post by James Bensley
The very first think you should have done is try to replicate the
issue in the lab, can you replicate it?
If yes, have you tried a code upgrade to see if this fixes anything?
Or changing any settings?
If not and you've only got the issue in production, can you enable
some logging to see if there is anything in the logs when the issues
happens? Do you see any packet drops on interfaces when the issue
happens? CPU spikes? Anything?
So far you haven't provided any data at all on the problem or what you
have tried to do to resolve it, before coming to the list.
Cheers,
James.
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
Richard McGovern
2018-10-02 19:33:47 UTC
Permalink
There is no such product as an MX9K. Is your product some form of MX or an EX9200 of some type?

In either case would need to know which exact modules within the MX or EX product you are using or are involved.

When using the CAT6K was the edge QFX5100 previously as well? I assume QFX5100 is just L2?



Sent from my iPhone
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo
james list
2018-10-04 18:34:01 UTC
Permalink
Due to the fact that access switch are QFX5100 in virtual chassis, does
anybody know if IS-IS managing virtual- chassis has something happening
every 30 minutes which could cause delay?

Cheers
Post by Richard McGovern
There is no such product as an MX9K. Is your product some form of MX or an
EX9200 of some type?
In either case would need to know which exact modules within the MX or EX
product you are using or are involved.
When using the CAT6K was the edge QFX5100 previously as well? I assume QFX5100 is just L2?
Sent from my iPhone
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same
L2
Post by james list
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a
delay
Post by james list
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether
Richard McGovern
2018-10-04 18:58:34 UTC
Permalink
It does not.

Do you know if delay if from QFX5100 or MX or both? Do you know what Queue this traffic is going into on each switch/router?

From: james list <***@gmail.com>
Date: Thursday, October 4, 2018 at 2:34 PM
To: Richard McGovern <***@juniper.net>, Juniper List <juniper-***@puck.nether.net>
Subject: Re: [j-nsp] Traffic delayed

Due to the fact that access switch are QFX5100 in virtual chassis, does anybody know if IS-IS managing virtual- chassis has something happening every 30 minutes which could cause delay?

Cheers

Il Mar 2 Ott 2018, 21:33 Richard McGovern <***@juniper.net<mailto:***@juniper.net>> ha scritto:
There is no such product as an MX9K. Is your product some form of MX or an EX9200 of some type?

In either case would need to know which exact modules within the MX or EX product you are using or are involved.

When using the CAT6K was the edge QFX5100 previously as well? I assume QFX5100 is just L2?



Sent from my iPhone
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/list
james list
2018-10-05 04:30:32 UTC
Permalink
I perfectly knows the path, there is no qos configured and from what is
seen there is no congestion (also because congestion cannot happen just
only 30 minutes) and I do not have a tap on each cable, also there are port
channels...
Post by Richard McGovern
It does not.
Do you know if delay if from QFX5100 or MX or both? Do you know what
Queue this traffic is going into on each switch/router?
*Date: *Thursday, October 4, 2018 at 2:34 PM
*Subject: *Re: [j-nsp] Traffic delayed
Due to the fact that access switch are QFX5100 in virtual chassis, does
anybody know if IS-IS managing virtual- chassis has something happening
every 30 minutes which could cause delay?
Cheers
There is no such product as an MX9K. Is your product some form of MX or an
EX9200 of some type?
In either case would need to know which exact modules within the MX or EX
product you are using or are involved.
When using the CAT6K was the edge QFX5100 previously as well? I assume QFX5100 is just L2?
Sent from my iPhone
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same
L2
Post by james list
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a
delay
Post by james list
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailma
Richard McGovern
2018-10-05 12:03:37 UTC
Permalink
Just FYI, there is a default QOS/COS config on any Juniper product. For example, “I believe” QFX5100 is set with 5% Network Control (Strict Queue 0) and 95% Best Effort (Queue 3?) or at least most EX products default to this.

I do not believe any Juniper product ‘holds on to packets’ without something else getting in the way, that is some sort of congestion someplace.

I’d suggest you open a TAC case and have them assist you.

From: james list <***@gmail.com>
Date: Friday, October 5, 2018 at 12:30 AM
To: Richard McGovern <***@juniper.net>
Cc: Juniper List <juniper-***@puck.nether.net>
Subject: Re: [j-nsp] Traffic delayed

I perfectly knows the path, there is no qos configured and from what is seen there is no congestion (also because congestion cannot happen just only 30 minutes) and I do not have a tap on each cable, also there are port channels...


Il Gio 4 Ott 2018, 20:58 Richard McGovern <***@juniper.net<mailto:***@juniper.net>> ha scritto:
It does not.

Do you know if delay if from QFX5100 or MX or both? Do you know what Queue this traffic is going into on each switch/router?

From: james list <***@gmail.com<mailto:***@gmail.com>>
Date: Thursday, October 4, 2018 at 2:34 PM
To: Richard McGovern <***@juniper.net<mailto:***@juniper.net>>, Juniper List <juniper-***@puck.nether.net<mailto:juniper-***@puck.nether.net>>
Subject: Re: [j-nsp] Traffic delayed

Due to the fact that access switch are QFX5100 in virtual chassis, does anybody know if IS-IS managing virtual- chassis has something happening every 30 minutes which could cause delay?

Cheers

Il Mar 2 Ott 2018, 21:33 Richard McGovern <***@juniper.net<mailto:***@juniper.net>> ha scritto:
There is no such product as an MX9K. Is your product some form of MX or an EX9200 of some type?

In either case would need to know which exact modules within the MX or EX product you are using or are involved.

When using the CAT6K was the edge QFX5100 previously as well? I assume QFX5100 is just L2?



Sent from my iPhone
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://p

James Bensley
2018-10-05 06:20:56 UTC
Permalink
Post by james list
Due to the fact that access switch are QFX5100 in virtual chassis, does
anybody know if IS-IS managing virtual- chassis has something happening
every 30 minutes which could cause delay?
Cheers
As per my previous message, you should see such an event in the logs. Have you enabled verbose logging in the IS-IS trace options (and any other services running on the devices)?

Cheers,
James.
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp
NK NSP
2018-10-02 21:54:13 UTC
Permalink
What kind of traffic is delayed? Unicast or multicast? Usually Mac tables
have Mac timeouts driven by traffic and flooding may occur on timeouts. You
can check if any ARPs are expring and needed to be refreshed every 30 mins
interval. For multicast, check if any prune or joins are happening around
the time. Any IGMP joins or prunes around the same time.
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puck.
james list
2018-10-03 07:01:10 UTC
Permalink
it's unicast, we're checking it. Thanks
Post by NK NSP
What kind of traffic is delayed? Unicast or multicast? Usually Mac tables
have Mac timeouts driven by traffic and flooding may occur on timeouts. You
can check if any ARPs are expring and needed to be refreshed every 30 mins
interval. For multicast, check if any prune or joins are happening around
the time. Any IGMP joins or prunes around the same time.
Post by james list
Dear experts
I’ve a strange issue.
Our customer replaced two L2/3 switches (C6500) where a pure L2 and L3
(hsrp) environment was set-up with a couple of new MX9k running the same L2
and L3 services but those two MX are running MPLS/VPLS to transport L3/L2
frames. Access switches are QFX5k connected to MX MPLS PE.
Now the main issue: the customer every almost 30 minutes (sometimes 28
sometimes 33 minutes sometimes 30) detect some frames received with a delay
of 3-600 milliseconds. The customer is a trading venue..
It seems like something slow down the forwarding processing, now I know
Juniper separate forwarding and control, but I was thinking to OSPF LSA
refresh or something like that since the frequency is around 30 minutes..
Can anybody help me in sorting out which can be the main point here ?
Thanks in advance
Cheers,
James
_______________________________________________
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-***@puck.nether.net
https://puc
Loading...