Hello all…I wanted to pick some brains and see if anyone can shed some light on a network issue I am having.
Topology is: (see attached pdf)
3 (double stacked) EX3400 access switches ---- connected to a double stacked building core switch (4300) – connected to a stacked campus core (4600)
All 3 access switches are connected via 2 SFP+ 10 G fiber links (trunks) to the 4300, which is connected to the Campus core in the same way. This the same for many of our school buildings.
Building core does the routing\inter vlan routing – dhcp forwarding etc…access switch links are trunked and aggregated
3400 - 18.1R3.3
What’s happening is devices on our desktop/phone data subnet are getting terribly degraded throughput, that’s if they can get a DHCP address to begin with. Our APs subnet(s), do not seem to be AS effected…but still getting reports of disconnected users. We have other wired subnets for various devices that seem to be a bit better, but not what they should be. DHCP timeouts and the like. I’ll get a large amount of duplicate pings when pinging some phone\desktops on the data vlan from the building core console- if they ping at all. The other subnets do not experience the duplicate pings. Only the voice\data subnet.
I’ve taken off the Jweb generated phone desktop port profile settings (qoS etc) so now the ports are just “plain” access ports with a data and voice vlan. No difference.
I’ve checked all the uplinks involved for dropped packets as well as some access ports- nothing dropped at the interface level.
The 4300s seem to have high utilization at times (show chassis routing engine.), and behave sluggish at the SSH console, sometimes kicking me out
This is a new school building, and we’ve just begun patching in devices – cameras, APs, smartscreens etc preparing for the open… but there is not a whole lot of traffic at all- links are barley utilized…so I’m stumped as to what is causing this degradation. The config is very basic and derived from a proven config from the other buildings. Other buildings are 4300 cores with ex3300 access switches…but with older Junos being the only real difference. I’m wondering if the Junos version could be to blame ?– although it is on the recommended list for 4300
I thought I would reach out here first and appreciate any light you can shed on this..
Thanks in advance..
Is this happening in all 3 access virtual chassis?
Is it happening for a specific vlan?
Jastro .... I think I found the issue. Some bizarre wiring to an elevator phone crated a network loop. Found by wiresharking the wired data\phone vlan - which was the one with the worst degradation (for now obvious reasons) - saw tons of arps from the offending device ....so that's what it was. Sorry it didn't turn out to be a more technically informative post for this great forum.
Thanks to everyone who gave this a look !
No problem Dennis,
I'm glad you found the root cause.