The network card that we have on the ESXi controller is: Intel(R) Ethernet Controller X710 for 10GbE SFP+ and that doe snot fiully support DCBX protocols.
It seems the 2 commands that are most likely to have caused problems are:
1). set protocols dcbx interface all
2). set protocols layer2-control bpdu-block disable-timeout 300
We will see how the remaining ESXi hosts work on the Juniper EX4650 after they are migrated as well. Thus far 2 of the 5 ESXi hosts that had bad performance problems with the EX4650 But do not with the EX4550 are now working very well for about 30 days on the Juniper EX4650.
According to Chat GPT:
The EX4650 no longer attempted DCBX negotiation on its interfaces. As a result:
Original Message:
Sent: 10-16-2025 17:32
From: KJS ADMIN
Subject: Slow performance on EX46550
Any other advice? The only commands that I think the switch should really need to have is
>set routing-options nonstop-routing<enter>for future update purposes.
Otherwise the other commands are not really necessary.
------------------------------
KJS ADMIN
Original Message:
Sent: 10-15-2025 11:42
From: RandyS
Subject: Slow performance on EX46550
Thank you for the details in this.
------------------------------
Randy Shulse
Original Message:
Sent: 10-14-2025 13:54
From: KJS ADMIN
Subject: Slow performance on EX46550
We ended up fixing this problem with the performance by looking at how another EX4650 was configured and being used at our backup site.
Since the Backup Switch did not have this performance problem, I used that as a base line. If the Backup Switch did not have specific commands I then removed those commands from the Production Switch (we are testing). After removing those commands then there were no more problems on the production switch.
The backup network is a good testing base and I have a list of connections that I have removed the following commands from the new production switch that we want to use.
1). I removed: set routing-options nonstop-routing
2). I removed: set system syslog file messages match " "
3). I removed the following 2 commands.
a. >set interfaces vlan unit 0 family inet
b. >set interfaces vlan unit 1 family inet
4). I removed: set protocols dcbx interface all
5). I removed the connds:
set interfaces irb unit 0 family inet dhcp vendor-id Juniper-ex4650-48y-8c-XH3722230774
set interfaces irb unit 0 family inet6 dhcpv6-client client-type stateful
set interfaces irb unit 0 family inet6 dhcpv6-client client-ia-type ia-na
set interfaces irb unit 0 family inet6 dhcpv6-client client-identifier duid-type duid-ll
set interfaces irb unit 0 family inet6 dhcpv6-client vendor-id Juniper:ex4650-48y-8c:XH3722230774
6). I removed:
set system phone-home server https://redirect.juniper.net
set system phone-home rfc-compliant
7). I removed the following commands:
set system processes general-authentication-service traceoptions file radius
set system processes general-authentication-service traceoptions flag all
8). I added a command that was missing from the production switch:
set system radius-options attributes nas-ip-address #...
9). I removed the following commands:
set system services netconf ssh
set system services netconf rfc-compliant
set system services netconf yang-compliant
10). I removed the follwoing comands:
set protocols layer2-control nonstop-bridging
set protocols layer2-control bpdu-block disable-timeout 300
------------
After removing the above commands the performance problem went away on the production switch.
I then re-added all of the above commands to the back up switch (using an older OS version) and I was not able to reproduce the performance problem.
------------
When trouble-shooting the problem on the production EX4550 (which is not working correctly) the performance problem was triggered by initiating vMware vMotion from the SAN. My plan is to add 1 command back at a time during a maintenance window and to test the performance.
Then to add the remainder of the ESXi hosts ands connections from the EX4550 to the EX4650.
------------------------------
KJS ADMIN
Original Message:
Sent: 09-19-2025 17:34
From: KJS ADMIN
Subject: Slow performance on EX46550
I think this might be a routing cable problem. I will try to
1). Update the JUNOS.
2). Temporarily Disable the VMware DRS.
3). Move all of the cables over from all of the ESXi hosts.
a. And the storage array.
4). Evaluate the performance before I trigger vmotion.
5). Then trigger vmotion and see what happens.
I suspect that the Vmotion, Iscsi, and the Storage Array connections will not have much to any packet loss if they are connected on the same switch.
------------------------------
KJS ADMIN
Original Message:
Sent: 09-18-2025 09:46
From: Olivier Benghozi
Subject: Slow performance on EX46550
How did you configure the 1Gb/s ports on the 4650 ?
------------------------------
Olivier Benghozi
Original Message:
Sent: 09-09-2025 13:12
From: KJS ADMIN
Subject: Slow performance on EX46550
We are trying to migrate from Juniper EX4550 ( EX4550-32F) with version: 15.1R7-S12 in a virtual Chassis to a Juniper ES4650-48Y-8C version 23.4R2.13 also configured for a virtual chassis.
When we moved the existing connections over to the EX4650 we experienced some really slow performance hence we move the cables back to the EX4550 and the problem went away. A summary of what we did and tried is shown below. My question is how to diagnose and fix the slow performance problem on the Juniper EX4650.
1). The new EX4650 was setup in a test environment where the V.C. and JUNOS version was configured and installed.
2). The new EX4650 also receive its initial configuration in the test lab environment where it was able to connect to the network for management purposes.
3). The new EX4650 also was configured pretty much exactly as the EX4550 except its up-link ports are in a LAG using interfaces 0/047 and 1/0/47 instead of 0/0/31 and 1/0/31 (on the EX4550).
a. Also every port on the EX4650 is configured with the "family ethernet-switching storm-control default' command while the EX4550 does not have this configuration on any of the ports.
b. I tested the routable-VLans and connections worked from with ping testing and connecting a test ESXi host with a spare 10 GB cable.
4). I then racked and mounted the EX4650 in a Server Rack directly below the EX4550 used in production to help for a smooth deployment and more testing.
5). I did move 2 more connections over for testing before the migration.
a. 2 separate 1 - GB heartbeat connection for some storage arrays would not work well on the EX-4650.
b. We were using Ethernet to 1-GB Digital Attach Copper (DAC) Jumper cables on the EX4550.
c. I then just connected the same 1 x GB Ethernet connections (without the DAC - Adapter) to a Juniper EX4300 Gb Switch and the same connection worked well.
6). For additional testing I then moved 2 x 10-GB DAC Jumper cables that were being currently used on the EX-4550 to the EX4650.
a. Those 2 test connections worked well and they were directly connected to the ESXi Hosts from the EX4650.
b. All of the DAC connections on the EX-4550 except for the 2 x Heartbeat connections used SFPP-PC015 and that model DAC is not listed on the EX4650 hardware compatibility web page: https://apps.juniper.net/home/ex4650-48y/hardware-compatibility
c. Those same DAC's were moved over to the EX4650 Switch and the performance was not very good. Note: The FS P/N is changed from SFPP-PC015 to SFP-10G-PC015, but the product is the same. https://www.fs.com/products/39781.html
7). After we moved the same DAC's back to the EX4550 the performance for the servers was good again.
I suspect the problem with the Juniper EX-4650 connections is that we should use DAC's that are listed supported, tested, and listed on the EX4650 Hardwawre compatibility web page. Not to use the 10-GB Ethernet DAC's that are currently connected on the EX-4550. I read that if you use an unsupported DAC transceiver the network performance us unpredictable. The interface configuration on the Switches looks to the same.
------------------------------
KJS ADMIN
------------------------------