Routing

Expand all | Collapse all

SRX300 high CPU usage

  • 1.  SRX300 high CPU usage

    Posted 02-22-2021 10:31

    Dear Juniper lovers,

    We have a srx300 running 18.2R3.4 with high CPU spikes.
    Only the flowd process is high.
    In the syslog, we have a lot of these messages : network_pkt_is_micro_bfd
    I am not sure it is related but would like to solve this already.
    We don't have BGP BFD configured on the sessions...

    Anybody knows what this message means and how to get rid of it ?

    Thanks a lot




    ------------------------------
    SIBRECHT MINJAUW
    ------------------------------


  • 2.  RE: SRX300 high CPU usage

     
    Posted 02-22-2021 19:38
    flowd is the main security firewall process, so this could be high due to processing traffic.

    Run this to see what all processes might also be part of your high cpu
    show system processes extensive | except 0.0
    BFD can be configured for any number of protocols.  To see if it is enabled anywhere on the system try this command.

    show configuration | display set | match bfd

    This should return any references at all in the config.

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------



  • 3.  RE: SRX300 high CPU usage

    Posted 02-23-2021 03:18
    Hello Steve,

    Thanks a lot for your feedback. 

    Hereunder you can find the result : 
    show configuration | display set | match bfd

    root@NAxxxxxxxxxx>

    root@NAxxxxxxxxx>

    There is no bfd configured for any protocol. Therefore I thought the syslog message meant something else.
    I already restart the bgp process but the bfd messages are still present. I checked also if the bgp peer is not configured with bfd which is not.

    Hostname: NAxxxxxxx
    Model: srx300
    Junos: 18.2R3.4
    JUNOS Software Release [18.2R3.4]

    Hereunder you can find the result :

    show system processes extensive | except 0.0
    last pid: 93971; load averages: 1.25, 0.86, 0.76 up 118+21:55:35 09:15:34
    185 processes: 18 running, 154 sleeping, 13 waiting

    Mem: 558M Active, 215M Inact, 1817M Wired, 565M Cache, 112M Buf, 809M Free
    Swap: 792M Total, 792M Free


    PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
    2107 root 123 0 1803M 1124M CPU1 1 3444.9 92.48% flowd_octeon_hm
    2107 root 105 0 1803M 1124M RUN 0 3444.9 30.81% flowd_octeon_hm
    21 root 155 52 0K 16K RUN 0 1776.3 1.46% idle: cpu0

    It is a SRX300 and throughput should be at least 500 Mbit according to the datasheet but we can't send only 200 Mbit with a CPU which is not saturated. (it is a 500 Mbit ethernet circuit) Is there a way to lower the CPU load and  increase the throughput ? And how to determ if it is legimate traffic ?

    Thanks a lot for your support.
    I can't find the 




    ------------------------------
    SIBRECHT MINJAUW
    ------------------------------



  • 4.  RE: SRX300 high CPU usage

     
    Posted 02-23-2021 05:53
    It might be possible that your traffic is more like the small packet flow than imix and thus the limit closer to 200 instead of 500.  But I'm not aware of a way to check that on the SRX.

    Another statistic to confirm is that you are not approaching the maximum number of sessions of 64k on the srx300
    show security flow statistics

    The error message seems odd.  Perhaps a device directly connected to your srx is configured for bfd and thus the srx sees these packets and raises the error.  I can't find any public docs on the error so not sure what this indicates.

    ------------------------------
    Steve Puluka BSEET - Juniper Ambassador
    IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
    http://puluka.com/home
    ------------------------------



  • 5.  RE: SRX300 high CPU usage

    Posted 02-23-2021 10:26
    Be aware that flowd is meant to take up all available CPU on core 1 where all management etc. runs on core 0 (SRX300 has two cores, 0+1).

    Too see how loaded flowd actually is, you need to look at " show security monitoring performance spu " which gives you cpu load for the last minute in one second intervals. That SPU value can also be extracted via SNMP if you need monitoring for longer trending.

    Example from a non-loaded SRX300:

    user@fw> show system processes extensive | except 0.0
    213 processes: 21 running, 177 sleeping, 2 zombie, 13 waiting

    Mem: 741M Active, 307M Inact, 1914M Wired, 505M Cache, 112M Buf, 496M Free
    Swap:


    PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
    2120 root 123 0 1919M 1184M CPU1 1 6741.7 93.21% flowd_octeon_hm
    2120 root 29 0 1919M 1184M RUN 0 6741.7 12.74% flowd_octeon_hm


    user@fw> show security monitoring performance spu
    fpc 0 pic 0
    Last 60 seconds:
    0: 0 1: 0 2: 1 3: 0 4: 0 5: 0
    6: 2 7: 1 8: 1 9: 0 10: 0 11: 0
    12: 3 13: 0 14: 1 15: 2 16: 0 17: 0
    18: 0 19: 0 20: 0 21: 0 22: 0 23: 3
    24: 4 25: 2 26: 1 27: 1 28: 1 29: 0
    30: 0 31: 0 32: 0 33: 0 34: 0 35: 2
    36: 1 37: 0 38: 2 39: 0 40: 0 41: 0
    42: 1 43: 0 44: 0 45: 0 46: 1 47: 0
    48: 1 49: 4 50: 1 51: 0 52: 0 53: 0
    54: 1 55: 0 56: 0 57: 0 58: 0 59: 1

    user@fw>

    I have no idea what "network_pkt_is_micro_bfd" is about - which process logs this error?



    ------------------------------
    --
    Jonas Hauge Klingenberg - Juniper Ambassador
    ------------------------------



  • 6.  RE: SRX300 high CPU usage

    Posted 02-24-2021 04:10
    Thanks Jonas.
    There is no process mentioned in the syslog messages of "network_pkt_is_micro_bfd "
    Please find hereunder the complete message. It looks like a bug to me as there is no bfd configured, also not on the neighbours.

    CPU / Performance :

    PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
    2107 root 122 0 1803M 1124M CPU1 1 3476.9 90.97% flowd_octeon_hm
    2107 root 76 0 1803M 1124M RUN 0 3476.9 46.58% flowd_octeon_hm
    21 root 155 52 0K 16K RUN 0 1789.6 38.33% idle: cpu0


    show chassis routing-engine
    Routing Engine status:
    Temperature 43 degrees C / 109 degrees F
    CPU temperature 60 degrees C / 140 degrees F
    Total memory 4096 MB Max 1188 MB used ( 29 percent)
    Control plane memory 2400 MB Max 816 MB used ( 34 percent)
    Data plane memory 1696 MB Max 356 MB used ( 21 percent)
    5 sec CPU utilization:
    User 25 percent
    Background 0 percent
    Kernel 37 percent
    Interrupt 0 percent
    Idle 38 percent
    Model RE-SRX300

    On the routing engine, there are 5 types of CPU utilization : User, background, Kernel, Interrupt, Idle
    Do you know what is covered under these types ? 

    On CPU 0, there is the management traffic - which traffic does this include ? SNMP, telnet, SSH ? And also bgp ?

    When checking the values under "show security monitoring performance spu", does this only include CPU 1 ?

    show security monitoring performance spu
    fpc 0 pic 0
    Last 60 seconds:
    0: 29 1: 30 2: 28 3: 26 4: 26 5: 27
    6: 30 7: 29 8: 27 9: 27 10: 26 11: 25
    12: 23 13: 22 14: 20 15: 23 16: 24 17: 22
    18: 23 19: 21 20: 19 21: 19 22: 21 23: 21
    24: 23 25: 24 26: 27 27: 25 28: 24 29: 26
    30: 20 31: 19 32: 24 33: 28 34: 19 35: 24
    36: 30 37: 23 38: 19 39: 27 40: 26 41: 29
    42: 32 43: 33 44: 22 45: 22 46: 19 47: 21
    48: 30 49: 37 50: 30 51: 29 52: 38 53: 41
    54: 40 55: 37 56: 37 57: 40 58: 38 59: 33

    I am running already a firewall filter on the loopback to protect the routing engine.



    timestamp    
    February 24th 2021, 09:53:51.458
     @version    
    1
     _id    
    jTc-03cB-Siqf431mzJJ
     _index    
    logstash-junoslogs-2021.02.55
     _index_name      
    syslog
     _score    
    -
     _type    
    doc
     asn    
    -
     host    
    NAXXXXXXX
     junos-ts    
    February 24th 2021, 09:58:51.651
     log-category    
    -
     log-type    
    -
     message    
    <167>1 2021-02-24T09:58:51.651+01:00 NAXXXXXX - - - - network_pkt_is_micro_bfd IP null
     syslog_facility    
    user-level
     syslog_facility_code    
    1
     syslog_severity    
    notice
     syslog_severity_code    
    5


    ------------------------------
    SIBRECHT MINJAUW
    ------------------------------