Cisco ISSU
Category: Networking — Tags: Cisco, ISSU, VSS — redwards @ 7:15 pm
During the building and configuration of a number of brand spanking new Cisco 6509′s with Sup720-3C I decided to convert them to VSS mode before updating the IOS (mainly due to laziness and only having console access). The conversion worked a treat however when I can to upgrade the IOS with ISSU I seemed to hit a glitch, the standby sup got itself in a reboot loop.
The error messages seen;
*Jul 26 14:38:21.291: %PFREDUN-SW1_SP-4-PHYSMEM_MISMATCH: Asymmetrical redundant configuration: Active SP has (1048576/8192K) memory, Standby has (1048576/65536K). *Jul 26 14:38:21.291: %PFREDUN-SW1_SP-4-PHYSMEM_MISMATCH: Asymmetrical redundant configuration: Active RP has (1048576/8192K) memory, Standby has (1048576/65536K). *Jul 26 14:38:22.307: %PFREDUN-SW1_SP-6-ACTIVE: Standby initializing for RPR mode *Jul 26 14:38:23.856: %ISSU-SW1_SP-3-FSM_MISMATCH_MTU: ISSU nego failed for client ISSU ifs client(110) entity_id 113 session 393699 due to mismatch of mtu size 72 & 36. : ios-base : (PID=12311, TID=17) : -Traceback=(s72033-advipservicesk9_wan-5-dso-bnp.so+0x94088) ([27:-3]4-dso-b+0x1ED79C) ([37:0]+0x1ED81C) ([27:-3]5-dso-bnp+0x908A4) ([39:0]+0x90B70) ([39:0]+0x91090) ([27:-3]6-dso-b+0x136110) ([27:-9]3+0x39F400) ([37:0]+0x36BF88) ([27:-9]7+0x1AA8A0) ([37:0]+0x1AA878) *Jul 26 14:38:23.892: %ISSU-SW1_SP-4-FSM_INCOMP: Version of local ISSU client ISSU ifs client(110) in session 393699 is incompatible with remote side. *Jul 26 14:38:23.892: %RFS-SW1_SP-3-START_NEGO_SESSION: RFS nego (393699:590199) to [issu:rfs:Secondary RFS Server Port:0x4050000] failed: [ISSU_RC_NEGO_ERROR]
After trying a couple of things (and not having patience) I decided to break out into ROMMON and reload with config-register of 0×2142 so that I could stop the roboot loop and put VSS into standalone mode. Although this worked a treat and I had both 6500 in standalone mode and with blank configuration (wr erase) I was unable to do anything on upgradeing the image as ISSU seemed to get stuck in it upgrade process;
Router#sh issu state detail
Slot = 5
RP State = Active
ISSU State = System Reset
Boot Variable = bootdisk:s72033-advipservicesk9_wan-vz.122-33.S
XI2a.bin,12;
Operating Mode = sso
Primary Version = bootdisk:s72033-advipservicesk9_wan-vz.122-33.S
XI2a.bin
Secondary Version = bootdisk:s72033-advipservicesk9_wan-vz.122-33.S
XI2a.bin
Current Version = bootdisk:s72033-advipservicesk9_wan-vz.122-33.S
XI2a.bin
Variable Store = PrstVbl
% Standby information is not available because it is in 'DISABLED' state
Router(config)#boot system flash sup-bootdisk:s72033-advipservicesk9_wan-vz.122-33.SXI2a.bin % ISSU process is in progress; Boot variable can not be updated.
Router#issu abortversion The system is without a fully initialized peer and Service impact will occur. Pr oceed with abort? [confirm] % ISSU process can be aborted only from [ Load Version ] or [ Run Version ] or [ Load Version - Switchover ] or [ Run Version - Switchover ] state
After trying a couple of command and looking on google I was no closer to getting the box back in a state that I could do anything with….until it dawned on me “I wonder if a clear command exists for ISSU to reset the flag”. Well yes it does;
Router#clear issu state
Router#sh issu state detail
Slot = 5
RP State = Active
ISSU State = Init
Boot Variable = bootdisk:s72033-advipservicesk9_wan-vz.122-33.S
XI2a.bin,12;
Operating Mode = sso
Primary Version = N/A
Secondary Version = N/A
Current Version = bootdisk:s72033-advipservicesk9_wan-vz.122-33.S
XI2a.bin
Variable Store = PrstVbl
% Standby information is not available because it is in 'DISABLED' state
Once I put the config-register back and added the required boot statement I was to a point in which I can reconvert the VSS.
Comments (0)
Intresting 6500 bug
Category: Networking — Tags: Cisco — redwards @ 8:33 pm
During some recent work we discovered a very interesting bug with Cisco 6500 and specifically their WS-X6708-10GE and WS-X6716-10GE cards. When installing the cards into 6509 or 6513 chassis it can cause an existing module to reload so if you happen to have chosen slot 9-13 you have a issue. The 8th slot above the one you install the card into will reload. So if you place in slot 9, slot 1 may reload. The interesting example I can think of is if you happen to be using a 6513 and install in slot 13 which slot is 8 above…..yes slot 5 which will be your supervisor card!
The bug details can be found below, however I like the workaround of powering the switch down to replace a hot swappable card.
CSCsz13049 W2.b: OIR 6708 to create bus stall in slot 12 cause reset on slot 4
Externally found severe (Sev2) bug: R-ResolvedThe release notes are:
Problem:
———
When a WS-X6708-10GE module or a WS-X6716-10GE module is inserted into a 6509 or 6513 chassis it may cause the module in slot N-8 to reload.
For eg. Insert into slot 9 may reset module in slot 1
Insert into slot 10 may reset module in slot 2
so on ..
Insert into slot 13 may reset module in slot 5There is no problem when removing the module from slots 9-13
Conditions:
———–
1. The 6708/6716 module should be inserted in slots 9 to 13.
2. Module insertion is done slowly.Workaround :
————1. Use slots other than 9 to 13 when inserting a 6708/6716 module.
or
2. Insert the module when the switch is powered down.As a workaround provided to previous customers that saw the exact problem, please refer to the following steps:
When seating the module, initially slide the module into the place, but do not let the locking arms close more than 45 degress. At this point the module is ready to be seated, but is not contacting the backplane.
When ready to fully seat the module, close the locking arms from the > 45 degree angle to fully closed. This must be done in less than 5 seconds. This should be easy to perform in less than 5 secs as the remaining angle for the arms are closed at steady rate.
Comments (0)
Config Register
Category: Networking — Tags: Cisco — redwards @ 4:39 pm
The other week I can across a very useful command after we had a failure to one of our Sup720 cards is a 6509′s. The card was replaced but during the boot-up it would keep on dropping back to ROMMON, however when you type ‘boot’ it would load as expected. When I was asked to have a look my immediate thought was ‘config register’ settings.
I carried out the usual configuration checks to ensure the config reg was set at 0×2102 (tried from both ROMMON and from IOS). When reloaded the same problem occurred with the output of the various IOS ‘sh’ (sh boot, sh ver) commands as well as commands form ROMMON showing things as OK. I then started to think we had a problem with the IOS image (even though I knew it was booting when I try manually), or the new card was faulty.
After a bit of digging and head scratching I discovered that I was getting misleading information from the ‘sh boot’ command as this did not give the answer for the Sup720 card, this was confirmed after issuing ‘remote command switch show boot‘ it showed the config register was et to 0×0. Issuing a config-register 0×2102 and rebooting resolved the issue.
Comments (0)
Image upgrade on SSLM
Category: Networking — Tags: Cisco, SLB, SSLM, SSM — redwards @ 9:33 pm
I have been hitting a bug with the version of code we are running on our Cisco SSLM which is holding up a project rollout, as a result TAC have provided a version that it is fix on. I decided to give it a whirl in the test lab however spent a rather long time trying to work out how you upgrade the images on these modules with very little luck. It turns out it is straight forward (arnt all things when you know how to do it).
The procedure that I followed was;
1) Back-up configuration on both the 6500 and SSLM, I also tend to take a copy of the current running image in case reversion is required.
2) Put the SSLM into maintenance mode. This is achieved with the following commend from the 6500;
SLB-Switch#hw-module module 7 reset cf:1 Device BOOT variable for reset = <cf:1> Warning: Device list is not verified. Proceed with reload of module?[confirm] % reset issued for module 7
You will then notice if you you do a ‘show module’ the status will initially move to ‘other’ before displaying (MP).
SLB-Switch#sh mod 7 Mod Ports Card Type Model Serial No. --- ----- -------------------------------------- ------------------ ----------- 7 1 SSL Module WS-SVC-SSL-1 XXXXXXX Mod MAC addresses Hw Fw Sw Status --- ---------------------------------- ------ ------------ ------------ ------- 7 xxxx.xxxx to xxxx.xxxx 3.2 7.2(1) 2.1(4) Other Mod Online Diag Status ---- ------------------- 7 Unknown
SLB-Switch# sh mod 7 Mod Ports Card Type Model Serial No. --- ----- -------------------------------------- ------------------ ----------- 7 1 SSL Module (MP) WS-SVC-SSL-1 XXXXXXX Mod MAC addresses Hw Fw Sw Status --- ---------------------------------- ------ ------------ ------------ ------- 7 0014.a90c.c53a to 0014.a90c.c541 3.2 7.2(1) 2.1(2)m Ok Mod Online Diag Status ---- ------------------- 7 Pass
The logs will also show something similar to this.
Jul 28 15:33:10.003: SP: OS_BOOT_STATUS(7) MP OS Boot Status: finished booting Jul 28 15:33:37.079: %DIAG-SP-6-RUN_MINIMUM: Module 7: Running Minimal Diagnostics... Jul 28 15:33:43.111: %SNMP-5-MODULETRAP: Module 7 [Up] Trap Jul 28 15:33:43.115: %MLS_RATE-4-DISABLING: The Layer2 Rate Limiters have been disabled. Jul 28 15:33:42.855: %DIAG-SP-6-DIAG_OK: Module 7: Passed Online Diagnostics Jul 28 15:33:43.143: %OIR-SP-6-INSCARD: Card inserted in slot 7, interfaces are now online
Jul 28 15:26:16.507: %SNMP-5-MODULETRAP: Module 7 [Down] Trap Jul 28 15:26:16.507: SP: The PC in slot 7 is shutting down. Please wait ... Jul 28 15:26:40.331: SP: PC shutdown completed for module 7 Jul 28 15:26:40.343: %C6KPWR-SP-4-DISABLED: power to module in slot 7 set off (Reset) Jul 28 15:27:30.882: SP: OS_BOOT_STATUS(7) AP OS Boot Status: finished booting Jul 28 15:28:35.337: %DIAG-SP-6-RUN_MINIMUM: Module 7: Running Minimal Diagnostics... Jul 28 15:28:36.085: %DIAG-SP-6-DIAG_OK: Module 7: Passed Online Diagnostics Jul 28 15:28:36.343: %SNMP-5-MODULETRAP: Module 7 [Up] Trap Jul 28 15:28:36.347: %MLS_RATE-4-DISABLING: The Layer2 Rate Limiters have been disabled. Jul 28 15:28:36.373: %OIR-SP-6-INSCARD: Card inserted in slot 7, interfaces are now online Jul 28 15:30:32.691: %SNMP-5-MODULETRAP: Module 7 [Down] Trap Jul 28 15:30:32.691: SP: The PC in slot 7 is shutting down. Please wait ... Jul 28 15:30:53.555: SP: PC shutdown completed for module 7 Jul 28 15:30:53.563: %C6KPWR-SP-4-DISABLED: power to module in slot 7 set off (Reset)
3) We then install the image by retrieving it from a TFTP server. The pclc#7 bit you just simpily replace 7 for te module you are updating.
SLB-Switch#copy tftp: pclc#7-fs: Address or name of remote host []? 192.168.1.1 Source filename []? c6svc-ssl-k9y9.2-1-12-6.bin Destination filename [c6svc-ssl-k9y9.2-1-12-6.bin]? Accessing tftp://192.168.1.1/c6svc-ssl-k9y9.2-1-12-6.bin... Loading c6svc-ssl-k9y9.2-1-12-6.bin from 128.253.20.23 (via GigabitEthernet4/2): !!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!! etc.. [OK - 18507249 bytes] 18507249 bytes copied in 158.988 secs (116407 bytes/sec) SLB-Switch#
The logs will look something like this, note that it takes around 5 minutes to upgrade the image. It doesn’t really tell you that, although if you try and restart the module you will get an error message back. I just kept checking the logs to see the desired confirmation.
Jul 28 15:38:03.483: %SVCLC-SP-5-STRRECVD: mod 7: <Application upgrade has started> Jul 28 15:38:03.483: %SVCLC-SP-5-STRRECVD: mod 7: <Do not reset the module till upgrade completes!!> Jul 28 15:43:32.035: %SVCLC-SP-5-STRRECVD: mod 7: <Application upgrade has succeeded> Jul 28 15:43:32.035: %SVCLC-SP-5-STRRECVD: mod 7: <You can now reset the module>
4) Once see the message that the upgrade has been succsesful you can reload the module for it to take effect.
SLB-Switch#hw-module module 7 reset Device BOOT variable for reset = <empty> Warning: Device list is not verified. Proceed with reload of module?[confirm] % reset issued for module 7
The logs will show somthing like this (similar to previously)
Jul 28 15:44:14.079: %SNMP-5-MODULETRAP: Module 7 [Down] Trap Jul 28 15:44:14.075: SP: The PC in slot 7 is shutting down. Please wait ... Jul 28 15:44:36.619: SP: PC shutdown completed for module 7 Jul 28 15:44:36.631: %C6KPWR-SP-4-DISABLED: power to module in slot 7 set off (Reset) Jul 28 15:45:27.095: SP: OS_BOOT_STATUS(7) AP OS Boot Status: finished booting
5) You can then do a ‘show module’ and confirm that you new shiny version of code is running.
SLB-Switch# sh mod 7 Mod Ports Card Type Model Serial No. --- ----- -------------------------------------- ------------------ ----------- 7 1 SSL Module WS-SVC-SSL-1 xxxxxx</span> Mod MAC addresses Hw Fw Sw Status --- ---------------------------------- ------ ------------ ------------ ------- 7 xxxx.xxxxx to xxxx.xxxxx 3.2 7.2(1) 2.1(12) Ok Mod Online Diag Status ---- ------------------- 7 Pass
Don’t panic if the process takes a while, it surprised me how long it took in the lab especially as we had pretty much no configuration.
