XC40 Installed Patches

The following provides the status of eFixes on each Cray system and the date that each eFix was installed.

Legend
CLE Cray Linux Enivornment
PE Programming Environment
SLE SuSE Linux Environment
SMW System Management Workstation
SXN Sonexion Storage System

List of Current Patches

System administrators can obtain the latest installed CLE and SMW patches and when they were installed by running xtshowrev -p on the SMW.

You can view the latest installed PEs and when they were installed from any login by doing the following:

$ ls -ltr /opt/cray/pe/modulefiles/cdt/*
-rw-r--r-- 1 root root 2417 Apr 20  2016 /opt/cray/pe/modulefiles/cdt/16.04
-rw-r--r-- 1 root root 2389 Jul  7  2016 /opt/cray/pe/modulefiles/cdt/16.06
-rw-r--r-- 1 root root 2389 Jul 21  2016 /opt/cray/pe/modulefiles/cdt/16.07
-rw-r--r-- 1 root root 2389 Aug 13 11:45 /opt/cray/pe/modulefiles/cdt/16.08
-rw-r--r-- 1 root root 2374 Oct 10 21:01 /opt/cray/pe/modulefiles/cdt/16.10
-rw-r--r-- 1 root root 2374 Nov 14 19:11 /opt/cray/pe/modulefiles/cdt/16.11

 

Date Patchset Fixes Obsoletes Notes
14 June 2016 PE_16.04    
07 July 2016 PE_16.06    
21 July 2016 PE_16.07    
13 August 2016 PE_16.08    
10 October 2016 PE_16.10    
19 October 2016 SXN_2.1.SU-002
  • ldlm: wrong evict during failover
  • remove 'lctl notransno' and 'lctl readonly' commands from XYMNTR stop operation
  • update trinity code to handle new versioning schema
  • correct lustre_config of primary MGS
  • Lustre i/o hung waiting for page (bug)
  • wakup osp_precreate_reserve on umount
  • properly update last_rcvd file
  • fix MDT->OST create/reconnection race
  • IOR data compare error during OST failover test (due to readonly remount) (bug 839072)
  • correctly apply Pacemaker config updates in SU installs
  • OSS crash, ASSERTION(lock->l_export == opd->opd_exp) failed (bug 826317)
  • server crash ASSERTION(flock->blocking_export |= 0) failure (bug 829693)
  • IOR job hang during OST failover/failback (bug 838602)
  • ldlm: soft lockup in ldlm_plain_compat_queue
  • remove force option from XYMNTR Lustre lazy unmount path
  • ensure correct Lustre upcall setting
  • increase timeout for crash memory dump procedure
  • suppress repetetive errors while communicating w/GEM
   
14 November 2016 PE_16.11    
30 January 2017 SXN_2.1.SU-004
  • Sonexion_System_Update_Bundle_with_Release_Notes_2.1.0-004_S-2533.pdf
SXN_2.1.SU-002  
06 February 2017 PE_17.02    
07 March 2017 CLE_6.0.UP03
  • README
  • Errata
  • What's New
   
07 March 2017 SMW_8.0.UP03
  • README
  • Errata
   
07 March 2017 PE_17.03    
08 March 2017 CLE_6.0.UP03.PS02
  • 846158 - File creates fail intermittently when the dwcfs cache is full
   
08 March 2017 CLE_6.0.UP03.PS03
  • NETARIES-54 Add function to initialize all the fields of a memory handle
  • CLEDUMP-13  cdump fails with errors when attempting to dump a broadwell node with 1TB of memory
   
08 March 2017 CLE_6.0.UP03.PS04
  • 847735 – kdwfs: dd gets 'No space left on device' after writing to about half ...
  • 847792 – Service ("batch") nodes become unresponsive handle
   
08 March 2017 CLE_6.0.UP03.PS05
  • 847594 – elogin ethernets fail to order correctly
   
08 March 2017 CLE_6.0.UP03.PS06
  • 847273 – WARNING: ../fs/btrfs/extent-tree.c:3731
  • 847656 - security updates SLES12 SP0 - Set Bbtrfs_free_reserved_data_space_noquota
   
08 March 2017 CLE_6.0.UP03.PS07

PS01 fixes

  • HPC NVIDIA 375.x: Package GLVND GLX and EGL libraries
  • 844859 - Missing NVIDIA libraries in CLE6UP02 for EGL

PS07 fixes

  • JIRA: LINUX-361
   
08 March 2017 SMW_8.0.UP03.PS02

PS01 fixes

  • 847494 - 1601051705 xtwarmswap of 16 blades failed after 1/5 10:46 HSN

PS02 fixes

  • 847464 - Remote PMDB Makefile builds from "redwood" repos rather than "8.0up03"

 

   
09 March 2017 SXN_GOBI.1.24a
  • Resolved issue of a missed signal causing device timeout and eventual STONITH
   
20 March 2017 SMW_8.0.UP03.PS04
  • 845960 -  KNL nodes fail to come out of reset
  • 848317 - SKL: xthwerrlog reporting bogus connector location on some errors
SMW_8.0.UP03.PS02  
20 March 2017 SMW_8.0.UP03.PS05
  • KNL BIOS 6670
   
27 March 2017 CLE_6.0.UP03.PS09
  • 848130 - CAPMC failing to update SDB database during user demand provisioning
   
27 March 2017 CLE_6.0.UP03.PS11
  • 849367 - xfs_create should free_eofblocks after flushing inodes on ENOSPC
CLE_6.0.UP03.PS04  
27 March 2017 CLE_6.0.UP03.PS12

PS10 fixes

  • 848966 - Data warp miscompare issues after UP03 upgrade
  • 849373 - Short writes aren't handled properly by DVS
  • 849371 - kdwfs: wrong error value returned from substripe deferred create failures

PS12 fixes

  • 849230 - Compute Ansible "[ids | task idsautofs, IDS AutoFS Handoff]" Failed ...
CLE_6.0.UP03.PS02  
27 March 2017 SMW_8.0.UP03.PS06
  • 847785 - erd lost connection, or BC in bad state; may be related to SWO
  • 847638 - capmc get_mcdram_capabilities command failing => xtremoted failed : open file descriptor limit reached
  • 846326 - unexpected link failures after warmswaps of adjacent blades
SMW_8.0.UP03.PS04  
03 April 2017 CLE_6.0.UP03.PS14
  • 838211 - Failed to start NTP Server Daemon
   
10 April 2017 CLE_6.0.UP03.PS15
  • 848384 - Jobstats ID lookup performance degradation
   
17 April 2017 PE_17.04    
08 May 2017 SMW_8.0.UP03.PS09

PS08 fixes

  • 848868 - sn 2574 Non-critical -- [INC0097699] -- Cori (sn2574) 16 quad node are down for reason: "Not responding"

PS09 fixes

  • 850437 - 20170418 xthwerrlog enters infinite loop.
  • 848861 - 20170219 Anomalous CPU temperature reading (90C) on nodes with zero power draw is causing cab fans to run at high speed.
SMW_8.0_UP03.PS06  
22 May 2017 PE_17.05      
05 Jun 2017 SMW_8.0.UP03.PS10
  • 850010 - 1704041530 Excessive KNL uncorrectable MCDRAM errors after BIOS 6933 installed.
SMW_8.0.UP03.PS05