Wednesday, August 8, 2018

z/VSE VTAPE Performance and Tuning


z/VSE VTAPE Performance and Tuning

We recently had a customer contact us asking about z/VSE VTAPE throughput. 

They asked "When I FTP data to the machine running our Java VTAPE server I get much higher throughput than my backups using 
z/VSE VTAPE can achieve. Why?

In the process of answering this question, this article resulted in a nice set of network tuning Tips and Tricks for z/VSE.


How z/VSE VTAPE works ...


You can not get FTP class throughput using z/VSE's VTAPE due to z/VSE VTAPE's  design. Based on wireshark traces we deduce that the Java VTAPE server and the TAPESRVR job always transfer 'chunks', 1MB by default, of tape data followed by an application level ACK or handshake sequence.

For example, in the case of writing to a virtual tape (perhaps doing a LIBR backup) the process begins by reading a chunk, 1MB by default, to examine/process the volume labels. This is followed by a series of chunks of tape data sent to the VTAPE server. Between each chunk is an application level ACK or handshake sequence. Likely, the application level ACK from the Java VTAPE server is telling the TAPESRVR job in z/VSE that the data transferred was successfully written to disk. In part, it is the latency of this application level ACK or handshake sequence that causes VTAPE to be so slow.

Tuning Tips and Tricks ...



OK, maybe you can not change the design of the VTAPE application but what about tuning the environment for best throughput?

Let's start by admitting that not all of these steps will improve throughput. Some may not even be necessary in your environment but some may help and may help a lot. All of these suggestions are about reducing latency in the VTAPE application or in your network. After all, latency is the bane of network throughput.


Linux vs. Windows


Lets start by recommending Linux over Windows for running the Java VTAPE server. Our experience shows the Java VTAPE server running under LInux to be faster than when using a Windows machine. Perhaps you are a 'Windows shop' and you have been looking for an opportunity to add a Linux machine to your data center ... Well, here is your chance.


Many customers are now beginning to use a Linux on Z image to run their Java VTAPE server, also a very good idea. In fact, with more and more customers using SSL/TLS for all data transfers, using a Linux on Z image for your Java VTAPE server is an excellent idea. Linux on Z using Hipersocket networking is very fast and SSL/TLS is unneeded since data transferred to and from z/VSE never leaves the Z box.



Java Interpret vs. Dynamic Compilation 

Most x86 (32-bit or 64-bit) machines will have a version of Java installed. This version of Java will likely be running in mixed mode. Mixed mode indicates dynamic compilation of Java byte codes is available and will be used. If your Java installation is running in interpret mode, this can cause big slowdowns in throughput and high CPU usage.

The java -version command allows you to verify this.


zPDT3:/tmp # java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (IcedTea 3.6.0) (build 1.8.0_151-b12 suse-18.1-x86_64)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)



Recently however, after doing an install of SLES 12 SP2, I found the following Linux Java Server installed ...

  SLES 12 SP2 
   java-1_8_0-ibm
   java-1_8_0-openjdk
   java-1_8_0-openjdk-headless 

  All of the above installed by default 

The java -version command displayed this output ...

jcb@sles12sp2:~> java -version
openjdk version "1.8.0_101"
OpenJDK Runtime Environment (IcedTea 3.1.0) (suse-14.3-s390x)
OpenJDK 64-Bit Zero VM (build 25.101-b13, interpreted mode) 


The java-1_8_0-openjdk package was being used and running in interpret mode. This was causing throughput problems and very high CPU usage.

Using yast to remove the java-1_8_0-openjdk *and* java-1_8_0-openjdk-headless packages resulted in the SLES 12 SP 2 image using the Java-1_8_0-ibm package.

jcb@sles12sp2:~> java -version
java version "1.8.0"
Java(TM) SE Runtime Environment (build pxz6480sr3-20160428_01(SR3))
IBM J9 VM (build 2.8, JRE 1.8.0 Linux s390x-64 Compressed References 20160427_301573 (JIT enabled, AOT enabled)
J9VM - R28_Java8_SR3_20160427_1620_B301573
JIT  - tr.r14.java.green_20160329_114288
GC   - R28_Java8_SR3_20160427_1620_B301573_CMPRSS
J9CL - 20160427_301573)
JCL - 20160421_01 based on Oracle jdk8u91-b14


The Java VTAPE server now runs compiled and is much faster with far lower CPU usage. Testing showed a big increase in throughput with VTAPE backup jobs duration reduced by 50%.

MTU Sizing


Most networks use the standard Ethernet MTU size of 1500 bytes. However, with the advent of Gigabit Networking Jumbo Ethernet frames appeared. Jumbo Ethernet frames can contain up to 9000 bytes of data. z/VSE's IJBOSA OSA Express driver, provided by IBM, supports Ethernet frames up to 9000 bytes and Hipersocket MFS (Maximum Frame Size) up to 64K. I have found that Linux systems do not like an MTU size of 9000. However, Linux is very happy with an Ethernet MTU of 8960 (8K plus 768 bytes) and a Hipersocket MTU size of 56K. IPv6/VSE, from BSI, also supports these MTU sizes.

Since Jumbo Ethernet Frames contain 6x as much data as a standard Ethernet Frame, throughput can improve dramatically. Hipersocket MTU's of 56K contain 39x more data per frame.


Ethernet

If your z/VSE system's OSA Express adapter is connected to a Gigabit switch and the machine running the Java VTAPE server is also connected to the same switch then using Jumbo Ethernet frames may be possible. Verify the switch and the NIC being used by the Java VTAPE machine support Jumbo Ethernet frames. If you are using managed Gigabit Ethernet switches remember frame size can often be configured for each switch port.

Throughput using Jumbo Ethernet Frames can be much higher.



Hipersockets

Based on customer information, I have been told that CSI'S TCP/IP for VSE uses 32K MTU's when using Hipersocket links. And, in this case, defining the Hipersocket link with an MFS of 40K may help throughput. Using an z/VSE MTU of 32K with a Hipersocket MFS of 64K has been shown to cause slowdowns. Likely this is due to forcing Hipersockets to transfer 64K when only 32K of data is being transferred.

BSI's IPv6/vSE supports maximum size, 64K, MFS definitions for Hipersocket links. Customers have reported Gigabit+ throughput using Hipersocket links.


Linux on Z will choose an MTU size just less than the MFS size. In the case of 64K MFS, Linux will use a 56K MTU. IPv6/VSE will use a 56K MTU also.


For example, this BSTTFTPC batch FTP job transferred a 100MB file in 0.673 seconds.

BSTT023I   100M BYTES IN  0.673 SECS. RATE   152M/SEC 

Remember, Hipersockets is a CPU function. So, limiting the available CPU on any machine using Hipersockets can and likely will have an affect on throughput.



TCP Window Scaling


TCP window scaling allows TCP sockets to use larger windows of data. The TCP window is the maximum amount of data that can be sent without being acknowledged by the remote host. CSI's TCP/IP for VSE, I have been told, supports only standard 64K fixed size TCP windows. Our IPv6/VSE product fully supports TCP window scaling. IPv6/VSE's large TCP windows range from 1MB to 8MB in size. This allows IPv6/VSE to send far more data without waiting for the remote host to send an acknowledgement. Since TCP acknowledgements are cumulative a remote host can receive large amounts of data acknowledged with a single acknowledgement.

IPv6/VSE's TCP window scaling can dramatically increase throughput. Reductions in run times of data transfer jobs can be as high a 95%.

Device and Cache Sizing



The Java VTAPE server reads and writes virtual tape data from and to file system files. These files may reside on a local SSD, harddrive or a remote SAN, Samba file. The speed at which the Java VTAPE server can read and write file data makes a difference in VTAPE throughput. So, the faster the Java VTAPE server can access the device, the better.


In addition, sizing the amount of memory available to the Linux or Windows system will help. Memory, not actively being used by an application, running on either Linux or Windows, is generally available to cache data in files.


By ensuring the Linux or Windows system has plenty of system memory for caching of file data and using a fast local storage device can really help VTAPE throughput. x86 Intel Linux and Windows images generally have 4GB (or more) of memory these days. Linux on Z, however, can require additional calculations to optimize the size of memory available.


As a starting point for Linux on Z images, try 1GB plus the average size of your virtual tape files. For example, if your VTAPE files average 1GB in size, try starting the 1GB + 1GB = 2GB for Linux on Z system memory. Tune up or down from there.


From the z/VSE TAPESRVR side, the utility you are using to create the virtual output tape is important. For example, trying to improve the performance of a LIBR BACKUP job can be very difficult. Why? z/VSE libraries have fixed length records of 1024 (1K) bytes. While the LIBR program can read multiple blocks in a single I/O, this only occurs if the blocks are contiguous. Accessing a z/VSE library is a very I/O intensive process. Using FCOPY or IDCAMS backup facilities can provide much faster access to data on disk. 3rd party backup and restore utilities are optimized for best disk access performance but still may have options for improving read or write access. For in house, E.g., COBOL, applications should be reviewed for improving read or write access. Specifying VSAM buffer space/counts in the job's JCL can have a dramatic impact on performance and improve throughput.


Linux Sockets


The Linux system configuration file /etc/sysctl.conf is used to modify various system defaults. In the case of a Linux system used to host the Java VTAPE server, changing the Linux system default for TCP windows scaling buffer sizes may improve performance. 


net.core.rmem_default = 4096000
net.core.rmem_max = 16777216
net.core.wmem_default = 4096000
net.core.wmem_max = 16777216 


Modifications made to /etc/sysctl.conf can be activated by restarting the Linux image or by using the sysctl -p /etc/sysctl.conf command.

z/VSE VTAPE Buffer Size 



The default 'chunk' size used by VTAPE on z/VSE systems is 1MB. This size can be changed by using the SIR VTAPEBUF=nnM command. The minimum is 1M and the maximum 15M. I do not believe there is any official documentation on this command. It is, however, referenced indirectly in the z/VSE Hints and Tips manual.


SIR VTAPEBUF=1M (default)


By default, under z/VSE 5.1+, IPv6/VSE uses 4M TCP windows, Try ... 

SIR VTAPEBUF=4M


Additional performance may be achieved by adding the SHIFT 7 command to the BSTTINET/BSTT6NET stack startup commands under z/VSE 5.1+. This results in using 8M TCP windows. In this case, try ...

SIR VTAPEBUF=8M 


Remember, increasing the VTAPEBUF size also increases the amount of 31-bit System GETVIS required by the TAPESRVR partition.




VTAPE Compression

Using the file suffix .zaws invokes zip style data compression within the TAPESRVR partition. While this option does reduce the size of the .aws tape files, it also increases the amount of CPU used by the TAPESRVR partition by a factor of 2x to 3x. It will also reduce the throughput of the VTAPE transfer.

If you do want to use compressed vtape files, ensure you have plenty of CPU available on the machine running the TAPESRVR job. And, using the fastest CPU you have available to run the Java VTAPE server will help too. 

QDIO or Hipersocket Queue Buffers


Linux on Z QDIO Ethernet or Hipersocket buffers can be changed but this is usually not necessary
On our SLES 12 SP 2 image ...


sles12sp2:/sys/bus/ccwgroup/drivers/qeth/0.0.0360 # cat buffer_count
128
sles12sp2:/sys/bus/ccwgroup/drivers/qeth/0.0.0360 # cat inbuf_size
64k


128 x 64K buffers is plenty

On z/VSE the number of QDIO/Hipersocket input buffers has been configurable since z/VSE 5.1. Output buffer buffers since z/VSE 6.1. See the TCP/IP Support manual for more details.

Additional Input / output queue buffers may improve TCP/IP performance. 

The input / output queue buffers can be configured in the IJBOCONF phase. You may use the skeleton SKOSACFG in ICCF library 59 to configure input / output queue buffers. The skeleton also describes the syntax of the configuration statements.


The z/VSE default is 8 x 64K input buffers and 8 x 64K output buffers. This default amount requires 1MB of page fixed 31-bit partition GETVIS in the IPv6/VSE BSTTINET/BSTT6NET stack partition.

Each additional 8 x 64K input and 8 x 64K output buffers requires 1MB of additional page fixed 31-bit partition GETVIS. So, if you change the default buffer counts, remember to change your // SETPFIX statement in your stack partition JCL.

For z/VSE, increasing the input/output buffer count to 16 or 32 might be useful.


Linux Fast Path (LFP)


For most workloads the default parameters used during LFP startup should be fine. However, there are a couple of values to monitor.

INITIALBUFFERSPACE   = 512K
MAXBUFFERSPACE       = 4M
IUCVMSGLIMIT                     = 1024


These values can be monitored by running // EXEC IJBLFPOP,PARM='INFO nn'

*** BUFFER MANAGER ***                          
  CURRENTLY USED MEMORY .......... : 524,160    
  INITIAL MEMORY SIZE ............ : 524,288    
  MAXIMUM MEMORY SIZE ............ : 4,194,304  


If CURRENTLY USED MEMORY is close to MAXIMUM MEMORY SIZE then you should probably increase the MAXBUFFERSPACE setting. CURRENTLY USED MEMORY may grow over time (up to MAXBUFFERSPACE), as tasks require buffers depending on the socket-workload. LFP allocates more buffers (if still below MAXBUFFERSPACE) as needed, but will never shrink the buffer space once allocated. All LFP buffer storage is allocated from 31-bit System GETVIS. 

When LFP is low on buffers, this might cause delays because tasks are put into a wait until buffers become available again from other tasks, or the in worst case, socket calls fail due to no more buffers available.

If lots of concurrent tasks use LFP, you might also watch these lines ...

TASKS WAITING FOR MSGLIMIT ..... : 0
TIMES IN WAIT FOR MSGLIMIT ..... : 0
IUCV MSGLIMIT EXCEEDED ......... : 0


If you see TASKS WATING FOR MSGLIMIT, increase the IUCV message limit.
Remember, this will likely increase buffer usage so you might want to increase buffer space as well.

LFP Tuning Tips courtesy of Ingo Frantzki at the z/VSE laboratory. 

Summary


As with all performance and tuning, will all of these tips and tricks help your VTAPE throughput? It depends. Your mileage will vary.

Remember, make one change at a time, evaluate the results before making more changes. Be prepared to 'back out' any change if testing results are not satisfactory. 

Well, there you have it. If anyone reading this has thoughts or suggestion, just send me an email (jeff@bsiopti.com) and I will incorporate the information into this article.


Jeff Barnard

Networking and Security
Barnard Software, Inc.



Wednesday, June 21, 2017

BSI IPv6/VSE Build 257 and IBM IPv6/VSE 1.3 Preview


Coming soon from BSI and IBM


BSI IPv6/VSE Build 257
IBM IPv6/VSE 1.3


Highlights


Improved BSTTFTPS FTP Server Security

  3 security methods, simplifies migrations
    BSI Standard Security
    IBM BSSTISX Security
    IBM RACROUTE Security

Improved PDF Generation

  Based on Open Source TXT2PDF product
  Many options to control the conversion and output appearance
  Entirely REXX based
  Full REXX source provided

SSH Secure Copy/Command Facility

  BSTTSCPY SSH Secure Copy Facility uses a Linux/Windows
  Pass-through image to facilitate an SSH connection to
  remote hosts providing for secure file transfer and
  command execution using SSH.

Encrypted Password Support

  Passwords no longer stored as clear text on your system
  Encrypted Password Member used by
    BSTTFTPC Batch FTP Client
    BSTTMTPC Batch Email Client
    BSTTREXC Batch Remote Execution Client
  Encrypted BSTTSCTY.T FTP Security Member used by
    BSTTFTPS FTP Server

BSTTFTPC Batch FTP Client Support for EPIC Catalog Access

  CA-EPIC or BIM/CSI-EPIC
  Eliminates the need to know block/record sizes

BSTTGSTI Utility for TLS Client Certificates

  Extracts SSL/TLS client Certificate information for TN3270E sessions
  For use with multi-factor authentication and signon to CICS

Explicit FTPS and SMTPS Support

  Support for AUTH TLS and STARTTLS commands
  BSTTFTPC Batch FTP Client
  BSTTFTPS FTP Server
  BSTTMTPC Batch Email Client

Implicit FTPS Support still available using BSTTATLS and BSTTPRXY

BSTTPRXY Server Load Balancing

  Automatically balances connections to multiple servers
  Up to 8 PRXY statements per BSTTPRXY partition
  E.g., Run up to 32 FTP Server connections per BSTTPRXY partition
  E.g., Balance CICS TS web connections between multiple listeners

Performance Improvements

  Greatly increased parallel time in the BSTTINET/BSTT6NET stacks
    Improves NP Ratio
  Dramatically reduced CPU overhead in SSL/TLS
    Up to 80% less CPU to create a secure socket
    Up to 25% less CPU processing SSL/TLS requests


Planned availability is fourth quarter, 2017.


Contact Teri at BSI for more information
teri@bsitcpip.com



Sunday, March 20, 2016

CICS TS Workload Management Using OPTI-WORKLOAD

Workload Management

Recently we had a customer request that we update our OPTI-WORKLOAD product to support the latest versions of z/VSE and CICS TS. The updates for supporting the latest z/VSE version was fairly easy and the customer reports that the z/VSE batch workload management works very well on their systems. The customer has a very large number of batch jobs that run throughout the day (and night). Using OPTI-WORKLOAD's facilities allowed them to have multiple balance groups and to have each group balanced based on the resource usage of each job.

However, updating the CICS/VSE Workload Manager to support CICS TS proved to be more difficult. And, in the process, we learned a bit so we thought we would pass on the information.

Before diving into CICS TS Workload Management I will review some of the basic features of Barnard Software, Inc.'s OPTI-WORKLOAD product.

VSE/ESA and z/VSE Balancing

The priority of the partitions/dynamic classes can be set and changed by using the AR PRTY command. When a dynamic class contains more than one allocated dynamic partition, the partitions within the dynamic class are balanced (time sliced). The time slice value can be modified via the MSECS command.

Partition Balancing

The partition balancing routine inspects the CPU time for each partition of the balancing group and decides how to rearrange the priority of the partitions. VSE/ESA 2.1 (and newer) running the Turbo dispatcher offers a new balancing algorithm. If partition balancing is specified for static and dynamic classes (via equal signs in the PRTY command), static and dynamic partitions will receive the same time slice. Without the Turbo dispatcher a dynamic class (with all its partitions) got the same time slice as a static partition.

OPTI-WORKLOAD Workload Management


OPTI-WORKLOAD introduces the concept of performance groups to VSE/ESA 1.3 (or higher) systems. Performance groups are a group of static partitions and/or dynamic classes with similar performance goals. Once a performance group has been identified, OPTI-WORKLOAD monitors the velocity of work being done by each partition and class. At specified intervals the partition or class achieving the lowest velocity of work is given the highest dispatching priority of those partitions and classes in the performance group. This results in a more equitable allocation of resources and greater overall system throughput.

Velocity of Work

Velocity of work is calculated by monitoring the number of times the partition is actively using the CPU, actively waiting on I/O, delayed waiting on the CPU and delayed waiting on other resources. Velocity is expressed as the percentage of the time a partition has the resources that it needs available to it. Velocity is not a measure of the quality of the work being done, it is simply a measure of how much of the time resources needed by a partition are available to it.

CICS Workload Management


OPTI-WORKLOAD provides a special CICS Workload Management feature. The CICS Workload Manager uses Velocity calculations to manage CICS transaction priorities. Proper CICS Workload Management results in dramatic improvements in CICS response times and throughput.

The CICS Workload Manager uses Velocity calculations to increase the priority of CICS transactions with low velocity-of-work (or transactions ‘starving’ for resources) and to decrease the priority of CICS transactions with high velocity-of-work (or transactions monopolizing resources).

The CICS Workload Manager defines a transaction as the work done between sync point requests and
measures the amount of work done by counting the number of KCP requests made by the transaction. On CICS systems where applications do not perform sync point requests for themselves, CICS will automatically issue a sync point request at end of task or when the transaction enters a terminal wait. If a transaction uses more than the specified number of work units (the CICSINT threshold value), it is moved down by one priority increment. At the same time, if a transaction is delayed by a higher priority transaction using all available resources, the delayed transaction is moved up by one priority increment. As transactions enter the CICS system their initial priority defines the its starting position in the CICS Workload Managers management scheme.

The CICSINT threshold value is normally set to a value about 50% higher than the number of work units used by an average transaction. In this way most CICS transactions enter and leave the system with little, if any, management required. However, when longer running transactions enter the system, the priority of these transactions will float downward while the priorities of transactions competing for resources will float upward. By managing the priorities of the various CICS transaction active in a system, the CICS Workload Manager maximizes throughput resulting in reduced and more consistent response times.

CICS Workload Management Thresholds


Five OPTI-WORKLOAD THRESHOLD commands are used to customize the CICS Workload
Management feature. The CICS Workload Manager extracts these threshold values from the OPTI-
WORKLOAD partition. Therefore, the OPTI-WORKLOAD partition should be active before the CICS Workload Management feature is initialized. If the OPTI-WORKLOAD partition is not active initialization will complete and the CICS Workload Manager will enter ‘quiet’ or ‘inactive’ mode. The CICS Workload Manager will synchronize with the OPTI-WORKLOAD partition when it becomes active.

The THRESHOLD CICSINT nnnn specifies the CICS Workload interval. The default value is 100 work units. Work units are defined as the work done between DFHKCP requests.

The THRESHOLD PRTYMIN nnn specifies the minimum transaction priority to be managed by the CICS Workload Management feature. Any transaction with a priority less than the minimum will not be managed. The default value is 2.

The THRESHOLD PRTYMAX nnn specifies the maximum transaction priority to be managed by the CICS Workload Management feature. Any transaction with a priority greater than the maximum will not be managed. The default value is 4.

The THRESHOLD RUNAWAY nnn specifies the Workload Runaway Task multiplier. Any transaction using more than the RUNAWAY * CICSINT work units is considered a Workload Runaway Task. For example, if the CICSINT is 250 work units and the RUNAWAY is 100, any transaction using more than 25000 work units would be a Workload Runaway Task. The default is 100.

The THRESHOLD TRIVIAL nnn specifies the trivial transaction work unit limit. Any transaction using less work units than the specification is considered trivial. Trivial transaction are excluded from the non-trivial transaction statistics displayed in the CICS Workload Manager termination statistics. The default is zero.

CICS TS Workload Management

To begin, the CICS TS Performance Guide describes CICS TS transaction priorities like this ...

The overall priority is determined by summing the priorities in all three definitions for any given task, with the maximum priority being 255.

Priority = Terminal priority + Operation priority + Transaction Priority

The value of the system initialization parameter, PRTYAGE also influences
the dispatching order, for example, PRTYAGE=1000 causes the task's
priority to increase by 1 every 1000ms it spends on the ready queue.

This is a nice description and basically incorrect.

With the help of Eugene (Gene) Hudders, a CICS TS internals expert and author the C/Trek CICS TS performance monitor, I found out basically how the CICS TS dispatcher actually works.

The one-byte task priority is initially calculated as described above. When PRTYAGE=0 (disabled) the 1-byte task priority is converted to an 8-byte priority field with the 1-byte priority in the low order byte. With PRTYAGE=nnnn the 1-byte priority is converted in some way and combined with other information (E.g., transaction arrival time, etc) and stored as an 8-byte priority field.

This 8-byte priority field becomes the value used to control dispatching of the transaction for the transactions remaining lifetime.

The CICS TS dispatcher keeps 3 queues of tasks within a CICS TS image. The 1st queue is the Ready queue (also called the Private queue). This is a queue of ready to run tasks in priority sequence. The 2nd queue is the Dispatchable queue (also called the Public queue). This is a queue of dispatchable tasks ordered by queue arrival time. The 3rd queue is the overall task queue. Basically this queue is all tasks that are in the system.

The CICS TS dispatcher will dispatch tasks from the Ready queue in priority sequence until the queue is empty and "from time to time" (IBM's term) it will add tasks from the Dispatchable queue as it looks for more work. When a task on the waiting/suspend queue is ready to run, it is added to the Dispatchable queue. And the dispatch process continues ...

Now, this is my own interpretation of how the dispatcher works and some might argue that the 3rd queue is not really a queue at all or that the 2nd queue (Public) is not used on z/VSE (since it has only one QR TCB) but I think it is pretty close to the process the dispatcher uses.

One more note about the 8-byte transaction priority field, this value can, at times, be modified by CICS TS to ensure quick or delayed dispatching. For example, when a task is first attached (started), its priority is set to a 'very high' value to ensure the initial dispatch of the task is done quickly. This ensures the attach process, including the calls to any Task Related User Exit (Start of Task TRUE) is completed very quickly. After this the task priority is returned to normal. At other times, the dispatcher may add a delay to the dispatch time. For example, if available storage becomes less than certain values, the dispatcher may add a task to the Dispatchable queue but add some number of milli-seconds to its arrival time (remember the Dispatchable queue is ordered by arrival time).  This effectively delays dispatching.

Are there problems with this design?

Overall the design is quite good. Certainly far better than the old CICS/VSE dispatcher. But there are a couple of issues ...

One issue with CICS TS PRTYAGE is that is only increases the priority of a task. To be effective a workload manager must also understand that heavy resource usage tasks (that should be batch applications) do exist and when running must have their priority reduced to allow other light resource tasks a chance to run. 

Also you need to use a very low setting for the PRTYAGE parameter because if you use the default setting of 32768 (32.768 seconds) then the task's priority would not be increased for 32.768 seconds. If a task had to wait to be dispatched for 32 seconds, you have a bigger problem than the PRTYAGE parameter is going to help. In fact, in z/OS this parameter's default value has been lowered from 32768 to 1000.  I would go further and suggest a value between 250 and 500.

Remember, as the CICS TS dispatcher ages a transaction the priority of the transaction increases until the next successful dispatch of the task. At that point the dispatcher resets the priority back to normal. The affect on the priority by PRTYAGE is temporary. While this makes sense (at least to me) it does show that PRTYAGE has little affect on dispatching unless the CICS TS system is very busy and transactions are backed up waiting to run.


OPTI-WORKLOAD to the Rescue

The CICS Workload Manager in OPTI-WORKLOAD will both decrease and increase the priority of a CICS transaction based, not on time, but on the amount of resources used (work done) by the transaction. These changes in the transaction's priority are permanent until or unless the CICS Workload Manager decides to change the priority again.

The CICS Workload Manager can manage the workload with or without PRTYAGE active. 

When using the CICS Workload Manager you specify the number of work units that are considered normal. This is the CICSINT value. If a transaction uses less than CICSINT work units nothing is done to its priority. If the transaction does exceed CICSINT work units and its initial priority falls in the management range (PRTYMIN to PRTYMAX) then the transaction will have its priority reduced by one. Over time a long running or heavy resource usage transaction will see its priority gradually lowered until it reaches the PRTYMIN value. At the same time, transactions that are waiting to run but unable to run because high resource usage transactions will see their priority gradually increased until it reaches the PRTYMAX value. The net result is transactions within CICS will have their priority managed up or down based on resource usage achieving maximum throughput. 

With the CICS Workload Manager active long running (batch like) transactions no longer block or 'lock out' normal or short running transactions.

Defining Transaction Priorities

Priority              Comment
_____________________________________________________________
0                     Lowest
1
...
...
PRTYMIN            \  Minimum managed by the Workload Manager
...                 \
...                  |- Range managed by the Workload Manager
...                 /
PRTYMAX            /  Maximum managed by the Workload Manager
...
...
254
255                   Highest 

There are many ways to define the initial priority of transaction within CICS. Some basic recommendations are ...

#1, Avoid 0 and 255. These priorities are the lowest and highest. And, often used by CICS housekeeping and utility transactions.

#2, Long running, high resource usage transactions should be defined with a low priority.

#3, Short running, low resource usage transactions should be defined with a high priority.

#4, Mixed usage transactions are difficult to manage without help.

Sometimes obeying these rules is easy. For example, a bank might have a Savings Balance (SBAL) transaction that simply reads a single customer record and displays some information about the customer's account. This is a transaction that should have a high priority. On the other hand, when the branch manager is balancing all the tellers with a TBAL transaction that reads the entire teller file, you have a transaction the needs a low priority.

Sometimes obeying the rules is not so easy. Perhaps you have a 'master' MENU transaction that invokes many functions and may even transfer control to many other transactions. Yet all of these transaction run with the priority of the 'master' MENU transaction. Another example might be a TCP/IP socket transaction that listens for connections. When a connection is made the socket is given to a child transaction for processing. Yet the child transaction might have the same transaction ID as the listener transaction. With these transactions, some are low resource usage transactions and some are high resource usage transactions. Proper management of their priorities is very difficult unless you have a CICS Workload Manager.

When using the OPTI-WORKLOAD CICS Workload Manager the whole process becomes quite simple. For example, define your CICSINT typical transaction work units, PRTYMIN (perhaps 50) and PRTYMAX (perhaps 150).

Rule #2, These transactions get a priority between 1-49
Rule #3, These transactions get a priority between 151-254
Rule #4, These transactions get a priority of 100 and are managed.

With proper CICS Workload Management CICS transactions response times are reduced, stabilized and far more consistent. 

Well there you have it. Enjoy.

Jeff Barnard
Barnard Software, Inc.










Thursday, January 21, 2016

Comparing SSL/TLS Facilities for z/VSE

Comparing SSL/TLS Facilities for z/VSE


We have recently received a number of questions about the SSL/TLS functionality available for IPv6/VSE. These questions are generally about the differences between IPv6/VSE SSL/TLS functionality and the SSL/TLS functionality available in the SSL for VSE feature of TCP/IP for VSE.

The question of SSL/TLS support is far more complex that just having an SSL/TLS client that is able to connect to z/VSE.

Most of the information presented here is also in the IBM publication z/VSE 6.1 TCP/IP Support SC34-2706-00 which provides information about SSL/TLS features available for both IPv6/VSE and TCP/IP for VSE. However, I will try to layout the information in a way that will allow you to easily compare the types of available SSL/TLS support.

For reference, when we talk about SSL/TLS sockets we are referring to a secure and trusted connection. The word secure means the connection is encrypted. The word trusted means the connection has been authenticated.

Authentication itself means the certificate presented to the SSL/TLS client by the server has been checked against the Certificate Authority's certificate to verify correctness, expiration, etc. Optionally, the SSL/TLS client can present a certificate to the SSL/TLS server allowing the server to authenticate the client too. This means that the SSL/TLS client and the SSL/TLS server are known to each other and have verified they are authorized to establish a connection.

The article will also refer to "Control Program Assist for Cryptographic Function" (CPACF). This is a feature of System z machines. The feature is disabled by default. After ordering and installing a System z machine, you much contact IBM to have this feature enabled. The feature provides hardware micro/millicode assist instructions to dramatically improve the performance of encryption and message digest/hash functions. CPACF features discussed in this article require using a z10 or newer machine.

Certification


To begin, lets look at the support itself. IPv6/VSE uses the IBM z/VSE port of the Open Source package OpenSSL. The OpenSSL package used by IBM has US government FIPS 142-2 certification and IBM itself updates the OpenSSL port with security patches and makes these updates available to customers in a timely fashion. US government certification of OpenSSL is a time consuming and expensive process where every part of the OpenSSL code is tested for correctness.

The SSL for VSE feature of TCP/IP for VSE was developed by Connectivity Systems, Inc. (CSI) and is exclusive to TCP/IP for VSE. I could find no information about the certification status of SSL for VSE.

A certified SSL/TLS solution is important because the certification process tests to make sure the SSL/TLS connections are made correctly and that no information is leaked in the process. Certification is far more that just being able to establish a connection or having someone tell you "Hey! I got connected!".

SSL/TLS Connection Types

IPv6/VSE supports SSLv3, TLSv1 and TLSv1.2 using RSA and Diffie-Hellmann key exchange.
SSL for VSE supports SSLv3 and TLSv1 using RSA key exchange.

While SSLv3 is supported, it should not be used. SSLv3 is completely broken and is little, if any, better than using plain text socket connections. The IBM OpenSSL port plans to discontinue SSLv3 support in the near future.

While TLSv1 is secure, IPv6/VSE supports TLSv1.2 connections. TLSv1.2 is the current TLS standard and is more secure than TLSv1.

While both SSL/TLS solutions support RSA keys, IPv6/VSE supports RSA keys sizes up to 4096 bits in length in both software and hardware.

While the SSL for VSE feature of TCP/IP for VSE Version 2.1 supports these RSA key sizes too, RSA keys sizes of 2048 and 4096 bits require Crypto Express2 (or later) adapter hardware. The SSL for VSE feature of TCP/IP for VSE Version 1.4/1.5 does not support RSA key sizes greater than 1024 bits.

Since RSA keys sizes less than 2048 bits are now deprecated and insecure, the SSL for VSE feature of TCP/IP for VSE Version 2.1 and a Cyrpto Express2 (or later) adapter hardware is required for secure sockets using SSL for VSE.

IPv6/VSE also supports using the more secure Diffie-Hellmann (DH) key exchange. DH key exchange is different method of managing session keys and is more secure than RSA key exchange. SSL for VSE does not support this type of key exchange. 

Why is DH Important?

The following diagram shows an SSL session key exchange using RSA.

The weak point is that the secret session is part of the SSL session data. The key is sent from the client to the server, encrypted via the public server RSA key. If the private server RSA key is compromised, stolen, or broken, the session key is no longer secure and it is possible to decrypt and read the complete session data. 

Another method of exchanging the session key, is by using Diffie-Hellman. Using Diffie-Hellman, the session key is never sent over the network and is therefore never part of the network session data. The following example shows how the session key is negotiated using DH. Therefore, the common term is not “exchanging a session key”, but rather “agreeing on a common session key” through the DH key agreement process.


The important point is that the encryption key is created independently on both sides. Therefore, it is not possible to reveal the session key from a given recorded network session later. For simplification, the example does not show how the two communication partners are authenticated. In practice, DH is mostly used together with certificate authentication by using RSA.

DH key exchange is the most secure method currently available and is fully supported by IPv6/VSE. Again, SSL for VSE does not support DH key exchange.

Encryption Methods

Supported by SSL for VSE ...
01 SSL_RSA_WITH_NULL_MD5
02 SSL_RSA_WITH_NULL_SHA 
08 SSL_RSA_EXPORT_WITH_DES40_CBC_SHA 
09 SSL_RSA_WITH_DES_CBC_SHA 
0A SSL_RSA_WITH_3DES_EDE_CBC_SHA 
2F TLS_RSA_WITH_AES_128_CBC_SHA 
35 TLS_RSA_WITH_AES_256_CBC_SHA 

Of these encryption methods 01, 02, 08 and 09 should not be used because they are insecure.

Supported by IPv6/VSE ...
0A DES-CBC3-SHA 
2F AES128-SHA 
35 AES256-SHA 
3C AES128-SHA256 
3D AES256-SHA256 
16 EDH-RSA-DES-CBC3-SHA
33 DHE-RSA-AES128-SHA
39 DHE-RSA-AES256-SHA 
67 DHE-RSA-AES128-SHA256
6B DHE-RSA-AES256-SHA256 
C012 ECDHE-RSA-DES-CBC3-SHA
C013 ECDHE-RSA-AES128-SHA
C014 ECDHE-RSA-AES256-SHA
C027 ECDHE-RSA-AES128-SHA256

Note: The z/VSE 6.1 TCP/IP Support manual states the IPv6/VSE supports method 09 and C011 but these algorithms are insecure and no longer supported.

IPv6/VSE supports far more encryption methods using both software and hardware.

Message Digest/Hash Algorithms

Each message (packet) transferred using SSL/TLS includes a message digest or hash value. This value is used to verify the data has not been altered. A Message Digest is also used to secure certificates and ensure the certificate has not been altered. Therefore the algorithm used to calculate this value is very important.

The message digest MD5 is completely broken/insecure and should no longer be used.
The SHA1 family of message digests can be used for secure sockets but should not be used to authenticate certificates. The difference is how long the hash value is used. For secure sockets the hash value is used only for the duration of an SSL/TLS transfer. For certificates the value is used for a long time, perhaps years. This gives hackers far to much time to hack/break the digest key.
The SHA2 family of message digests are current and secure.
The SHA3 family of message digests are also secure and the industry is moving to SHA3.

IPv6/VSE supports using SHA1 and SHA2. It is planned to add SHA3 in the near future. These values are supported using either software or hardware (CPACF).
SSL for VSE supports using SHA1 and with the introduction of TCP/IP for VSE 2.1 under z/VSE 6.1 SHA2 was added. However, the CPACF hardware on a z10 (or newer machine) must be available and enabled for SHA2 to be used.

The industry is now using SHA2 and moving to SHA3 to secure certificates and communications.

Sample Connection

Google Chrome was used to create this blog entry. I used the information available from Chrome to display the type of SSL/TLS connection.


So, Chrome created a TLSv1.2 secure socket using ECDHE_ECDSA. This is an Elliptic Curve cryptographic session created using DH key exchange. After the key exchange process completed a 128-bit AES key is being used to encrypt each message. IPv6/VSE fully supports this type of TLS session while SSL for VSE does not.

Viewing the certificate information shows that the certificate's fingerprint or message digest hash uses both SHA-1 and SHA-2 (SHA-256). This allows both older (less secure) and newer (more secure) clients and servers to authenticate the certificate.



Summary

Maybe the real question is simple.

Can I get a secure and trusted socket using IPv6/VSE? Yes, easily. Many options exist to provide secure communications using both software and hardware.

Can I get a secure and trusted socket using the SSL for VSE feature of TCP/IP for VSE?

Maybe. Only if you ...
  1. Are using TCP/IP for VSE 2.1 and
  2. Running on a z10 (or newer) machine and
  3. Have CPACF  available and enabled and
  4. Have a Crypto Express2 (or later) adapter

Basically, if you need truly secure and trusted socket support I strongly recommend you move to IPv6/VSE.



Credits

IBM publication z/VSE 6.1 TCP/IP Support SC34-2706-00

The diagrams in the "Why is DH Important?" section and the text around them was lifted directly from the IBM manual with minor edits to the text by me. Actual credit here is likely to be to Joerg Schmidbauer at the IBM z/VSE lab in Germany. Joerg is the person who did the actual port of OpenSSL to z/VSE. Very impressive stuff if you ask me.





Wednesday, August 26, 2015

IPv6/VSE SSH Secure Copy for z/VSE


BSTTSCPY SSH Secure Copy Facility


Over the years, Barnard Software, Inc., has received a number of requests to provide SSH or SSH like functionality. However, VSE/ESA and z/VSE does not provide the basic foundation for this type of function.

At the same time we have wondered “What exactly would you do with SSH on z/VSE?” It is a good question since z/VSE does not have a 'shell' or interactive command environment. When we ask this question more often that not we hear “Well, we have to transfer data to someone that requires we use SSH.” 

For this we can provide a solution.

The IPv6/VSE BSTTSCPY SSH Secure Copy Facility uses a Linux Pass-through image to facilitate an SSH connection to remote hosts providing for secure file transfer using SSH to and from z/VSE.

SSH


SSH is the standard world wide for secure access to systems.
Secure Shell, or SSH, is a cryptographic (encrypted) network protocol for initiating text-based shell sessions on remote machines in a secure way.

This allows a user to run commands on a machine's command prompt without them being physically present near the machine. It also allows a user to establish a secure channel over an insecure network in a client-server architecture, connecting an SSH client application with an SSH server. Common applications include remote command-line login and remote command execution, but any network service can be secured with SSH. The protocol specification distinguishes between two major versions, referred to as SSH-1 and SSH-2.

The most visible application of the protocol is for access to shell accounts on Unix-like operating systems, but it sees use on Windows as well. In 2015 Microsoft announced that they would include native support for SSH in a future release.

SSH was designed as a replacement for Telnet and other insecure remote shell protocols such as the Berkeley rsh and rexec protocols, which send information, notably passwords, in plaintext, rendering them susceptible to interception and disclosure using packet analysis. The encryption used by SSH is intended to provide confidentiality and integrity of data over an unsecured network, such as the Internet.

Secure Copy


Secure copy or SCP is a means of securely transferring computer files between a local host and a remote host. It is based on the Secure Shell (SSH) protocol.

SFTP vs. FTPS


FTPS (also known as FTP-ES, FTP-SSL and FTP Secure) is an extension to the commonly used File Transfer Protocol (FTP) that adds support for the Transport Layer Security (TLS) and the Secure Sockets Layer (SSL) cryptographic protocols.

FTPS should not be confused with the SSH File Transfer Protocol (SFTP), an incompatible secure file transfer subsystem for the Secure Shell (SSH) protocol. It is also different from FTP over SSH, the practice of tunneling FTP through an SSH connection.

In the past, CSI and IBM have written manuals describing a “Secure FTP Facility” for z/VSE. This facility is FTPS (FTP using SSL). It is not SFTP (FTP over SSH). IPv6/VSE provides FTPS (FTP over SSL) also.

The Secure Copy facility provided by IPv6/VSE is not SFTP or FTPS.

Secure Copy Concepts


The following diagram shows how the BSTSTSCPY Secure Copy Facility transfers data to and from z/VSE using a Linux Pass-through image.

BSTTSCPY using a Linux Pass-through Image


This is the basic overview of the IPv6/VSE Secure Copy Facility and the Linux Pass-through Image.















The BSTTSCPY application running on z/VSE connects to the bsttscpyd (BSTTSCPY Daemon) running on the Linux Pass-through image. From there, the bsttscpyd uses SSH to connect to the destination remote host. Data transferred from BSTTSCPY running on z/VSE to the bsttscpyd is clear text. The data transferred by SSH is, of course, encrypted.

BSTTSCPY Using Linux on System z















This is the recommended configuration.

In this configuration we suggest using a Hipersockets connection between z/VSE and the Linux Pass-through image. This is very fast. This configuration also guarantees no clear text data ever leaves the System z machine.

Linux Fast Path (LFP)


IBM's Linux Fast Path (LFP) can also be used in this configuration. Using LFP, BSTTSCPY can communicate with bsttscpyd running on the Linux Path-Tthough image using IUCV.
LFP also provides access to z/VM IP Assist which can be used to access the network on supported System z hardware, providing access to bsttscpyd running on an x86_64 Linux Pass-through image.

















BSTTSCPY Using x86_64 Intel


















If you do not have a Linux on System z machine available to run the bsttscpyd, you still can use this feature. You can use one of these options.
  1. An x86_64 Intel Linux machine
  2. A 64-bit Windows 7 (or newer) machine
    Running either ...
    1. 64-bit Cygwin
    2. Virtual Box
      Running an x86_64 Intel Linux image


Some customers have suggested that this is not a 'secure' configuration and I have been mystified by these comments. 

A good network administrator can easily make this configuration completely secure.

First, the subnet used by the BSTTSCPY facility in z/VSE would be different than the usual production subnet. E.g., If the production subnet is 192.168.0.0/16 then the subnet used by the BSTTSCPY facility might be 172.16.1.0/24.

Second, the NIC's used by the System z machine and the PC would be connected to the same layer 2 switch. This means traffic from these systems would never go outside of the switch being used.

Next, traffic from these systems would use a special/unique VLAN.

And, this is the key. By using a special VLAN for this traffic, it is physically separate from all other traffic on the LAN. This provides excellent security for the data transfers.


Why Use a Linux Pass-through Image?


The SSH connections from the Linux Pass-through image use public key authentication. Public key authentication allows you to login to a remote host via the SSH protocol without a password and is more secure than password-based authentication.

Password authentication is not supported and can not be used with the BSTTSCPY Secure Copy facility.

There are several benefits to using a Linux Pass-through image.
  1. SSH is basic to all Linux OS installations.
  2. SSH and Linux are Open Source
  3. Support and updates are provided by the Linux distribution 
    E.g., SUSE, Red Hat.
  4. FIPS 140-2 Certification of OpenSSH and OpenSSL
  5. All cryptographic overhead is offloaded to the Linux Pass-through image.
    CPU overhead of cryptographic functions can be very high.
  6. No data is stored on the Linux Pass-through image.
The last item is critical. The Linux Pass-through image is used only for SSH (and its functionality). No data is stored on the Linux Pass-through image at any time.

The Linux Pass-through image can be a Linux on System z (zLinux) image, an x86-64 Intel Linux image or a Windows system hosting a Linux Pass-through image. When using a Windows host both Cygwin and VirtualBox Linux images are supported.


Linux Pass-through Image


Once you have access to the Linux Pass-through image, you will want to create the user that will run the bsttscpyd daemon. This can be root but it is not required. Since no data is stored on the Linux Pass-through image the user used can be a normal user.

Authentication


The SSH connections from the Linux Pass-through image to destination remote hosts use public key authentication. Public key authentication allows you to login to a remote host via the SSH protocol without a password and is more secure than password-based authentication.

Password authentication is not supported and can not be used with the BSTTSCPY Secure Copy facility.

SSH keys provide a more secure way of logging into a virtual private server with SSH than using a password alone. While a password can eventually be cracked with a brute force attack, SSH keys are nearly impossible to decipher by brute force alone. Generating a key pair provides you with two long string of characters: a public and a private key. You can place the public key on any server, and then unlock it by connecting to it with a client that already has the private key. When the two match up, the system unlocks without the need for a password. 


BSTTSCPY


The basic structure of the z/VSE BSTTSCPY application is similar to the IPv6/VSE BSTTFTPC application. Remember, SSH transfers all data in binary form. So, if translation of the data is necessary you must tell BSTTSCPY to handle this function.

BSTTSCPY requires IPv6/VSE Build 256pre17 (or later).

Feature code 'S' is required for use of the IPv6/VSE BSTTSCPY application. If your IPv6/VSE license key does not have feature code 'S' in it, you will need to contact Barnard Software, Inc. for an updated license key.

The IPv6/VSE BSTTSCPY application (like BSTTFTPC, BSTTMTPC, etc.) requires a minimum 8M partition for execution.

BSTTSCPY can use the IPv6/VSE BSTTINET/BSTT6NET TCP/IP stacks as well as the TCP/IP for VSE TCP/IP stack.

The Basic Process


Identify the stack and connect to the bsttscpyd you want to access.

Define the INPUT or OUTPUT data.

Specify options. E.g., TYPE A (Convert to ASCII) etc. Most of the options used for a BSTTFTPC FTP client data transfer can be used with BSTTSCPY also.

Define the destination remote host, userid and port.

STOR or RETR the data.

And, finally QUIT. 

Basic JCL


// EXEC BSTTSCPY,SIZE=BSTTSCPY          

ID nn                                   

OPEN ...                       

*                                       
INPUT ...
TYPE A
*                                       
PORT 22                                 
HOST user@host                  
STOR file.name                        
*                                       
QUIT                                    
/*                                      


Just like BSTTFTPC, BSTTSCPY commands are used in pairs. The INPUT command is paired with the STOR command and the OUTPUT command paired with the RETR command.

IPv6/VSE for VSE/ESA and z/VSE


More information about the IPv6/VSE SSH Secure Copy facility is available in the IPv6/VSE SSH Secure Copy Supplement Guide. This manual is part of the IPv6/VSE download available from the BSI website.