Mainframe Connector API reference

The following table lists the BigQuery, Cloud Storage, and other Google Cloud commands that you can use with Mainframe Connector.

Product Command Description Supports remote transcoding
BigQuery commands bq export Use this command to create a binary file. The command accepts a COPYBOOK DD as input.

Note: The bq export command fails requests to export large Bigtable tables. To avoid errors, add the -allowLargeResults flag to the bq export command when you want to export large tables.
Yes
bq load Use this command to load data into a table. For more information, see bq load. No
bq mk Use this command to create BigQuery resources, such as built-in tables or external tables, that need partitioning and clustering to be set up. For more information, see bq mk. No
bq query Use this command to create a query job that runs the specified SQL query.

Use the --follow=true flag to generate a report that displays the results of a select query. In order to write this report to a file in the mainframe, define a DD statement AUDITL that points to the file that should contain the audit logs report. Don't use the --follow flag if you want normal logging behavior.

Some query results may return a large number of rows, sometimes in millions. In order for the output to remain human readable the number of lines displayed is capped. To control the number of rows being displayed, use the --report_row_limit flag. For example, use --report_row_limit 10 to limit the results to 10 lines. By default, the number of lines displayed is limited to 30.

For more information, see bq query.
Yes
bq rm Use this command to permanently delete a BigQuery resource. As this command permanently deletes a resource, we recommend that you use it with caution. For more information, see bq rm. No
Cloud Storage commands scp Use this command to copy text or binary data to Cloud Storage. You can use the simple binary copy mode to copy a dataset from IBM z/OS to Cloud Storage unmodified as part of a data pipeline. Optionally, you can convert the character encoding from extended binary coded decimal interchange code (EBCDIC) to ASCII UTF-8, and add line breaks.

You can also use this command to copy application source code defined in job control language (JCL).
No
gsutil utility gsutil cp Use this command to transcode a dataset and write it to Cloud Storage in the Optimized Row Columnar (ORC) file format. The command reads the data from the INFILE dataset, and the record layout from the COPYBOOK DD. The command then opens a configurable number of parallel connections to the Cloud Storage API and transcodes the COBOL dataset to the columnar and GZIP compressed ORC file format. You can expect about 35% compression ratio.

Optionally, you can use this command to interact with the Mainframe Connector gRPC service running on a VM on the mainframe. To do so, set the SRVHOST and SRVPORT environment variables, or provide the hostname and port number using command line options. When the gRPC service is used, the input dataset is first copied to Cloud Storage by the Mainframe Connector, and then a remote procedure (RPC) call is made to instruct the gRPC service to transcode the file.

gsutil cp command also supports some performance tuning capabilities. For more information, see Performance improvements for the gsutil cp command.
Yes
gsutil rm Use this command to delete buckets or objects within a bucket. For more information, see rm - Remove objects. No
gszutil utility gszutil The gszutil utility runs using the IBM JZOS Java SDK and provides a shell emulator that accepts gsutil and BigQuery command line invocations using JCL.

The gszutil utility extends the functionality of the gsutil utility by accepting a schema in the form of a COPYBOOK DD, using it to transcode COBOL datasets directly to ORC before uploading to Cloud Storage. The gszutil utility also lets you execute BigQuery query and load using JCL.

The gszutil utility works with the gRPC server, which helps you reduce the million instructions per second (MIPS) consumption. We recommend using the gszutil utility in your production environment to convert binary files in Cloud Storage to the ORC format.
No
Other Commands gcloud pubsub topics send Use this command to send a message to a Pub/Sub topic. You can provide the message using the command line, or using a dataset. No
gcloud dataflow flex-template run Use this command to trigger the execution of a Dataflow flex template. The command runs a job from the specified flex template path. For more information, see gcloud dataflow flex-template run. No
curl Use this command to make an HTTP request to a web service or REST APIs. No

Performance tuning configuration for the gsutil cp command

Mainframe Connector supports the following performance tuning configuration for the gsutil cp command.

  • Use the --parallelism flag to set the number of threads. The default value is 1 (single threaded).
  • Use the --maxChunkSize argument to set the maximum size of each chunk. Each chunk will have its own Optimized Row Columnar (ORC) file. Increase this value to reduce the number of chunks created at the cost of larger memory requirements during the transcoding process. For details, see Parse the maxChunkSize argument. The default value is 128 MiB.
  • Use --preload_chunk_count argument to set the amount of data to preload to memory while all workers are busy. This argument can improve performance at the cost of memory. The default value is 2.

Execution example

gsutil cp \
  --replace \
  --parser_type=copybook \
  --parallelism=8 \
  --maxChunkSize=256MiB \
  gs://$BUCKET/test.orc

In this example, we've considered a large file and so have used 8 threads at which line rate is reached. If you have enough memory, we recommend that you increase the chunk size to 256 MiB or even 512 MiB since it reduces creating overhead and finalizing Cloud Storage objects. For small files using less threads and smaller chunks might produce better results.

Parse the maxChunkSize argument

The maxChunkSize flag accepts values in the form of an amount and a unit of measurement, for example 5 MiB. You can use whitespace between the amount and magnitude.

You can provide the value in the following formats:

  • Java format: b/k/m/g/t, for byte, kibibyte, mebibyte, gibibyte, and tebibyte respectively
  • International format: KiB/MiB/GiB/TiB, for kibibyte, mebibyte, gibibyte, and tebibyte respectively
  • Metric format: b/kb/mb/gb/tb, for kilobyte, megabyte, gigabyte, and terabyte respectively

Data size parsing is case insensitive. Note that you can't specify partial amounts. For example, use 716 KiB instead of 0.7 MiB.