SAP Datasphere federated queries

As a data analyst, you can query relational data in SAP Datasphere from BigQuery using federated queries.

BigQuery SAP Datasphere federation lets BigQuery query data residing in SAP Datasphere in real time, without copying or moving data.

To run a SQL query in SAP Datasphere, specify that SQL query within BigQuery in a EXTERNAL_QUERY function. The results are then transferred from SAP Datasphere to BigQuery.

Limitations

  • You can only query relational views that are exposed for consumption. Other objects in SAP Datasphere are not accessible to the query federated through EXTERNAL_QUERY.
  • The federated query latency might be noticeably higher than the same query if it was executed directly in SAP Datasphere.
  • The first query that uses SAP Datasphere connection in a given project might take more than a minute to run.
  • No additional SQL pushdowns are supported for SAP Datasphere.
  • The SAP Datasphere SQL query must specify aliases for columns that contain function results.
  • When the usage of Compute Engine API in the query project is restricted by VPC Service Controls, the federated query will fail.

Before you begin

Ensure that your BigQuery administrator has created a SAP Datasphere connection and shared it with you.

Required roles

To get the permissions that you need to query SAP Datasphere, ask your administrator to grant you the BigQuery Connection User (roles/bigquery.connectionUser) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Query data

To send a federated query to SAP Datasphere from a GoogleSQL query, use the EXTERNAL_QUERY function.

The following example is a federated query that joins a table in SAP Datasphere named ORDERS and a table in BigQuery named mydataset.customers.

SELECT c.customer_id, c.name, rq.first_order_date
FROM mydataset.customers AS c
LEFT OUTER JOIN EXTERNAL_QUERY(
  'connection_id',
  '''SELECT CUSTOMER_ID, MIN(ORDER_DATE) AS first_order_date
     FROM ORDERS
     GROUP BY CUSTOMER_ID''') AS rq
  ON rq.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, rq.first_order_date;

View a SAP Datasphere table schema

The following examples use the EXTERNAL_QUERY function to retrieve database metadata from the SYS schema in SAP Datasphere.

-- List all views in a schema.
SELECT * FROM EXTERNAL_QUERY(
  'connection_id',
  '''SELECT VIEW_NAME FROM SYS.VIEWS
     WHERE SCHEMA_NAME = 'MY_SCHEMA'''');
-- List all columns in a view.
SELECT * FROM EXTERNAL_QUERY(
  'connection_id',
  '''SELECT COLUMN_NAME, DATA_TYPE_NAME
     FROM SYS.VIEW_COLUMNS
     WHERE SCHEMA_NAME = 'MY_SCHEMA' AND
           VIEW_NAME = 'my_view'
     ORDER BY POSITION''');

Pricing

The cost of running a federated query is based on three factors:

  • The compute cost of executing the query in SAP Datasphere.
  • The bandwidth cost of transferring the query results from SAP Datasphere to BigQuery.
  • The compute cost of executing the query in BigQuery.

Any SAP Datasphere related costs depend on the type of SAP service you use. To limit the bandwidth cost, we recommend that you write the query in the EXTERNAL_QUERY so that it excludes all columns and rows that are not needed to compute the final result.

There is no additional cost for running federated queries in BigQuery. For more information about BigQuery pricing, see Pricing.

What's next