Source
id: pinot-to-bigquery
namespace: company.team
tasks:
  - id: query_pinot
    type: io.kestra.plugin.jdbc.pinot.Query
    url: jdbc:pinot://192.168.0.106:9000
    sql: |
      SELECT lat, lon, name, address, cast(location_st_point as string)
      FROM starbucksStores
    fetchType: STORE
  - id: write_to_csv
    type: io.kestra.plugin.serdes.csv.IonToCsv
    from: "{{ outputs.query_pinot.uri }}"
  - id: load_bigquery
    type: io.kestra.plugin.gcp.bigquery.Load
    from: "{{ outputs.write_to_csv.uri }}"
    serviceAccount: "{{ secret('GCP_SERVICE_ACCOUNT_JSON') }}"
    projectId: my_project
    destinationTable: my_project.my_dataset.my_table
    format: CSV
    csvOptions:
      fieldDelimiter: ","
      skipLeadingRows: 1
    schema:
      fields:
        - name: lat
          type: FLOAT64
        - name: lon
          type: FLOAT64
        - name: name
          type: STRING
        - name: address
          type: STRING
        - name: location_st_point
          type: STRING
About this blueprint
SQL GCP
This flow extracts data from an Apache Pinot table, writes it to a CSV file, and then loads the CSV data into BigQuery.
Note that we convert the bytes column location_st_point into a STRING format while extracting the data from Pinot.
The GCP credentials are configured using an environment variable GCP_SERVICE_ACCOUNT_JSON. The BigQuery table will get created at runtime using the schema provided in the flow code.
You can setup Apache Pinot locally with Docker using:
bash docker pull apachepinot/pinot:latest docker run -p 9000:9000 apachepinot/pinot:latest QuickStart -type batch 
The Quickstart mode will automatically load the example datasets. One of the example datasets starbucksStores is used in this flow.
More Related Blueprints
