Observe

This section describes observability features of PQS, which are designed to help you monitor health and performance of the application.

Approach to observability

PQS opted to incorporate OpenTelemetry APIs to provide its observability features. All three sources of signals (traces, metrics, and logs) can be exported to various backends by providing appropriate configuration defined by OpenTelemetry protocols and guidelines. This makes PQS flexible in terms of observability backends, allowing users to choose what fits their needs and established infrastructure without being overly prescriptive.

To have PQS emit observability data, an OpenTelemetry Java Agent must be attached to the JVM running PQS. OpenTelemetry’s documentation page on Java Agent Configuration [1] has all the necessary information to get started.

As a frequently requested shortcut (only metrics over Prometheus exposition endpoint embedded by PQS), the following snippet can help you get started. For more details, refer to the official documentation:

$ export OTEL_SERVICE_NAME=pqs
$ export OTEL_TRACES_EXPORTER=none
$ export OTEL_LOGS_EXPORTER=none
$ export OTEL_METRICS_EXPORTER=prometheus
$ export OTEL_EXPORTER_PROMETHEUS_PORT=9090
$ export JDK_JAVA_OPTIONS="-javaagent:path/to/opentelemetry-javaagent.jar"
$ ./scribe.jar pipeline ledger postgres-document ...

PQS Docker images already come pre-configured this way, but users are free to override these values as they see fit for their environments.

Logging

Log level

Set log level with --logger-level. Possible value are All, Fatal, Error, Warning, Info (default), Debug, Trace:

--logger-level=Debug

Per-logger log level

Use --logger-mappings to adjust the log level for individual loggers. For example, to remove Netty network traffic from a more detailed overall log:

--logger-mappings-io.netty=Warning \
--logger-mappings-io.grpc.netty=Trace

Log pattern

With --logger-pattern, use one of the predefined patterns, such as Plain (default), Standard (standard format used in DA applications), Structured, or set your own. Check Log Format Configuration [2] for more details.

To use your custom format, provide its string representation, such as:

--logger-pattern="%highlight{%fixed{1}{%level}} [%fiberId] %name:%line %highlight{%message} %highlight{%cause} %kvs"

Log format for console output

Use --logger-format to set the log format. Possible values are Plain (default) or Json. These formats can be used for the pipeline command.

Log format for file output

Use --logger-format to set the log format. Possible values are Plain (default), Json, PlainAsync and JsonAsync. They can be used for the interactive commands, such as prune. For PlainAsync and JsonAsync, log entries are written to the destination file asynchronously.

Destination file for file output

Use --logger-destination to set the path to the destination file (default: output.log) for interactive commands, such as prune.

Log format and log pattern combinations

  • Plain / Plain

    00:00:23.737 I [zio-fiber-0] com.digitalasset.scribe.pipeline.pipeline.Impl:34 Starting pipeline on behalf of 'Alice_1::12209982174bbaf1e6283234ab828bcab9b73fbe313315b181134bcae9566d3bbf1b'  application=scribe
    00:00:24.658 I [zio-fiber-0] com.digitalasset.scribe.pipeline.pipeline.Impl:61 Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'  application=scribe
    00:00:25.043 I [zio-fiber-895] com.digitalasset.zio.daml.ledgerapi.package:201 Contract filter inclusive of 1 templates and 0 interfaces  application=scribe
    00:00:25.724 I [zio-fiber-0] com.digitalasset.scribe.pipeline.pipeline.Impl:85 Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'  application=scribe
    
  • Plain / Standard

    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:38.902+0000 level=INFO correlation_id=tbd description=Starting pipeline on behalf of 'Alice_1::1220c6d22d46d59c8454bd245e5a3bc238e5024d37bfd843dbad6885674f3a9673c5'  scribe=application=scribe
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:39.734+0000 level=INFO correlation_id=tbd description=Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'  scribe=application=scribe
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:39.982+0000 level=INFO correlation_id=tbd description=Contract filter inclusive of 1 templates and 0 interfaces  scribe=application=scribe
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:40.476+0000 level=INFO correlation_id=tbd description=Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'  scribe=application=scribe
    
  • Plain / Custom

    --logger-pattern=%timestamp{yyyy-MM-dd'T'HH:mm:ss} %level %name:%line %highlight{%message} %highlight{%cause} %kvs
    
    2024-01-16T23:55:52 INFO com.digitalasset.scribe.pipeline.pipeline.Impl:34 Starting pipeline on behalf of 'Alice_1::1220444f494b31c0a40c2f393edac3f5900325028c6f810a203a0334cd830ec230c8'  application=scribe
    2024-01-16T23:55:53 INFO com.digitalasset.scribe.pipeline.pipeline.Impl:61 Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'  application=scribe
    2024-01-16T23:55:53 INFO com.digitalasset.zio.daml.ledgerapi.package:201 Contract filter inclusive of 1 templates and 0 interfaces  application=scribe
    2024-01-16T23:55:53 INFO com.digitalasset.scribe.pipeline.pipeline.Impl:85 Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'  application=scribe
    
  • Json / Standard

    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:12.537+0000","level":"INFO","correlation_id":"tbd","description":"Starting pipeline on behalf of 'Alice_1::1220f03ed424480ab4487d88230fc033f3910f4cb4492fea68535a5760744b53dabe'","scribe":{"application":"scribe"}}
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:13.551+0000","level":"INFO","correlation_id":"tbd","description":"Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'","scribe":{"application":"scribe"}}
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:13.935+0000","level":"INFO","correlation_id":"tbd","description":"Contract filter inclusive of 1 templates and 0 interfaces","scribe":{"application":"scribe"}}
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:14.659+0000","level":"INFO","correlation_id":"tbd","description":"Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'","scribe":{"application":"scribe"}}
    
  • Json / Structured

    {"timestamp":"2024-01-17T00:08:25+0000","level":"INFO","thread":"zio-fiber-0","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:34","message":"Starting pipeline on behalf of 'Alice_1::122077c6b00e952ff694e2b25b6f5eb9582f815dfe793e2da668b119481a1dd5acdc'","application":"scribe"}
    {"timestamp":"2024-01-17T00:08:26+0000","level":"INFO","thread":"zio-fiber-0","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:61","message":"Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'","application":"scribe"}
    {"timestamp":"2024-01-17T00:08:26+0000","level":"INFO","thread":"zio-fiber-882","location":"com.digitalasset.zio.daml.ledgerapi.package:201","message":"Contract filter inclusive of 1 templates and 0 interfaces","application":"scribe"}
    {"timestamp":"2024-01-17T00:08:26+0000","level":"INFO","thread":"zio-fiber-0","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:85","message":"Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'","application":"scribe"}
    
  • Json / Custom

    --logger-pattern=%label{timestamp}{%timestamp{yyyy-MM-dd'T'HH:mm:ss}} %label{level}{%level} %label{location}{%name:%line} %label{description}{%message} %label{cause}{%cause} %label{scribe}{%kvs}
    
    {"timestamp":"2024-01-17T00:16:31","level":"INFO","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:34","description":"Starting pipeline on behalf of 'Alice_1::1220ee13431ac437d454ea59d622cfc76599e0846a3caf166b4306d47b1bf83944a6'","scribe":{"application":"scribe"}}
    {"timestamp":"2024-01-17T00:16:33","level":"INFO","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:61","description":"Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'","scribe":{"application":"scribe"}}
    {"timestamp":"2024-01-17T00:16:34","level":"INFO","location":"com.digitalasset.zio.daml.ledgerapi.package:201","description":"Contract filter inclusive of 1 templates and 0 interfaces","scribe":{"application":"scribe"}}
    {"timestamp":"2024-01-17T00:16:35","level":"INFO","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:85","description":"Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'","scribe":{"application":"scribe"}}
    

    Notice you need to use %label{your_label}{format} to describe a Json attribute-value pair.

Application metrics

Assuming PQS exposes metrics as described above, you can access the following metrics at http://localhost:9090/metrics. Each metric is accompanied by # HELP and # TYPE comments, which describe the meaning of the metric and its type, respectively.

Some metric types have additional constituent parts exposed as separate metrics. For example, a histogram metric type tracks max, count, sum, and actual ranged buckets as separate time series. Metrics are labeled where it makes sense, providing additional context such as the type of operation or the template/choice involved.

Conceptual list of metrics (refer to actual metric names in the Prometheus output):

Type

Name

Description

gauge

watermark_ix

Current watermark index (transaction ordinal number for consistent reads)

counter

pipeline_events_total

Processed ledger events

histogram

jdbc_conn_use

Latency of database connections usage

histogram

jdbc_conn_isvalid

Latency of database connection validation

histogram

jdbc_conn_commit

Latency of database connection commit

histogram

total_tx_handling_latency

Total latency of transaction handling in PQS (observed in LAPI to committed in DB)

gauge

tx_lag_from_ledger_wallclock

Lag from ledger (wall-clock delta (in ms) from command completion to receipt by pipeline)

histogram

pipeline_convert_acs_event

Latency of converting ACS events

histogram

pipeline_convert_transaction

Latency of converting transactions

histogram

pipeline_prepare_batch_latency

Latency of preparing batches of statements

histogram

pipeline_execute_batch_latency

Latency of executing batches of statements

histogram

pipeline_progress_watermark_latency

Latency of watermark progression

histogram

pipeline_wp_acs_events_size

Number of in-flight units of work in pipeline_wp_acs_events wait point

histogram

pipeline_wp_acs_statements_size

Number of in-flight units of work in pipeline_wp_acs_statements wait point

histogram

pipeline_wp_acs_batched_statements_size

Number of in-flight units of work in pipeline_wp_acs_batched_statements wait point

histogram

pipeline_wp_acs_prepared_statements_size

Number of in-flight units of work in pipeline_wp_acs_prepared_statements wait point

histogram

pipeline_wp_events_size

Number of in-flight units of work in pipeline_wp_events wait point

histogram

pipeline_wp_statements_size

Number of in-flight units of work in pipeline_wp_statements wait point

histogram

pipeline_wp_batched_statements_size

Number of in-flight units of work in pipeline_wp_batched_statements wait point

histogram

pipeline_wp_prepared_statements_size

Number of in-flight units of work in pipeline_wp_prepared_statements wait point

histogram

pipeline_wp_watermarks_size

Number of in-flight units of work in pipeline_wp_watermarks wait point

counter

pipeline_wp_acs_events_total

Number of units of work processed in pipeline_wp_acs_events wait point

counter

pipeline_wp_acs_statements_total

Number of units of work processed in pipeline_wp_acs_statements wait point

counter

pipeline_wp_acs_batched_statements_total

Number of units of work processed in pipeline_wp_acs_batched_statements wait point

counter

pipeline_wp_acs_prepared_statements_total

Number of units of work processed in pipeline_wp_acs_prepared_statements wait point

counter

pipeline_wp_events_total

Number of units of work processed in pipeline_wp_events wait point

counter

pipeline_wp_statements_total

Number of units of work processed in pipeline_wp_statements wait point

counter

pipeline_wp_batched_statements_total

Number of units of work processed in pipeline_wp_batched_statements wait point

counter

pipeline_wp_prepared_statements_total

Number of units of work processed in pipeline_wp_prepared_statements wait point

counter

pipeline_wp_watermarks_total

Number of units of work processed in pipeline_wp_watermarks wait point

counter

app_restarts_total

Tracks number of times recoverable failures forced the pipeline to restart

gauge

grpc_up

Indicator whether gRPC channel is up and operational

gauge

jdbc_conn_pool_up

Indicator whether JDBC connection pool is up and operational

Grafana dashboard

Based on the metrics described above, it is possible to build a comprehensive dashboard to monitor PQS. Vendor-supplied Grafana dashboard for PQS can be downloaded from artifacts repository (see Download). You may want to refer to this as a starting point for your own.

grafana/v9.4.0/dashboard.json
grafana/v10.4.0/dashboard.json
grafana/v11.0.0/dashboard.json
../../_images/grafana-pqs-dashboard.png

Health check

The health of the PQS process can be monitored using the health check endpoint /livez. The health check endpoint is available on the configured network interface (--health-address) and TCP port (--health-port). Note the default is 127.0.0.1:8080.

$ curl http://localhost:8080/livez
{"status":"ok"}

Tracing of pipeline execution

PQS instruments the most critical parts of its operations with tracing to provide insights into the execution flow and performance. Traces can be exported to various OpenTelemetry backends by providing appropriate configuration, for example:

$ export OTEL_TRACES_EXPORTER=otlp
$ export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
$ export OTEL_EXPORTER_OTLP_ENDPOINT="http://otel-collector:4317"
$ export JDK_JAVA_OPTIONS="-javaagent:path/to/opentelemetry-javaagent.jar"
$ ./scribe.jar pipeline ledger postgres-document ...

The following root spans are emitted by PQS:

span name

description

process metadata and schema

interactions that happen when PQS starts up and ensures its datastore is ready for operations

initialization routine

interactions that happen when PQS establishes its offset range boundaries (including seeding from ACS if requested) on startup

consume com.daml.ledger.api.v1.TransactionService/GetTransactions consume com.daml.ledger.api.v1.TransactionService/GetTransactionTrees

[Daml SDK v2.x] timeline of processing a ledger transaction from delivery over gRPC to its persistence to datastore

consume com.daml.ledger.api.v2.UpdateService/GetUpdates consume com.daml.ledger.api.v2.UpdateService/GetUpdateTrees

[Daml SDK v3.x] timeline of processing a ledger transaction from delivery over gRPC to its persistence to datastore

execute datastore transaction

interactions when a batch of transactions is persisted to the datastore

advance datastore watermark

interactions when the latest consecutive watermark is persisted to the datastore

All spans are enriched with contextual information through OpenTelemetry’s attributes and events where appropriate. It is advisable to get to know this contextual data. Due to the technical nature of asynchronous and parallel execution, PQS heavily employs span links [3] to highlight causal relationships between independent traces. Modern trace visualisation tools leverage this information to provide a usable representation and navigation through the involved traces.

Below is an example of causal trace data that spans receipt of a transaction from the Ledger API all the way to it becoming visible by PQS’ SQL API in Postgres.

../../_images/trace-consumer.png
Span #110
Trace ID       : 042ce1ffa24b34b38472933ac8209d54
Parent ID      :
ID             : d5c0071e1d9bbf76
Name           : consume com.daml.ledger.api.v1.TransactionService/GetTransactionTrees
Kind           : Consumer
Start time     : 2024-11-06 03:16:43.004004 +0000 UTC
End time       : 2024-11-06 03:16:43.004193 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> messaging.operation.name: Str(consume)
     -> messaging.batch.message_count: Int(1)
     -> messaging.destination.name: Str(com.daml.ledger.api.v1.TransactionService/GetTransactionTrees)
     -> messaging.system: Str(canton)
     -> messaging.operation.type: Str(process)

Span #123
Trace ID       : 042ce1ffa24b34b38472933ac8209d54
Parent ID      : d5c0071e1d9bbf76
ID             : 9d60e1f4c42dce76
Name           : export transaction tree
Kind           : Internal
Start time     : 2024-11-06 03:16:43.004134 +0000 UTC
End time       : 2024-11-06 03:16:43.024574 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> daml.effective_at: Str(2024-11-06T03:16:42.827847Z)
     -> daml.command_id: Str(3563113460)
     -> daml.events_count: Int(3)
     -> daml.workflow_id: Empty()
     -> daml.transaction_id: Str(122056219af2a73f913e1c2f0ce4422c156bc9cfdb5e5d49baaee0053bf3787f4a97)
     -> daml.offset: Str(000000000000000261)
Events:
SpanEvent #0
     -> Name: canonicalizing transaction tree
     -> Timestamp: 2024-11-06 03:16:43.004809542 +0000 UTC
SpanEvent #1
     -> Name: canonicalized transaction tree
     -> Timestamp: 2024-11-06 03:16:43.005138375 +0000 UTC
SpanEvent #2
     -> Name: converting canonical transaction to domain model
     -> Timestamp: 2024-11-06 03:16:43.005690625 +0000 UTC
SpanEvent #3
     -> Name: converted canonical transaction to domain model
     -> Timestamp: 2024-11-06 03:16:43.006170917 +0000 UTC
SpanEvent #4
     -> Name: released transaction model into batch
     -> Timestamp: 2024-11-06 03:16:43.015018459 +0000 UTC
SpanEvent #5
     -> Name: prepared SQL statements for transaction model
     -> Timestamp: 2024-11-06 03:16:43.015437 +0000 UTC
SpanEvent #6
     -> Name: flushed transaction model SQL to datastore
     -> Timestamp: 2024-11-06 03:16:43.019356042 +0000 UTC
SpanEvent #7
     -> Name: advanced datastore watermark
     -> Timestamp: 2024-11-06 03:16:43.024570334 +0000 UTC
     -> Attributes::
          -> index: Int(384)
          -> offset: Str(000000000000000261)
Links:
SpanLink #0
     -> Trace ID: 839da768a12333920b709410fb73911a
     -> ID: 276627b6e10f62c5
     -> TraceState:
     -> Attributes::
          -> target: Str(↥ ledger submission)
SpanLink #1
     -> Trace ID: 76c58361d46c08761c37ef5821e8fb78
     -> ID: 6051f05f10af0399
     -> TraceState:
     -> Attributes::
          -> target: Str(↧ persist to datastore)
SpanLink #2
     -> Trace ID: 71e67e2420deeef36ef3efacea6399dc
     -> ID: 161b5911e7a0ec18
     -> TraceState:
     -> Attributes::
          -> target: Str(↧ advance watermark)
../../_images/trace-export-batch.png
Span #115
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      :
ID             : 81f5f42361aa93ee
Name           : execute datastore transaction
Kind           : Internal
Start time     : 2024-11-06 03:16:43.015931 +0000 UTC
End time       : 2024-11-06 03:16:43.020991 +0000 UTC
Status code    : Unset
Status message :

Span #111
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 81f5f42361aa93ee
ID             : 4bf8484e99999c64
Name           : acquire connection
Kind           : Internal
Start time     : 2024-11-06 03:16:43.016475 +0000 UTC
End time       : 2024-11-06 03:16:43.016688 +0000 UTC
Status code    : Unset
Status message :

Span #113
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 81f5f42361aa93ee
ID             : 6051f05f10af0399
Name           : execute batch
Kind           : Internal
Start time     : 2024-11-06 03:16:43.016828 +0000 UTC
End time       : 2024-11-06 03:16:43.019494 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> scribe.batch.models_count: Int(37)
Links:
SpanLink #0
     -> Trace ID: 33736b299a690b885c2314b9b17bde05
     -> ID: aba3d1dd6024ff71
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025c)
          -> target: Str(↥ incoming transaction)
SpanLink #1
     -> Trace ID: 17f3edce9565defd379bf3ab8243f86d
     -> ID: 076afe5b4aac1212
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025d)
          -> target: Str(↥ incoming transaction)
SpanLink #2
     -> Trace ID: 646ae61de95731c7726a6caee2d69ee9
     -> ID: bca9f5c28de74c90
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025e)
          -> target: Str(↥ incoming transaction)
SpanLink #3
     -> Trace ID: 9ebd4d4f288b8b338f4192c0d7ea1b8c
     -> ID: a1d145fa9d76d5b3
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025f)
          -> target: Str(↥ incoming transaction)
SpanLink #4
     -> Trace ID: e0716f968b5019a450da04317ea8f776
     -> ID: a75658ce89441bee
     -> TraceState:
     -> Attributes::
          -> offset: Str(000000000000000260)
          -> target: Str(↥ incoming transaction)
SpanLink #5
     -> Trace ID: 042ce1ffa24b34b38472933ac8209d54
     -> ID: 9d60e1f4c42dce76
     -> TraceState:
     -> Attributes::
          -> offset: Str(000000000000000261)
          -> target: Str(↥ incoming transaction)

Span #112
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 6051f05f10af0399
ID             : 00419239933510fa
Name           : execute SQL
Kind           : Internal
Start time     : 2024-11-06 03:16:43.016855 +0000 UTC
End time       : 2024-11-06 03:16:43.019162 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> scribe.__contracts.rows_count: Int(9)
     -> scribe.__exercises.rows_count: Int(3)
     -> scribe.__events.rows_count: Int(12)
     -> scribe.__archives.rows_count: Int(1)
     -> scribe.__transactions.rows_count: Int(6)

Span #114
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 81f5f42361aa93ee
ID             : 9872ff55adc9e370
Name           : commit transaction
Kind           : Internal
Start time     : 2024-11-06 03:16:43.019916 +0000 UTC
End time       : 2024-11-06 03:16:43.020742 +0000 UTC
Status code    : Unset
Status message :
../../_images/trace-advance-watermark.png
Span #124
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      :
ID             : 161b5911e7a0ec18
Name           : advance datastore watermark
Kind           : Internal
Start time     : 2024-11-06 03:16:43.021507 +0000 UTC
End time       : 2024-11-06 03:16:43.024872 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> scribe.watermark.offset: Str(000000000000000261)
     -> scribe.watermark.ix: Int(384)
Links:
SpanLink #0
     -> Trace ID: 76c58361d46c08761c37ef5821e8fb78
     -> ID: 6051f05f10af0399
     -> TraceState:
     -> Attributes::
          -> target: Str(↥ persist to datastore)

Span #116
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      : 161b5911e7a0ec18
ID             : 33ab3918ebfe138d
Name           : acquire connection
Kind           : Internal
Start time     : 2024-11-06 03:16:43.022009 +0000 UTC
End time       : 2024-11-06 03:16:43.022222 +0000 UTC
Status code    : Unset
Status message :

Span #6
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      : 161b5911e7a0ec18
ID             : 1a66240dfd597654
Name           : UPDATE scribe.__watermark
Kind           : Client
Start time     : 2024-11-06 03:16:43.022737084 +0000 UTC
End time       : 2024-11-06 03:16:43.023134917 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> db.operation: Str(UPDATE)
     -> db.sql.table: Str(__watermark)
     -> db.name: Str(scribe)
     -> db.connection_string: Str(postgresql://postgres-scribe:5432)
     -> server.address: Str(postgres-scribe)
     -> server.port: Int(5432)
     -> db.user: Str(pguser)
     -> db.statement: Str(update __watermark set "offset" = ?, ix = ?;)
     -> db.system: Str(postgresql)

Span #117
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      : 161b5911e7a0ec18
ID             : f0b24fb074fe41f8
Name           : commit transaction
Kind           : Internal
Start time     : 2024-11-06 03:16:43.023629 +0000 UTC
End time       : 2024-11-06 03:16:43.024157 +0000 UTC
Status code    : Unset
Status message :

Trace context propagation

PQS is an intermediary between a ledger instance and downstream applications that would prefer to access data through SQL rather than in streaming manner from Ledger API directly. Despite forming a pipeline between two data storage systems (Canton and PostgreSQL), PQS stores the original ledger transaction’s trace context (see also Open Tracing in Ledger API Client Applications) for the purposes of propagation rather than its own. This allows downstream applications to decide for themselves how they want to connect to the original submission’s trace (as a child span or as a new trace connected through span links).

select "offset",
       (trace_context).trace_parent,
       (trace_context).trace_state
from __transactions limit 1;
       offset       |                      trace_parent                       |   trace_state
--------------------+---------------------------------------------------------+-----------------
 0000000000000000bb | 00-f35923baa38cc520a1fc3aec6771380b-b4cf363cbf5efa6a-01 | foo=bar,baz=qux
Span #85
    Trace ID       : f35923baa38cc520a1fc3aec6771380b
    Parent ID      : d3300bedd4c64511
    ID             : b4cf363cbf5efa6a
    Name           : MessageDispatcher.handle
    Kind           : Internal
    Start time     : 2024-11-05 04:01:40.808 +0000 UTC
    End time       : 2024-11-05 04:01:40.822694083 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> canton.class: Str(com.digitalasset.canton.participant.protocol.EnterpriseMessageDispatcher)
↑↑↑ span context propagated through transaction/tree stream in Ledger API

↓↓↓ following parent's links chain leads us to the root span of original submission
Span #19
    Trace ID       : f35923baa38cc520a1fc3aec6771380b
    Parent ID      :
    ID             : de3aed62b5fb43ce
    Name           : com.daml.ledger.api.v1.CommandService/SubmitAndWaitForTransaction
    Kind           : Server
    Start time     : 2024-11-05 04:01:40.569 +0000 UTC
    End time       : 2024-11-05 04:01:40.866904459 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> rpc.method: Str(SubmitAndWaitForTransaction)
     -> daml.submitter: Str()
     -> rpc.service: Str(com.daml.ledger.api.v1.CommandService)
     -> net.peer.port: Int(38640)
     -> net.transport: Str(ip_tcp)
     -> daml.workflow_id: Str()
     -> daml.command_id: Str(3498760027)
     -> rpc.system: Str(grpc)
     -> net.peer.ip: Str(172.18.0.15)
     -> daml.application_id: Str(appid)
     -> rpc.grpc.status_code: Int(0)

Accessing data stored in PQS’ __transactions.trace_context column allows any application to re-create the propagated trace context [4] and use it with their runtime’s instrumentation library.

Diagnostics

PQS is capable of exporting diagnostic telemetry snapshots. This data export archive contains essential troubleshooting information such as:

  • application thread dumps (over a period of time)

  • application metrics (over a period of time)

Getting this archive is as easy as accessing the socket with netcat tool:

$ nc localhost 9091 > health-dump.zip
$ unzip health-dump.zip
Archive:  health-dump.zip
  inflating: metrics.openmetrics
  inflating: threads-20250307-105606.zip

The table below lists the available configuration sources with priority decreasing from left to right:

System property

Environment variable

Default value

Description

da.diagnostics.enabled

DA_DIAGNOSTICS_ENABLED

true

Enables/disables diagnostics data collection and exposition

da.diagnostics.host

DA_DIAGNOSTICS_HOST

127.0.0.1

Hostname or IP address to use for binding the exposition socket

da.diagnostics.port

DA_DIAGNOSTICS_PORT

0

Port to use for binding the exposition socket (0 = random port)

da.diagnostics.dump.path

DA_DIAGNOSTICS_DUMP_PATH

<empty>

Directory to write to on graceful shutdown (path needs to be an existing writable directory)

da.diagnostics.metrics.interval

DA_DIAGNOSTICS_METRICS_INTERVAL

PT10S

Metrics collection interval in ISO 8601 format

da.diagnostics.metrics.buffer.size

DA_DIAGNOSTICS_METRICS_BUFFER_SIZE

60

Quantity of samples to store for each monitored metric (rolling window)

da.diagnostics.metrics.tags

DA_DIAGNOSTICS_METRICS_TAGS

<empty>

Comma-separated list of additional labels to enrich each metric with during exposition (for example, job=myapp,env=staging,deployed=20250101)

da.diagnostics.threads.interval

DA_DIAGNOSTICS_THREADS_INTERVAL

PT1M

Thread dumps collection interval in ISO 8601 format

da.diagnostics.threads.buffer.size

DA_DIAGNOSTICS_THREADS_BUFFER_SIZE

10

Quantity of thread dumps to store (rolling window)