- Overview
- Tutorials
- Getting started
- Get started with Canton and the JSON Ledger API
- Get Started with Canton, the JSON Ledger API, and TypeScript
- Get started with Canton Network App Dev Quickstart
- Get started with smart contract development
- Basic contracts
- Test templates using Daml scripts
- Build the Daml Archive (.dar) file
- Data types
- Transform contracts using choices
- Add constraints to a contract
- Parties and authority
- Compose choices
- Handle exceptions
- Work with dependencies
- Functional programming 101
- The Daml standard library
- Test Daml contracts
- Next steps
- Application development
- Getting started
- Development how-tos
- Component how-tos
- Explanations
- References
- Application development
- Smart contract development
- Daml language cheat sheet
- Daml language reference
- Daml standard library
- DA.Action.State.Class
- DA.Action.State
- DA.Action
- DA.Assert
- DA.Bifunctor
- DA.Crypto.Text
- DA.Date
- DA.Either
- DA.Exception
- DA.Fail
- DA.Foldable
- DA.Functor
- DA.Internal.Interface.AnyView.Types
- DA.Internal.Interface.AnyView
- DA.List.BuiltinOrder
- DA.List.Total
- DA.List
- DA.Logic
- DA.Map
- DA.Math
- DA.Monoid
- DA.NonEmpty.Types
- DA.NonEmpty
- DA.Numeric
- DA.Optional
- DA.Record
- DA.Semigroup
- DA.Set
- DA.Stack
- DA.Text
- DA.TextMap
- DA.Time
- DA.Traversable
- DA.Tuple
- DA.Validation
- GHC.Show.Text
- GHC.Tuple.Check
- Prelude
- Smart contract upgrading reference
- Glossary of concepts
Optimize¶
It is important to ensure that proper capacity planning of resources is performed to match the expected workload.
Host infrastructure¶
The following are the minimum requirements for the host operating system running any kind of production workload, and should be seen as a starting point for determining the appropriate resources required by a particular workload:
PQS host:
Memory: 4GB
CPU: 4 cores
PostgreSQL:
Memory: 8GB
CPU: 8 cores
Importance of deliberate PostgreSQL tuning¶
It is hard to predict what a client’s needs are without knowing their traffic shapes, usage patterns, etc.
If PQS data is used heavily in read scenarios with large active ACS, it is advisable to increase RAM available to PostgreSQL to maximise its cache capabilities.
The number of CPU cores is strongly correlated with the maximum number of connections. Larger parallelism in query execution will require more simultaneous connections and therefore more CPUs.
Please note that if you intend to run OLAP-like queries (reporting-style - lots of aggregations, many joins with large result sets, etc) it will influence the need for additional resources that are typically unnecessary in OLTP-like workloads. In practice, this means that the number of CPUs should match the number of maximum connections to avoid process scheduling in the presence of concurrent long-running queries.
By default, the PostgreSQL Docker image is shipped untuned with irrelevant settings (assuming HDD is used on a Raspberry Pi, which is inadequate in most common contemporary scenarios). One needs to actively tune PostgreSQL according to the hardware used to run it.
As a starting point, please use online calculator [1] to correlate hardware parameters with PostgreSQL startup settings.
In Docker container environment pass the configuration parameters as startup arguments, for example:
services:
postgres:
command:
- postgres
- -c
- max_connections=100
- -c
- shared_buffers=1GB
- -c
- effective_cache_size=3GB # and so on
AWS Aurora comes already pre-configured [2] with sensible defaults which take into account underlying provisioned cloud resources, but do verify unreasonable overrides (with query below or Aurora management UI).
select name, setting, source from pg_settings;
Can PQS run against AWS Aurora?¶
It has been proven in production environments that PQS works within expected parameters with AWS Aurora (Serverful) as a datastore, provided default settings are in use (for both PQS and Aurora).
AWS Aurora Serverless had not been tested yet.
Java Virtual Machine (JVM) configuration¶
Appropriate JVM configuration is important to ensure that PQS has enough resources to run efficiently. At minimum the following should be considered for any realistic deployment:
-XX:+AlwaysPreTouch
-XX:-UseAdaptiveSizePolicy
## containers:
-XX:InitialRAMPercentage=75.0
-XX:MaxRAMPercentage=75.0
## host/vm:
-Xms4g
-Xmx4g
In resources constrained scenarios an out-of-memory error may occur. To diagnose this, a heap-dump will need to be collected for analysis, by adding the following JVM parameters:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/path/to/heap.dump
PostgreSQL configuration¶
Users should at least consider the following PostgreSQL configuration items which are relevant to the workloads it will
be expected to satisfy (see postgresql.conf
[3]):
In cases where high performance is required, a Database Administrator will be required to tune PostgreSQL for the intended workload and infrastructure, and iteratively adjust the configuration whilst the system is under simulated workload.
Host environment¶
PQS requires write access to a local cache directory (configured through --source-ledger-cachedir
, default
/tmp/scribe
) in order to temporarily cache Daml packages. The size of this cache is proportional to the size of
all Daml packages observable on the ledger. It is an ephemeral cache - so it does not need to persist beyond a single
execution. Containerized environments should configure a disk-based mount, as it is not important for overall
performance.
Processing pipeline configuration¶
If you wish to have more detailed logging for diagnosis, you can adjust the --logger-level
parameter to DEBUG
or TRACE
. However, be aware that this will generate a lot of log output and may negatively impact performance.
Therefore it is recommended you de-scope particularly verbose components (such as Netty) to INFO
level (see
Logging).
Setting the Ledger API queue length is a trade-off between memory usage and performance. The default value is 128, and can be increased to deliver more stable performance, at the cost of requiring additional memory. Note that the buffer will consume memory equal to the size of transactions in the rolling window of the buffer size:
--source-ledger-buffersize 1024
The anticipated workload might necessitate increasing the number of JDBC connections utilized by PQS (default 16). This can be controlled via:
--target-postgres-maxconnections 64
It is paramount to make sure that PostgreSQL’s max_connections
[9] parameters is set to no less than the number of
connections requested from all clients of the database.
Query analysis¶
This section briefly discusses optimizing the PQS database to make the most of the capabilities of PostgreSQL. The topic is broad, and there are many resources available. Refer to the PostgreSQL documentation for more information.
PQS makes extensive use of JSONB columns to store ledger data. Familiarity with JSONB is essential to optimize queries.
To get an explanation of how the query performs, prefix query text with explain analyze
. This helps verify that a
query executes as expected, using the indexes that you expect it to.
explain analyze
select * from active('MyModule:MyTemplate');
To learn more how to interpret the query planner output refer to series of articles [10] and to visualize use the online tool [11].
Indexing¶
Indexes are an important tool for improving the performance of queries with JSONB content. Users are expected to create JSONB-based indexes to optimize their model-specific queries, where additional read efficiency justifies the inevitable write-amplification. Simple indexes can be created using the following helper function:
call create_index_for_contract('token_wallet_holder_idx',
'register:DA.Register:Token',
'(payload->''wallet''->>''holder'')',
'hash');
In this example, the index allows comparisons on the wallet holder. It has the additional advantage that the results of the JSON inspection are cached/materialized and do not have to be recomputed for every access.
PostgreSQL provides several index types. Each index type uses a different algorithm that is best suited to different types of queries. The table below provides a basic explanation of where they can be used. For a more thorough understanding, consult the chapter on indexes [12] in the PostgreSQL manual.
Index Type |
Comment |
---|---|
Hash |
Compact. Useful only for filters that use |
B-tree |
Can be used in filters that use |
GIN |
Useful for subset operators. |
BRIN |
Efficient for tables where rows are already physically sorted for a particular column. |
More sophisticated indexes can be created using the standard PostgreSQL syntax.
Testing¶
Any of modified settings need to be independently assessed and tuned. Users should establish performance testing and benchmarking environment in order to validate the performance of PQS on a given workload. It should be noted that the following variables are extremely relevant to overall PQS performance characteristics:
Network latency
Ledger transaction throughput
Ledger transaction sizes
Contract data sizes
Changing any of these variables requires re-testing to ensure that the impact on system performance is understood, and within acceptable tolerances.
There are no “one size fits all” settings for PQS, so it is important to understand the workload and empirically test the configuration to ensure that it meets the performance requirements of your specific use case.