SQL Results Caching
The user chooses the look-aside cache (e.g. Redis, GemFire), and the proxy provides the caching and invalidation logic. For business intelligence applications like Tableau, customers have experienced improved response times over 10x. No application changes are required. View our SQL caching video below.
The Heimdall proxy supports true multi-user pooling. Operators can safely scale without overwhelming Greenplum resources. There are two Heimdall techniques that reduce connection overhead:
- Connection Pooling: Multiple client connections are associated to a Greenplum connection. Unlike pg-Bouncer, we support 2-tier pooling: per user and global control limits.
- Connection Multiplexing: This is an extension of pooling, the proxy it dispatches individual queries or transactions to connections in the connection pool.
- As client connections are often idle, multiplexing allows for more “active” client connections to Greenplum. The net result is, 1) lower total memory, 2) more active queries processed, and 3) reduced database costs.
|Multiplexing (transaction pooling)||Yes||Yes|
|Per-User Connection Limits||Yes||Yes|
|Global Connection Limits||Yes||No|
|LDAP Group Sync||Yes||No|
|LDAP Group Policies||Yes||No|
- Improved application response times due to fewer commits
- Improved DML scale
Heimdall’s transparent failover is a must-have for any Enterprise desiring Greenplum to be always-on. Heimdall detects a write master failure and seamlessly failovers over to the standby. How are we different? Upon a failure, Heimdall queues up the current connection and transparently fails over to the backend master standby; our proxy persists the application connection. This helps remove application errors and exceptions upon a failover. Check out our video.
Hybrid Transactional Analytical Processing (HTAP)
As explained in our technical blog, companies choose Pivotal Greenplum for its large-scale MPP (Massively Parallel Processing). However, there are some drawbacks:
- Higher latency: Many nodes must coordinate to generate an answer;
- No support for materialized views
- Frequent queries slowing down processing
The ideal solution to these drawbacks requires no code changes such as Heimdall Data. Heimdall uses a separate Postgres database to maximize Pivotal Greenplum performance. One benefit is materialized view support.
Fast materialized views is very important in analytics environments. When reports are generated, or dashboards are viewed, a subset of data is pulled from the back-end data store and processed. Heimdall provides the following functionality:
- Queries against a materialized view are routed to an alternate database (e.g. Postgres) acting on behalf of Pivotal Greenplum. Postgres answers queries offloading Greenplum.
- Flexibly configured triggers provide an auto-refresh of the view. Heimdall is aware of updated views from Pivotal Greenplum and when data was loaded that may impact the view. The net result is faster reports and a lighter load on Greenplum.
Visit our technical blog.
Heimdall Data for Pivotal Greenplum allows customers to deploy applications for both OLTP and OLAP workloads, often called HTAP (Hybrid Transactional/Analytic Processing).
Why Heimdall Data for Pivotal?
Heimdall Data is a database proxy that improves Pivotal Greenplum performance up to 100x. Heimdall is deployed between the application and data source to improve query response times (e.g. query caching, connection pooling, batching singleton DML, materialized view offload).
How is Heimdall Data deployed?
1) VM, Docker Container or Standalone instance between the application and database
2) Sidecar process, an agent, or JDBC driver installed on each application instance to maximize performance and reduce latency:
Deployment requires no application or database changes. Just change the connection string or networking setting of the application to route through the Heimdall Database Proxy. SQL performance visibility and optimization is managed by a Heimdall Data central console.
What are other Heimdall for Pivotal Use Cases?
Problem: As a result of queries being issued by a client, it is necessary to perform (trigger) actions that perform data synchronization and/or updates to refresh data. This can be used in conjunction with query routing to aggregate or batch data to load into a large-scale data warehouse for example. In this case, the inserts may be made against a front-end fast with a small database such as Postgres, then after a period of time during which no more updates occur, the data is synchronized as a whole into the large-scale data warehouse.
HTAP Solution: Heimdall is implementing a generic system whereby actions (SQL calls or external programs) can be triggered while processing a query, and the trigger can be executed either before the matching SQL, after, or in parallel. When in “parallel” this means in another thread. Further, the parallel execution can be delayed, with repeated calls being aggregated into a single call.
Summary: Heimdall Data optimizes Greenplum performance for both query reads and writes.
Batching DML Operations: Inserting data row-by-row, with aggregation for a bulk insert into the data warehouse when there is a lull in insert activity. This allows the back-end to have low overhead, even while large amounts of individual row data is being inserted, potentially by many clients. Use-case: When using IoT devices, which are providing large amounts of data into a data warehouse, a front-end database can be used as a buffer, with a periodic flush of the data into the warehouse, without additional logic being needed at the application layer.
Automatic Materialized View Management:
Heimdall serves as a traffic manager:
- Queries against a materialized view are routed to an alternate database (e.g. Postgres) acting on behalf of Pivotal Greenplum. Postgres answers queries, offloading Greenplum. Also, frequent queries of small data sets are best deployed using a general purpose Postgres DB than Greenplum. Heimdall routes traffic to the appropriate data source for the best performance.
- Heimdall triggers a refresh of the view automatically. Heimdall is aware of updated views from Pivotal Greenplum and when data was loaded that may impact the view. The net result is faster reports and a lighter load on Greenplum.
How does Heimdall help Greenplum with frequent singleton transactions?
How does query caching work?
Our proxy uses a grid-cache as a look-aside results cache. We provide the caching and invalidation logic to intelligently offload SQL traffic from Greenplum and cache into the grid-cache of your choice (Redis, Amazon ElastiCache, Pivotal GemFire, Hazelcast, GridGain).
For this use case, Heimdall is transparently storing SQL results into cache and serving out results from the corresponding query.
- Heimdall treats GemFire as a Key/Value pair: Key = query and Value = Result set
- Heimdall is NOT writing SQL into GemFire nor is GemFire used as a Read-through or Write-through cache.
- No code changes are required for automatic cache invalidation.
How does Heimdall's Automated Failover help Pivotal Greenplum?
What concurrency levels can you expect with the Heimdall solution?
What transactional databases does Heimdall support?
Heimdall supports any SQL database (Postgres, MySQL, SQL Server, Oracle etc.)