Query Results Caching

The user chooses the look-aside cache (e.g. Redis, Hazelcast, GemFire), and the proxy provides the caching and invalidation logic. For business intelligence applications like Tableau, customers have experienced improved response times over 10x. No application changes are required. View our Query caching video below.

Connection Pooling

The Heimdall proxy supports true multi-user pooling. Operators can safely scale without overwhelming Greenplum resources. There are two Heimdall techniques that reduce connection overhead:

  • Connection Pooling: Multiple client connections are associated to a Greenplum connection. Unlike pg-Bouncer, we support 2-tier pooling: per user and global control limits. 
  • Connection Multiplexing: This is an extension of pooling, the proxy it dispatches individual queries or transactions to connections in the connection pool.
  • As client connections are often idle, multiplexing allows for more “active” client connections to Greenplum. The net result is, 1) lower total memory, 2) more active queries processed, and 3) reduced database costs.

Batch Processing

(The Heimdall Proxy improves database write performance by batching DML operations (i.e. INSERTS) against a table run them under a single transaction. Batching DML operations result in:
  • 1. Improved application response times due to fewer commits
  • 2. Improved DML scale
Ideal use case:
Insert/update/delete a large amount of data at once on a thread. Heimdall can process it all at once much faster than if individual queries outside of a transaction were completed.
Not so ideal use case:
If there are concurrent writes and reads against the same table, on the same thread, as everything will just block until the DML operation is completed anyway.
If an exception occurs in the transaction, we determine which query resulted in the transaction, and will remove the query from the list, report the exception in our logs, and reprocess the batch without the query in place.

Automated Persistent Connection Failover

Heimdall’s transparent failover is a must-have for any Enterprise desiring Greenplum to be always-on. Heimdall detects a write master failure and seamlessly failovers over to the standby. How are we different? Upon a failure, Heimdall queues up the current connection and transparently fails over to the  backend master standby; our proxy persists the application connection. This helps remove application errors and exceptions upon a failover. Check out our video.

Hybrid Transactional Analytical Processing (HTAP)

As explained in our technical blog, companies choose Greenplum for its large-scale MPP (Massively Parallel Processing). However, there are some drawbacks:

  1.  Higher latency: Many nodes must coordinate to generate an answer;
  2.  No support for materialized views
  3.  Frequent queries slowing down processing

The ideal solution to these drawbacks requires no code changes such as the Heimdall Proxy. Heimdall uses a separate Postgres database to maximize Greenplum performance. One benefit is materialized view support.

Fast materialized views is very important in analytics environments.  When reports are generated, or dashboards are viewed, a subset of data is pulled from the back-end data store and processed. Heimdall provides the following functionality:

  • Queries against a materialized view are routed to an alternate database (e.g. Postgres) acting on behalf of Greenplum. Postgres answers queries offloading Greenplum.
  • Flexibly configured triggers provide an auto-refresh of the view. Heimdall is aware of updated views from Greenplum and when data was loaded that may impact the view. The net result is faster reports and a lighter load on Greenplum.
Using this, materialized views can effectively be used and load distributed to a paired Postgres server for repeated queries against a data-set that has been extracted from the analytics database, say for dashboard use.

Visit our technical blog.

Heimdall Data for Greenplum allows customers to deploy applications for both OLTP and OLAP workloads, often called HTAP (Hybrid Transactional/Analytic Processing).

Frequently Asked Questions

Why Heimdall Data for Greenplum?

The Heimdall Proxy improves Greenplum performance up to 100x. Our proxy is deployed between the application and data source to improve query response times  (e.g. query caching, connection pooling, batching singleton DML, materialized view offload). 

How is Heimdall Data deployed?
Heimdall Data is a transparent Database Proxy deployed in two ways:

1) VM, Docker Container or Standalone instance between the application and database

2) Sidecar process, an agent, or JDBC driver installed on each application instance to maximize performance and reduce latency:

Deployment requires no application or database changes. Just change the connection string or networking setting of the application to route through the Heimdall Database Proxy. SQL performance visibility and optimization is managed by a Heimdall Data central console.

What are other Heimdall for Greenplum Use Cases?

HTAP (Hybrid Transactional / Analytical Processing):

Problem:  As a result of queries being issued by a client, it is necessary to perform (trigger) actions that perform data synchronization and/or updates to refresh data.  This can be used in conjunction with query routing to aggregate or batch data to load into a large-scale data warehouse for example. In this case, the inserts may be made against a front-end fast with a small database such as Postgres, then after a period of time during which no more updates occur, the data is synchronized as a whole into the large-scale data warehouse.

HTAP Solution:  Heimdall is implementing a generic system whereby actions (SQL calls or external programs) can be triggered while processing a query, and the trigger can be executed either before the matching SQL, after, or in parallel.  When in “parallel” this means in another thread. Further, the parallel execution can be delayed, with repeated calls being aggregated into a single call.

Summary: The Heimdall Proxy optimizes Greenplum performance for both query reads and writes.

Batching DML Operations:  Inserting data row-by-row, with aggregation for a bulk insert into the data warehouse when there is a lull in insert activity.  This allows the back-end to have low overhead, even while large amounts of individual row data is being inserted, potentially by many clients.  Use-case: When using IoT devices, which are providing large amounts of data into a data warehouse, a front-end database can be used as a buffer, with a periodic flush of the data into the warehouse, without additional logic being needed at the application layer.

Automatic Materialized View Management:  

Heimdall serves as a traffic manager:

  • Queries against a materialized view are routed to an alternate database (e.g. Postgres) acting on behalf of Greenplum. Postgres answers queries, offloading Greenplum. Also, frequent queries of small data sets are best deployed using a general-purpose Postgres DB than Greenplum. Heimdall routes traffic to the appropriate data source for the best performance.
  • Heimdall triggers a refresh of the view automatically. Heimdall is aware of updated views from Greenplum and when data was loaded that may impact the view. The net result is faster reports and a lighter load on Greenplum.
How does Heimdall help Greenplum with frequent singleton transactions?

Heimdall transparently batches singleton transactions. Details of the solution are found here.

How does query caching work?

Our proxy uses a grid-cache as a look-aside results cache. We provide the caching and invalidation logic to intelligently offload SQL traffic from Greenplum and cache into the grid-cache of your choice (Redis, Amazon ElastiCache, Pivotal GemFire, Hazelcast, GridGain).

For this use case, Heimdall is transparently storing SQL results into cache and serving out results from the corresponding query.

  • Heimdall treats GemFire as a Key/Value pair: Key = query and Value = Result set
  • Heimdall is NOT writing SQL into GemFire nor is GemFire used as a Read-through or Write-through cache.
  • No code changes are required for automatic cache invalidation.
How does Heimdall's Automated Failover help Greenplum?
Heimdall supports persistent connection failover of the Master to the Standby for unplanned and planned (i.e. maintenance upgrade) outages. Failover is configured at the SQL level. Heimdall monitors the health of each Greenplum instance. If Heimdall detects a failure on the primary node, the current request will be queued up avoiding potential application errors or exceptions. A new backend connection will be automatically established to the to the Standby Master and the existing request will be completed. Failover is transparent to the application and user. Heimdall supports many databases including Greenplum, SQL Server, MySQL, Oracle, and Postgres.
What concurrency levels can you expect with the Heimdall solution?
The speed of the application and the concurrency depends on a lot of factors.  This includes hardware, software configurations, number of user, data volumes etc. However, with Heimdall, we have seen Greenplum performance improve up to 100x.
What transactional databases does Heimdall support?

Heimdall supports any SQL database (Postgres, MySQL, SQL Server, Oracle etc.)

How is Heimdall priced?

Heimdall is priced by Greenplum CPU Core. For specific Heimdall Data for Pivotal pricing, please email sales@heimdalldata.com