Asynchronous Batch Processing

Heimdall Data improves database write performance by batching DML operations against a table and pull them onto another connection, and run them under a single transaction. Batching DML operations result in:
  • Improved application response times due to fewer commits
  • Improved DML scale
Ideal use case:
Insert/update/delete a large amount of data at once on a thread. Heimdall can process it all at once much faster than if individual queries outside of a transaction were completed.
Not so ideal use case:
If there are concurrent writes and reads against the same table, on the same thread, as everything will just block until the DML operation is completed anyway.
If an exception occurs in the transaction, we determine which query resulted in the transaction, and will remove the query from the list, report the exception in our logs, and reprocess the batch without the query in place.

Hybrid Transactional Analytical Processing (HTAP)

As explained in our technical blog, companies choose Pivotal Greenplum for its large-scale MPP (Massively Parallel Processing). However, there are some drawbacks:

  1.  Higher latency: Many nodes must coordinate to generate an answer;
  2.  No support for materialized views
  3.  Frequent queries slowing down processing

The ideal solution to these drawbacks requires no code changes such as Heimdall Data. Heimdall uses a separate Postgres database to maximize Pivotal Greenplum performance. One benefit is materialized view support.

Fast materialized views is very important in analytics environments.  When reports are generated, or dashboards are viewed, a subset of data is pulled from the back-end data store and processed. Heimdall provides the following functionality:

  • Queries against a materialized view are routed to an alternate database (e.g. Postgres) acting on behalf of Pivotal Greenplum. Postgres answers queries offloading Greenplum.
  • Flexibly configured triggers provide an auto-refresh of the view. Heimdall is aware of updated views from Pivotal Greenplum and when data was loaded that may impact the view. The net result is faster reports and a lighter load on Greenplum.
Using this, materialized views can effectively be used and load distributed to a paired Postgres server for repeated queries against a data-set that has been extracted from the analytics database, say for dashboard use.

Visit our technical blog.

Heimdall Data for Pivotal Greenplum allows customers to deploy applications for both OLTP and OLAP workloads, often called HTAP (Hybrid Transactional/Analytic Processing).

SQL Results Caching

Heimdall’s SQL Caching utilizes Pivotal GemFire as a look-aside cache. SQL result sets are cached into Pivotal GemFire, and are invalidated upon writes to Greenplum. Ideal for low latency performance improvement, Heimdall’s intelligent caching and invalidation logic is distributed across application servers. It is all automated requiring zero code change. View our SQL caching video.

Automated Persistent Connection Failover

Heimdall’s transparent failover is a must-have for any Enterprise desiring Greenplum to be always-on. Heimdall detects a write master failure and seamlessly failovers over to the standby.

How are we different? Upon a failure, Heimdall queues up the current request and transparently creates a new connection at the backend to the write-master standby. This helps eliminate application errors and exceptions upon a failover. Check out our configuration video.

Frequently Asked Questions

Why Heimdall Data for Pivotal?

Heimdall Data is a SQL Traffic Manager that improves Pivotal Greenplum performance up to 100x. Heimdall is deployed between the application and data source providing query optimization (e.g. batch processing, materialized view offload, SQL auto-caching). 

For more information on our technology partnership, go to the Pivotal partner website.

How is Heimdall Data deployed?

Heimdall Data is a transparent Database Proxy deployed in two ways:

1) VM, Docker Container or Standalone instance between the application and database

2) Sidecar process, an agent, or JDBC driver installed on each application instance to maximize performance and reduce latency:

Deployment requires no application or database changes. Just change the connection string or networking setting of the application to route through the Heimdall Database Proxy. SQL performance visibility and optimization is managed by a Heimdall Data central console.

What are other Heimdall for Pivotal Use Cases?

HTAP (Hybrid Transactional / Analytical Processing):

Problem:  As a result of queries being issued by a client, it is necessary to perform (trigger) actions that perform data synchronization and/or updates to refresh data.  This can be used in conjunction with query routing to aggregate or batch data to load into a large-scale data warehouse for example. In this case, the inserts may be made against a front-end fast with a small database such as Postgres, then after a period of time during which no more updates occur, the data is synchronized as a whole into the large-scale data warehouse.

HTAP Solution:  Heimdall is implementing a generic system whereby actions (SQL calls or external programs) can be triggered while processing a query, and the trigger can be executed either before the matching SQL, after, or in parallel.  When in “parallel” this means in another thread. Further, the parallel execution can be delayed, with repeated calls being aggregated into a single call.

Summary: Heimdall Data optimizes Greenplum performance for both query reads and writes.

Batching DML Operations:  Inserting data row-by-row, with aggregation for a bulk insert into the data warehouse when there is a lull in insert activity.  This allows the back-end to have low overhead, even while large amounts of individual row data is being inserted, potentially by many clients.  Use-case: When using IoT devices, which are providing large amounts of data into a data warehouse, a front-end database can be used as a buffer, with a periodic flush of the data into the warehouse, without additional logic being needed at the application layer.

Automatic Materialized View Management:  

Heimdall serves as a traffic manager:

  • Queries against a materialized view are routed to an alternate database (e.g. Postgres) acting on behalf of Pivotal Greenplum. Postgres answers queries, offloading Greenplum. Also, frequent queries of small data sets are best deployed using a general purpose Postgres DB than Greenplum. Heimdall routes traffic to the appropriate data source for the best performance.
  • Heimdall triggers a refresh of the view automatically. Heimdall is aware of updated views from Pivotal Greenplum and when data was loaded that may impact the view. The net result is faster reports and a lighter load on Greenplum.
How does Heimdall help Greenplum with frequent singleton transactions?

Heimdall transparently batches singleton transactions. Details of the solution are found here.

How does Heimdall work with Pivotal GemFire?

As a look-aside SQL results cache, Heimdall can automatically and intelligently offload queries from Greenplum and cache into GemFire.

For this use case, Heimdall is transparently storing SQL results into GemFire and serving out results from a corresponding query.

  • Heimdall treats GemFire as a Key/Value pair: Key = query and Value = Result set
  • Heimdall is NOT writing SQL into GemFire nor is GemFire used as a Read-through or Write-through cache.
  • No code changes are required for automatic cache invalidation.
How does Heimdall's Automated Failover help Pivotal Greenplum?

Heimdall supports persistent connection failover of the Master to the Standby for unplanned and planned (i.e. maintenance upgrade) outages. Failover is configured at the SQL level. Heimdall monitors the health of each Greenplum instance. If Heimdall detects a failure on the primary node, the current request will be queued up avoiding potential application errors or exceptions. A new backend connection will be automatically established to the to the Standby Master and the existing request will be completed. Failover is transparent to the application and user. Heimdall supports many databases including Greenplum, SQL Server, MySQL, Oracle, and Postgres.

What concurrency levels can you expect with the Heimdall solution?

The speed of the application and the concurrency depends on a lot of factors.  This includes hardware, software configurations, number of user, data volumes etc. However, with Heimdall, we have seen Greenplum performance improve up to 100x.

What transactional databases does Heimdall support?

Heimdall supports any SQL database (Postgres, MySQL, SQL Server, Oracle etc.)

How is Heimdall priced?

Heimdall is priced by node. For specific Heimdall Data for Pivotal pricing, please email sales@heimdalldata.com

Can I demo Heimdall for Pivotal?

Yes, download here for a free, fully featured version.