- How It Works
- Before You Start
- Step 1: Heimdall Installation
- Step 2: Logging In
- Step 3: Heimdall Configuration
- Step 4: Application Configuration
How It Works
The Heimdall Data Access Platform (HDAP) includes two components, the Heimdall Central Manager (HCM), and the Heimdall Data Access Layer (HDAL). These two components work together to provide your application with a wide variety of functionality, including:
- load balancing & HA;
- query redirection;
- SQL security.
The HCM provides configuration management, logging, analytics, and in some configurations, centralized HA failover capabilities. The HDAL provides all the other functionality, and scales to hundreds of nodes. The two components operate in a classic Data Plane Management Plane split, and the data plane can operate without service interruption in the case of a management plane service interruption. The HDAL can operate in two modes, that of a proxy (centralized or distributed), or as a JDBC driver. When possible, the JDBC driver should be used for improved performance and lower overhead, as well as greater compatibility with a wider variety of databases. In proxy mode, two modes are supported:
- A distributed mode, where a proxy resides on each application server, for optimal performance, or
- A centralized mode, with one proxy servicing many application servers.
The second configuration, with the HCM and HDAL on the same server, is the easiest configuration to bring online for evaluation purposes, and is described in this guide.
Before You Start
In order to complete this quick start guide, you will need:
- Database type: In proxy mode, Heimdall supports recent versions of MySQL (5.5+ and compatible, including Amazon Aurora) and PostgreSQL (9.4+). In JDBC mode, any database with a JDBC complaint driver is supported, with primary testing including MS SQL Server and Oracle.
- How to configure your applications database settings. For a Java application, this most often includes the JDBC URL and JDBC Driver class. For non-Java environments, this will include the hostname and port of the database, as well as the “database” instance the application connects to.
- Database authentication credentials. If an application uses multiple credentials to access the database, this is supported in JDBC mode, but currently not in Proxy mode.
- A server/VM/Docker host to install the HDAP/HCM on. The general requirement is a system that can run Java 6+, has at least two cores, and 3GB of RAM available. The preferred OS is any of the common Linux environments (Ubuntu, Centos, Redhat, Amazon & Oracle Linux), with Ubuntu 16.04LTS being the most tested, and thus preferred environment. In AWS environments, the smallest suggested vm is a t2.medium.
Please follow the instructions on the download page in order to install the system in your environment. These instructions will install the HDAP, including the HCM and HDAL and any required dependencies.
Once the HCM has been installed, the first step is to login. Connect to your server’s IP on port 8087 (http://<ip>:8087), and you should be presented with a login screen as shown:
Various help links can be found on the bottom of the page, including help for logging in, this quick-start guide, and a link to our Youtube channel, with video guides that may assist using Heimdall.
The default username will be “admin” and the password will be the AWS Instance ID, or “heimdall” if not in AWS. For additional login help, please see Logging in.
The steps below provide instructions using the detailed configuration pages–within the GUI, there is also a Wizard that provides a step-by-step walk-through to configuring the system, and will generate customized instructions based on the options selected.
In order for an application to use the HDAL, several steps need to be performed to prepare the configuration:
Configure a JDBC Driver (generally optional)
The driver configuration provides the back-end access to the database via JDBC. Even when using the proxy, this is required, as the wire-line protocol is converted to JDBC for connectivity to the back-end. By default, the system comes with a small library of drivers that can be used, so often this step can be skipped if an appropriate diver is already loaded. In proxy mode, only the official MySQL or PostgreSQL drivers should be used, as internal functions unique to them are leveraged as part of the proxy functionality. A driver can be reused between many data sources, and only needs to be setup once for a given driver version. Example MySQL configuration:
Configure the Data Source
A data source points to a driver to use, and then provides information about the actual connection, including the JDBC URL to connect to the back-end database, username and password to use for monitoring, and a monitoring query, normally “SELECT 1” or “SELECT 1 FROM DUAL”. The test source button can be used to verify that the settings are correct and that a connection can be established to the server.
Additionally, the data source allows configuration of JDBC parameters that should be used to provide expected behavior from the JDBC driver to match the upstream driver, in particular when using the MySQL or PostgreSQL proxy. The sample data sources provide examples for Odoo (PostgreSQL) and WordPress (MySQL). The resulting configuration properties will be printed in the log on a configuration change (see Debugging below). Example of a data source configuration:
Configure the Virtual Database (VDB)
Finally, a VDB should be configured. The only required item to set in a VDB is to select what access mode to use (JDBC, MySQL Proxy, or PostgreSQL Proxy) and what default data source to use:
When the proxy mode is configured, this will allow mixed JDBC+proxy access, but if JDBC is selected, then ONLY JDBC access is allowed.
If using the MySQL or PostgreSQL proxy, the proxy behavior needs to be configured. For a consolidated configuration, where the HCM and HDAL are all on the same system, in the VDB settings, the option “Management Server Proxy” can be enabled in order to have the management server start and restart the proxy as necessary. In Proxy mode, an option to “Restart Proxy” is also available, which will allow all proxies connected to that VDB to be restarted from the HCM. Example proxy configuration with a HCM managed MySQL proxy:
If the HDAL and the Application are installed together, the “Localhost Only” option can be selected, which will trigger any proxy to ONLY listen to localhost connections. This is recommended in a distributed deployment to improve security.
Finally, in most cases, the “Proxy-auth Enabled” option should be selected, which will allow the credentials the application will authenticate to the proxy with. If this option is not selected, then any combination will be used. In proxy mode, the connection from the proxy to the database will be as configured in the data source.
Best practices for VDB & Proxy Configuration:
- Make the VDB name match the database instance name you are connecting to on the back-end. In some cases, queries can be formed that use the front-end database name, and if this doesn’t match the back-end database instance, then the queries will fail.
- Match the username used to connect to the proxy to the username used in the data source. Again, this can insure that there isn’t a mismatch at the query level that results in unexpected behaviors.
- Use the Localhost option whenever possible to secure your proxy install. Be aware that in many applications, use of the name “localhost” will trigger attempts to connect via Unix sockets, while use of “127.0.0.1” for a hostname will leverage TCP connections. As Heimdall listens on TCP, this is important for application connections to work properly.
- When possible, use the same port for the proxy as the database, EXCEPT when multiple proxies need to run on the same server, such as on the server the HCM is operating on. As only one proxy can run on the same port, this case requires using a unique port per proxy.
In the HDAL, caches are managed on a per-VDB level, so each VDB can have different behaviors. For caching to actually occur however, two conditions must occur:
- The VDB must have a cache enabled, configured, and initialized;
- A Rule list must exist, be enabled, and have a cache policy configured (see Rule Configuration below).
At a minimum, simply checking “Enable Query Caching” and selecting “Local only”, and specifying a size (in bytes) is sufficient to configure and initialize the cache into a usable state. In most cases however, the “auto-tune cache” and “Serialize” options will be desired, as shown:
This is a usable configuration in the case where there is only one HDAL instance, as no synchronized invalidation is necessary between HDAL instances, which is a common situation in evaluation. The auto-tune option will allow the HDAL to decide what potentially cacheable content should NOT be cached for optimal performance, as often, certain types of queries will deliver a very low cache hit rate, and thus not benefit caching. The Serialize option is necessary to properly manage memory usage in Java, and also reduces the amount of memory used in each response dramatically, but at a cost of slower performance in returning a result. In some cases, disabling the serialize option MAY provide better performance, but care must be taken when using this option, and garbage collection stats monitored over time.
When using more than one HDAL instance, it is recommended that the Hazelcast or Redis grid cache be used. At this time, while supported, Memcached and the Jcache interfaces should be considered deprecated.
To configure Hazelcast, at a minimum, an API Cache Name must be selected, which defaults to “heimdall”, and uniquely identifies the cache map that will be used to store the VDB’s cache into. Additionally, it is recommended that the Serialize and Grid Cache Offload option be selected, as shown:
The “Cache Configuration File” is to allow specifying the Hazelcast XML configuration file, relative to the install directory of the HDAL. This configuration file is document here, and is optional for most simple and evaluation deployments. In AWS, as multicast discovery is not supported, it is required that at minimum AWS discovery is configured via this configuration XML file. This is documented here. On HDAL initialization, the Hazelcast config used will be printed out to assist in debugging this configuration (see Debugging below).
For Redis installs, the process is very similar, but the Redis server hostname, port, and (optionally) password are specified:
The Redis Cluster API is necessary in some instances, such as with AWS Elasticache Redis clusters, but is not necessary with Redis Labs Enterprise clusters.
For all grid caches, the option “Grid Cache Offload” is provided, which provides an important performance enhancing feature, that of a local in-HDAL cache in addition to the grid cache. It provides a multi-tier cache option, where the hottest content is stored locally, while all content is ingested into the grid cache in the background. On a cold-restart, the application will be able to repopulate it’s local cache from the grid-cache, significantly reducing or preventing an overload on the database during large-scale restarts. There can still be a performance penalty using a remote cache however, if content that is cached has a low cache-hit rate, resulting in frequent multiple traversals of the network. The “Auto-Tune Cache” will help mitigate this effect.
Once an application is verified to be operating correctly via the driver without any advanced functions in place, then additional features can be enabled. The most common of these is the use of caching and logging. To add cache in a single server install, or one where grid caching has been configured, the first step is to add a new rule-list with a rule that matches everything, i.e. an empty regular expression, set the action to cache, and then set a TTL to a reasonable value. As Heimdall handles invalidation internally automatically for most cases, such a rule will attempt to cache everything, and only serve content deemed fresh. Optionally, a rule can be set to only operate when outside of transactions, which can help guarantee the desired results are returned to the application.
The next rule to add to the rule list is a log rule. If logging of SQL is not specified at the VDB level, in the logging section, then this allows one to selectively log traffic for analytics. If both a VDB level and a rule level log directive is found, the entry will only be logged once. Example of a rule-list with a simple log and cache rule configured:
Please see the detailed document on how to install the HDAL into an application.
In order to diagnose issues in Heimdall, please see Heimdall Logging, including how to enable the debug option at the VDB level. Additionally, please see our ongoing list of common problems and resolution.