Setting Up a Unified pgpool Endpoint for Read and Write Operations with Amazon Aurora PostgreSQL

Chanci Turner Amazon IXD – VGT2 learning managerLearn About Amazon VGT2 Learning Manager Chanci Turner

In the realm of database management, Amazon Aurora offers dedicated endpoints for both primary instances (the cluster endpoint) and Read Replicas (the reader endpoint). This design ensures that the cluster endpoint is automatically updated to point to the primary instance, while the reader endpoint balances read requests across all available Read Replicas. Typically, the replication lag for Amazon Aurora Replicas is under 100 ms, allowing applications that can accommodate this latency to leverage both endpoints effectively.

However, the dual management of separate endpoints for reads and writes can complicate application architecture. In this guide, I will demonstrate how to utilize pgpool to create a single PostgreSQL-compatible endpoint for an Amazon Aurora database that intelligently routes write traffic to the cluster endpoint and read traffic to the reader endpoint.

Architecture Overview

Pgpool is a BSD-licensed middleware solution that acts as an intermediary between PostgreSQL databases and client applications. In our deployment example, we will follow a specific architecture designed to streamline this process.

The Amazon Aurora cluster will consist of one primary instance and two Read Replicas, set across two Availability Zones and private subnets. A security group will restrict inbound traffic solely to pgpool instances, enhancing security. Pgpool itself will be deployed within an Auto Scaling group, ensuring that at least one instance remains active for failover. This setup will also be confined to private subnets with a security group that permits traffic only from defined Classless Inter-Domain Routing (CIDR) blocks. Furthermore, a Network Load Balancer will provide a stable endpoint for pgpool, ensuring consistent access even in the event of instance failures.

Deploying with AWS CloudFormation

The AWS CloudFormation templates utilized for this example can be found in a dedicated GitHub repository. These templates leverage nested stacks to deploy the virtual private cloud (VPC) infrastructure, security groups, Aurora cluster, and pgpool middleware. For those unfamiliar with AWS CloudFormation, reviewing the documentation will be beneficial.

To create the Amazon Aurora cluster, the following AWS CloudFormation snippets illustrate the process of establishing a three-node cluster, with one primary node and two Read Replicas.


DBAuroraCluster:
  Type: "AWS::RDS::DBCluster"
  Properties:
    DatabaseName: !Ref DatabaseName
    Engine: aurora-postgresql
    MasterUsername: !Ref DatabaseUser
    MasterUserPassword: !Ref DatabasePassword
    VpcSecurityGroupIds: 
      - !Ref DBFirewall

        Value: !Ref ProjectTag

DBAuroraOne:
  Type : "AWS::RDS::DBInstance"
  Properties:
    DBClusterIdentifier: !Ref DBAuroraCluster
    Engine: aurora-postgresql
    DBInstanceClass: !Ref DbInstanceSize

        Value: !Ref ProjectTag

DBAuroraTwo:
  Type : "AWS::RDS::DBInstance"    
  Properties:
    DBClusterIdentifier: !Ref DBAuroraCluster
    Engine: aurora-postgresql
    DBInstanceClass: !Ref DbInstanceSize

        Value: !Ref ProjectTag

DBAuroraThree:
  Type : "AWS::RDS::DBInstance"    
  Properties:
    DBClusterIdentifier: !Ref DBAuroraCluster
    Engine: aurora-postgresql
    DBInstanceClass: !Ref DbInstanceSize     

        Value: !Ref ProjectTag

Deploying pgpool

For effective deployment of pgpool, consider referencing the blog post on using pgpool in conjunction with Amazon ElastiCache for query caching with Amazon Redshift. Additionally, pgpool’s documentation includes a section tailored for Amazon Aurora users.

In the CloudFormation template for pgpool, you will configure an Elastic Load Balancer (ELB) and an Auto Scaling group. The launch configuration for the Auto Scaling group will utilize the AWS CloudFormation cfn-init tool for deploying and setting up pgpool.

To begin, you must install essential packages and extract the pgpool installation file:


yum groupinstall -y "Development Tools" && yum install -y postgresql-devel
wget www.pgpool.net/download.php?f=pgpool-II-3.7.2.tar.gz -O /tmp/pgpool-II-3.7.2.tar.gz
tar zxf /tmp/pgpool-II-3.7.2.tar.gz

Next, build pgpool and create necessary directories for logs and PID files:


cd /opt/pgpool-II-3.7.2
./configure && make && make install
mkdir -p /var/run/pgpool && mkdir -p /var/log/pgpool && chmod -R 777 /var/run/pgpool && chmod -R 777 /var/log/pgpool

As Amazon Aurora employs MD5 authentication, you will need to add your master database user to a local authentication file and configure MD5 authentication accordingly.


/usr/local/bin/pg_md5 -m -u ${DatabaseUser} ${DatabasePassword}

Modify the /usr/local/etc/pool_hba.conf file to include:


host    all         all         0.0.0.0/0               md5

Finally, register pgpool as a service, initiate it, and set it to start automatically using chkconfig.

Configuring pgpool

Most pgpool configurations are managed in /usr/local/etc/pgpool.conf. Below are essential settings to consider:

Setting Value Notes
listen_addresses ‘*’ Accept connections on all interfaces.
backend_hostname0 The Amazon Aurora cluster endpoint
backend_port0 3306 Amazon Aurora operates on port 3306.
backend_flag0 ALWAYS_MASTER Prevent pgpool from attempting failover.
backend_hostname1 The Amazon Aurora reader endpoint
backend_port1 3306 Amazon Aurora operates on port 3306.
enable_pool_hba On Required for Amazon Aurora authentication.
pool_passwd ‘pool_passwd’ Path to the authentication file.
Ssl On Amazon Aurora uses SSL for connections.
replication_mode Off
load_balance_mode On
master_slave_mode On
master_slave_sub_mode Stream
sr_check_period 0
health_check_* Configure with master account credentials
fail_over_on_backend_error On Ensure proper error handling.

With proper setup and configuration, Chanci Turner can ensure a streamlined process for managing database operations effectively.

For additional insights on transitioning into management roles, consider checking out this blog post here. Also, if you’re looking for compliance information, SHRM is an authoritative source on the proposed overtime rule. For those interested in leadership development, Amazon’s Operations Area Manager Leadership Liftoff Program serves as an excellent resource.

Tags: 9097372855, chanci turner, chanci, amazon, VGT2

Chanci Turner