Integrating Amplify DataStore with Pre-existing SQL Databases: Enhancing Offline and Synchronization Capabilities for Your Application

Chanci Turner Amazon IXD – VGT2 learning

Author: Chanci Turner and Alex Johnson

Date: April 22, 2021

Categories: Amazon Aurora, Amazon RDS, AWS Amplify, AWS AppSync, AWS Lambda, Best Practices, Database, Front-End Web & Mobile, Technical How-to, Thought Leadership, Top Posts

In this blog post, Chanci Turner – Principal Solution Architect – AWS AppSync, explores how to effectively utilize Amplify DataStore, a library that enables developers to manage shared and distributed data seamlessly, even in offline scenarios. Utilizing the Amplify CLI, developers can effortlessly create new applications that interface with AWS AppSync while establishing Amazon DynamoDB for their DataStore needs. However, there are instances when developers aim to leverage DataStore to create offline-first applications that also utilize existing databases in their AWS infrastructure or on-premises setups. This article will guide you through the process of utilizing the Amplify CLI to construct an AppSync API for your DataStore application that connects to an Aurora MySQL database via an AppSync Lambda resolver and an Amazon RDS Proxy.

Getting Started

We will establish a backend that interacts with two tables, Posts and Comments, in a pre-existing Aurora MySQL database. These tables support a blog and contain existing data. Notably, the database is restricted to VPC access and does not allow public entry. A configured RDS Proxy with IAM database authentication will enable a Lambda function to connect to the database.

CREATE TABLE Posts (
  id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  title TEXT NOT NULL,
  content LONGTEXT,
  post_status VARCHAR(20),
  createdAt DATETIME(3) NOT NULL,
  updatedAt DATETIME(3) NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE Comments (
  id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  postID BIGINT(20) UNSIGNED NOT NULL, 
  content TEXT,
  createdAt DATETIME(3) NOT NULL,
  updatedAt DATETIME(3) NOT NULL,
  PRIMARY KEY (id),
  KEY (postID),
  CONSTRAINT `comments_posts_ibfk_1` FOREIGN KEY (`postID`) REFERENCES `Posts`(`id`)
);

Preparing the Database

To enhance version control, we need to add metadata columns to these tables. Introducing a _datastore_uuid column will store the ID generated by DataStore when a new item is created. This column will be mapped back to the id field when the data is retrieved by the client. It also plays a crucial role in updating operations, as it helps in retrieving and locking a row. The original id field will not be returned to the client.

Further details about the columns _version, _deleted, _lastChangedAt, and _ttl can be found in the Conflict Detection and Sync documentation. Additionally, I will update the createdAt and updatedAt columns to automatically record the current timestamp on creation and modification.

ALTER TABLE Posts
ADD `_datastore_uuid` VARCHAR(36) NOT NULL DEFAULT '0',
ADD `_version` INT(11) UNSIGNED DEFAULT 1,
ADD `_deleted` TINYINT(1) DEFAULT 0,
ADD `_lastChangedAt` DATETIME(3) NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
ADD `_ttl` DATETIME(3) NULL DEFAULT NULL,

ADD INDEX (`_datastore_uuid`),

CHANGE createdAt createdAt TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
CHANGE updatedAt updatedAt TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3);

ALTER TABLE Comments
ADD `_datastore_uuid` VARCHAR(36) NOT NULL DEFAULT '0',
ADD `_version` INT(11) UNSIGNED DEFAULT 1,
ADD `_deleted` TINYINT(1) DEFAULT 0,
ADD `_lastChangedAt` DATETIME(3) NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
ADD `_ttl` DATETIME(3) NULL DEFAULT NULL,

ADD INDEX (`_datastore_uuid`),

CHANGE createdAt createdAt TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
CHANGE updatedAt updatedAt TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3);

Two new Delta Sync tables will be established to monitor changes to our versioned data sources. These Delta Sync tables will optimize the process of incremental updates. When the client reconnects, it will query the AppSync API to synchronize any changes that occurred during its offline period. AppSync will, in turn, invoke the Lambda resolver to fetch the relevant data. The Lambda function will then ascertain whether to synchronize from the primary tables or to return change records from the Delta Sync tables for a more efficient update.

CREATE TABLE `DeltaSyncPosts` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `fromId` bigint(20) unsigned NOT NULL,
  `title` text NOT NULL,
  `content` longtext,
  `post_status` varchar(20) DEFAULT NULL,
  `createdAt` timestamp(3) NOT NULL,
  `updatedAt` timestamp(3) NOT NULL,
  
  `_datastore_uuid` varchar(36) NOT NULL DEFAULT '0',
  `_version` int(11) unsigned DEFAULT '1',
  `_deleted` tinyint(1) DEFAULT '0',
  `_lastChangedAt` datetime(3) NOT NULL,
  `_ttl` datetime(3) NOT NULL,
  
  PRIMARY KEY (`id`),
  KEY `_datastore_uuid` (`_datastore_uuid`),
  KEY `fromId` (`fromId`,`_lastChangedAt`,`_version`)
);

CREATE TABLE `DeltaSyncComments` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `fromId` bigint(20) unsigned NOT NULL,
  `postID` bigint(20) unsigned NOT NULL,
  `content` text,
  `createdAt` timestamp(3) NOT NULL,
  `updatedAt` timestamp(3) NOT NULL,
  
  `_datastore_uuid` varchar(36) NOT NULL DEFAULT '0',
  `_version` int(11) unsigned DEFAULT '1',
  `_deleted` tinyint(1) DEFAULT '0',
  `_lastChangedAt` datetime(3) NOT NULL,
  `_ttl` datetime(3) NOT NULL,
  
  PRIMARY KEY (`id`),
  KEY `_datastore_uuid` (`_datastore_uuid`),
  KEY `fromId` (`fromId`,`_lastChangedAt`,`_version`),
  KEY `comments_posts_ibfk_2` (`postID`),
  CONSTRAINT `comments_posts_ibfk_2` FOREIGN KEY (`postID`) REFERENCES `Posts` (`id`)
);

Lastly, I will configure an event to remove deleted items on a daily schedule. This can be turned off or adjusted as needed. For your Aurora MySQL database, ensure that the event_scheduler is enabled in your parameter group.

delimiter |

CREATE EVENT process_deleted_items
    ON SCHEDULE
      EVERY 1 DAY
    COMMENT 'purge deleted items'
    DO
      BEGIN
        DELETE FROM Comments
        WHERE
            _deleted = TRUE
            AND _ttl <= CURRENT_TIMESTAMP(3);
        DELETE FROM DeltaSyncComments 
        WHERE
            _ttl <= CURRENT_TIMESTAMP(3);
        DELETE FROM Posts 
        WHERE
            _deleted = TRUE
            AND _ttl <= CURRENT_TIMESTAMP(3);
        DELETE FROM DeltaSyncPosts 
        WHERE
            _ttl <= CURRENT_TIMESTAMP(3);
    END |

delimiter ;

Configuring the Backend with the Amplify CLI

The latest version of the Amplify CLI simplifies the process of setting up a DynamoDB-based Delta Sync data source compatible with the Amplify DataStore client. I will leverage this functionality to create a GraphQL schema representing my data model, implementing the necessary GraphQL operations expected by the DataStore client: sync queries and create, update, and delete mutations. Subsequently, I will connect these fields to my database tables using an AppSync Lambda resolver created with the amplify @function directive.

To get started, I will initiate a new Amplify project in my application directory with amplify init, following the on-screen prompts. Next, I will configure my functions, utilizing mysql2 to connect to the database, which will be set up in a Lambda layer for shared access.

> amplify add function
? Select which capability you want to add: Lambda layer (shared code & resource used across functions)
? Provide a name for your Lambda layer: mysql2
? Select up to 2 compatible runtimes: NodeJS
? The current AWS account will always have access to this layer.
Optionally, configure who else can access this layer. (Hit <Enter> to skip)

For more insights on managing transitions in your career, check out this article, advice to your younger self. Additionally, for those interested in understanding employment law compliance, you can refer to the guidance provided by SHRM. Lastly, if you’re looking for a helpful resource on setting up your project, consider watching this YouTube video, which provides excellent guidance.

Integrating Amplify DataStore with Pre-existing SQL Databases: Enhancing Offline and Synchronization Capabilities for Your Application

Getting Started

Preparing the Database

Configuring the Backend with the Amplify CLI

Related Topics: