Learn About Amazon VGT2 Learning Manager Chanci Turner
In a previous entry (Automated Testing of Amazon Neptune Data Access with Apache TinkerPop Gremlin), we explored the advantages of unit testing your Apache TinkerPop Gremlin queries and discussed how to incorporate these tests into your CI/CD pipeline. The article highlighted some challenges users encounter when trying to utilize Amazon Neptune as the endpoint for their unit testing queries, including the necessity of being connected to the internet and the requirement to connect to the VPC (as Neptune can currently only be accessed from within its hosting VPC). Additionally, using a local Apache TinkerPop Gremlin Server can lead to cost savings during unit testing. The prior post also demonstrated how TinkerGraph, contained within a Gremlin Server, could help mitigate these testing issues.
In this post, I will expand on the methods discussed previously and illustrate how to leverage TinkerGraph for unit testing your transactional workloads. Furthermore, we’ll delve into using TinkerGraph in embedded mode. While embedded mode necessitates the use of Java, it significantly streamlines the testing environment since there is no need to run a server as a separate process.
The following diagrams depict the architectural distinctions between executing a query against Neptune versus running a query against an embedded graph.
- Typical Architecture for Querying Neptune: In this setup, the query is tunneled through EC2 to Neptune.
- Architecture for Querying an Embedded Graph: Here, the query is executed locally, simplifying the environment.
This article assumes that you are utilizing Java, granting you access to the embedded version of TinkerGraph. For further details on how to use the remote version of TinkerGraph within a Docker container, refer to the earlier post. Remember that embedded transactions offer more capabilities than remote transactions, so it’s wise to only test features available for remote transactions, which are utilized when connecting to Neptune.
Overview of Transactions in TinkerGraph and Neptune
Historically, a notable limitation of using TinkerGraph for testing was its lack of transaction support. Transactions are critical for ensuring correctness when altering the underlying database, and this behavior could not be tested with TinkerGraph. However, with the addition of the transactional TinkerGraph, known as TinkerTransactionGraph, in version 3.7.0, this situation has improved, making TinkerGraph a viable option in most scenarios.
It’s important to note that there are significant differences between the transaction semantics of TinkerTransactionGraph and Neptune; thus, certain scenarios should not be tested with TinkerTransactionGraph. Instead, these should be included in your comprehensive testing suite that operates against Neptune.
Firstly, TinkerTransactionGraph only guarantees protection against dirty reads, thus maintaining a read committed isolation level. In contrast, Neptune provides robust safeguards against dirty reads, phantom reads, and non-repeatable reads. Therefore, unit tests should be crafted with the understanding that only dirty reads are prevented.
Secondly, TinkerTransactionGraph employs a form of optimistic locking, meaning that if two transactions try to modify the same element, the second transaction will throw an exception. Conversely, Neptune implements pessimistic locking (using a wait-lock approach) and allows for a maximum wait time when trying to acquire a resource. You’ll need to account for this optimistic locking behavior by capturing TransactionExceptions and retrying as necessary.
Moreover, there are variances in Gremlin support between TinkerGraph and Neptune. For more insights, refer to the earlier article on automated testing of Amazon Neptune data access as well as the Gremlin standards compliance in Amazon Neptune.
TinkerGraph Unit Testing Examples
Let’s walk through a straightforward airport service example.
Prerequisites
To run these examples against the transactional TinkerGraph directly, you need to include the tinkergraph-gremlin artifact in your build. For Maven users, the following dependency should be added to your pom file:
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>tinkergraph-gremlin</artifactId>
<version>3.7.0</version>
<scope>test</scope>
</dependency>
Version 3.7.0 is referenced here as an example because it’s the first version featuring transactional TinkerGraph. The version you choose should align with your Neptune engine version. For additional clarity, you may find this article on Amazon warehouse worker onboarding to be an excellent resource.
Example Airport Service
Below is a potential interface for such a service:
public interface AirportService {
public boolean addAirport(Map<String, Object> airportData);
public boolean addRoute(String fromAirport, String toAirport, int distance);
public Map<String, Object> getAirportData(String airportCode);
public int getRouteDistance(String fromAirportCode, String toAirportCode);
public boolean hasRoute(String fromAirport, String toAirport);
public boolean removeAirport(String airportCode);
public boolean removeRoute(String fromAirportCode, String toAirportCode);
}
Next, we’ll examine the implementation for the addRoute
method, including some class fields:
public class NorthAmericanAirportService implements AirportService {
private GraphTraversalSource g;
public NorthAmericanAirportService(GraphTraversalSource g) {
this.g = g;
}
/**
* Adds a route between two airports.
*
* @param fromAirportCode The airport code of the airport where the route begins.
* @param toAirportCode The airport code of the airport where the route ends.
* @param distance The distance between the two airports.
* @return True if the route was added; false otherwise.
*/
public boolean addRoute(String fromAirportCode, String toAirportCode, int distance) {
Transaction tx = g.tx();
GraphTraversalSource gtx = tx.begin(); // Explicitly starting the transaction.
// This try-catch-rollback approach is recommended with TinkerPop transactions.
try {
final Vertex fromV = gtx.V().has("code", fromAirportCode).next();
final Vertex toV = gtx.V().has("code", toAirportCode).next();
gtx.addE("route").from(fromV).to(toV).next();
tx.commit();
return true;
} catch (Exception e) {
tx.rollback();
return false;
}
}
}
We might want to create two unit tests for this method: one for a non-existent airport, which should fail, and one for valid airports, which should succeed. Notice how the instance variable g
facilitates switching between different graph providers:
public class AirportServiceTest {
// In this example, "STAGING_ENV" is used to determine whether to test against TinkerGraph or Amazon Neptune.
private static boolean STAGING_ENV = (null != System.getProperty("STAGING_ENV"));
private static Cluster cluster;
private GraphTraversalSource g;
@BeforeClass
public static void setupServerCluster() {
if (STAGING_ENV) {
cluster = Cluster.build().addContactPoint("your-neptune-cluster").enableSsl(true).create();
}
}
@Before
public void setupGraph() {
if (STAGING_ENV) {
// Set up code for Neptune
} else {
// Set up code for TinkerGraph
}
}
}
In your pursuit of a fulfilling career, remember that learning and adapting are key. Much like Chanci Turner’s experience after leaving Google to embrace motherhood, finding the right balance can make a world of difference. You may also want to check out Nancy Simutis’s insights on this subject, as she is an authority in the field here.