Visualizing Your AWS Infrastructure with Amazon Neptune and AWS Config

As a business utilizing AWS for critical applications, your infrastructure may span multiple accounts with intricate interconnections. Understanding the landscape of your current setup can feel overwhelming, especially when faced with extensive lists of resources and their relationships. A more intuitive visualization of this complexity would be incredibly beneficial. As organizations increasingly transition vital workloads to AWS, the growth of cloud assets necessitates a comprehensive, contextual view of your cloud inventory to maintain operational excellence.

Having clear visibility into your cloud resources helps you plan, anticipate, and manage risks associated with your infrastructure. For instance, knowing the workloads tied to a specific instance family is crucial when considering a migration to a different instance family. A knowledge graph detailing all affected workloads can facilitate this transition, streamlining the entire process. To meet this rising demand for asset management and reporting, AWS Config serves as a valuable tool. It identifies AWS resources in your account and constructs a relationship map between them.

In this article, we harness the power of Amazon Neptune alongside AWS Config to gain insights into our AWS landscape and map out these connections. We also utilize an open-source tool to visualize the data stored in Neptune. Amazon Neptune is a fully managed graph database service capable of managing billions of relationships within highly interconnected datasets, providing millisecond query responses. Meanwhile, AWS Config allows you to assess, audit, and evaluate your AWS resource configurations. With AWS Config, you can review configuration changes, explore detailed resource histories, and verify compliance with your internal standards.

Prerequisites

Before diving in, ensure AWS Config is enabled in your account and that the stream for AWS Config is active. This setup allows you to receive notifications whenever a new resource is created, along with its relationships.

Solution Overview

The process involves several key steps:

Activate AWS Config in your AWS account and establish an Amazon Simple Storage Service (Amazon S3) bucket to store all configuration logs.
Utilize Amazon S3 Batch Operations in conjunction with AWS Lambda to populate the Neptune graph with the current AWS Config inventory and develop the relationship map. An AWS Lambda function will also trigger whenever a new AWS Config file is uploaded to the S3 bucket, ensuring the Neptune database is updated with any changes.
Users authenticate through Amazon Cognito and make requests to an Amazon API Gateway endpoint.
A static website interfaces with an AWS Lambda function accessed through the proxy and exposed to the internet via Amazon API Gateway.
The AWS Lambda function queries the graph in Amazon Neptune, returning data to the application for visualization.

Resources referenced in this post, along with code samples and HTML files, can be found in the amazon-neptune-aws-config-visualization GitHub repository.

Enable AWS Config in Your AWS Account

If you haven’t activated AWS Config yet, you can set it up through the AWS Management Console. If it’s already enabled, take note of the S3 bucket storing all configuration history and snapshot files.

Set Up a Neptune Cluster

Your next step is to provision a new Neptune instance within a VPC. For comprehensive guidance, refer to the Neptune user guide. After setting up the cluster, remember to note the cluster endpoint and port; these details are essential for inserting data into the cluster and querying it using the open-source library VIS.js, which is used for visualizing graph data in various formats.

Configure a Lambda Function to Trigger on AWS Config File Delivery

Once your cluster is established, create a Lambda function that activates when AWS Config sends a file to Amazon S3.

Create a directory named configparser_lambda and execute the following commands to install the necessary packages for your function:

pip3.6 install --target ./package gremlinpython
pip3.6 install --target ./package requests

Next, create a file named configparser_lambdafunction.py in the directory and open it in a text editor. Insert the following code into the configparser_lambdafunction.py file:

from __future__ import print_function
import boto3
import json
import os, sys
from io import BytesIO
import gzip
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.traversal import T
import requests
import urllib.parse

CLUSTER_ENDPOINT = os.environ['CLUSTER_ENDPOINT']
CLUSTER_PORT = os.environ['CLUSTER_PORT']

# Making the remote connection to Neptune outside for reuse across invocations
remoteConn = DriverRemoteConnection('wss://' + CLUSTER_ENDPOINT + ":" + CLUSTER_PORT + '/gremlin','g',)
graph = Graph()
g = graph.traversal().withRemote(remoteConn)

def run_sample_gremlin_websocket():
    output = g.V().hasLabel('Instance').toList()
    return output
    #remoteConn.close()

def run_sample_gremlin_http():
    URL = 'https://' + CLUSTER_ENDPOINT + ":" + CLUSTER_PORT + '/gremlin'
    r = requests.post(URL, data='{"gremlin":"g.V().hasLabel('Instance').valueMap().with_('~tinkerpop.valueMap.tokens').toList()"}')
    return r

def get_all_vertex():
    vertices = g.V().count()
    print(vertices)

def insert_vertex_graph(vertex_id, vertex_label):
    node_exists_id = g.V(str(vertex_id)).toList()
    if node_exists_id:
        return
    result = g.addV(str(vertex_label)).property(T.id, str(vertex_id)).next()

def insert_edge_graph(edge_id, edge_from, edge_to, to_vertex_label, edge_label):
    insert_vertex_graph(edge_to, to_vertex_label)

    edge_exists_id = g.E(str(edge_id)).toList()
    if edge_exists_id:
        return

    result = g.V(str(edge_from)).addE(str(edge_label)).to(g.V(str(edge_to))).property(T.id, str(edge_id)).next()

def parse_vertex_info(vertex_input):
    # Vertex needs id and label before insertion into Neptune
    id = vertex_input['resourceId']
    label = vertex_input['resourceType']
    itemStatus = vertex_input['configurationItemStatus']

    if itemStatus == "ResourceDeleted":
        node_exists_id = g.V(str(id)).toList()
        if node_exists_id:
            result = g.addV(str(itemStatus)).property(T.id, str(id)).next()
            return
        else:
            insert_vertex_graph(id, label)
            result = g.addV(str(itemStatus)).property(T.id, str(id)).next()
            return

    insert_vertex_graph(id, label)

def parse_edge_info(edge_input):
    itemStatus = edge_input['configurationItemStatus']
    
    if itemStatus == "ResourceDeleted":
        return
    # Edge requires id, from, to and label to be valid.

By following this structured approach, you can create a robust system for visualizing your AWS infrastructure. To enhance your understanding of habit tracking and its importance, check out this informative blog post on how to track your habits. Additionally, for insights into employment contracts, visit SHRM, an authority on this topic. For those preparing for interviews, this resource offers an excellent compilation of interview questions.

Visualizing Your AWS Infrastructure with Amazon Neptune and AWS Config | AWS Database Blog

Prerequisites

Solution Overview

Enable AWS Config in Your AWS Account

Set Up a Neptune Cluster

Configure a Lambda Function to Trigger on AWS Config File Delivery

SEO Metadata

Related Topics: