Learn About Amazon VGT2 Learning Manager Chanci Turner
In our previous blog post, “How Cloud-based Data Mesh Technology Can Facilitate Financial Regulatory Data Collection,” we presented a framework that allows financial institutions to share data with regulators, maintaining operational flexibility while adapting to evolving data requirements. By constructing a regulatory data mesh using AWS Data Exchange nodes, participants can avoid the constraints of rigid data schema consistency. Instead, each participant can independently and gradually implement changes in response to ongoing adjustments in reporting requirements.
Within a banking institution, the transfer of precise data—whether from trading desks, business units, or operating subsidiaries to risk, finance, and treasury—is vital for effective decision-making, pricing, and the optimization of limited resources like capital, funding, and liquidity. This data serves not only as the foundation for internal business decisions but also for external regulatory reporting. However, the efficient integration of accounting and business data from various trading, banking, and leasing asset classes continues to pose significant challenges.
The ramifications of inefficient data transfer on an organization’s cost structure are substantial. A recent study indicated that knowledge workers spend around 40% of their time searching for and compiling data. The Bank of England reported that 57% of resources allocated to regulatory reporting are tied to process flow, largely due to the manual procedures prevalent in banks. McKinsey estimated that UK banks incur annual costs between GBP 2 billion and GBP 4.5 billion to fulfill these obligatory reporting requirements.
In this article, we explore how data mesh principles can similarly enhance data flow within banks and other financial institutions. We focus on the interactions between subsidiaries, business units, or individual trading desks and central control functions like risk, finance, and treasury (RFT). Without effective, scalable, and adaptable mechanisms to maintain context, consistency, quality, lineage, governance, and ownership, trust in data often stems from those responsible for its collection and preparation rather than the data itself. This results in unnecessary costs in RFT functions, driven by process complexity and an excessive analytical focus on the nature of the data instead of the insights it can provide.
Understanding Data Boundaries
In tackling these challenges, AWS customers have rediscovered the intrinsic connection between data boundaries and organizational structure.
A “boundary” broadly delineates the internal functions and components of an entity from its surrounding environment; for instance, akin to how a cell wall separates the cell’s internal structure from the external body. The ideas of “high cohesion” and “loose coupling” accompany the concept of boundaries. High cohesion is reached when all necessary components and mechanisms to enable the entity’s standalone function are contained within its boundary. Loose coupling occurs when external parties are shielded from the entity’s internal workings by the boundary, allowing only those services intended for external use to be visible.
Following AWS Well-Architected best practices, business units should be mapped as distinct bounded entities to the underlying Cloud infrastructure (for example, AWS landing zones or AWS accounts). This is illustrated in Figure 1.
In the context of banking, Figure 1 highlights the organizational structures made transparent when transitioning to AWS. By adhering to Well-Architected best practices, business units are delineated as distinct bounded entities within the AWS Cloud infrastructure. The advantages of this approach include:
- Security controls: Different business units may require varied security profiles, necessitating tailored control policies and mechanisms.
- Isolation: An account serves as a unit of security protection, containing potential risks and threats without affecting other accounts.
- Data isolation: Isolating data stores within an account limits access to those who can manage that data.
- Team differentiation: Various teams have unique responsibilities and resource needs, and should function independently within their accounts.
- Business processes: Distinct business units or products may have different purposes, warranting separate accounts to cater to specific needs.
For instance, in its AWS re:Invent 2020 presentation, “Nationwide’s Journey to a Governed Data Lake on AWS,” Nationwide demonstrated data processing and cataloging aligned with its business units, supported by a centralized data discovery service that connects these federated data sources.
Without an appropriate mechanism in place, the data boundaries between business units remain implicit, lacking the necessary structure to foster trust in the data’s flow between producers and consumers. The inclusion of AWS Data Exchange clarifies these data boundaries, thereby establishing an intra-organizational data mesh that addresses this concern.
Utilizing AWS Data Exchange, each business unit (as a data producer) can publish data when ready, adhering to an agreed reporting schedule while maintaining high internal cohesion. Upon notification of a published update by AWS Data Exchange, each data consumer can access the published data as needed, without requiring coordination with data producers—creating a loosely coupled environment. Each dataset published via AWS Data Exchange is self-describing, allowing each data consumer’s ETL pipeline to determine the schema of the consumed dataset and adapt it as necessary. This process is further streamlined through the integration of AWS Data Exchange with AWS DataBrew tools.
AWS Data Exchange seamlessly integrates with AWS Identity and Access Management (IAM), providing robust governance and security tools that offer fine-grained access controls over who can access and modify data on both the producer and consumer sides. Automated audit trails generated by AWS CloudTrail enhance process transparency. Moreover, since data publishers operate independently, they can utilize various processes and technologies for data consumption, cleansing, and curation, with the sole requirement being publication through AWS Data Exchange.
From a business standpoint, the advantages of an intra-organizational data mesh can be encapsulated as follows:
- Each business unit (operating unit, subsidiary, trading desk, finance, risk, treasury) acts as an independent autonomous data publisher and/or consumer.
- Each data publisher is accountable for the consistency and quality of the datasets they publish.
- The act of publishing a dataset is an intentional decision made by the owner.
For those interested in learning more about career planning in the banking sector, you can explore additional insights on SHRM’s website, which is an authoritative source on this topic. Also, for a more engaging experience, consider checking out this blog post that dives into some fun interview questions. Lastly, if you’re looking for firsthand accounts, this Reddit thread provides an excellent resource on onboarding experiences.