I’ve been working recently with some data that doesn’t naturally fit into my AWS HealthLake datastore. I have some additional information captured in a DynamoDB table that would be useful to blend with HealthLake but on its own is not an FHIR resource. I pondered on this for a while and came up with the idea of piping DynamoDB stream changes to S3 so that I could then pick up with AWS Glue. In this article, I want to show you an approach to building a partitioned S3 bucket from DynamoDB. Refining that further with Glue jobs, tables and crawlers will come later.
I’ve been wanting to spend more time lately talking about AWS HealthLake. And then more specifically, Fast Healthcare Interoperable Resources (FHIR) which is the foundation for interoperability in healthcare information systems. I believe very strongly that Serverless is for more than just client and user-driven workflows. I wrote extensively about it here but I wanted to take a deeper dive into building out streams of dataflows. I’ve been using this pattern for quite some time in production, so let’s have a look at EventBridge Pipes enriching DynamoDB Streams.
In my previous article I wrote about a Callback Pattern with AWS Step Functions built upon the backbone of HealthLake’s export. As much as I went deep with code on the Callback portion, I felt that I didn’t give the HealthLake side of the equation enough run. So this article is that adjustment. Managing exports with AWS HealthLake.
What is HealthLake
AWS HealthLake is a HIPAA-eligible service that provides FHIR APIs that help healthcare and life sciences companies securely store, transform, transact, and analyze health data in minutes to give a chronological view at the patient and population-level. – AWS
My words on that are that HealthLake is a FHIR-compliant database that gives a developer a robust set of APIs to build patient-centered applications. You can use HealthLake for building transactional applications, analyze large volumes of data, store structured and semi-structured information and build analytics and reports.
When building with HealthLake I find it fits in one of two places.
- As the transactional center for your Healthcare application. It is highly patient centered, very scalable and contains APIs for working with each resource. In addition, it provides SMART on FHIR capabilities that make it nice choice for building an application on top of.
- As the aggregation point for many external and internal systems in a LakeHouse style architecture for interopability and reporting. When you’ve got a distributed system with various datababases and you need your data reunited in one location. HealthLake does that. Or if you are pulling in data from various external sources, HealthLake can do that too. I wrote about doing this with Serverless a while back.
Some operations in a system function asynchronously. Many times, those same operations must also happen to be responsible for coordinating external workflows to provide an overall status on the execution of the main workflow. A natural fit for this problem with AWS is to use Step Functions and make use of the Callback pattern. In this article, I’m going to walk through an example of the Callback pattern while using AWS’ HealthLake and its export capabilities as the backbone for the async job. Welcome to the AWS Step Functions Callback Pattern.
Callback Workflow Solution Architecture
Let’s first start with the overarching architecture diagram. The general premise of the solution is that AWS’ HealthLake allows the export of all resources “since the last time”. By using Step Functions, Lambdas, SQS, DynamoDB, S3, Distributed Maps and EventBridge I’m going to build the ultimate Serverless Callback workflow. I feel like outside of Kinesis and SNS, I’ve touched them all in this one.
There’s quite a bit going on in here so I’m going to break it down into segments which will be:
- Triggering the State Machine
- Record Keeping and Run Status
- Running the Export and initiating the Callback
- Polling the Export and Restarting the State Machine
- Working the results
- Wrapping Up
- Dealing with Failure
Hang tight, there’s going to be a bunch of code and lots of detail. If you want to jump to code, it’s down at the bottom here
Building Serverless applications can feel a bit overwhelming when you are first getting started. Sure, Event-Driven Systems have been around for many years but this notion of using managed services to “assemble” solutions vs a more traditional “plugin” style architecture might throw you for a loop. I haven’t created a series yet, so this is my first attempt at that. My goal is to walk you through the design considerations when Building Serverless Applications with AWS.
- Data Storage Choices
- Building the Application (Fargate/Containers vs Lambda)
- Handling Events
- Exposing the API (if there is one)
- Securing it all, including the API
- Debugging and Troubleshooting in Production
This is an ambitious list, but when I think about what it takes to put together a Serverless application, these are the concepts and decisions that I often end up counseling or guiding developers new to the paradigm. So let’s dig in.