Document Revision

Date	Version	Description of change	Author
2021-12-06	1.0	Initial document	Trevlen Bahadur

Overview

Reveal’s mission statement and core value proposition is to facilitate health interventions in locations of need, using geo spatial intelligence data as a driving factor. The reality is that these implementations are conducted in areas of low connectivity, in which users adopt asynchronous work patterns. Reveal pilot implementations have highlighted the need (amongst many others) for improvement to better deliver on the value proposition.

To deliver the mission statement, Reveal’s architecture, in particular, it’s core backend components, require a design pattern that better facilitates the core, asynchronous work pattern, use case. Whilst not being a panacea, a shift to a more traditional event driven architecture satisfies many needs.

This below describes a summary of benefits of implementing a event driven architecture:

Fault tolerance, scalability, simplified maintenance, versatility, and other benefits of loose coupling: Applications and components in an event-driven article aren’t dependent on each other’s availability; they can be independently updated, tested, and deployed without interruption of service, and when one component goes down a backup can be brought online. Event persistence enables ‘replaying’ of past events, which can help recover data or functionality if there is an event consumer outage. Components can be scaled easily and independently of each other across the network, and developers can revise or enrich applications and systems by adding and removing event producers and consumers.
Asynchronous messaging: Event-driven architecture enables components to communicate asynchronously—producers publish event messages, on their own schedule, without waiting for consumers to receive them (or even knowing if consumers received them). In addition to simplifying integration, this improves the application experience for users. A user completing a task in one component can move on to the next task without waiting, regardless of any downstream integrations between that component and others in the system.
Powerful real-time response and analytics: Real-time situational awareness means that business decisions, whether manual or automated, can be made using all of the available data that reflects the current state of the systems.

Goals

Simplify and restructure the backend
Eliminate data loss / data misinterpretation
Design to enable faster delivery of new business requirements including reporting
Ensure current system functionality still remains after changes applied
Ensure current reporting is still available
Track data over time
Create processes to accommodate historic data

Specifications

Use Cases

The following depicted list of use cases are considered when creating this document.

Current Challenges and Approach

Current Challenges

The Reveal Application is based on a FHIR data model. The 2 front end clients (mobile client and React web app) are able to integrate with the backend against a FHIR based API. The challenge with this architecture is that:

There is significant transformation required with data during integration to align with FHIR models and API. There are a number of integrations that would be useful to more fully support use cases; modifying Reveal to more readily allow integrations is necessary.
1. This alignment, which is currently done, significantly increases the complexity of the process to address trivial tasks e.g. save a plan, save a client etc.
  1. Examples not limited to:
    1. FHIR Path Evaluation
    2. For each event processed the application is currently converting from FHIR-to-OpenSRP structures and OpenSRP-to-FHIR structures everytime there is a relationship between entities that needs to be retrieved and / or maintained.
2. This in-turn results in complexities to introduce new features into the application because significant time will need to be spent on understanding the current process in order to identify where the changes need to be made to address the new featureset.
3. Teams will need to dedicate a significant amount of time in understanding the FHIR Specifications and will need to design with FHIR, along with the requested requirement as the core objectives; ideally this he business request should be athe priority.
4. Due to the fact that the FHIR specification is an evolving initiative, the Reveal application will also need to be maintained/aligned from a FHIR perspective.
  1. Whilst this is still a priority, the business functionality should not be hindered by the initiative but rather the application’s core functional aspects should be independent from the FHIR facility.
  2. FHIR evolution and complexity to align to Reveal may have resulted in API’s and data structures which do not fit the FHIR specifications entirely.
In order to maintain the ability to expose the FHIR API, the application stores data in a partially schema-less/unstructured manner.
1. This results in the need for the ETL process which stages and transforms data so that it is manageable for application consumption and reporting purposes.
2. The ETL process still produces generic structures which require significant analysis to produce reports. There is currently a need to maintain materialized views based on complex queries via scheduled tasks of which reports are drawn from.
3. Minor errors/misinterpretations in extracting information from the persisted FHIR structures may also result in reporting gaps and misleading insights.

Approach

Decouple FHIR concerns from Reveal core functional aspects.
1. Restructure the data layer of the application by aligning data structures as per the operational aspects of Reveal business processes.
2. Provide an independent FHIR interface which primarily facilitates interoperability.
  1. The FHIR interface will publish its capability and should be enhanced on a needs basis.
3. Create a dedicated API for system operations.
Redefine data/event consumption on the backend
1. Store and maintain data with explicit relationships across entities e.g. Plan has structures, Plan has clients etc.
  1. This is currently available however only after the ETL process has processed and restructured the data.
  2. It is far more favourable to have data structured in a reportable manner to enable real time behaviour and enable ease of report creation.
Reduce the need for an ETL process.
1. To reduce the need for ETL, metadata facilities will be built in data structures (leveraging the json/schema-less capability of postgres).
2. Introduce aggregation processes triggered by events. Aggregation processes will run independently from core processing.
3. An ETL process should only be created to provide non-standard reports / data elements where data structures do not have the relationships to facilitate direct extraction.
Introduce an event driven API.
1. Introduce data consumption processes which are managed by system queues.
  1. Queues will be assigned to entity domains e.g. Plan queue, Patient queue etc.
    1. This will allow creation of processes which are focused within the domain simplifying the process.
    2. Specific domains can be broken out of the application into its own component which then can have its own managed life-cycles.
    3. If new domains/dimensions need to be added to the system for example a use case: delivery of medical items to clinics, a new domain would need to be introduced: perhaps “equipment”.
      1. The process would then to introduce an independent “equipment” queue and have them triggered by other driver queues eg. task/plan.
2. This is aimed to better manage race conditions because of the first-in-first-out nature of queue messages. By consuming items as they exit the queue, in that order, the chances of having concurrency concerns is significantly reduced.
  1. Race conditions are an unfortunate reality of the Reveal application because field workers are disconnected from the backend and hence users can be actioning the same tasks at the same time without the opportunity for synchronization.
  2. An internal component will be created to specifically manage race conditions and give business the ability to define how best to handle the condition.
3. Handling events within queues will aid in efficient load balancing since additional instances can be introduced to consume from the queues thereby distributing the load.
Introduce a process to consume Location data.
Enable practitioner data to be centralised and still managed by Keycloak

Current and Target Architecture

High level changes highlighted:

Current: OpenSRP server exposes Partial FHIR API

Target: Reveal Server (OpenSRP server renamed) exposing system API and HAPI FHIR API.

Current: API and event processes facilitated by the wide range of OpenSRP and FHIR libraries

Target: In-house built service layers to enable system API and HAPI FHIR API, along with the above-mentioned event processes using message queues.

Current: ETL Process with 2 database schemas (core and reveal)

Target: ETL Removed with a single Reveal schema

Current: Queries to Superset rendered by React

Target: Embedded Superset Charts (or similar tooling or libraries that offer comparable functionality)

Target: Schedule to consume data for Location Information (expanding to other batch data sets on a needs basis)

Data Architecture

The following diagram depicts the conceptual data entity relationships that exist within the Reveal (and OpenSRP) domain. As part of the API rewrite initiative the business relationships across the entities must still exist to ensure that the application is able to facilitate current business operations, although there are minor conceptual changes introduced. These changes are introduced to enforce relationships across entities which may not have originally existed in the OpenSRP implementation and to redefine concepts that result in inefficiencies.

Conceptual Data Model

Conceptual changes to data model

Structure is a location

In the OpenSRP implementation the structure entity is an independent entity which houses its own location geojson data. In the proposed model, structure is viewed as a dependent entity. It will be dependent on the location it resides on and will not have geojson information. The system will rather fetch the geojson information from the associated location it is linked to.

Proposed Entity Relation Diagram

The following diagram depicts the entity relationship across the database entities. It must be noted that each entity and certain linking tables will be equipped with a metadata facility. On a row level the entity metadata will be used to record metrics and specific “ad-hoc” requirement elements for example per structure transitions over time. On a table level, metadata will be used for entity aggregations for example counts over different dimensions. It should also be noted that the illustration primarily focuses on the relationships across the entities and it is expected that the outstanding fields will be attained through a collaborative approach with the project team to reach an agreed upon structure.

Implications of adopting the proposed data model

The above proposal is loosely a normalized version of the original database structure with a big focus on enforcing entity relationships on the database level. The rationale for taking this approach is to ensure that the database expresses the relationships of the entities and enforces the relationships during development.
The benefits of taking the approach are:
1. That the relationships are apparent by analysing the data and it will make drawing reports easier.
2. Entity relationships do not live in code but rather with data. This is beneficial because insights can be drawn from data without referencing the code. It will also set the logical boundaries around entities during code development.
The normalised structures (entities joined with linking tables) do pose both a benefit and a drawback.
1. The drawback is that development will be slightly more complicated when dealing with entities and their associated linking tables as opposed to the entity in isolation. With current ORM tools care must be taken not to create cyclic relationships and massively linked objects in the application. This happens when an entity is created (in code) and linked to its associated tables. Upon querying the entity the hierarchy of associated linked tables are also queried and loaded into memory. This creates inefficiencies in the application and must be evaluated on a per-query basis to determine whether the associated entity objects should be preloaded or loaded upon request. ORM tools have facilities to mitigate this issue.
2. The advantage of the approach is that the normalised tables provide a high degree of flexibility whilst still enforcing relationships with entities. As depicted in the conceptual data model there are many inter-relationships across entities e.g. plan-client, plan-structure, plan-location, plan-task, task-structure, task-client. As per business process: when a plan (MDA as an example) is created, a list of locations are selected: i.e. relationship plan-location. This selection results in a list of structures to be targeted: i.e. plan-structure. Subsequently tasks will need to be generated for each applicable structure: i.e. task-structure etc. The normalised approach will cater for these inter-relationships and can support future use cases..

Application Components

The following diagram depicts the internal components of the Reveal server.

Application Components Descriptions

API - This is the set of endpoints which will facilitate the system operations e.g. create plan, create structure, create location. It will also provide the read capability example task sync, event sync etc. The endpoints in these API’s will need to match the current endpoints that are exposed by the OpenSRP server however the payload structures will differ (in the end state). The payload structures will be aligned to the new entity structures defined in this project. This is to maintain the current functionality that is residing on the mobile client. The authentication aspect is expected to remain the same i.e. using the spring security configuration to validate that the token supplied by the consumer is valid before the service layers are invoked.
Service layer - The service layer will be the logical layer which will house the business rules related to entities. All processes that are interacting with data entities should interface the data layer through the service layer. The service layer ultimately should be the final decision point when CRUD should happen against an entity. The service layer should be aligned with the entity model i.e. Plan Service, Task Service, Event Service etc, however specific services should be created for processes which span across entities. An example of this is plan condition evaluation, event evaluation and data aggregators.
Event Processor - The event processor is the layer which will coordinate the write processes. Similar to the service layer the event processor will be modelled against the entities e.g. plan queue, client queue, task queue etc. The diagram depicts the queues as internal components within the Reveal server, however it must be noted that this is the logical interface that needs to be built and which interfaces with an external queuing system e.g. Kafka or RabbitMQ. The queue process must interface the service layer to perform business logic and it should operate within its domain i.e. within the plan queue perform plan actions. Once done with its task within the specific domain it must determine (with the use of a service layer method) the next step and handover to the next queue. As depicted each domain entity queue has an associated aggregation process. The intention here is that an aggregation process can subscribe to the queue (topic subscription), however in this instance, not consume the item off the queue but rather get a copy of the event so that aggregation can be performed. This will allow the application to aggregate data on a row level and can assist with, for example: time series data persistence, in a decoupled fashion.
Global aggregator - The global aggregator will be responsible for aggregation across data entities and across rows. Examples are counting the number of completed tasks within a plan or number of patients registered within a jurisdiction. It will interface with the Service layer for all business rules and CRUD operations.
Schedule - The scheduling component will be responsible for all schedules as any batching action (location first, expanding to other batch data sets on a needs basis).
Reveal FHIR translator - This component will be responsible for converting the Reveal entities to FHIR compliant objects. It will read the entities via the service layer and restructure the object in accordance with the FHIR standard.
HAPI FHIR API - The HAPI FHIR API will be responsible for the interoperability with products that use the FHIR standard for data exchange

System Processes

Security Considerations

As it stands the current security implementation appears to be sufficient to protect the API and should remain as-is. The current concern however which is not a security concern but is related is the fact that user onboarding needs to happen on keycloak as well.

Access to production data is only made to individuals that need it, when they need it. Production data is isolated from data used in setups used for testing. If data is to be moved from a production environment to testing environments, data will be anonymized. All sensitive information that is put in service logs will be minimized. Access to monitoring services that ingest service logs is limited to team members that require the access. Data at rest will be encrypted. This applies for both data stores (e.g. databases) and data backups. Audit logs for when production data is accessed, and by whom, shall be kept.

The extensive Azure network infrastructure has built-in protections against DDoS (distributed denial-of-service) attacks to safeguard resources against volumetric or protocol layer attacks. Azure DDoS Protection has the operational capacity to scale protection to the largest of workloads and experience protecting Microsoft services such as Xbox and O365.

Configuration

The current high level of configuration requirements of Reveal must be better documented, and functionality to reduce this must be built into the platform. The current platform relies heavily on knowledgeable technical resources to script the configuration, which makes the system very cumbersome to implement.

Infrastructure and Deployment

The following components will make up the initial containerization of the Reveal platform:

OpenSRP server
Reveal Web
Reveal ETL
Keycloak
MariaDB
PostgreSQL
Redis
Nginx

These components will be built into a docker-compose file and docker build scripts. Once the system has been stabilized as a dockerised platform, the same will be scripted in order to be able to deploy the platform to a mini-Kube instance as well.

End state hosting costing model

The above table represents potential hosting costs per month across multiple implementations.

Proof-of-concepts

Reveal user base as an identity provider for Reveal

Location Parent Relationship

HAPI FHIR Api

Resources

Use Reveal User Base as the Identity provider for keycloak:

Server Developer Guide

Home Page

Reveal API Architecture (Draft)