System Architecture
Explore the key components of the Integrated Award Environment (IAE) system architecture and how critical technical considerations were incorporated into it.
Overview
The modernized application architecture of SAM.gov has been decomposed into independent micro-frontend Angular applications and Spring Boot powered microservices. Each one of these components is responsible for a specific SAM.gov functionality and can be updated, tested, and deployed quickly since they are limited in scope and contain few dependencies on other services or packages.
The individual components are connected through a custom reverse proxy implementation based on Spring Cloud Gateway. This gateway receives and routes all requests for the application to the correct micro-frontend or API Docker cluster. Specific routes are used for each micro-frontend which are defined by the business requirements and all of our microservices are located behind the API context. Some application functionality such as the SAM.gov login services or reporting software resides at a Custom Context configured in the gateway.
User authentication is provided by an external provider, Login.gov. We host services that set up the initial authentication parameters for the browser, the user is then redirected to Login.gov to complete the login process, and the user is returned to SAM.gov where a session is established for the remainder of the user's interactions. We also utilize a 3rd party tool, API Umbrella, to provide direct authentication for other applications that want to integrate directly with our API offering. Once the application has been approved, they are provided API keys that can be used for programmatic access to SAM.gov API’s.
Since SAM.gov has very complex and unique user authorization requirements, a custom service for role management was developed to provide role based authorization and permissions to access various objects like entities, opportunities, and organizations. The role management service provides users with the ability to self manage their organizations through role requests, approvals, and removals.
Reporting is a key pillar for SAM.gov to expose the vast amount of information that it manages. A third party reporting tool, MicroStrategy, provides users with pre-built reports or they can use the interface to build more complex report queries on the data contained within the SAM.gov ecosystem. The user then runs the reports and visualizes the data how it best suits their needs.
The sections below explain the various aspects of the architecture in greater detail, including the high-level architecture view, component interactions, business architecture, authentication, and the technology stack.
The High-Level Architecture View of the entire SAM.gov → shows components of the system.
The figure (1) below represents the baseline system architecture. This diagram gives an overview of the users of the system, development tools and pipeline, production environment, operations infrastructure, and platform configuration. Specific, approved software are enumerated to denote the tools that are in place to enable system functionality. Currently the system is in transition, hosting both legacy and modernized solution components connected by different integration mechanisms.
The top of the figure shows how the various system Users interact with the system including the Login Services (Login.gov for user authentication, and Okta for system account authentication); these users include both regular web portal users, administrators, interfacing systems (including contract writing systems), and personnel working on the management of the system, such as developers and testers.
The left-hand side of the figure shows how the system is made. Stakeholders and product owners create ideas that are placed and prioritized in backlogs. Development teams using Agile processes turn these ideas into code committed into GitHub. Code changes in GitHub trigger CI/CD Pipeline jobs to build, test, package, and deploy the code. The CI/CD Pipeline is enabled by using the Github code repository, Jenkins deployment management and container instances utilizing Docker. In the development of the system, system changes work through building, testing and packaging through the environments until the changes are deployed to the production environment. Development tools that are utilized to facilitate these changes include Angular (UI development), Postgres (database), Java Script, SpringBoot (API development), GitHub (code repository), JAWS (508 testing), and Cucumber/Selenium testing. The system's multiple operating environments include the Charlie(for load and performance testing), dev, test, staging and production environments.
The center of the figure shows the wealth of Business Services for the user including, but not limited to, Home, Search, Data Bank, Data Services, Data Entry, Help, System Accounts, and Workspace. These business services leverage the Front End and microservices including, but not limited to, SAM Front-End, Entity/Exclusion MFE, Search MFE, System Accounts MFE, Admin MFE and CMS MFE to communicate with the APIs within the Technical Services layer (i.e. entity registration APIs, opportunities APIs, exclusion APIs, federal hierarchy APIs, etc). An API may consist of several REST based endpoints. Each microservice encapsulates and manages the data that it creates. The business services are further enabled by a layer of technical services containerized APIs to support program wide technical functions such as access management (user role management and system account management). Beside using the dataset encapsulated in each microservice, the APIs use the data enabled by the technical backbone of Microstrategy services for Reports functionality, and Elasticsearch for search functionality. The APIs interact with a Data Services layer through integration services for messaging (Kinesis) and batch transfer (ETL). The data services layer includes an Elasticsearch, Postgres database and a reporting LDE Redshift database that feeds data into the technical backbone of the technical services layer to enable the suite of system APIs to serve the system functionalities of the business services. The Elasticsearch database consolidates the data for the individual system functional domains as referenced above (ie entity registration, opportunities, exclusion, federal hierarchy, etc) for read-only operations.
The bottom of the figure shows the platform layer that hosts the site, enabling all aspects of the system described thus far. The platform for the system resides at FCS. FCS hosts the Amazon Web Services platform (AWS), as well as provides its on suite of services. The AWS services include both AWS Infrastructure as a Service and Platform as a Service. The Infrastructure as a Service (IaaS) offering includes EC2 compute, Object storage (S3 bucket), Network VPC, and AWS monitoring (CloudWatch). The AWS Platform as a Service (PaaS) offerings include RDS Database as a Service (which feeds into the Postgres database), Search Engine as a Service (Elasticsearch), data messaging (Kinesis), Elastic Cache (Redis), and data warehousing (RedShift). The AWS services work in tandem with FCS independent services that are broken out by FCS Platform as a Service, FCS Monitoring, and FCS security. FCS Platform as a Service enables theSearch Engine as a Service (Elasticsearch), the Reports engine (Microstrategy), application logging (Splunk), the API services engine (API Umbrella), Infrastructure level logging (logstash), and Static Code scanning (SonarQube). FCS monitoring enables security scanning on images (Twistlock) and penetration testing (OWASP). DataDog is leveraged by IAE for application performance monitoring and Cloudfront is the solution being leveraged for CDN.
There are system-wide components, not shown explicitly on the above figure, that provide audit logging and debugging tools including, but not limited to, logging solution and ServiceHub. The entirety of all components described above work together to enable the functionality of the system, facilitating in deployments to implement system changes and enhancements, and securely manage the system to prevent any malicious activities. For the users, these functions are largely transparent, enabling regular, day-to-day usage of the system, but as denoted in the architecture, a complex set of environments, technical/data layers, security layers, and platform layers all work in tandem to provide an integrated user experience.
Key interactions between these components, and selected sub-components, are shown in the figure below.
The above figure captures the interaction among SAM.gov components. The top captures the major capabilities available via the user interface of SAM.gov, which includes Home/Help, Search, Data Bank, Data Services and data entry/workspace for various business domains that SAM.gov supports. Login.gov is the identity provider (IDP) for SAM which is what the users are directed to when they attempt signing into SAM. The Search capability is built on OpenSearch, the data bank reports are built on Micro Strategy which utilizes Redshift Cluster as the data source, the help content uses FSD and the extracts under data services pull data from S3. The workspace and data entry capabilities leverage multiple APIs like Contract Opportunities, Assistance Listings, Wage Determination (which gets data from WDOL via email), Federal Hierarchy, UEI Manage, Entity Hierarchy, System Accounts etc. The ISAM environment provides data entry for SAM registration information and all the sensitive data is stored there, which is a separate VPCaaS account and data is synchronized between the 2 accounts using Kineses and APIs. The notifications to the users (not captured in the diagram) comprises Alerts, Requests, Feeds and Emails which have their own microservices to manage that data. The contract writing system accounts are managed by the system accounts API that leverages Okta for authentication and the authorization is built into the API. The list of APIs available to be used by the CWS and other consumers of SAM's data can be found at https://open.gsa.gov/api/
The baseline data architecture gives an overview about the flow of data between various domains involved in the system. It shows the interaction between various domains in sam.gov, legacy SAM, and external systems. The main legacy systems include legacy SAM, FPDS, and WDOL. The transactional data layer consists of all the AWS RDS (Postgres) for each domain such as FAL, wage determination, opportunities, all common services etc. The application interface of SAM.GOV retrieves data via an OpenSearch index cluster that is created over entities, exclusions, FAL, wage determination, opportunities, and common services. The data from legacy SAM and FPDS flows via the ETL layer to the OpenSearch cluster. The AWS Kinesis stream is set up to feed data to the search index cluster in order to improve the performance of the entity and exclusion search. The WDOL data is loaded into the Postgres database via XML extracts on a periodic basis. The common services that include Watchlist, Alerts, Feeds, Feedback, Role Management, Reference data, Federal Hierarchy etc. feed the data to all the domains including Legacy SAM, all the domains with the transactional layer, and Data Warehouse in AWS Redshift. The data from Federal Hierarchy, Opportunities and Assistance Listing is stored in AWS Redshift, which is used by MicroStrategy to build reports displayed on the UI.
The users of the system are broadly divided into 2 categories: human accounts and system (service) accounts and the system authenticates both these types of users using separate authentication modules namely Login.gov and Okta respectively.
The IAE technology stack follows the PaaS (Platform-as-a-Service) model, as shown in Figure (3) PaaS Stack below, has the IAE Tenant (i.e. the SAM.gov system) as the Applications and associated Data layers running as containerized services hosted on the FCS Platform as a Service (PaaS). The FCS Platform is grouped into two areas, the Runtime and Operating Systems (OS) layers that are a Container as a Service (CaaS) offering, and underlying Hypervisor, Storage, Network, and Hardware layers that are an Infrastructure as a Service (IaaS) offering running on Amazon Web Services (AWS). The AWS services run in the US East (Northern Virginia) Region. The platform inherits security controls and services available from AWS. The platform was previously known as the Business Service Platform (BSP).
- Route 53
- EC2
- VPC
- Application Load Balancer (ALB)/ELB
- OpenSearch
- ElasticCache (Redis)
- Key Management Service (KMS)
- Kinesis
- Simple Storage Solution (S3)
- Redshift
- RDS
- CloudFront
- Cloudformation
- Github Enterprise
- Jenkins
- Artifactory
- Marketplace Application (customer built by FCS)
- Splunk
- ForgeRock (FCS Auth)
- Windows Data Jumphost
- Citrix Nginxs
- Docker Enterprise (Mirantis Cluster)
- API Umbrella