Data Capabilities
Capabilities are discrete functional components for enabling a business or technical activity. These capabilities can be used in concert to enable various plays within the playbook. They are grouped into Services that are security pre-approved with associated security controls accelerating the ATO of our tenant applications. For data, the defined services are Data Governance, Data Management and Data Analytics.
Maturity Index

1. Data Catalog (Data Governance)
Description
The Data Catalog is an organized inventory of data assets in the organization. It uses metadata to help FAS manage its data, and helps data professionals collect, organize, access, and enrich metadata to support data discovery and governance.
The data catalog stores data set and attribute-level metadata and enables data stewards to create and maintain that metadata for existing and new data sets.
Key Capabilities | |
---|---|
Data search & Discovery | Find relevant information within the huge volumes of enterprise data, contextualize it, and determine how the data can be accessed/used |
Curation & Governance | Ensure analytics and insights are derived from the best, most trusted data. By applying governance at the point of data use, the data catalog reduces misuse of data and ensures compliance with agency and regulatory policies. |
Collaboration & Analysis | Through wiki-like articles, ratings, reviews, and conversations, the data catalog facilitates collaboration among an increasingly global and remote workforce. |
Maturity
FCS Product Offerings
- Data Governance
Technologies

Additional Documentation
Data Catalog Capability2. Data Quality (Data Governance)
Description
The Data Quality Service provides the necessary capabilities to assess the validity, accuracy, completeness, correctness, and timeliness of the data. The service supports data users as they are evaluating new data sets; and production applications and data pipelines as they are performing CRUD functions and processing data.
Key Capabilities | |
---|---|
Data profiling | Generate descriptive metadata about a data set, (e.g. schema, data types, field lengths, value distribution, valid values, etc.) |
Rule definition | Specify data quality rules based on prescriptive, (e.g. business rules) and descriptive, (e.g. technical) constraints and specify applicability (e.g. full dataset, sampling, etc.) |
Rule execution | Invoke rules through data pipelines/orchestration solutions and support corrective data quality, including logging of data corrections and rule execution results |
Rule lifecycle management | Modify rules and track changes over time. |
DQ Results Reporting/Notification | Includes DQ dashboard results, alerts/notifications for users and systems |
Maturity
Technologies
* TBDAdditional Documentation
* TBD3. Master Data (Data Governance)
Description
The Master Data Service provides capabilities to rationalize core data domains and create authoritative data sets that can be exposed and leveraged across systems and business domains.
Key Capabilities | |
---|---|
Unique Identifier Creation and Management | Create/apply unique IDs to drive consistent identification/linking of master data elements across systems |
Data Standardization | Apply consistent formatting and correct data inconsistencies of master data elements, (e.g. address formatting standardization) |
Exact Matching | Identify master data relationship across systems based on byte-for-byte matching values |
Fuzzy Matching | Identify potential master data relationships across systems based on similar values and/or complex logic across multiple attributes |
Recommendations | Show potential master data record matches across systems and allow users to determine if they are valid or invalid matches. |
Maturity
Technologies
* TBDAdditional Documentation
* TBD4. Data Lifecycle (Data Governance)
Description
The Data Lifecycle Service provides the mechanism to manage data storage in alignment with data retention, archiving, and purge requirements to reduce data sprawl and storage costs
Key Capabilities | |
---|---|
Lifecycle Definition | Define the conditions under which data can be retained, archived and or purged. |
Time Driven Lifecycle | Move data to lower tiered storage based on elapsed calendar time since the data was created or modified. |
Utilization Driven Lifecycle | Move data to lower tiered storage based on elapsed calendar time since the data was last touched. |
Intelligent Tiering | Move data between storage tiers based on utilization/access patterns. |
Maturity
FCS Product Offerings
- Data Lake
- Data Warehouse
Technologies



Additional Documentation
* AWS S3 Lifecycle Management5. Lineage (Data Governance)
Description
The Data Lineage Service enables understanding, recording and visualizing data as it flows from data sources to consumption. This includes all transformations the data underwent along the way—how the data was transformed, what changed and why.Data lineage shows the history of the data you are looking at today, detailing where it originated and how it may have changed over time. It is a reflection of the data life cycle, the source, what processes or systems may have altered it and how it arrived at its current location and state.
Key Capabilities | |
---|---|
Lineage Mapping | Graphical representation of the data flow between source and target |
Lineage Details | Description of data transformations applied to the data through each step of the data processing pipeline. |
Design-time Lineage | Lineage based on the intended process flow when the data pipeline was being created. |
Run-time Lineage | Lineage based on the actual data pipeline execution |
Maturity
FCS Product Offerings
Technologies

Additional Documentation
* Data Catalog Capability6. Reference Data (Data Governance)
Description
The Reference Data Service provides a means to manage bounded, common data sets across data domains to drive consistency. Reference data is slowly changing by nature and is used to group or organize other data. Within OLAP models, reference data is often represented through dimension tables.Managing reference data centrally ensures the ability to consistently group and organize data which enables easier cross-domain analytics.
Key Capabilities | |
---|---|
Reference Data Inventory | Store and manage reference data sets centrally. |
Change Notification | Create systematic alerts when reference data records are created, modified, or deleted. |
Reference Data Harmonization | Standardization of multi-source reference data through business rules applied as transformation logic. |
Maturity
FCS Product Offerings
Technologies

Additional Documentation
* TBD7. Data Policy (Data Governance)
Description
The Data Policy service provides a centralized location to define and manage the rules for users interaction with data. Stewards can map the rules to specific data sets and identify which policies are being applied to what data and user groups.
Key Capabilities | |
---|---|
Policy Definition | Specify rules, conditions, warnings mapped to data sets and elements. |
Policy Execution | Based on defined rules manage users access/interaction with data consistent to the policy definition. |
Policy Audit | Detailed view of policy definition and how it is applied to specific data sets and elements. |
Maturity
FCS Product Offerings
Technologies

Additional Documentation
* TBD8. Sensitive Data Detection (Data Governance)
Description
The Sensitive Data Detection service provides an automated means to identify data elements that require additional data protection or special handling based on organizational or regulatory rules.
Key Capabilities | |
---|---|
Pattern Matching | Identification of sensitive data elements based on attribute structure/format. |
Metadata Matching | Identification of sensitive data elements based on attribute name or definition. |
Rule Definition | Creation of detection rules based on business-defined conditions. |
Catalog Integration | Automated updating of data catalog with tags for sensitive data attributes. |
Maturity
Technologies
* TBDAdditional Documentation
* TBD9. Scheduling and Orchestration (Data Management)
Description
The Scheduling and Orchestration Service provides the ability to set up recurring executions of data pipelines/processes based on time parameters or conditions. This service reduces the need for manual intervention and can be used in conjunction with infrastructure provisioning capabilities.
Key Capabilities | |
---|---|
Time-based Schedule Creation | Configuration of recurring job executions based on time conditions, (e.g. time of day, day of week, 1st day of the month, etc.) |
Condition-based Schedule Creation | Configuration of recurring job executions based on specific conditions being true. This could include dependency on another job, a specific file being delivered or a notification from another system. |
Job Execution Retry | In the event that a job does not complete successfully, automatically restarting the job. |
Point-of-Failure Restartability | In the event of a process failure, the ability to restart the job from the point that the failure occurred vs. restarting the entire process over. |
Job Branching and Merging | Complex orchestration that allows jobs to initiate other jobs, wait for other jobs to complete before execution and feed processing details into subsequent jobs. |
Maturity
FCS Product Offerings
Technologies
Linux cron / crontabAdditional Documentation
* TBD10. Data Model (Data Management)
Description
The Data Model service provides a means to manage and map key organizational data and relationships between that data and represent those relationships graphically. This is key for supporting data governance, management, and design activities. Integration of the Data Modeling solution and Data Catalog is key to ensure consistent data management.
Key Capabilities | |
---|---|
Model Creation | Define a model including key entities/tables, attributes/fields, and relationships. |
Attribute Management | Configuration of attributes including defining business and technical metadata |
Relationship Management | Define how different entities are related based on the attributes that each entity contains |
Constraint Management | Establish rules for attributes, (e.g. key values, valid values, nullability, format, etc.) |
Data Definition Language Generation | Creation of scripts from the data model that can be used to create/modify database objects. |
Reverse Engineering | Generating a data model based on a database DDL |
Maturity
Technologies
* TBDAdditional Documentation
* TBD11. Data Sharing (Data Management)
Description
The Data Sharing service provides systematic means for data owners to expose data to interested parties through controlled interfaces.
Key Capabilities | |
---|---|
Direct Access | Providing data consumer direct access to the data storage layer through defined access controls based on the sensitivity of the data and the permission of the user. |
Data Abstraction | Creation of a semantic layer to manage data access and provide a managed view of the data to the data consumer where the consumer does not have direct access to the underlying data. |
Maturity
Technologies
* TBDAdditional Documentation
* TBD12. Data Exchange (Data Management)
Description
The Data Exchange service provides a means to deliver authoritative copies of data to downstream users/systems to support local application processing and/or analytics.
Key Capabilities | |
---|---|
Bulk Data Transfer | Creation of an authoritative copy of data that can be delivered to the consumer for reuse. This is done through the creation of batch files delivered to a specified location or through the creation of database replicas for one-time or ongoing (change data capture) data transfer. |
Application Programming Interface (API) | Brokered real-time synchronous interface between data owner and consumer based on a request/response paradigm whereby the consumer makes a specific request for data to the data owner based on a predefined data specification. |
Event Publication | Brokered real-time asynchronous interface where the data owner publishes notifications of changes in state or the actual state change of data to a centralized queue where consumers can monitor the queue for data of interest and consume and process the data as the events occur. |
Maturity
FCS Product Offerings
- Data Lake
- Database Migration
Technologies


Additional Documentation
* TBD13. Data Processing (Data Management)
Description
The Data Processing service enables integration, standardization, organization, and data derivation to make data easier to consume/use downstream. It supports data integration to manipulate and consolidate data from disparate sources into a useful form. This allows users to have easy access and reliable means to meet the information needs of all applications, users, and business processes. It helps to produce a unified view to be able to glean actionable information from it.
Key Capabilities | |
---|---|
Extract Transform Load (ETL) | Access and pull data from sources, apply transformations, refine and publish data for downstream consumption. |
Extract Load Transform (ELT) | Access and pull data from sources, persist a copy of the source data for additional refinement, apply transformations, refine and publish data for downstream consumption. |
Maturity
FCS Product Offerings
- Extract Transform Load (ETL)
- Data Processing Cluster
- Database Migration
Technologies



Additional Documentation
* Data Integration Play14. Data Storage (Data Management)
Description
The Data Storage service provides the ability to store, manage, and expose data for data consumers to access, query, explore, analyze, and use that data to generate new insights and reports.
Key Capabilities | |
---|---|
Unstructured Data Storage | Capturing and persisting data in a scalable manner to enable centralized storage of cross-domain data for further downstream processing and consumption. Unstructured data storage can handle any file/object type and store it in a cost-effective manner with easy ingestion and access methods. |
Structured Data Storage | Capturing and persisting conformed data organized in a business context to support ease of data exploration, analytics and reporting. Structured data storage enforces data design specifications such as schema to improve quality and usability of the data. |
Data Access | Query/interact with data through standard interfaces based on user roles and data protection policies. |
Data Protection | Encrypt data to further protect it from unnecessary exposure/access. |
Maturity
FCS Product Offerings
- Data Lake
- Data Warehouse
Technologies


Additional Documentation
* Data Warehousing with AWS Redshift15. Self Service (Data Analytics)
Description
Self-Service provides capabilities to allow users to query data through a command line interface that supports ANSI standard SQL and manipulate, integrate, and transform data to derive new insights. This service is intended to allow business users to generate new insights and prototype data pipelines.
Key Capabilities | |
---|---|
Query creation | Writing of custom SQL against analytic data stores to explore the data and generate insights |
Query optimization and editing | Refactoring of query based on new business requirements or to improve performance based on systematic recommendations, (e.g. explain plan) |
Query version control | Saving/persisting versions of a query, being able to track changes, and potentially branch/merge code across users |
Extract Transform Load (ETL) | Access and pull data from sources, apply transformations, refine and publish data to support localized analytics/reporting |
Extract Load Transform (ELT) | Access and pull data from sources, persist a copy of the source data for additional refinement, apply transformations, refine and publish data to support localized analytics/reporting. |
Maturity
Technologies





Additional Documentation
* TBD16. Computational Service (Data Analytics)
Description
The Computational service provides a means for scalable, parallelized complex data processing and compute. It is intended to support core capabilities for advanced data processing to support analytics, data science, and machine learning (ML).
Key Capabilities | |
---|---|
Apache Spark-based Processing | Leverages Spark's in-memory processing to improve scale and parallelization for large-scale data processing. |
Multi-language Support | Use python, scala or java to write Spark-based data processes. |
Library Integrations | Extend data science functionality through common open source libraries. |
EMR Studio / Jupyter Notebooks | Integrated development environment (IDE) for data scientists and data engineers to develop, visualize, and debug data engineering and data science applications written in R, Python, Scala, and PySpark. |
Maturity
FCS Product Offerings
Technologies


Additional Documentation
EMR User Guide17. Business Intelligence (Data Analytics)
Description
The Business Intelligence service includes all facets of standard reporting, dashboarding and data visualization capabilities including authoring, publication, lifecycle management, and access to reporting and visualization artifacts.
Key Capabilities | |
---|---|
Pixel-perfect Reporting | Structured reporting conformed to exact specifications to meet organizational or regulatory requirements. |
Standard Reporting | Structured tabular reports where the user can interact with the data to filter, drill up/down/across, and explore the underlying data. |
Visualization/Dashboards | Interactive reports including charts, visual representations, and graphs. |
Maturity
FCS Product Offerings
Technologies


Additional Documentation
* Data Visualization Play18. AI/ML Lifecycle (Data Analytics)
Description
The AI/ML Lifecycle service enables data scientists to manage all facets of model creation and execution through standardized tools and methods aligned with best practices for model management and DevSecOps approaches.
Key Capabilities | |
---|---|
Data Acquisition and Refinement | Import/access data and standardize it for input to machine learning model |
Model Development | Create and refine models |
Model Training | Harmonize models through additional input data and refactoring |
Model Testing | Validate model outcomes and functionality. |
Model Versioning | Retain model versions, including input data, code, and output data for development and compliance requirements |
Model Promotion | Migrate approved models to execute in a production environment and/or integrate with production applications. |
Model Monitoring | Recurring validation of models to identify data and/or model drift. |
Model Refactoring | Updating/retraining models to ensure model produces appropriate outcomes |