FAS-IT Playbook - Technical Approach

Business Benefits: Improved business insights and intelligence through blending telematics data with Fleet business data and the analysis of same.

Telematics enables Fleet to drive the key business outcomes listed below. For more information on how Fleet is implementing telematics, click here.

Support current fleet and transition to the electric future

Compare data of fleet’s daily requirements over time to match vehicles with suitable Electric Vehicle (EV) replacements.
Report on EV battery levels and trends to optimize charging strategies and cut costs and carbon.

Keep everyone safe on the road

Provide advanced safety features to report on vehicle usage and safety habits, including seat belt use, speeding, harsh acceleration, braking, turning, etc.
Review driver safety scorecards or receive instant notification when a driver breaks safety rules.
Reconstruct collision events to investigate safety issues in real-time.

Cut asset procurement and maintenance expenses

Achieve efficient allocation and utilization of vehicles.
Reduce procurement costs and ongoing maintenance by using assets more productively.
Expand solutions further and predict maintenance requirements before issues arise.

About Geotab

Geotab Offers:

LTE connected devices
Data gathering (w/ SLA)
Capabilities

Diagnostics
Collision Detection
Alerts / Coaching
Location
Extended Options

Features

Python API
Wifi Hotspot
Optionally

Imaging
Analytics
Rugged Housing
Management Software
IOX expansion

Conceptual Architecture

The following diagram shows the conceptual architecture for Fleet's IoT implementation.

Data from each vehicle are accumulated by the telematics data provider and a client software downloads the data into the PostgreSQL database. Next, a process selects recent data for each parameter from that postgresql database and inserts them into a parameter-specific table in redshift. A separate process moves old data from redshift into s3 bucket. AWS Glue catalogs these data in the s3 bucket. A subset of the telematics data is copied into the MySQL database so GSAFleet.gov could access it. A lambda function sends an email notification if recent data are not available.

Business Benefit:Stronger protection of business and user data during application testing while also meeting legal requirements for the protection of same.

Only the production environment is allowed to host real production data. However, how does one perform testing if the test environment does not have real data? Many things in data workflow, application functionality in business workflow depend on a particular form of data so a mock data often cannot meet the testing needs. To address this problem the production data can be copied into the test environment in a disguised form. Roughly speaking if a variable value in production is ABC, it can be stored in the testing environment as XYZ. This way the software can still handle the variable the same way as the production system does, but the stored and displayed value would not be real. The challenge is to mask the data consistently between the tables and across the databases, so the same contract number would still be the same after masking.

This is how the data masking is performed. Fleet business line has determined the fields that need to be masked. That list of the fields to be masked serves as the input for the data masking pipeline. The masking process utilizes Streamsets as the ETL tool and open source encryption libraries that are FIPS 140-2 compliant to dynamically mask these lists of fields across tables and schemas and write them to the staged temporary instance in a temporary database. Once the masking is done, the snapshots of the prepared masked databases are copied into the test environment. Then the masked data in the testing environment could be copied into the legacy test databases using a special loading process and the same data could be consolidated into the masked as is consolidated database. From there the consolidated masked data are propagated into the GSAFleet test system via the Streamsets data pipelines. This way all the databases in the test environment have the same masked data.

Business Benefit:Saving time and cost on data pipelines development and maintenance by using a tool that allows to build and maintain the data pipelines using a graphic user interface with drag-and-drop functionality and advanced monitoring capabilities

While the modernization project is ongoing the data produced in the legacy Fleet systems have to be made available to the new FSAFleet.gov system. Likewise data originated from GSAFleet.gov are needed in the legacy systems. So multiple pipelines must be implemented between these two systems. Such pipelines need to be on the enterprise level of reliability and maintainability.

StreamSets tool is an off the shelf data migration solution that allows building simple data pipelines using a graphic user interface with the drag and drop features. A pipeline could be customized to any level of complexity up to including a separately designed code as a step in the pipeline. Moreover, one can create a template with carefully designed data validations, error handling and other necessary features. Then a new pipeline could be based on such a template reducing the development time and improving the pipeline quality and maintainability.

A prepared pipeline can be executed on a schedule or it can run continuously using change data capture log processing from the source database. Of course, it can also be executed manually as required for example to bulk load the data from source to target.

There is a separate pipeline for each source table migrating the data into a target table in the new system. The reverse data pipelines synchronize data from the new system into the legacy system and also into the consolidated database. The later synchronization is necessary to enable a record initially created in a new system to be written back into legacy, legacy then can update this record and such an update needs to propagate into the consolidated database to be read from the new system. So, the consolidated database should have the original record in the first place before it can be updated from legacy. This pipeline system accomplishes such a task.

The software provides a report for each pipeline displaying number of records migrated, number of exceptions if any and other relevant information.

Business Benefit: The Legacy Fleets Systems relied on secure yet dated asynchronous communication tools between systems and government organizations. Given the need to both transfer more data and to transfer that data in near-instantaneous timeframes, a more sophisticated streaming technology was required. Additionally, accelerating data processing to get business insights and respond to fraud and other threats more quickly.

Amazon Kinesis is a fully managed service that enables users to collect, process, and analyze real-time streaming data at scale. It provides a simple and cost-effective way to handle large amounts of streaming data in real-time and can be used for a variety of use cases, including data processing, machine learning, real-time analytics, and more.

Amazon Kinesis Streams is one of the core components of the Amazon Kinesis service. It is a scalable and durable data streaming service that allows users to collect and process large amounts of data from multiple sources in real-time. With Kinesis Streams, users can ingest data in real-time, process it, and store it in a distributed manner. The data is partitioned across multiple shards, which allows for parallel processing of data streams.

Kinesis Streams supports multiple data producers and consumers, allowing for a wide variety of use cases. For example, it can be used to capture data from social media feeds, IoT devices, mobile apps, and web applications. Once the data is ingested, it can be processed using various AWS services, such as AWS Lambda, AWS EMR, or Amazon Kinesis Data Analytics. Additionally, Kinesis Streams integrates with other AWS services, such as Amazon S3 and Amazon Redshift, allowing for seamless data transfer and storage.

Kinesis Streams provides several key features that make it an ideal solution for handling streaming data at scale. These include:

Scalability: Kinesis Streams can handle data streams of any size, from small to extremely large, without any manual intervention. Users can easily scale up or down their data streams based on their needs.
Durability: Kinesis Streams is designed to be highly available and durable, ensuring that data is never lost. Data is automatically replicated across multiple Availability Zones (AZs), providing fault tolerance and disaster recovery.
Real-time processing: Kinesis Streams allows for real-time processing of data streams, providing insights and analytics in near real-time. This makes it ideal for use cases that require real-time decision-making, such as fraud detection or predictive maintenance.
Cost-effective: Kinesis Streams is a cost-effective solution for handling streaming data at scale. Users only pay for the resources they use, and there are no upfront costs or minimum fees.

Kinesis Streams also provides several tools and APIs that make it easy to use and integrate with other AWS services. These include the Kinesis Producer Library (KPL), which enables users to easily produce data to Kinesis Streams, and the Kinesis Client Library (KCL), which provides a simple way to consume and process data from Kinesis Streams.

In addition to Kinesis Streams, Amazon Kinesis also provides two other components: Kinesis Firehose and Kinesis Analytics.

Kinesis Firehose: This is a fully managed service that allows users to easily deliver streaming data to AWS services, such as S3, Redshift, or Elasticsearch. It provides a simple way to load and transform streaming data in real-time, without requiring any manual intervention.
Kinesis Analytics: This is a fully managed service that allows users to easily perform real-time analytics on streaming data. It provides a simple and powerful way to run SQL queries on streaming data, and can be used for a variety of use cases, such as real-time dashboards, anomaly detection, and more.

In summary, Amazon Kinesis Streams is a scalable, durable, and cost-effective solution for handling large amounts of streaming data in real-time. It provides a variety of key features, such as scalability, durability, and real-time processing, and integrates seamlessly with other AWS services. With Kinesis Streams, users can easily collect, process, and analyze real-time streaming data, enabling them to make timely and informed decisions.

Business Benefit: Faster software development process that enables faster delivery of new features to customers

GraphQL is a powerful technology that offers many benefits to the GSA's business by seeking, foremost, to improve their API development and management. By reducing API complexity, improving performance, and enabling better collaboration between front-end and back-end developers, GraphQL is quickly becoming a go-to solution in systems development effort today.

The following are several key features of GraphQL that make it such a valuable technology for businesses:

Increased Flexibility - GraphQL allows developers to request only the data they need, reducing network usage and improving load times. On the surface, this might not seem very important but this flexibility also enables developers to make changes more easily and efficiently.

Better Developer Experience - With GraphQL, developers have access to a self-documenting API that is easy to understand and work with, reducing development time and increasing the quality of the final product.

Improved Performance - By minimizing the amount of data transferred over the network, GraphQL can improve API performance and provide a faster and more responsive user experience.

Reduced API Complexity - With GraphQL, developers can access data using a single endpoint, making it easier to manage the API layer and reducing overall complexity.

Better Collaboration - GraphQL provides a shared language for defining data requirements, enabling front-end and back-end developers to work together more closely and efficiently.

Fleet, through all of its modernization efforts, has adopted GraphQL across its entire suite of microservices in its data access layer. With GraphQL, Fleet has been able to streamline its data layer API development and management processes, reducing complexity and improving overall performance. The technology has enabled better collaboration between its development teams, leading to increased productivity and better outcomes for the business. Overall, GraphQL has proven to be a valuable tool for Fleet as it prepares a legacy of stable APIs for the years to come.

telematics

Technical Approach

High-Level Approach

Innovations

Technical Approach

High-Level Approach

Distributed Datasets & Microservices

Innovations

Local Data Analytics

Telematics & Internet of Things (IoT)

Test Data Masking

Data Profiling and Exploration with Streamlit

Data Pipelines with Streamsets

Messaging with Kinesis Streaming

Efficient Data Access with GraphQL

Redshift Serverless

Buildah