Dromt
Dromt is a platform that enables the remote and seamless control and management of a fleet of drones. The platform allows users to utilize multiple drones (from various vendors) concurrently to complete a predetermined task, providing real-time feedback and video to the operator and allowing them to control the drones in real-time through a web interface.
The project has the following requirements:
- The ability to control multiple drones simultaneously
- The exchange of events and data between drones and operators who may not be in close proximity and may be behind NATs
- Cloud-based architecture for scalability and resiliency, suitable for use in various critical scenarios (e.g., rescue, firefighting, etc.)
- Fast and reliable performance to enable real-time control of drones.
Infrastructure Architecture
We designed the Dromt application and infrastructure to be cloud-based and scalable, using AWS and Kubernetes. The entire infrastructure is located within an AWS VPC, featuring a Kubernetes cluster deployed with EKS that runs the Dromt application, a cloud-native, event-driven application based on Kafka. I was responsible for designing and developing both the application and the infrastructure.
A Network Load Balancer makes the application accessible from the outside world. It consists of a REST API and a Web UI, both hosted on a Kubernetes cluster of EC2 Spot Instances__. I used Karpenter to automatically scale the cluster based on the application's load and the Node Termination Handler to drain the pods when AWS needs to terminate the spot instances.
I also set up Prometheus and Grafana to monitor the cluster and application and send alerts to the organization's Slack channel. These services are deployed within the same VPC and use AWS PrivateLink to communicate with the application. In addition, the application utilizes AWS S3 to store media (such as pictures and videos), AWS RDS to store metadata on drones and operators, and Redis to store session data.
Furthermore, I configured an ECS cluster to run batch jobs, such as generating 3D models from collected media.
Continuous Integration and Deployment (CI/CD)
To ensure the quality of the code and the reliability of deployments for critical applications, it is crucial to have a CI/CD pipeline in place. I configured two GitLab CI/CD pipelines for this purpose: one for the application and one for the infrastructure.
Infrastructure
The infrastructure pipeline consists of several stages:
- When a developer pushes code to the GitLab repository, the GitLab CI/CD pipeline checks the validity and formatting of the IAC code and performs static security analysis using Snyk.
- It also checks for significant changes in the infrastructure price using Infracost and includes this information in the merge request.
- Once the request is approved and merged, the pipeline uses Terraform to deploy the infrastructure.
Application
The application pipeline consists of several stages:
- When a developer pushes code to the GitLab repository, the GitLab CI/CD pipeline checks the code's validity and formatting and performs static security analysis using Snyk.
- It builds the application using Docker and pushes the image to the Harbor repository. If the code is pushed to a tag, it also pushes the image to the AWS ECR repository.
- During the designated maintenance window, ArgoCD deploys the application using the Helm Cart. It deploys the application to the staging Kubernetes cluster for untagged commits and the production cluster for tagged commits.
This workflow allows developers to test the application in the staging cluster before deploying it to the production cluster, reducing the risk of human errors during deployment.