Simplifying Ephemeral Environments: A Deep Dive into Dynamic Testing
In today’s software development landscape, it’s essential to test new service versions without causing disruptions. In this article, I’ll share our experience with implementing ephemeral environments, which have proven to be a game-changer for dynamic testing and deployment.
What are Ephemeral Environments?
For our proof-of-concept, we focused on creating “ephemeral environments” for individual services rather than the entire environment. The idea was to run multiple versions of a service simultaneously and split the traffic between them. This approach, reminiscent of a long-lived canary deployment, allowed us to test new features or fixes in a production-like setting without impacting the main service. By keeping the environment stable and only changing the service, we could ensure that our tests were as realistic as possible while minimizing risk.
Setting Boundaries
To implement ephemeral environments effectively, we established clear boundaries:
- No modifications to databases
- No changes to scripts acting on the main service
- No alterations to configurations related to the main service
Everything else had to behave as normal, allowing us to test the new service version as if it were fully deployed. We limited this functionality to our lower, non-production environments for safety.
Helm Chart Modifications
To support our ephemeral environment setup, we needed to update our
Helm chart. The goal was to allow an isEphemeral=true
flag
that would prevent the creation of non-service-specific resources when
deploying an ephemeral version of a service. Here’s how we modified our
Helm chart:
We added a new value to our
values.yaml
file:isEphemeral: false
In our template files, we used conditional statements to control resource creation:
{{- if not .Values.isEphemeral }} # Non-service-specific resources --- apiVersion: v1 kind: ConfigMap metadata: name: {{ include "myapp.fullname" . }}-config # ... rest of the ConfigMap definition --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: {{ include "myapp.fullname" . }}-data # ... rest of the PVC definition {{- end }} # Service-specific resources (always created) --- apiVersion: apps/v1 kind: Deployment metadata: name: {{ include "myapp.fullname" . }} # ... rest of the Deployment definition
We updated our
NOTES.txt
to inform users about the ephemeral deployment:{{- if .Values.isEphemeral }} This is an ephemeral deployment of {{ .Chart.Name }}. Non-service-specific resources have not been created. {{- else }} This is a standard deployment of {{ .Chart.Name }}. {{- end }}
With these changes, we could deploy an ephemeral version of our service using:
helm install my-service ./myapp --set isEphemeral=true
This would create a deployment without the non-service-specific resources, allowing us to test the service in isolation.
The Technical Implementation
To implement our ephemeral environment solution, we developed a multi-faceted approach that leverages custom header injection, Kubernetes routing, and careful service configuration. This section will delve into the technical details of our implementation, showcasing how we achieved dynamic routing and seamless integration of ephemeral services within our existing infrastructure.
Our implementation consists of three main components:
- Custom Header Injection: A Go-based proxy that injects a unique identifier into incoming requests.
- Kubernetes Routing: Utilizing Kubernetes IngressRoute resources to direct traffic based on the injected headers.
- Service Configuration: Modifications to our services to ensure proper handling and propagation of the custom headers.
Let’s explore each of these components in detail.
Custom Header Injection
The core of our solution revolves around a custom header injection. Here’s how we achieved it:
- Developed a proxy in Go to inject a custom header into requests
- Linked the header to a Jira ticket ID (e.g., ABC-123)
- Ensured the API gateway forwarded this header
- Required all internal services to propagate the header in their calls
func getJiraToken(url string) (string, error) {
, _ := http.Get("http://" + url)
resp, _ := ioutil.ReadAll(resp.Body)
body
return string(body), nil
}
func extractJiraID(host string) string {
:= strings.SplitN(host, ".", 2)
parts return parts[0]
}
func injectHeader(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
:= extractJiraID(r.Host)
jiraID , _ := getJiraToken(r.Host)
token
.Header.Set("X-Ephemeral-Id", jiraID)
r.ServeHTTP(w, r)
next})
}
This code snippet is a simplified example. In a production setting, you would include error handling, logging, and configuration management to ensure robustness and maintainability.
Kubernetes Routing
With the custom header in place, we leveraged Kubernetes routing resources to:
- Intercept requests with the matching header
- Forward these requests to the ephemeral service
- Maintain normal routing for all other requests
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: ephemeral-ingressroute
spec:
entryPoints:
- web
routes:
- match: Host(`my-service.svc.cluster.local`) && HeadersRegexp(`X-Ephemeral-Id`, `\w+-\d+`)
kind: Rule
services:
- name: ephemeral-service
port: 80
This approach allowed us to deploy an ephemeral service deep in the request chain and target it without affecting the main service. In practice, you would also implement security measures such as TLS and authentication to protect your services.
Automation and Integration
To streamline the deployment of ephemeral environments, we integrated these processes into our CI/CD pipelines. Automated scripts handle the creation and teardown of these environments, ensuring minimal manual intervention and rapid iteration. This integration allows for seamless updates and testing cycles, reducing the time from development to deployment.
Benefits and Use Cases
Ephemeral environments offer several advantages:
- Risk Mitigation: Test new features without impacting the main service
- Parallel Testing: Run multiple versions simultaneously for comparison
- Realistic Testing: Test in a production-like environment
- Rapid Iteration: Quickly deploy and test changes
Common use cases include:
- Feature flag testing
- Performance comparisons
- Gradual rollouts
- A/B testing
Challenges and Considerations
While powerful, implementing ephemeral environments comes with challenges:
- Complexity: Requires careful orchestration and routing. We use tools like Helm and Terraform to manage configurations and deployments efficiently.
- Resource Management: Running multiple versions can be resource-intensive. We use resource quotas and monitoring tools like Prometheus and Grafana to manage this effectively. It’s crucial to balance resource allocation to avoid over-provisioning.
- Data Consistency: Ensuring data integrity across versions is crucial. We employ data synchronization strategies and use mock data where necessary. Consider using database snapshots or versioned data sets to maintain consistency.
- Monitoring: Requires robust monitoring to track performance across versions. We have integrated logging and monitoring solutions to provide real-time insights. Tools like Prometheus and Grafana are essential for visualizing performance metrics.
- Security and Compliance: Ensuring secure environments and compliance with data protection regulations is crucial. We conduct regular security audits and ensure all environments adhere to compliance standards. Implementing role-based access control (RBAC) and network policies can enhance security.
Conclusion
Our ephemeral environment implementation is essentially a dynamic routing overlay that intercepts and routes specific service requests based on the injected header. This allows us to test new service versions without impacting the live system, providing a powerful tool for continuous integration and deployment.
As software architectures become more complex, solutions like ephemeral environments will play an increasingly crucial role in maintaining agility and reliability in software development and deployment processes. By addressing automation, resource management, and security, we ensure these environments are both effective and sustainable.
In conclusion, while the journey to implementing ephemeral environments can be challenging, the benefits they offer in terms of flexibility, risk mitigation, and rapid iteration make them an invaluable asset in modern software development. By continuously refining our processes and tools, we aim to maximize the potential of these environments and drive innovation forward.
Further Reading
To deepen your understanding of the concepts and technologies discussed in this article, consider exploring these additional resources:
- Kubernetes Ingress: Official Kubernetes documentation on Ingress concepts and implementation.
- Go net/http Package: Documentation for Go’s HTTP client and server implementations.
- Ephemeral Environments: Martin Fowler’s in-depth article on the concept and benefits of ephemeral environments.
- Canary Deployments on Kubernetes: Google Cloud’s guide on implementing canary deployments in Kubernetes.
- CI/CD with Kubernetes: GitLab’s comprehensive guide on integrating CI/CD pipelines with Kubernetes.
- Prometheus and Grafana: Documentation for these popular monitoring and visualization tools.
- Kubernetes Security Best Practices: Official guide on securing Kubernetes deployments.
- Kubernetes Resource Management: Detailed information on managing compute resources in Kubernetes.
- Feature Toggles: Martin Fowler’s article on implementing feature flags, a common use case for ephemeral environments.
These resources provide additional context, best practices, and in-depth explanations that complement the specific implementation described in this article.