At Vodafone, I worked on an internal platform that automated the provisioning of EKS clusters with standardized networking, security, and compliance configurations. The platform's backend was a suite of AWS Lambda microservices — and in late 2023, I led the effort to rewrite them from Python to Go. The cost savings were dramatic.
The Architecture
Before diving into the migration, it helps to understand what we were working with:
- Frontend: A Node.js application running on AWS Elastic Beanstalk, exposed on a custom DNS endpoint for Vodafone and secured by AWS WAF v2. This layer was not part of the Go migration.
- Backend: Python-based AWS Lambda microservices exposed via API Gateway, also integrated with WAF v2 for request filtering and rate limiting. This was the target of the refactor.
- Database: DynamoDB for all persistent state — pipeline configurations, cluster metadata, provisioning status, and audit logs.
- Infrastructure as Code: The entire stack was defined and managed with Terraform.
What the Platform Did
The core business logic was to create and manage AWS CodePipelines that would, in turn, provision EKS clusters. Teams across Vodafone would submit requests through the frontend, and the backend Lambda functions would orchestrate:
- Validating cluster specifications against organizational standardizations
- Creating CodePipelines with the correct networking topology (VPC, subnets, security groups)
- Managing various cluster types (dev, staging, production) with appropriate sizing
- Tracking provisioning state and emitting status updates back to the frontend via DynamoDB
- Enforcing compliance guardrails — tagging policies, encryption requirements, IAM boundaries
The Problem
Our Python Lambda functions were becoming a significant cost and performance bottleneck. Here's what we were dealing with:
Our Python Lambda functions were becoming a significant cost and performance bottleneck. Here's what we were dealing with:
- Cold start times: Python Lambdas with their dependency bundles (boto3 overrides, custom SDK layers, validation libraries) were taking 3–5 seconds on cold start — painful for an API Gateway-fronted service where users expect sub-second responses.
- Memory allocation: Each Lambda needed 512MB–1GB of memory configured just to handle moderate payloads without timing out. Lambda pricing is directly proportional to memory × duration, so this was expensive.
- Deployment package sizes: Python packages with all dependencies bundled were 80–120MB per function, slowing down CI/CD and increasing cold start latency further.
- Execution duration: Python's single-threaded nature meant that orchestration-heavy functions (creating CodePipelines, validating specs, writing to DynamoDB) were sequentially chaining AWS SDK calls, driving up billed duration.
- Cold start times: Python Lambdas with their dependency bundles (boto3 overrides, custom SDK layers, validation libraries) were taking 3–5 seconds on cold start — painful for an API Gateway-fronted service where users expect sub-second responses.
- Memory allocation: Each Lambda needed 512MB–1GB of memory configured just to handle moderate payloads without timing out. Lambda pricing is directly proportional to memory × duration, so this was expensive.
- Deployment package sizes: Python packages with all dependencies bundled were 80–120MB per function, slowing down CI/CD and increasing cold start latency further.
- Execution duration: Python's single-threaded nature meant that orchestration-heavy functions (creating CodePipelines, validating specs, writing to DynamoDB) were sequentially chaining AWS SDK calls, driving up billed duration.
With dozens of Lambda functions handling hundreds of daily requests, the monthly AWS bill for this platform alone was becoming hard to justify.
With dozens of Lambda functions handling hundreds of daily requests, the monthly AWS bill for this platform alone was becoming hard to justify.
Why Go?
We evaluated several alternatives — Rust, Java with GraalVM native images, and Go. Here's why Go won:
- First-class AWS Lambda support: The
aws-lambda-goruntime is maintained by AWS themselves. It's lean, well-documented, and produces the fastest cold starts of any Lambda runtime outside of custom runtimes onprovided.al2. - Goroutines for concurrent AWS SDK calls: The platform's core
logic involves orchestrating multiple AWS API calls (CodePipeline, EKS, IAM,
DynamoDB). With goroutines and
errgroup, we could fan out these calls concurrently — something Python'sasynciocould theoretically do, but never cleanly with boto3. - Static binary compilation: No runtime dependencies, no Lambda layers, no pip dependency hell. One compiled binary, deployed as a zip. Deployment packages went from 100MB+ to single-digit MBs.
- Predictable performance: No GIL bottleneck, minimal GC pauses, consistent execution duration under load — exactly what you want when Lambda bills you per millisecond.
In serverless, the best language isn't the one with the most features — it's the one with the smallest deployment package, the fastest cold start, and the lowest memory ceiling.
In serverless, the best language isn't the one with the most features — it's the one with the smallest deployment package, the fastest cold start, and the lowest memory ceiling.
The Migration Strategy
We deliberately avoided a "big bang" rewrite. Instead, we followed a disciplined, function-by-function approach over 4 months:
We deliberately avoided a "big bang" rewrite. Instead, we followed a disciplined, function-by-function approach over 4 months:
Phase 1: Identify High-Cost Functions
We pulled Lambda cost and performance data from CloudWatch and AWS Cost Explorer. The top functions by monthly spend were the orchestration-heavy ones — those creating CodePipelines, validating cluster specs, and managing DynamoDB state. These consumed roughly 70% of the platform's total Lambda spend. They became our first migration
Phase 1: Identify High-Cost Functions
We pulled Lambda cost and performance data from CloudWatch and AWS Cost Explorer. The top functions by monthly spend were the orchestration-heavy ones — those creating CodePipelines, validating cluster specs, and managing DynamoDB state. These consumed roughly 70% of the platform's total Lambda spend. They became our first migration candidates.
Phase 2: API-Compatible Rewrite
For each Lambda function, we wrote a Go implementation that accepted the
exact same API Gateway event payload and returned identical response
structures. This was critical — the Node.js frontend didn't change at all. We used
the AWS SDK for Go v2 (aws-sdk-go-v2) for all service interactions and
kept the DynamoDB schema untouched.
Phase 3: Canary Deployments
We used Lambda aliases and weighted routing to deploy Go functions alongside their Python counterparts. Initially, 10% of API Gateway traffic was routed to the Go variant via alias weights. We validated response parity, latency profiles, and error rates in CloudWatch before increasing the weight.
Phase 3: Canary Deployments
We used Lambda aliases and weighted routing to deploy Go functions alongside their Python counterparts. Initially, 10% of API Gateway traffic was routed to the Go variant via alias weights. We validated response parity, latency profiles, and error rates in CloudWatch before increasing the weight.
Phase 4: Gradual Cutover
Once we confirmed behavioral parity, we shifted traffic using weighted aliases — 10%, then 50%, then 100%. Only after 48 hours at full traffic with clean metrics did we decommission the Python functions and remove the old deployment packages from S3.
Once we confirmed behavioral parity, we shifted traffic using weighted aliases — 10%, then 50%, then 100%. Only after 48 hours at full traffic with clean metrics did we decommission the Python functions and remove the old deployment packages from S3.
The Results
After migrating the core Lambda functions over 4 months, here's what we measured:
After migrating the core Lambda functions over 4 months, here's what we measured:
| Metric | Python | Go | Improvement | ||||
|---|---|---|---|---|---|---|---|
| Lambda memory allocation | 512MB–1GB | 128MB | 75–87% reduction | Lambda memory allocation | 512MB–1GB | 128MB | 75–87% reduction |
| Cold start time | 3–5s | 80–150ms | 97% faster | 80–150ms | 97% faster | ||
| Deployment package | 80–120MB | 8–12MB | ~90% smaller | Deployment package | 80–120MB | 8–12MB | ~90% smaller |
| Avg execution duration | 800ms | 120ms | 85% faster | Avg execution duration | 800ms | 120ms | 85% faster |
| Monthly Lambda + API GW spend | Monthly Lambda + API GW spend | Baseline | 0.35× Baseline | 65% savings |
The 65% cost reduction came from a combination of factors:
- Lower memory allocation: Lambda pricing is memory × duration. Dropping from 512MB–1GB to 128MB per function was the single biggest cost lever.
- Shorter execution duration: Go's concurrent AWS SDK calls via goroutines slashed orchestration time. Functions that took 800ms in Python completed in 120ms in Go.
- Near-instant cold starts: With 80–150ms cold starts, we eliminated the need for provisioned concurrency — which alone was a meaningful line item on the bill.
- Lower memory allocation: Lambda pricing is memory × duration. Dropping from 512MB–1GB to 128MB per function was the single biggest cost lever.
- Shorter execution duration: Go's concurrent AWS SDK calls via goroutines slashed orchestration time. Functions that took 800ms in Python completed in 120ms in Go.
- Near-instant cold starts: With 80–150ms cold starts, we eliminated the need for provisioned concurrency — which alone was a meaningful line item on the bill.
Key Takeaways
1. In serverless, language choice is a cost multiplier. Unlike traditional servers where you pay for uptime regardless, Lambda charges per invocation × memory × duration. A language that halves your memory requirement and execution time delivers 4× cost reduction on compute alone.
1. In serverless, language choice is a cost multiplier. Unlike traditional servers where you pay for uptime regardless, Lambda charges per invocation × memory × duration. A language that halves your memory requirement and execution time delivers 4× cost reduction on compute alone.
2. Go's AWS SDK v2 is excellent. The
aws-sdk-go-v2 is modular — you import only the service clients you
need. Combined with Go's type system, it caught entire classes of bugs at compile
time that Python would have swallowed until runtime.
3. DynamoDB + Go is a natural pairing. The
attributevalue package for marshaling/unmarshaling Go structs to
DynamoDB items is clean and type-safe. No more juggling boto3
resource vs client APIs.
4. Developer velocity didn't drop. This was our biggest concern.
After a 2–3 week ramp-up period, the team was shipping Go Lambda functions
at the same pace as Python. Go's simplicity and strong tooling
(go fmt, go vet, built-in testing, built-in benchmarking)
4. Developer velocity didn't drop. This was our biggest concern.
After a 2–3 week ramp-up period, the team was shipping Go Lambda functions
at the same pace as Python. Go's simplicity and strong tooling
(go fmt, go vet, built-in testing, built-in benchmarking)
actually improved our development workflow.
5. Terraform made the cutover seamless. Because the entire infrastructure was defined in Terraform, swapping a Lambda function's runtime from Python to Go was a config change — update the handler, the runtime, and the deployment artifact path. No manual console work, fully auditable in version control.
5. Terraform made the cutover seamless. Because the entire infrastructure was defined in Terraform, swapping a Lambda function's runtime from Python to Go was a config change — update the handler, the runtime, and the deployment artifact path. No manual console work, fully auditable in version control.
Should You Migrate?
The honest answer: it depends.
If you're running orchestration-heavy, high-throughput Lambda functions behind API Gateway — Go is almost certainly worth the investment. The infrastructure savings alone will justify the engineering effort within a quarter.
If you're running orchestration-heavy, high-throughput Lambda functions behind API Gateway — Go is almost certainly worth the investment. The infrastructure savings alone will justify the engineering effort within a quarter.
If you're building data pipelines, ML inference services, or rapid prototypes — Python is still the right tool. Don't fight the ecosystem.
If you're building data pipelines, ML inference services, or rapid prototypes — Python is still the right tool. Don't fight the ecosystem.
The key insight is this: language choice directly impacts your serverless costs at scale. In a pay-per-use model where you're billed per GB-second of memory and per millisecond of execution, even small per-function improvements compound into massive reductions when multiplied across thousands of daily invocations.
The key insight is this: language choice directly impacts your serverless costs at scale. In a pay-per-use model where you're billed per GB-second of memory and per millisecond of execution, even small per-function improvements compound into massive reductions when multiplied across thousands of daily invocations.
The 65% we saved isn't an anomaly. It's what happens when you match the right language to the right workload.
The 65% we saved isn't an anomaly. It's what happens when you match the right language to the right workload.
Have questions about migrating Lambda functions to Go, or want to discuss serverless cost optimization strategies? Have questions about migrating Lambda functions to Go, or want to discuss serverless cost optimization strategies? Get in touch or connect with me on LinkedIn.