How We Cut Cloud Costs by 65% by Migrating from Python to Go

GoLang Python AWS Lambda Serverless AWS Lambda Serverless Cost Optimization

At Vodafone, I worked on an internal platform that automated the provisioning of EKS clusters with standardized networking, security, and compliance configurations. The platform's backend was a suite of AWS Lambda microservices — and in late 2023, I led the effort to rewrite them from Python to Go. The cost savings were dramatic.

The Architecture

Before diving into the migration, it helps to understand what we were working with:

Frontend: A Node.js application running on AWS Elastic Beanstalk, exposed on a custom DNS endpoint for Vodafone and secured by AWS WAF v2. This layer was not part of the Go migration.
Backend: Python-based AWS Lambda microservices exposed via API Gateway, also integrated with WAF v2 for request filtering and rate limiting. This was the target of the refactor.
Database: DynamoDB for all persistent state — pipeline configurations, cluster metadata, provisioning status, and audit logs.
Infrastructure as Code: The entire stack was defined and managed with Terraform.

What the Platform Did

The core business logic was to create and manage AWS CodePipelines that would, in turn, provision EKS clusters. Teams across Vodafone would submit requests through the frontend, and the backend Lambda functions would orchestrate:

Validating cluster specifications against organizational standardizations
Creating CodePipelines with the correct networking topology (VPC, subnets, security groups)
Managing various cluster types (dev, staging, production) with appropriate sizing
Tracking provisioning state and emitting status updates back to the frontend via DynamoDB
Enforcing compliance guardrails — tagging policies, encryption requirements, IAM boundaries

The Problem

Our Python Lambda functions were becoming a significant cost and performance bottleneck. Here's what we were dealing with:

Cold start times: Python Lambdas with their dependency bundles (boto3 overrides, custom SDK layers, validation libraries) were taking 3–5 seconds on cold start — painful for an API Gateway-fronted service where users expect sub-second responses.
Memory allocation: Each Lambda needed 512MB–1GB of memory configured just to handle moderate payloads without timing out. Lambda pricing is directly proportional to memory × duration, so this was expensive.
Deployment package sizes: Python packages with all dependencies bundled were 80–120MB per function, slowing down CI/CD and increasing cold start latency further.
Execution duration: Python's single-threaded nature meant that orchestration-heavy functions (creating CodePipelines, validating specs, writing to DynamoDB) were sequentially chaining AWS SDK calls, driving up billed duration.
Cold start times: Python Lambdas with their dependency bundles (boto3 overrides, custom SDK layers, validation libraries) were taking 3–5 seconds on cold start — painful for an API Gateway-fronted service where users expect sub-second responses.
Memory allocation: Each Lambda needed 512MB–1GB of memory configured just to handle moderate payloads without timing out. Lambda pricing is directly proportional to memory × duration, so this was expensive.
Deployment package sizes: Python packages with all dependencies bundled were 80–120MB per function, slowing down CI/CD and increasing cold start latency further.
Execution duration: Python's single-threaded nature meant that orchestration-heavy functions (creating CodePipelines, validating specs, writing to DynamoDB) were sequentially chaining AWS SDK calls, driving up billed duration.

With dozens of Lambda functions handling hundreds of daily requests, the monthly AWS bill for this platform alone was becoming hard to justify.

Why Go?

We evaluated several alternatives — Rust, Java with GraalVM native images, and Go. Here's why Go won:

First-class AWS Lambda support: The aws-lambda-go runtime is maintained by AWS themselves. It's lean, well-documented, and produces the fastest cold starts of any Lambda runtime outside of custom runtimes on provided.al2.
Goroutines for concurrent AWS SDK calls: The platform's core logic involves orchestrating multiple AWS API calls (CodePipeline, EKS, IAM, DynamoDB). With goroutines and errgroup, we could fan out these calls concurrently — something Python's asyncio could theoretically do, but never cleanly with boto3.
Static binary compilation: No runtime dependencies, no Lambda layers, no pip dependency hell. One compiled binary, deployed as a zip. Deployment packages went from 100MB+ to single-digit MBs.
Predictable performance: No GIL bottleneck, minimal GC pauses, consistent execution duration under load — exactly what you want when Lambda bills you per millisecond.

In serverless, the best language isn't the one with the most features — it's the one with the smallest deployment package, the fastest cold start, and the lowest memory ceiling.

In serverless, the best language isn't the one with the most features — it's the one with the smallest deployment package, the fastest cold start, and the lowest memory ceiling.

The Migration Strategy

We deliberately avoided a "big bang" rewrite. Instead, we followed a disciplined, function-by-function approach over 4 months:

Phase 1: Identify High-Cost Functions

We pulled Lambda cost and performance data from CloudWatch and AWS Cost Explorer. The top functions by monthly spend were the orchestration-heavy ones — those creating CodePipelines, validating cluster specs, and managing DynamoDB state. These consumed roughly 70% of the platform's total Lambda spend. They became our first migration

Phase 1: Identify High-Cost Functions

Phase 2: API-Compatible Rewrite

For each Lambda function, we wrote a Go implementation that accepted the exact same API Gateway event payload and returned identical response structures. This was critical — the Node.js frontend didn't change at all. We used the AWS SDK for Go v2 (aws-sdk-go-v2) for all service interactions and kept the DynamoDB schema untouched.

Phase 3: Canary Deployments

We used Lambda aliases and weighted routing to deploy Go functions alongside their Python counterparts. Initially, 10% of API Gateway traffic was routed to the Go variant via alias weights. We validated response parity, latency profiles, and error rates in CloudWatch before increasing the weight.

Phase 3: Canary Deployments

Phase 4: Gradual Cutover

Once we confirmed behavioral parity, we shifted traffic using weighted aliases — 10%, then 50%, then 100%. Only after 48 hours at full traffic with clean metrics did we decommission the Python functions and remove the old deployment packages from S3.

The Results

After migrating the core Lambda functions over 4 months, here's what we measured:

Metric	Python	Go	Improvement
Lambda memory allocation	512MB–1GB	128MB	75–87% reduction	Lambda memory allocation	512MB–1GB	128MB	75–87% reduction
Cold start time	3–5s	80–150ms	97% faster	80–150ms	97% faster
Deployment package	80–120MB	8–12MB	~90% smaller	Deployment package	80–120MB	8–12MB	~90% smaller
Avg execution duration	800ms	120ms	85% faster	Avg execution duration	800ms	120ms	85% faster
Monthly Lambda + API GW spend	Monthly Lambda + API GW spend	Baseline	0.35× Baseline	65% savings

The 65% cost reduction came from a combination of factors:

Lower memory allocation: Lambda pricing is memory × duration. Dropping from 512MB–1GB to 128MB per function was the single biggest cost lever.
Shorter execution duration: Go's concurrent AWS SDK calls via goroutines slashed orchestration time. Functions that took 800ms in Python completed in 120ms in Go.
Near-instant cold starts: With 80–150ms cold starts, we eliminated the need for provisioned concurrency — which alone was a meaningful line item on the bill.
Lower memory allocation: Lambda pricing is memory × duration. Dropping from 512MB–1GB to 128MB per function was the single biggest cost lever.
Shorter execution duration: Go's concurrent AWS SDK calls via goroutines slashed orchestration time. Functions that took 800ms in Python completed in 120ms in Go.
Near-instant cold starts: With 80–150ms cold starts, we eliminated the need for provisioned concurrency — which alone was a meaningful line item on the bill.

Key Takeaways

1. In serverless, language choice is a cost multiplier. Unlike traditional servers where you pay for uptime regardless, Lambda charges per invocation × memory × duration. A language that halves your memory requirement and execution time delivers 4× cost reduction on compute alone.

2. Go's AWS SDK v2 is excellent. The aws-sdk-go-v2 is modular — you import only the service clients you need. Combined with Go's type system, it caught entire classes of bugs at compile time that Python would have swallowed until runtime.

3. DynamoDB + Go is a natural pairing. The attributevalue package for marshaling/unmarshaling Go structs to DynamoDB items is clean and type-safe. No more juggling boto3 resource vs client APIs.

4. Developer velocity didn't drop. This was our biggest concern. After a 2–3 week ramp-up period, the team was shipping Go Lambda functions at the same pace as Python. Go's simplicity and strong tooling (go fmt, go vet, built-in testing, built-in benchmarking)

5. Terraform made the cutover seamless. Because the entire infrastructure was defined in Terraform, swapping a Lambda function's runtime from Python to Go was a config change — update the handler, the runtime, and the deployment artifact path. No manual console work, fully auditable in version control.

Should You Migrate?

The honest answer: it depends.

If you're running orchestration-heavy, high-throughput Lambda functions behind API Gateway — Go is almost certainly worth the investment. The infrastructure savings alone will justify the engineering effort within a quarter.

If you're building data pipelines, ML inference services, or rapid prototypes — Python is still the right tool. Don't fight the ecosystem.

The key insight is this: language choice directly impacts your serverless costs at scale. In a pay-per-use model where you're billed per GB-second of memory and per millisecond of execution, even small per-function improvements compound into massive reductions when multiplied across thousands of daily invocations.

The 65% we saved isn't an anomaly. It's what happens when you match the right language to the right workload.

Have questions about migrating Lambda functions to Go, or want to discuss serverless cost optimization strategies? Have questions about migrating Lambda functions to Go, or want to discuss serverless cost optimization strategies? Get in touch or connect with me on LinkedIn.