When I joined the infrastructure team at Eiger, the AWS bill was growing every month without a clear explanation. Here is what we did to reverse that — and sustain it.
The Initial Assessment
The first step was getting visibility. AWS Cost Explorer and Trusted Advisor gave us a starting picture, but the real insight came from enabling Cost Allocation Tags and grouping spend by service and environment.
What we found:
- A significant number of EC2 instances were over-provisioned relative to actual CPU and memory utilisation
- No Reserved Instances or Savings Plans had been purchased for steady-state workloads
- Unattached EBS volumes were quietly accumulating charges
- S3 had no lifecycle policies — old logs from years back were sitting in Standard storage
Right-Sizing EC2
We deployed CloudWatch agents to collect memory metrics (CPU alone is not enough). After 30 days of data collection:
- Several instances were candidates for downsizing one or two instance types
- A handful of workloads moved from fixed instance types to burstable T3s
- Dev/test environments went onto a scheduled stop/start cycle outside business hours
The scheduled stop/start automation alone accounted for a meaningful portion of the eventual savings.
Reserved Instances and Savings Plans
Once we had a clearer picture of the steady-state baseline, we committed:
- 1-year Reserved Instances for production databases (known, stable workloads)
- Compute Savings Plans for the variable EC2 fleet
- Kept a proportion on-demand to preserve flexibility for scaling events
The key discipline here is buying for your baseline, not your peaks.
Storage
S3 lifecycle policies were absent across most buckets. We implemented:
- Transition to S3-IA after 30 days for infrequently accessed logs
- Move to Glacier after 90 days
- Delete after defined retention periods for ephemeral environments
- S3 Intelligent-Tiering on buckets with unpredictable access patterns
We also audited EBS volumes — anything unattached for more than 7 days was flagged for deletion after verification.
Automation and Governance
The savings only stick if there is something preventing drift. We built:
- Lambda functions to terminate unattached EBS volumes after a grace period
- SNS alerts for anomalous daily spend increases
- A cost anomaly detection rule in AWS Cost Explorer
The total result: £32K+ annual saving, a 26.6% reduction in the AWS bill, with no degradation in availability or performance.
What I Would Do Differently
Start the tagging earlier. Getting clean cost allocation data retroactively is painful. If you are building a new environment, tag from day one — per environment, per application, per team.
Also: get engineering buy-in before optimising. Some of the largest savings came from engineers volunteering workloads they knew were over-sized. The team knows where the waste is.