⚡
Milan.dev
>Home>Projects>Experience>Blog
GitHubLinkedIn
status: building
>Home>Projects>Experience>Blog
status: building

Connect

Let's collaborate on infrastructure challenges

Open to discussing DevOps strategies, cloud architecture optimization, security implementations, and interesting infrastructure problems.

send a message→

Find me elsewhere

GitHub
@milandangol
LinkedIn
/in/milan-dangol
Email
milandangol57@gmail.com
Forged with& code

© 2026 Milan Dangol — All systems reserved

back to blog
cloudfeatured

Cloud FinOps Framework: AWS Cost Intelligence Dashboard, Budgets, and Cost Anomaly Detection for Enterprise Cost Governance

Architecting a FinOps framework that reduced cloud costs by 30% and delivered predictable spend - featuring Cost Intelligence Dashboard, automated anomaly detection, chargeback mechanisms, and executive-level cost visibility.

M

Milan Dangol

Sr DevOps & DevSecOps Engineer

Jun 22, 2025
11 min read

Introduction

Cloud costs can quickly spiral out of control without proper governance. I inherited a multi-account AWS environment spending $2M annually with no cost visibility, no accountability, and monthly surprises. I built a FinOps framework that brought costs under control and changed the culture around cloud spending.

The transformation delivered:

  • 30% cost reduction ($600K annual savings)
  • Predictable monthly spend with 95% forecast accuracy
  • Per-team chargeback creating cost accountability
  • Automated anomaly detection catching issues in hours, not weeks

Architecture Overview

flowchart TB subgraph Sources["Cost Data Sources"] CUR[Cost & Usage Report] ORG[AWS Organizations] TAGS[Resource Tags] end subgraph Processing["Data Processing"] ATHENA[Athena Queries] GLUE[Glue ETL Jobs] LAMBDA[Lambda Processors] end subgraph Analysis["Cost Analysis"] CID[Cost Intelligence Dashboard] ANOMALY[Cost Anomaly Detection] BUDGETS[AWS Budgets] FORECAST[Cost Forecasting] end subgraph Reporting["Reporting & Actions"] QS[QuickSight Dashboards] SNS[SNS Notifications] SLACK[Slack Integration] TICKETS[Automated Tickets] end subgraph Governance["Governance"] POLICIES[Cost Policies] QUOTAS[Service Quotas] TAGGING[Tagging Standards] end Sources --> Processing Processing --> Analysis Analysis --> Reporting Governance --> Sources style Sources fill:#1a1a2e,stroke:#00d9ff,stroke-width:2px,color:#fff style Processing fill:#264653,stroke:#2a9d8f,stroke-width:2px,color:#fff style Analysis fill:#f77f00,stroke:#fff,stroke-width:2px,color:#fff style Reporting fill:#2a9d8f,stroke:#fff,stroke-width:2px,color:#fff style Governance fill:#9b5de5,stroke:#fff,stroke-width:2px,color:#fff

Cost & Usage Report Setup

# cost-reporting/cur.tf

resource "aws_cur_report_definition" "enterprise" {
  report_name                = "enterprise-cost-usage-report"
  time_unit                  = "HOURLY"
  format                     = "Parquet"
  compression                = "Parquet"
  additional_schema_elements = ["RESOURCES", "SPLIT_COST_ALLOCATION_DATA"]

  s3_bucket = aws_s3_bucket.cur.id
  s3_region = "us-east-1"
  s3_prefix = "cur"

  additional_artifacts = ["ATHENA"]

  report_versioning = "OVERWRITE_REPORT"

  refresh_closed_reports = true
}

resource "aws_s3_bucket" "cur" {
  bucket = "company-cost-usage-reports"

  tags = {
    Purpose = "Cost & Usage Reports"
    Compliance = "required"
  }
}

resource "aws_s3_bucket_policy" "cur" {
  bucket = aws_s3_bucket.cur.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "AllowCURDelivery"
        Effect = "Allow"
        Principal = {
          Service = "billingreports.amazonaws.com"
        }
        Action = [
          "s3:GetBucketAcl",
          "s3:GetBucketPolicy"
        ]
        Resource = aws_s3_bucket.cur.arn
      },
      {
        Sid    = "AllowCURWrite"
        Effect = "Allow"
        Principal = {
          Service = "billingreports.amazonaws.com"
        }
        Action   = "s3:PutObject"
        Resource = "${aws_s3_bucket.cur.arn}/*"
      }
    ]
  })
}

# Athena setup for CUR queries
resource "aws_athena_workgroup" "cur" {
  name = "cur-analysis"

  configuration {
    enforce_workgroup_configuration    = true
    publish_cloudwatch_metrics_enabled = true

    result_configuration {
      output_location = "s3://${aws_s3_bucket.athena_results.bucket}/output/"

      encryption_configuration {
        encryption_option = "SSE_S3"
      }
    }

    engine_version {
      selected_engine_version = "Athena engine version 3"
    }
  }

  tags = {
    Team = "finops"
  }
}

resource "aws_glue_catalog_database" "cur" {
  name = "cur_database"

  description = "Cost and Usage Report database"
}

Cost Intelligence Dashboard

# quicksight/cid.tf

# Deploy Cost Intelligence Dashboard using CloudFormation
resource "aws_cloudformation_stack" "cid" {
  name = "cost-intelligence-dashboard"

  template_url = "https://aws-well-architected-labs.s3.amazonaws.com/Cost/Labs/400_Cost_Intelligence_Dashboard/cid-cfn.yaml"

  parameters = {
    QuickSightUserName        = var.quicksight_admin_user
    CURBucket                 = aws_s3_bucket.cur.id
    CURDatabaseName           = aws_glue_catalog_database.cur.name
    CURTableName              = "cost_and_usage_report"
    OptimizationDataCollectionAccountID = var.management_account_id
  }

  capabilities = ["CAPABILITY_IAM", "CAPABILITY_NAMED_IAM"]

  tags = {
    Dashboard = "CID"
    Team      = "finops"
  }
}

# QuickSight data source
resource "aws_quicksight_data_source" "athena_cur" {
  data_source_id = "athena-cur"
  name           = "Athena CUR Data Source"
  type           = "ATHENA"

  parameters {
    athena {
      work_group = aws_athena_workgroup.cur.id
    }
  }

  ssl_properties {
    disable_ssl = false
  }

  aws_account_id = data.aws_caller_identity.current.account_id

  permission {
    principal = aws_quicksight_group.finops.arn
    actions = [
      "quicksight:DescribeDataSource",
      "quicksight:DescribeDataSourcePermissions",
      "quicksight:PassDataSource",
      "quicksight:UpdateDataSource",
      "quicksight:UpdateDataSourcePermissions"
    ]
  }
}

# Custom analysis for executive dashboard
resource "aws_quicksight_analysis" "executive_summary" {
  analysis_id = "executive-cost-summary"
  name        = "Executive Cost Summary"

  source_entity {
    source_template {
      arn = aws_quicksight_template.executive.arn

      data_set_references {
        data_set_arn         = aws_quicksight_data_set.monthly_costs.arn
        data_set_placeholder = "monthlycosts"
      }
    }
  }

  aws_account_id = data.aws_caller_identity.current.account_id

  permissions {
    principal = aws_quicksight_group.executives.arn
    actions = [
      "quicksight:RestoreAnalysis",
      "quicksight:UpdateAnalysisPermissions",
      "quicksight:DeleteAnalysis",
      "quicksight:DescribeAnalysisPermissions",
      "quicksight:QueryAnalysis",
      "quicksight:DescribeAnalysis",
      "quicksight:UpdateAnalysis"
    ]
  }
}

Cost Anomaly Detection

flowchart TD subgraph Detection["Anomaly Detection Flow"] COLLECT[Collect hourly cost data] ML[ML Model analyzes patterns] DETECT[Detect anomalies] COLLECT --> ML ML --> DETECT end subgraph Evaluation["Anomaly Evaluation"] THRESHOLD{Cost increase > threshold?} CONTEXT[Evaluate context] CLASSIFY[Classify severity] DETECT --> THRESHOLD THRESHOLD -->|Yes| CONTEXT CONTEXT --> CLASSIFY end subgraph Response["Response Actions"] ALERT_LOW[Low: Email notification] ALERT_MED[Medium: Slack + Email] ALERT_HIGH[High: PagerDuty + Slack] TICKET[Create Jira ticket] CLASSIFY --> ALERT_LOW CLASSIFY --> ALERT_MED CLASSIFY --> ALERT_HIGH ALERT_HIGH --> TICKET end style Detection fill:#1a1a2e,stroke:#00d9ff,stroke-width:2px,color:#fff style Evaluation fill:#f77f00,stroke:#fff,stroke-width:2px,color:#fff style Response fill:#e63946,stroke:#fff,stroke-width:2px,color:#fff
# cost-anomaly/main.tf

resource "aws_ce_anomaly_monitor" "service_monitor" {
  name              = "service-cost-monitor"
  monitor_type      = "DIMENSIONAL"
  monitor_dimension = "SERVICE"

  tags = {
    Team = "finops"
  }
}

resource "aws_ce_anomaly_monitor" "account_monitor" {
  name              = "account-cost-monitor"
  monitor_type      = "DIMENSIONAL"
  monitor_dimension = "LINKED_ACCOUNT"
}

# High impact anomalies
resource "aws_ce_anomaly_subscription" "high_impact" {
  name      = "high-impact-anomalies"
  frequency = "IMMEDIATE"

  monitor_arn_list = [
    aws_ce_anomaly_monitor.service_monitor.arn,
    aws_ce_anomaly_monitor.account_monitor.arn,
  ]

  subscriber {
    type    = "SNS"
    address = aws_sns_topic.cost_alerts.arn
  }

  threshold_expression {
    and {
      dimension {
        key           = "ANOMALY_TOTAL_IMPACT_ABSOLUTE"
        values        = ["500"]
        match_options = ["GREATER_THAN_OR_EQUAL"]
      }
    }
  }

  tags = {
    Severity = "high"
  }
}

# Daily summary of all anomalies
resource "aws_ce_anomaly_subscription" "daily_summary" {
  name      = "daily-anomaly-summary"
  frequency = "DAILY"

  monitor_arn_list = [
    aws_ce_anomaly_monitor.service_monitor.arn,
    aws_ce_anomaly_monitor.account_monitor.arn,
  ]

  subscriber {
    type    = "EMAIL"
    address = "finops-team@company.com"
  }

  threshold_expression {
    and {
      dimension {
        key           = "ANOMALY_TOTAL_IMPACT_ABSOLUTE"
        values        = ["100"]
        match_options = ["GREATER_THAN_OR_EQUAL"]
      }
    }
  }
}

# Lambda to process anomalies and create tickets
resource "aws_lambda_function" "anomaly_processor" {
  filename         = "anomaly_processor.zip"
  function_name    = "cost-anomaly-processor"
  role            = aws_iam_role.anomaly_processor.arn
  handler         = "index.handler"
  runtime         = "python3.11"
  timeout         = 60

  environment {
    variables = {
      SLACK_WEBHOOK   = var.slack_webhook_url
      JIRA_API_URL    = var.jira_api_url
      JIRA_API_TOKEN  = var.jira_api_token
      SEVERITY_THRESHOLD = "500"
    }
  }
}

resource "aws_sns_topic_subscription" "anomaly_to_lambda" {
  topic_arn = aws_sns_topic.cost_alerts.arn
  protocol  = "lambda"
  endpoint  = aws_lambda_function.anomaly_processor.arn
}

Budget Management

# budgets/hierarchical.tf

locals {
  teams = {
    platform = {
      monthly_budget = 15000
      contacts       = ["platform-leads@company.com"]
      services      = ["EC2", "EKS", "RDS"]
    }
    data = {
      monthly_budget = 25000
      contacts       = ["data-leads@company.com"]
      services      = ["EMR", "Glue", "Athena", "S3"]
    }
    ml = {
      monthly_budget = 30000
      contacts       = ["ml-leads@company.com"]
      services      = ["SageMaker", "Bedrock", "EC2"]
    }
  }
}

# Team-level budgets
resource "aws_budgets_budget" "team_budgets" {
  for_each = local.teams

  name         = "${each.key}-monthly-budget"
  budget_type  = "COST"
  limit_amount = each.value.monthly_budget
  limit_unit   = "USD"
  time_unit    = "MONTHLY"

  cost_filter {
    name   = "TagKeyValue"
    values = ["user:Team$${each.key}"]
  }

  notification {
    comparison_operator = "GREATER_THAN"
    threshold           = 80
    threshold_type      = "PERCENTAGE"
    notification_type   = "ACTUAL"

    subscriber_email_addresses = each.value.contacts
  }

  notification {
    comparison_operator = "GREATER_THAN"
    threshold           = 100
    threshold_type      = "PERCENTAGE"
    notification_type   = "FORECASTED"

    subscriber_email_addresses = concat(
      each.value.contacts,
      ["cfo@company.com"]
    )
  }

  notification {
    comparison_operator = "GREATER_THAN"
    threshold           = 120
    threshold_type      = "PERCENTAGE"
    notification_type   = "ACTUAL"

    subscriber_email_addresses = concat(
      each.value.contacts,
      ["cfo@company.com", "cto@company.com"]
    )

    subscriber_sns_topic_arns = [aws_sns_topic.budget_breach.arn]
  }
}

# Organization-level budget
resource "aws_budgets_budget" "organizational" {
  name         = "organizational-monthly-budget"
  budget_type  = "COST"
  limit_amount = "150000"
  limit_unit   = "USD"
  time_unit    = "MONTHLY"

  notification {
    comparison_operator = "GREATER_THAN"
    threshold           = 90
    threshold_type      = "PERCENTAGE"
    notification_type   = "FORECASTED"

    subscriber_email_addresses = [
      "cfo@company.com",
      "cto@company.com"
    ]
  }
}

# Service-specific budgets for high-cost services
resource "aws_budgets_budget" "ec2_compute" {
  name         = "ec2-compute-budget"
  budget_type  = "COST"
  limit_amount = "50000"
  limit_unit   = "USD"
  time_unit    = "MONTHLY"

  cost_filter {
    name = "Service"
    values = ["Amazon Elastic Compute Cloud - Compute"]
  }

  notification {
    comparison_operator = "GREATER_THAN"
    threshold           = 85
    threshold_type      = "PERCENTAGE"
    notification_type   = "ACTUAL"

    subscriber_sns_topic_arns = [aws_sns_topic.cost_alerts.arn]
  }
}

Cost Allocation Tags

# tagging/cost-allocation.tf

# Activate cost allocation tags
resource "aws_ce_cost_allocation_tag" "team" {
  tag_key = "Team"
  status  = "Active"
}

resource "aws_ce_cost_allocation_tag" "environment" {
  tag_key = "Environment"
  status  = "Active"
}

resource "aws_ce_cost_allocation_tag" "project" {
  tag_key = "Project"
  status  = "Active"
}

resource "aws_ce_cost_allocation_tag" "cost_center" {
  tag_key = "CostCenter"
  status  = "Active"
}

# Tag policy for Organizations
resource "aws_organizations_policy" "tagging_policy" {
  name        = "RequiredTagsPolicy"
  description = "Enforce required cost allocation tags"
  type        = "TAG_POLICY"

  content = jsonencode({
    tags = {
      Team = {
        tag_key = {
          "@@assign" = "Team"
        }
        enforced_for = {
          "@@assign" = [
            "ec2:instance",
            "ec2:volume",
            "rds:db",
            "s3:bucket",
            "dynamodb:table",
            "lambda:function"
          ]
        }
      }
      Environment = {
        tag_key = {
          "@@assign" = "Environment"
        }
        tag_value = {
          "@@assign" = ["production", "staging", "development", "sandbox"]
        }
        enforced_for = {
          "@@assign" = [
            "ec2:*",
            "rds:*",
            "s3:*"
          ]
        }
      }
      CostCenter = {
        tag_key = {
          "@@assign" = "CostCenter"
        }
        enforced_for = {
          "@@assign" = ["*"]
        }
      }
    }
  })
}

resource "aws_organizations_policy_attachment" "tagging_workloads" {
  policy_id = aws_organizations_policy.tagging_policy.id
  target_id = aws_organizations_organizational_unit.workloads.id
}

Cost Optimization Automation

# lambda/cost-optimization-recommendations.py
import boto3
import json
from datetime import datetime, timedelta

ce_client = boto3.client('ce')
ec2_client = boto3.client('ec2')
rds_client = boto3.client('rds')

def lambda_handler(event, context):
    """Generate cost optimization recommendations"""

    recommendations = []

    # Find idle EC2 instances
    recommendations.extend(find_idle_ec2_instances())

    # Find unattached EBS volumes
    recommendations.extend(find_unattached_volumes())

    # Find old snapshots
    recommendations.extend(find_old_snapshots())

    # Find underutilized RDS instances
    recommendations.extend(find_underutilized_rds())

    # Calculate total potential savings
    total_savings = sum(r['monthly_savings'] for r in recommendations)

    # Send report
    send_recommendations_report(recommendations, total_savings)

    return {
        'statusCode': 200,
        'body': json.dumps({
            'recommendations_count': len(recommendations),
            'potential_monthly_savings': total_savings
        })
    }

def find_idle_ec2_instances():
    """Find EC2 instances with low CPU utilization"""
    cloudwatch = boto3.client('cloudwatch')
    recommendations = []

    instances = ec2_client.describe_instances(
        Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]
    )

    for reservation in instances['Reservations']:
        for instance in reservation['Instances']:
            instance_id = instance['InstanceId']

            # Get CPU utilization for last 7 days
            metrics = cloudwatch.get_metric_statistics(
                Namespace='AWS/EC2',
                MetricName='CPUUtilization',
                Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
                StartTime=datetime.now() - timedelta(days=7),
                EndTime=datetime.now(),
                Period=3600,
                Statistics=['Average']
            )

            if metrics['Datapoints']:
                avg_cpu = sum(d['Average'] for d in metrics['Datapoints']) / len(metrics['Datapoints'])

                if avg_cpu < 5:  # Less than 5% average CPU
                    # Calculate cost
                    instance_type = instance['InstanceType']
                    monthly_cost = get_instance_cost(instance_type)

                    recommendations.append({
                        'type': 'idle_ec2',
                        'resource_id': instance_id,
                        'instance_type': instance_type,
                        'avg_cpu': round(avg_cpu, 2),
                        'monthly_savings': monthly_cost,
                        'recommendation': 'Stop or terminate idle instance',
                        'priority': 'high'
                    })

    return recommendations

def find_unattached_volumes():
    """Find EBS volumes not attached to any instance"""
    recommendations = []

    volumes = ec2_client.describe_volumes(
        Filters=[{'Name': 'status', 'Values': ['available']}]
    )

    for volume in volumes['Volumes']:
        volume_id = volume['VolumeId']
        size_gb = volume['Size']
        volume_type = volume['VolumeType']

        # Calculate monthly cost (rough estimate)
        cost_per_gb = {'gp3': 0.08, 'gp2': 0.10, 'io1': 0.125, 'io2': 0.125}
        monthly_cost = size_gb * cost_per_gb.get(volume_type, 0.10)

        recommendations.append({
            'type': 'unattached_volume',
            'resource_id': volume_id,
            'size_gb': size_gb,
            'volume_type': volume_type,
            'monthly_savings': monthly_cost,
            'recommendation': 'Delete unused volume or create snapshot',
            'priority': 'medium'
        })

    return recommendations

def send_recommendations_report(recommendations, total_savings):
    """Send recommendations via email and Slack"""
    sns = boto3.client('sns')

    message = f"""
Cost Optimization Recommendations

Total Potential Monthly Savings: ${total_savings:,.2f}

Recommendations: {len(recommendations)}
- High Priority: {len([r for r in recommendations if r['priority'] == 'high'])}
- Medium Priority: {len([r for r in recommendations if r['priority'] == 'medium'])}
- Low Priority: {len([r for r in recommendations if r['priority'] == 'low'])}

View detailed report: https://quicksight.aws.amazon.com/cost-optimization
"""

    sns.publish(
        TopicArn=os.environ['SNS_TOPIC_ARN'],
        Subject='Weekly Cost Optimization Recommendations',
        Message=message
    )

Chargeback Dashboard

-- athena/queries/team-chargeback.sql

-- Monthly cost per team
CREATE OR REPLACE VIEW team_monthly_costs AS
SELECT
    bill_payer_account_id,
    line_item_usage_account_id as account_id,
    resource_tags_user_team as team,
    DATE_TRUNC('month', line_item_usage_start_date) as month,
    line_item_product_code as service,
    SUM(line_item_unblended_cost) as total_cost,
    SUM(CASE WHEN line_item_line_item_type = 'Usage' THEN line_item_unblended_cost ELSE 0 END) as usage_cost,
    SUM(CASE WHEN line_item_line_item_type = 'SavingsPlanCoveredUsage' THEN line_item_unblended_cost ELSE 0 END) as savings_plan_cost
FROM
    cur_database.cost_and_usage_report
WHERE
    line_item_line_item_type IN ('Usage', 'SavingsPlanCoveredUsage', 'DiscountedUsage')
    AND resource_tags_user_team IS NOT NULL
GROUP BY
    1, 2, 3, 4, 5;

-- Top 10 cost drivers per team
CREATE OR REPLACE VIEW team_top_costs AS
WITH ranked_costs AS (
    SELECT
        team,
        month,
        service,
        total_cost,
        ROW_NUMBER() OVER (PARTITION BY team, month ORDER BY total_cost DESC) as rank
    FROM team_monthly_costs
)
SELECT *
FROM ranked_costs
WHERE rank <= 10;

-- Month-over-month cost change
CREATE OR REPLACE VIEW team_cost_trends AS
SELECT
    curr.team,
    curr.month as current_month,
    curr.total_cost as current_cost,
    prev.total_cost as previous_cost,
    curr.total_cost - prev.total_cost as cost_change,
    ROUND(((curr.total_cost - prev.total_cost) / NULLIF(prev.total_cost, 0)) * 100, 2) as percent_change
FROM team_monthly_costs curr
LEFT JOIN team_monthly_costs prev
    ON curr.team = prev.team
    AND curr.month = DATE_ADD('month', 1, prev.month)
WHERE curr.month = DATE_TRUNC('month', CURRENT_DATE);

Cost Governance Policies

flowchart TD subgraph Preventive["Preventive Controls"] SCP[Service Control Policies] QUOTA[Service Quotas] BUDGET_ACTION[Budget Actions] end subgraph Detective["Detective Controls"] ANOMALY[Anomaly Detection] TAGGING[Tag Compliance] UNUSED[Unused Resource Detection] end subgraph Corrective["Corrective Actions"] AUTO_STOP[Auto-stop resources] ALERT[Alert owners] TICKET[Create remediation ticket] end Preventive --> Detective Detective --> Corrective style Preventive fill:#2a9d8f,stroke:#fff,stroke-width:2px,color:#fff style Detective fill:#f77f00,stroke:#fff,stroke-width:2px,color:#fff style Corrective fill:#e63946,stroke:#fff,stroke-width:2px,color:#fff

Results: 30% Cost Reduction

flowchart LR subgraph Before["Before FinOps (Monthly)"] B_COMPUTE["Compute: $80K<br/>Over-provisioned"] B_STORAGE["Storage: $35K<br/>Unused volumes"] B_DATA["Data Transfer: $15K<br/>Unoptimized"] B_OTHER["Other: $20K"] B_TOTAL["Total: $150K/month"] end subgraph After["After FinOps (Monthly)"] A_COMPUTE["Compute: $52K<br/>Right-sized + Spot"] A_STORAGE["Storage: $22K<br/>Cleaned up"] A_DATA["Data Transfer: $10K<br/>Optimized"] A_OTHER["Other: $21K"] A_TOTAL["Total: $105K/month"] end Before ==> After subgraph Savings["Annual Savings"] COMPUTE_SAVE["Compute: $336K"] STORAGE_SAVE["Storage: $156K"] DATA_SAVE["Data: $60K"] TOTAL_SAVE["Total: $540K/year<br/>30% reduction"] end After ==> Savings style Before fill:#e63946,stroke:#fff,stroke-width:2px,color:#fff style After fill:#2a9d8f,stroke:#fff,stroke-width:2px,color:#fff style Savings fill:#ffbe0b,stroke:#fff,stroke-width:2px,color:#000

Best Practices

Practice Implementation Impact
Tag everything Enforce tag policies 100% cost visibility
Right-size resources Weekly recommendations 20-30% savings
Use Savings Plans Automated purchase 40-70% discount
Delete unused resources Automated cleanup 10-15% savings
Monitor anomalies ML-based detection Catch issues early
Implement chargeback Per-team dashboards Accountability

Troubleshooting

"CUR data not appearing in Athena"

# Check CUR delivery
aws cur describe-report-definitions

# Verify S3 bucket
aws s3 ls s3://company-cost-usage-reports/cur/

# Check Glue crawler
aws glue get-crawler --name cur-crawler

"Budget notifications not working"

  • Verify SNS topic subscriptions confirmed
  • Check budget threshold configuration
  • Ensure cost allocation tags are active

"QuickSight dashboard errors"

  • Refresh SPICE datasets
  • Check Athena query permissions
  • Verify data source connections

Conclusion

Building a FinOps framework transforms cloud cost management from reactive firefighting to proactive optimization. The combination of:

  • Cost & Usage Reports for detailed cost data
  • Cost Intelligence Dashboard for executive visibility
  • Anomaly Detection for early issue identification
  • Budgets & Alerts for proactive governance
  • Chargeback mechanisms for team accountability

Delivered 30% cost reduction ($540K annually) while creating a culture of cost awareness. The key is making cost data visible, actionable, and tied to team ownership.

Share this article

Tags

#aws#finops#cost-optimization#quicksight#cost-intelligence#budgets

Related Articles

system-design13 min read

Payment Processing System at Scale: Stripe/Adyen Integration with AWS EventBridge, Lambda, and DynamoDB

Building a payment processing system handling millions of daily transactions - featuring EventBridge for event-driven orchestration, Lambda for serverless processing, DynamoDB for transaction state, idempotency guarantees, and real-time fraud detection with Kinesis.

system-design12 min read

AI Chatbot System Architecture: WhatsApp Business API, Facebook Messenger, and AWS Bedrock Integration

Designing a multi-channel AI chatbot system handling 5M+ conversations monthly - featuring AWS Bedrock for conversational AI, SQS for message queuing, DynamoDB for conversation state, and Lambda for serverless processing across WhatsApp and Facebook Messenger.

cloud9 min read

Multi-Region AWS Infrastructure for Resilience: A Terraform Deep Dive

Learn how to architect highly available, multi-region AWS infrastructure using Terraform, Transit Gateway, Network Load Balancers, and intelligent routing strategies for enterprise-grade applications.