deployment
PassServerless deployment with zero-downtime, multi-environment strategies, and infrastructure validation. Use when deploying Lambda functions, managing environments, or troubleshooting deployment failures.
(0)
82
10
11
Install Skill
Skills are third-party code from public GitHub repositories. SkillHub scans for known malicious patterns but cannot guarantee safety. Review the source code before installing.
Install globally (user-level):
npx skillhub install majiayu000/claude-skill-registry/deploymentInstall in current project:
npx skillhub install majiayu000/claude-skill-registry/deployment --projectSuggested path: ~/.claude/skills/deployment/
AI Review
Instruction Quality75
Description Precision65
Usefulness63
Technical Soundness75
Scored 69 for a well-structured deployment skill with decision tree, real AWS commands, and multi-layer validation. Strong technical execution with zero-downtime patterns. Limited to AWS Lambda ecosystem. Would benefit from cross-cloud examples.
SKILL.md Content
---
name: deployment
description: Serverless deployment with zero-downtime, multi-environment strategies, and infrastructure validation. Use when deploying Lambda functions, managing environments, or troubleshooting deployment failures.
---
# Deployment Skill
**Tech Stack**: AWS Lambda, Docker, Terraform, GitHub Actions, Doppler (secrets)
**Source**: Extracted from CLAUDE.md deployment principles and production deployment patterns.
---
## When to Use This Skill
Use the deployment skill when:
- ✓ Deploying Lambda functions to AWS
- ✓ Managing multi-environment deployments (dev/staging/production)
- ✓ Implementing zero-downtime deployments
- ✓ Troubleshooting deployment failures
- ✓ Validating infrastructure configuration
- ✓ Setting up CI/CD pipelines
- ✓ Pre-deployment validation (see [lambda-deployment checklist](../../checklists/lambda-deployment.md))
**DO NOT use this skill for:**
- ✗ Local development setup (use project README instead)
- ✗ Running tests (use testing-workflow skill)
- ✗ Code refactoring (use refactor skill)
**Quick Links**:
- **Pre-deployment checklist**: [lambda-deployment.md](../../checklists/lambda-deployment.md) - Systematic verification before deploying Lambda
- **Error investigation**: [error-investigation skill](../error-investigation/) - AWS-specific troubleshooting
- **Testing workflow**: [testing-workflow skill](../testing-workflow/) - Docker-based testing for deployment fidelity
---
## Quick Deployment Decision Tree
```
What are you deploying?
├─ Lambda function update?
│ ├─ Code change only? → Update function code, wait for update
│ ├─ Config change (env vars, memory)? → Update configuration, wait
│ ├─ Zero-downtime required? → Use versioning + alias pattern
│ └─ Rollback needed? → Point alias to previous version
│
├─ New environment?
│ ├─ Branch-based (dev/staging/prod)? → Follow multi-env guide
│ ├─ Secrets setup? → Configure Doppler + GitHub secrets
│ └─ Infrastructure? → Terraform apply
│
├─ Deployment failed?
│ ├─ Check CloudWatch logs → Filter ERROR level
│ ├─ Validate secrets → Run validation script
│ ├─ Check resource state → AWS CLI describe commands
│ └─ Verify permissions → IAM policy validation
│
└─ CI/CD pipeline setup?
├─ Define environments → dev/staging/prod
├─ Configure artifact promotion → Immutable images
├─ Add validation gates → Pre-deploy checks
└─ Setup monitoring → CloudWatch + smoke tests
```
---
## Core Deployment Patterns
### Pattern 1: Zero-Downtime Lambda Deployment
**Problem:** Updating Lambda function causes brief unavailability during deployment.
**Solution:** Version + Alias pattern
```
$LATEST (mutable, testing)
↓ publish version
Version N (immutable snapshot)
↓ update alias
live (production pointer) → Version N
```
**Benefits:**
- Zero downtime (alias swap is atomic)
- Instant rollback (point alias to previous version)
- Test before promotion ($LATEST → Version)
See [ZERO_DOWNTIME.md](ZERO_DOWNTIME.md) for detailed patterns.
### Pattern 2: Multi-Environment Strategy
**Branch-Based Deployment:**
```
dev branch → dev environment (~8 min)
↓ PR
main branch → staging environment (~10 min)
↓ Tag v*.*.*
production environment (~12 min)
```
**Artifact Promotion:**
- Build once in dev
- Same immutable Docker image promoted to staging/prod
- What you test is what you deploy
See [MULTI_ENV.md](MULTI_ENV.md) for environment separation patterns.
> **Note**: Deployment verification applies **Progressive Evidence Strengthening** (CLAUDE.md Principle #2). We verify from weak evidence (exit codes) to strong evidence (actual traffic metrics).
### Pattern 3: Deployment Validation
**Multi-Layer Verification (Deployment Application):**
1. **Status Code** (weakest signal)
```bash
aws lambda invoke --function-name worker --payload '{}' /tmp/response.json
# Exit code 0 only means "invocation succeeded", not "function worked"
```
2. **Response Payload** (stronger signal)
```bash
if grep -q "errorMessage" /tmp/response.json; then
echo "❌ Lambda returned error"
exit 1
fi
```
3. **CloudWatch Logs** (strongest signal)
```bash
ERROR_COUNT=$(aws logs filter-log-events \
--log-group-name /aws/lambda/worker \
--filter-pattern "ERROR" \
--query 'length(events)' --output text)
if [ "$ERROR_COUNT" -gt 0 ]; then
echo "❌ Found errors in logs"
exit 1
fi
```
**Principle:** AWS services returning 200 OK doesn't guarantee error-free execution. Always validate logs.
See [MONITORING.md](MONITORING.md) for comprehensive validation patterns.
### Pattern 4: Secret Management by Consumer
**Doppler (Runtime Secrets)**
- Who needs it: Lambda functions during execution
- Examples: `AURORA_HOST`, `OPENROUTER_API_KEY`
- Access: Doppler → Terraform → Lambda env vars
**GitHub Secrets (Deployment Secrets)**
- Who needs it: CI/CD pipeline during deployment
- Examples: `CLOUDFRONT_DISTRIBUTION_ID`, `AWS_ACCESS_KEY_ID`
- Access: `${{ secrets.SECRET_NAME }}` in workflows
**The Deciding Question:** "Does the Lambda function running in production need this value?"
- YES → Doppler (runtime secret)
- NO → GitHub Secrets (deployment secret)
See [MULTI_ENV.md#secret-management](MULTI_ENV.md#secret-management) for detailed patterns.
---
## Pre-Deployment Validation
**Principle:** Validate configuration BEFORE deployment, not during.
### Validation Script
```bash
# Run before every deployment
scripts/validate_deployment_ready.sh
# Checks:
# 1. Doppler configuration exists
# 2. Required environment variables set
# 3. AWS resources exist (S3 buckets, DynamoDB tables)
# 4. Lambda function dependencies available
```
**Why This Matters:**
- Distinguishes config failures from logic failures
- Fails fast (30 seconds vs 8 minute deploy)
- Prevents partial deployments
### Infrastructure-Deployment Contract Validation
**Pattern:** Query AWS infrastructure, validate secrets match reality.
```yaml
jobs:
validate-deployment-config:
runs-on: ubuntu-latest
steps:
- name: Validate CloudFront Distributions
run: |
# Query actual infrastructure
ACTUAL=$(aws cloudfront list-distributions \
--query 'DistributionList.Items[?Comment==`app-dev`].Id' \
--output text)
# Compare to GitHub secret
if [ "$ACTUAL" != "${{ secrets.CLOUDFRONT_DISTRIBUTION_ID }}" ]; then
echo "❌ Secret mismatch detected"
exit 1
fi
build:
needs: validate-deployment-config # Won't run if validation fails
```
**Benefits:**
- Catches configuration drift automatically
- No manual checklist needed
- Single source of truth (AWS infrastructure is reality)
See [MONITORING.md#infrastructure-validation](MONITORING.md#infrastructure-validation).
---
## Common Deployment Scenarios
### Scenario 1: Deploy Code Change to Dev
```bash
# 1. Commit to dev branch
git add .
git commit -m "feat: Add new feature"
git push origin dev
# 2. GitHub Actions automatically:
# - Builds Docker image
# - Pushes to ECR
# - Updates Lambda function code
# - Waits for function update (no sleep!)
# - Runs smoke tests
# - Validates CloudWatch logs
# 3. Manual verification (optional)
just test-dev-api # Test deployed function
```
**Time:** ~8 minutes
### Scenario 2: Promote Dev → Staging
```bash
# 1. Create PR from dev → main
gh pr create --base main --head dev --title "Release: v1.2.0"
# 2. Review and merge
gh pr merge --squash
# 3. GitHub Actions automatically:
# - Uses SAME Docker image from dev (artifact promotion)
# - Updates staging Lambda with promoted image
# - Runs integration tests
# - Validates staging environment
```
**Time:** ~10 minutes (faster because no rebuild)
### Scenario 3: Deploy to Production
```bash
# 1. Tag release on main branch
git tag v1.2.0
git push origin v1.2.0
# 2. GitHub Actions automatically:
# - Uses SAME Docker image from staging
# - Publishes new Lambda version (immutable)
# - Updates 'live' alias to new version (zero-downtime)
# - Runs smoke tests
# - Validates production logs
```
**Time:** ~12 minutes
### Scenario 4: Rollback Production
```bash
# Find previous version
aws lambda list-versions-by-function \
--function-name worker \
--query 'Versions[-2].Version' # Previous version
# Update alias to previous version (instant rollback)
aws lambda update-alias \
--function-name worker \
--name live \
--function-version 42 # Previous working version
# Verify rollback
aws lambda get-alias --function-name worker --name live
```
**Time:** < 30 seconds (instant)
---
## Deployment Principles
**From CLAUDE.md global instructions:**
> "Deployment Philosophy: Serverless AWS Lambda with immutable container images, zero-downtime promotion via versioning."
### Core Principles
1. **Immutability**: Build once, promote same artifact through all environments
2. **Zero-Downtime**: Version + Alias pattern for atomic updates
3. **Fail Fast**: Validate before deployment, not during
4. **Multi-Layer Verification**: Status code + Payload + Logs
5. **Artifact Promotion**: Dev image → Staging image → Prod image (same hash)
6. **Secret Separation**: Runtime secrets (Doppler) vs Deployment secrets (GitHub)
### When NOT to Deploy
- ❌ Tests failing (run `just test-deploy` first)
- ❌ Secrets not configured (run validation script)
- ❌ Infrastructure not created (run `terraform apply` first)
- ❌ No PR review for staging/prod (require approval)
### AWS CLI Waiter Pattern
**DO:**
```bash
aws lambda update-function-code --function-name worker --image-uri $IMAGE
aws lambda wait function-updated --function-name worker # Blocks until ready
```
**DON'T:**
```bash
aws lambda update-function-code --function-name worker --image-uri $IMAGE
sleep 30 # ❌ Arbitrary delay, might be too short or too long
```
**Why Waiters:**
- Precise timing (returns immediately when ready)
- Handles variability (cold start vs warm container)
- Prevents race conditions
---
## File Organization
```
.claude/skills/deployment/
├── SKILL.md # This file (entry point)
├── ZERO_DOWNTIME.md # Lambda versioning patterns
├── MULTI_ENV.md # Environment strategy
├── MONITORING.md # Validation and monitoring
└── scripts/
└── validate_deployment_ready.sh # Pre-deployment validation
```
---
## Next Steps
- **For zero-downtime patterns**: See [ZERO_DOWNTIME.md](ZERO_DOWNTIME.md)
- **For multi-environment setup**: See [MULTI_ENV.md](MULTI_ENV.md)
- **For deployment monitoring**: See [MONITORING.md](MONITORING.md)
- **For complete runbook**: See `docs/deployment/TELEGRAM_DEPLOYMENT_RUNBOOK.md`
---
## References
- [AWS Lambda Versioning and Aliases](https://docs.aws.amazon.com/lambda/latest/dg/configuration-aliases.html)
- [AWS CLI Waiters](https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-wait.html)
- [Doppler Documentation](https://docs.doppler.com/)
- [GitHub Actions: AWS Credentials](https://github.com/aws-actions/configure-aws-credentials)
- [Terraform AWS Provider](https://registry.terraform.io/providers/hashicorp/aws/latest/docs)