CI/CD Fundamentals
Continuous Integration, Delivery, and Deployment are distinct but related practices that together automate the path from developer commit to running software:
- Continuous Integration (CI): Every commit triggers an automated build and test run. Fast feedback on broken code before it reaches main.
- Continuous Delivery (CD): The build artefact is automatically deployed to a staging environment and is always in a releasable state. Deployment to production is a one-click operation.
- Continuous Deployment: Every passing commit automatically deploys to production with no human approval. Requires high test coverage and robust monitoring.
The four DORA metrics measure CI/CD effectiveness: Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Time to Restore Service. Elite performers deploy multiple times per day with less than 15-minute lead times.
GitHub Actions Setup
GitHub Actions uses YAML workflow files in .github/workflows/. A workflow is triggered by events (push, pull_request, schedule) and runs one or more jobs on GitHub-hosted or self-hosted runners.
name: CI
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
# Cancel in-progress runs on new commits to same PR
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
NODE_VERSION: '20'
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
test:
name: Test
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: testpass
ports: ['5432:5432']
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: npm
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test -- --coverage
env:
DATABASE_URL: postgresql://postgres:testpass@localhost:5432/testdb
- name: Upload coverage
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
Automated Testing Stage
Structure your test suite in three layers — each with a distinct scope and speed. Faster tests run first; expensive integration tests run only on the main branch to keep PR feedback fast.
test-matrix:
name: Test (${{ matrix.type }})
runs-on: ubuntu-latest
strategy:
matrix:
type: [unit, integration, e2e]
fail-fast: false # run all types even if one fails
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20', cache: npm }
- run: npm ci
- name: Run unit tests
if: matrix.type == 'unit'
run: npm run test:unit -- --coverage --coverageThreshold='{"global":{"lines":80}}'
- name: Run integration tests
if: matrix.type == 'integration'
run: npm run test:integration
env:
DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}
- name: Run E2E tests
if: matrix.type == 'e2e'
run: |
npx playwright install --with-deps chromium
npm run test:e2e
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results-${{ matrix.type }}
path: test-results/
- Keep unit tests under 100ms each — use mocks for database and network calls.
- Enforce a minimum line coverage threshold in CI to prevent regressions.
- Use
--bailin unit tests to stop on first failure for faster feedback in development. - Parallelise E2E tests across shards:
--shard=1/4with Playwright.
Code Quality Checks
Quality gates run in parallel with tests. They enforce code style, catch bugs statically, and flag security vulnerabilities — all before a merge.
name: Code Quality
on: [push, pull_request]
jobs:
lint:
name: Lint & Format
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20', cache: npm }
- run: npm ci
- name: ESLint
run: npm run lint -- --max-warnings=0
- name: Prettier check
run: npm run format:check
security:
name: Security Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Dependency audit
run: npm audit --audit-level=high
- name: SAST with CodeQL
uses: github/codeql-action/analyze@v3
with:
languages: javascript
sonarqube:
name: SonarQube Analysis
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
with: { fetch-depth: 0 }
- uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
Build and Package
Build Docker images only after all tests and quality checks pass. Use GitHub Container Registry (GHCR) for storage and tag images with both the Git SHA (immutable) and branch name (floating).
build-image:
name: Build & Push Docker Image
runs-on: ubuntu-latest
needs: [test-matrix, lint, security] # only build if all checks pass
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=sha,format=long
type=ref,event=branch
type=semver,pattern={{version}}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
Multi-Environment Deployment
Promote artefacts through environments: develop branch → staging, main branch with manual approval → production. Use GitHub Environments with protection rules for approvals and secrets scoping.
name: Deploy
on:
workflow_run:
workflows: [CI]
types: [completed]
branches: [main, develop]
jobs:
deploy-staging:
name: Deploy to Staging
if: github.ref == 'refs/heads/develop' && github.event.workflow_run.conclusion == 'success'
runs-on: ubuntu-latest
environment:
name: staging
url: https://staging.myapp.com
steps:
- name: Deploy to Staging
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.STAGING_HOST }}
username: deploy
key: ${{ secrets.STAGING_SSH_KEY }}
script: |
export IMAGE_TAG=${{ github.sha }}
cd /opt/app
docker compose pull app
docker compose up -d app
docker system prune -f
deploy-production:
name: Deploy to Production
if: github.ref == 'refs/heads/main' && github.event.workflow_run.conclusion == 'success'
runs-on: ubuntu-latest
environment:
name: production
url: https://myapp.com
# GitHub Environment protection rules require manual approval
steps:
- name: Deploy to Production
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.PROD_HOST }}
username: deploy
key: ${{ secrets.PROD_SSH_KEY }}
script: |
export IMAGE_TAG=${{ github.sha }}
cd /opt/app
docker compose pull app
docker compose up -d --no-deps app
sleep 30
# Smoke test
curl -f https://myapp.com/health || (docker compose rollback && exit 1)
- Store secrets in GitHub Environment secrets, not repository secrets, so they're scoped to specific environments.
- Rotate secrets regularly — use
gh secret setin your rotation script to automate updates. - Never print secrets in logs — GitHub Actions auto-masks known secrets, but be careful with encoded variants.
Rollback Strategies
Even with strong automated testing, bad deployments happen. Design rollback into your pipeline from day one — not as an afterthought.
Blue/Green Deployment: Run two identical production environments (blue = current, green = new). Switch traffic from blue to green atomically. Rollback is instant — switch back.
deploy-bluegreen:
steps:
- name: Deploy new task definition (green)
run: |
aws ecs register-task-definition \
--family my-app \
--container-definitions "[{
\"name\": \"app\",
\"image\": \"ghcr.io/myorg/app:${{ github.sha }}\",
\"portMappings\": [{\"containerPort\": 8080}]
}]"
- name: Update service with CodeDeploy (blue/green)
run: |
aws ecs update-service \
--cluster production \
--service my-app \
--task-definition my-app \
--deployment-configuration \
"deploymentCircuitBreaker={enable=true,rollback=true}"
Canary Releases: Route 5% of traffic to the new version while 95% goes to stable. Monitor error rates and latency. Gradually increase traffic if metrics stay healthy; auto-rollback if they don't.
upstream stable { server app-stable:8080; }
upstream canary { server app-canary:8080; }
split_clients "${remote_addr}${http_user_agent}" $backend {
5% canary;
* stable;
}
server {
location / {
proxy_pass http://$backend;
}
}
Monitoring Pipeline Performance
Track pipeline performance to identify bottlenecks, flaky tests, and DORA metric trends over time. GitHub Actions provides built-in job duration data; export it to your metrics platform.
- name: Notify Slack on failure
if: failure()
uses: slackapi/slack-github-action@v1
with:
channel-id: ${{ secrets.SLACK_CHANNEL_ID }}
payload: |
{
"text": "❌ Pipeline failed on `${{ github.ref_name }}`",
"attachments": [{
"color": "danger",
"fields": [
{"title": "Repository", "value": "${{ github.repository }}", "short": true},
{"title": "Commit", "value": "${{ github.sha }}", "short": true},
{"title": "Run URL", "value": "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"}
]
}]
}
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
- Deployment Frequency: Multiple times per day
- Lead Time for Changes: Less than one hour
- Change Failure Rate: 0–15%
- Time to Restore Service: Less than one hour