Storage Migration

Migrating APK storage from GitHub Releases to Cloudflare R2 with zero impact on deployed devices.

Last updated: April 27, 2026

Why Migrate

GitHub Releases is the initial APK storage backend. At scale (10,000+ schools), it becomes unsuitable:

  • Rate limits: 5,000 API requests/hour per token. Concurrent fleet boots can exhaust this.
  • Acceptable Use Policy: GitHub discourages using Releases as a high-volume CDN.
  • No regional edge: Downloads from Kenya hit US-based GitHub servers.
  • No analytics: No visibility into download traffic or failures.
i
Cloudflare R2 solves all of these: zero egress fees, custom domain support, CDN edge presence in Africa, and designed for binary file serving at fleet scale.

Architecture (Why Devices Need No Changes)

The migration is invisible because devices never know which storage backend hosts the APK. They only know one stable URL:

Stable Contract
text
cdn.familypocket.io/update/latest          -> version metadata
cdn.familypocket.io/update/v{version}/apk  -> 302 redirect to current backend
cdn.familypocket.io/provision/apk?token=.. -> 302 redirect to current backend

Behind these URLs, a Vercel function decides where to redirect based on one environment variable:

Abstraction Layer
text
Deployed Kiosk
    |
    v
Vercel Function (reads KIOSK_STORAGE_BACKEND env var)
    |
    +-- "github" --> GitHub Releases (initial)
    +-- "r2"     --> Cloudflare R2   (target)

Migration = flip the env var. Devices in the field don't know anything changed.

Anti-Patterns (Migration Breaks If Any Exist)

  • A kiosk app version that hardcodes https://github.com/... directly
  • An NFC provisioning tag pointing to GitHub instead of cdn.familypocket.io
  • A backend response returning apk_url pointing outside cdn.familypocket.io

When to Migrate

Trigger thresholds (any one is sufficient):

  • Active fleet exceeds 1,000 devices
  • Monthly GitHub bandwidth exceeds 50 GB
  • GitHub API rate limit hit during a release rollout
!
Do NOT migrate during: an active phased rollout, a force-update event, or the first 30 days of a new kiosk version.

Ideal window: existing version at 100% rollout, no new release planned for 2 weeks.

Pre-Migration Audit

1. Kiosk App Source Code

Search for hardcoded URLs
bash
grep -rn "github.com" app/src/main/
grep -rn "githubusercontent" app/src/main/
grep -rn "r2.dev" app/src/main/

Expected: zero hits. Only cdn.familypocket.io should appear.

2. Backend Response Shape

Verify apk_url points to CDN
bash
curl -H "X-Device-Id: audit-test" \
     -H "X-Current-Version: 2.4.0" \
     https://cdn.familypocket.io/update/latest | jq .apk_url

Must start with https://cdn.familypocket.io/.

3. NFC Provisioning Tags

Sample a provisioning tag. The download URL must point to cdn.familypocket.io/provision/apk, not GitHub.

4. Vercel Function

  • KIOSK_STORAGE_BACKEND env var exists, set to github
  • Code uses the env var to choose between adapters
  • Both getGitHubReleaseAssetUrl() and getR2SignedUrl() exist

Cloudflare R2 Setup

Create the Bucket

  1. Cloudflare dashboard → R2 → Create bucket
  2. Name: familypocket-kiosk-apks
  3. Location hint: Eastern Europe & Middle East & Africa (closest to Kenya)

Generate API Token

  1. R2 → Manage R2 API Tokens → Create API Token
  2. Permission: Object Read & Write on familypocket-kiosk-apks only
  3. TTL: No expiry (rotated annually)

Vercel Environment Variables

VariableValue
R2_ACCOUNT_IDYour Cloudflare account ID
R2_ACCESS_KEY_IDFrom API token creation
R2_SECRET_ACCESS_KEYFrom API token creation
R2_BUCKET_NAMEfamilypocket-kiosk-apks

Add to all Vercel environments but do not flip KIOSK_STORAGE_BACKEND yet.

R2 Domain Strategy

Two options for serving R2 content:

OptionHowTrade-off
A: Direct subdomainPoint cdn.familypocket.io directly to R2 bucketSimpler, but loses the Vercel abstraction layer
B: Behind Vercel (chosen)Vercel generates signed R2 URLs, 302 redirects devicesKeeps the single decision point for future migrations

We chose Option B because it keeps the Vercel function as the single routing layer. Migration changes nothing about DNS. Devices redirect to pub-<hash>.r2.dev signed URLs instead of GitHub signed URLs.

R2 Adapter Code

The R2 adapter generates short-lived signed URLs. R2 is S3-compatible, so we use the AWS SDK:

Install
bash
npm install @aws-sdk/client-s3 @aws-sdk/s3-request-presigner
lib/storage/r2.ts
typescript
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';

const r2Client = new S3Client({
  region: 'auto',
  endpoint: `https://${process.env.R2_ACCOUNT_ID}.r2.cloudflarestorage.com`,
  credentials: {
    accessKeyId: process.env.R2_ACCESS_KEY_ID!,
    secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!,
  },
});

export async function getR2SignedUrl(version: string): Promise<string> {
  const command = new GetObjectCommand({
    Bucket: process.env.R2_BUCKET_NAME!,
    Key: `kiosk-${version}.apk`,
  });
  return await getSignedUrl(r2Client, command, { expiresIn: 300 }); // 5 min
}
i
Why signed URLs? They expire in 5 minutes. Even if a URL leaks, it's useless shortly after. Public buckets allow anyone to download forever.

Migration Execution (7 Steps)

Total time: ~2 hours of focused work, plus a 24-hour observation window.

Step 1: Copy APKs from GitHub to R2

For every active version (target_version and previous_stable_version):

scripts/migrate-apks-to-r2.sh
bash
VERSIONS=("2.4.1" "2.5.0")
GITHUB_REPO="familypocket/familypocket-kiosk"

for VERSION in "${VERSIONS[@]}"; do
  echo "Migrating v$VERSION..."

  # Download from GitHub
  ASSET_URL=$(curl -s -H "Authorization: Bearer $GITHUB_PAT" \
    "https://api.github.com/repos/$GITHUB_REPO/releases/tags/v$VERSION" \
    | jq -r '.assets[] | select(.name=="familypocket-kiosk.apk") | .url')

  curl -L -H "Authorization: Bearer $GITHUB_PAT" \
    -H "Accept: application/octet-stream" \
    "$ASSET_URL" -o "/tmp/kiosk-$VERSION.apk"

  # Verify checksum against DB
  ACTUAL=$(sha256sum "/tmp/kiosk-$VERSION.apk" | awk '{print $1}')
  echo "SHA-256: $ACTUAL (verify against kiosk_rollout_config.checksums)"

  # Upload to R2
  aws s3 cp "/tmp/kiosk-$VERSION.apk" \
    "s3://familypocket-kiosk-apks/kiosk-$VERSION.apk" \
    --endpoint-url "https://$R2_ACCOUNT_ID.r2.cloudflarestorage.com"

  rm "/tmp/kiosk-$VERSION.apk"
  echo "Done: v$VERSION"
done

Step 2: Verify R2 Reads Work

Test the R2 adapter without flipping the switch. Download via a signed URL and verify SHA-256 matches, file size matches GitHub.

Step 3: Canary on a Single Device

Set R2_CANARY_DEVICES=<device-serial> in Vercel. The function routes that device to R2 while all others stay on GitHub. Trigger an update, confirm via telemetry. Wait 24 hours.

Step 4: Flip the Switch

Vercel Dashboard
text
KIOSK_STORAGE_BACKEND=r2  (was: github)

Redeploy. Verify immediately:

curl -I "https://cdn.familypocket.io/update/v2.5.0/apk"
# Should 302 to *.r2.cloudflarestorage.com, not GitHub

Step 5: 24-Hour Observation

  • update.completed events still flowing
  • No spike in update.failed events
  • R2 dashboard shows download traffic
  • GitHub traffic drops to zero

Step 6: Decommission (After 30 Days)

  • Remove canary override code
  • Remove R2_CANARY_DEVICES env var
  • Optionally remove getGitHubReleaseAssetUrl() (or keep as fallback)
  • Revoke GITHUB_PAT if GitHub is no longer needed

Step 7: Update CI Pipeline

Add an R2 upload step to .github/workflows/release.yml so every future release lands in R2 automatically:

.github/workflows/release.yml (add step)
yaml
- name: Upload APK to R2
  env:
    AWS_ACCESS_KEY_ID: ${{ secrets.R2_ACCESS_KEY_ID }}
    AWS_SECRET_ACCESS_KEY: ${{ secrets.R2_SECRET_ACCESS_KEY }}
    R2_ACCOUNT_ID: ${{ secrets.R2_ACCOUNT_ID }}
  run: |
    VERSION=${GITHUB_REF_NAME#v}
    aws s3 cp familypocket-kiosk.apk \
      "s3://familypocket-kiosk-apks/kiosk-$VERSION.apk" \
      --endpoint-url "https://$R2_ACCOUNT_ID.r2.cloudflarestorage.com"

Rollback Procedures

During Migration (Immediate)

Time to recover: <60 seconds.

  1. Vercel dashboard → set KIOSK_STORAGE_BACKEND=github
  2. Trigger redeploy
  3. Devices resume from GitHub. No device-side action needed.
This is the entire reason for the abstraction layer.

After Decommission (~30 Minutes)

  1. Restore GitHub adapter code from git history
  2. Re-add GITHUB_PAT env var
  3. Flip KIOSK_STORAGE_BACKEND=github
  4. Redeploy

Cost Comparison

Assuming 50 MB average APK, one update per device per month:

Backend10K Devices100K DevicesNotes
GitHub Releases$0$0Free but ToS gray area, rate limited
Cloudflare R2~$0.01/mo~$0.05/moZero egress, scales sublinearly

Disaster Scenarios

R2 Outage

Devices retry on next boot. Existing kiosks continue functioning normally, they just don't receive updates until R2 is back. For belt-and-suspenders, the Vercel function can fall back to GitHub if R2 fails (keep the adapter code and GITHUB_PAT).

R2 Credentials Leaked

  1. Rotate the R2 API token immediately in Cloudflare
  2. Update Vercel env vars
  3. Redeploy

Signed URLs already issued expire in 5 minutes. No path for pushing a malicious APK: devices verify SHA-256 checksum + Android signing certificate.

Wrong APK Uploaded to R2

Devices download the APK, compute SHA-256, and compare against the expected checksum from the backend. Mismatch = download deleted, update.failed telemetry logged. No malicious APK can be installed.

The checksum comes from the trusted auth service, not from the APK file itself. A compromised storage backend cannot push a malicious APK.

Migration Day Runbook

Pre-Migration (Week Before)

  • Audit checklist complete
  • R2 bucket created, credentials in Vercel
  • R2 adapter code deployed (env var still github)
  • Canary device on R2 for 24+ hours
  • On-call engineer scheduled

Migration Day (T-0 = Env Var Flip)

TimeAction
T-30 minConfirm fleet healthy, no in-progress force update
T-15 minBackup kiosk_rollout_config table
T-15 minConfirm both active versions are in R2
T-0Flip KIOSK_STORAGE_BACKEND=r2, redeploy
T+5 mincurl -I .../update/v{X}/apk confirms 302 to R2
T+15 minTelemetry shows update events flowing
T+1 hourGitHub traffic dropping
T+24 hoursNo update.failed spike, declare success

Post-Migration (Next Week)

  • CI updated to upload to R2 alongside GitHub
  • Decommission scheduled for T+30 days
  • R2 cost visible in Cloudflare dashboard

Future Migrations

The same abstraction supports any future storage backend (Backblaze B2, Bunny CDN, self-hosted MinIO). The pattern:

  1. Implement an adapter alongside getR2SignedUrl()
  2. Add a new value to KIOSK_STORAGE_BACKEND
  3. Copy APKs to the new backend
  4. Canary, observe, flip, observe, decommission
Devices in the field (potentially hundreds of thousands by then) will continue functioning through any future migration without anyone touching them. This is the payoff of doing the abstraction work at the start.