Resource pool capacity and concurrency model
Treat each remote Mac in the build resource pool as a single IO domain: one fast SSD usually backs workspaces, global caches, and sometimes Docker layers. CPU can look idle while the disk is saturated—so cap concurrency from observed pull latency, not core count.
Baseline model: separate network-bound fetches (Git clone/fetch, registry downloads) from metadata-heavy work (npm install with huge dependency trees, CocoaPods resolver, Homebrew formula updates). Network-bound jobs can overlap modestly; metadata-heavy jobs should stay sparse on the same volume.
| Scenario | Max concurrent Git network ops / host | Max heavy npm / CocoaPods installs | Notes |
|---|---|---|---|
| Single shared runner (one workspace disk) | 2–3 | 1 | Default safe start; raise only if iowait stays low and p95 pull time is flat. |
| Pool host (NVMe, dedicated cache volume) | 4 | 2 | Split /Volumes/cache from job workspaces to reduce fragmentation. |
| Burst / release day | Same as steady | Same or lower | Prefer queueing jobs over raising concurrent installs; spikes correlate with cache corruption and timeouts. |
Steps (greenfield pool): (1) Set orchestrator max parallel jobs per host to the “single shared runner” row. (2) Run a 30-minute soak with representative pipelines. (3) If disk latency p95 and pull failure rate are stable, increase Git concurrency by one step and re-measure.
Git / npm pull queues and locks
Concurrency limits alone do not fix write contention: multiple jobs updating the same Homebrew prefix, global npm cache, or CocoaPods cache can serialize internally and surface as random stalls or lock errors.
Recommended patterns:
- Host mutex for mutating package managers: acquire a file lock (e.g.
/var/run/macpull-brew.lock) aroundbrew upgradeor formula taps in shared images; CI jobs that only read bottles can skip the lock. - Per-job npm cache: set
npm_config_cache=$WORKSPACE/.npm-cachefor writes; promote to shared read-only cache via a scheduled pre-pull job. - Git: prefer per-job
GIT_OBJECT_DIRECTORYisolation only when you understand pack reuse; otherwise rely on shallow clone and shared reference repos on disk.
| Mechanism | Typical queue depth | Wait timeout (fail job) |
|---|---|---|
| Homebrew global mutex | 1 holder | 15–30 min (weak net) |
| Shared npm publish cache (write) | 1–2 | 20–45 min |
| Git fetch to local mirror | 4–8 | 10–20 min connect + transfer |
Disk and cache partition thresholds
APFS tolerates low free space poorly for large sequential writes (pack files, Docker layers). Use tiered watermarks on the workspace + cache volume, not just the root filesystem.
| Used % (df) | Automation action | Operator action |
|---|---|---|
| ≤ 80% | Normal scheduling | None |
| 80–85% | Alert; reduce concurrent pulls by 1; trigger LRU cache eviction | Review largest dirs (DerivedData, Docker, old workspaces) |
| 85–90% | Pause new clones / large installs; finish in-flight jobs only | Evict or move caches; expand volume or add node |
| > 90% | Hard stop new jobs; drain queue | Emergency cleanup; verify no snapshot exhaustion |
Keep ≥ 15–25 GB absolute free on the primary data volume as a secondary guard (whichever triggers first wins). Align cache layout with our cache strategy for Git and npm on remote Mac CI so eviction policies are predictable.
Weak-network timeouts, retries, and resume
Pool-wide instability often comes from timeout retry storms: every job retries at once after a registry blip. Cap retry aggressiveness and prefer exponential backoff with jitter at the orchestrator or wrapper script layer.
| Layer | Parameter | Starter value |
|---|---|---|
| Git (HTTP) | http.lowSpeedLimit / http.lowSpeedTime |
1000 B/s · 60–120 s |
| Git (wrapper) | process kill timeout | 45–90 min for full clone; 15–25 min for fetch |
| npm | fetch-timeout / fetch-retries |
300000 ms / 5 |
| curl / generic | --connect-timeout |
30–60 s |
| Orchestrator retry | backoff cap | 30 s → 60 s → 120 s (+ jitter); max 3–5 attempts |
Pair timeouts with resumable workflows: partial clone (--filter=blob:none) where allowed, npm cache reuse, and Docker pull with layer cache. For Git/Homebrew/npm-specific knobs, our pull stability FAQ lists copy-paste settings.
Acceptance and monitoring metrics
Promote threshold changes only when metrics hold for a full business week:
- Pull failure rate < 0.5% of jobs (network + disk errors).
- p95 time in “dependency fetch” stage within 20% of baseline after concurrency changes.
- Disk utilization spends < 5% of minutes above 85% used during peak.
- iowait (or macOS disk latency proxy) does not trend upward week over week.
Export alerts to the same channel as hypervisor or host restarts so on-call can correlate disk IO spikes with job scheduling changes.
Common questions (FAQ)
Should we use one huge SSD or split volumes? Splitting workspace and cache simplifies eviction and reduces the chance that a single rogue job fills the disk that also holds the OS. On a single volume, enforce stricter watermarks.
Why do jobs fail together after a short outage? Thundering herd on retry. Add jittered backoff and temporarily lower concurrent pull caps until error rates normalize.
Is NFS acceptable for Git workspaces? Only with care—latency kills Git and package managers. Prefer local NVMe for workspaces; use NFS for read-mostly artifacts if needed.
Where can I get human support for capacity planning? See the MacPull help center (no login required to browse); for dedicated nodes and regions, open pricing or purchase.
Summary
A healthy remote Mac build resource pool is governed by disk-first concurrency, explicit queues and locks around shared package stores, tiered disk watermarks, and conservative timeout retry behavior on weak networks. Start with the tables in this article, measure pull and IO metrics for a week, then tune one variable at a time.
When you need predictable Apple Silicon capacity—SSH/VNC, stable egress, and room to isolate caches—MacPull remote Mac plans let you scale the pool without owning hardware. Browse help, compare plans and pricing, or go straight to purchase; you can also continue with pull acceleration and the full blog—all without logging in.
Dedicated remote Mac for your build pool
Mac Mini class nodes with SSH/VNC—tune concurrency and disk layout on hardware you control. View pricing, purchase, or read more guides—no login required.