Loading...
Loading...
When a content update bricks 4,000 endpoints overnight, when ransomware encrypts the corporate fleet, when a failed mass deployment leaves the workforce locked out — you need surge engineering capacity at the door within 24 hours. That is what we deliver.
The July 2024 CrowdStrike Falcon content update incident was the watershed moment for endpoint recovery as a discipline. A single bad sensor configuration update bricked an estimated 8.5 million Windows endpoints worldwide. Recovery required physical access to each device, boot into Safe Mode or WinRE, removal of a specific channel file. Multiplied across enterprise fleets, the operational impact dwarfed the technical fix. Airlines grounded, hospitals reverted to paper, banks lost ATM availability. The lesson: mass-endpoint recovery capacity is a capability you do not build during the incident.
The two scenario classes we engage on are:
Remote within 4 hours: scope of impact, fleet inventory, available backups, golden-image freshness, EDR/MDM status, BitLocker recovery key escrow status. Output is the recovery work-package definition.
Pre-staged USB recovery media, PXE infrastructure where feasible, automation scripts (PowerShell, batch, ansible) tailored to your image, BitLocker key-extraction tooling, golden-image refresh.
On-site recovery teams across affected sites, parallel execution against the pre-staged tooling, throughput targets of 200-400 endpoints/day/team for homogeneous fleets. Daily progress dashboard to executive sponsors.
Post-restore verification of OS integrity, EDR re-enrolment, MDM re-enrolment, domain-trust validation, application allow-list, user-data restoration. Sign-off per endpoint logged to the recovery dashboard.
Root-cause analysis, recovery-time analysis, recommendations for resilience uplift (image-management modernisation, EDR rollout-ring strategy, BitLocker key-escrow hardening, retainer right-sizing). Board-ready summary.
Endpoint recovery after ransomware is fundamentally different from outage recovery, even when the visible technical work looks similar. The decision tree is:
Recovery throughput is bounded by physical access, fleet homogeneity, BitLocker key availability and parallel-team capacity. Realistic targets for a cooperative environment:
Emergency response without a retainer is best-effort against our roster availability. A retainer is the only way to guarantee response-time SLAs and pre-engagement environment knowledge. Three tiers:
When the volume exceeds your team's recovery throughput per day. A typical internal IT team can manually rebuild 20-40 endpoints per day; a 2,000-endpoint outage at that pace is a multi-week business-down event. Our recovery teams ship pre-staged boot media, automation scripts and additional engineers to compress the recovery window from weeks to days. We also handle the forensic chain-of-custody requirements that internal IT typically does not — important if there is a ransomware angle and a possible insurance or regulator notification downstream.
Both, with different workflows. An accidental outage (EDR content update breaking endpoints, failed mass patch deployment, broken golden image) is a pure restoration exercise. A ransomware event is a recovery exercise wrapped in a forensic and possibly insurance/legal exercise — every endpoint touched without preserving artefacts may be undermining the insurance claim, the regulator notification and the criminal investigation. Our ransomware engagements run alongside our <Link href='/services/incident-response' className='text-red-700 underline'>incident response</Link> and <Link href='/services/digital-forensics' className='text-red-700 underline'>digital forensics</Link> teams to keep the chain intact.
Depends on outage profile. For a homogeneous fleet (all on the same image, same EDR, same management plane) we typically achieve 200-400 endpoint restorations per day per recovery team once on-site and pre-staged. Multi-image, multi-OS environments are slower. For a 1,000-endpoint outage with cooperative environment and pre-existing backup, we typically deliver business-critical functions back within 48-72 hours and full restoration within 5-7 days.
Both options exist. Emergency engagements without a retainer are accepted when our roster has capacity, but you will be at the back of the queue if we are already deployed elsewhere. A retainer guarantees response-time SLAs (typically 4-hour remote engagement, 24-hour on-site mobilisation in Klang Valley), pre-negotiated rates, and a discovery exercise we run during the retainer onboarding so we already understand your environment when the incident hits.
Emergency response engagements are typically billed on a daily-rate basis with a minimum 5-day engagement, plus equipment and travel at cost. Daily rates per recovery engineer typically sit in the RM 3,500 to RM 6,000 band depending on seniority and on-site requirement. Retainers start from RM 60,000/year for SME tiers (covers 50 hours of pro-active work plus emergency response SLA) and scale to enterprise contracts in the RM 250,000-500,000/year range. We will scope a fixed-price quote within 24 hours of engagement.
Active incident: call our emergency line. Planning ahead: scope a retainer in a 30-minute discovery call.
Get a Scope