
When a critical alarm goes unseen at 3 a.m., the shock waves hit safety, production, and compliance all at once. These aren’t abstract risks for oil and gas or heavy industry. They’re the sort of failures that turn into night-long shutdowns and week-long investigations.
For 24/7 operations, adopting WIN-911 isn’t about convenience; it’s about keeping the plant monitored when no one is at the console.
Understanding the Remote Alarm Problem
Older models assumed a staffed control room, which left blind spots during shift changes, maintenance, storms, and evacuations. Modern plants are distributed: upstream pads and midstream stations spread across counties, specialists on the road, and experts on call.
Meanwhile, SCADA can spew hundreds of alarms a day, burying the urgent inside the routine. Miss a high-pressure alarm and you can get a safety event. Miss a pump trip and you can be staring at $50,000 per hour in lost production, plus extended maintenance windows, emergency callouts, and repair costs.
Remote alarm management fixes two things simultaneously:
- It routes the right event to the right person fast.
- It confirms that person actually received it.
This way, the plant can act within defined time windows instead of hope and luck.
WIN-911 in a Nutshell
WIN-911 is purpose-built for industrial notification. It sits next to your HMI/SCADA/DCS stack, connects through standard interfaces, and pushes alarms out over multiple channels: two-way voice (text-to-speech), SMS, email, and a mobile app.
That mix provides redundancy when a site has spotty coverage or noise that drowns out a ring. The system also supports hands-free voice acknowledgment so techs can confirm receipt without dropping tools. Useful when they’re in PPE or mid-task.
On the data side, WIN-911 polls alarms from your control system via industrial connectors such as OPC Data Access (OPC DA), the long-standing standard for moving live tags between devices, SCADA, and HMIs.
In short: an OPC server exposes items; client software (like an HMI or a notifier) reads, writes, and subscribes to change events.
Because the platform is built for volume, it can filter, prioritize, and throttle floods so urgent events break through while nuisance noise gets parked.
Audit trails and reports show who was notified, how they were notified, and when they acknowledged. Powerful evidence for investigations and continuous improvement.
A Practical Framework for Callout Strategy
Alarm Prioritization and Classes
Start by separating critical safety from production and maintenance/informational alarms.
Fire and gas, ESD trips, and equipment failures with immediate hazard get “all-hands” treatment; throughput and quality events still matter but allow measured response; condition-based alerts inform planning.
This tiering protects focus and prevents fatigue.
People and Escalation
Define the human chain of custody.
- Primary contacts are the operators and technicians with direct responsibility and authority.
- Secondary contacts are supervisors or specialists who can authorize bigger moves.
- Backup contacts cover nights, holidays, and call-ins.
Route by skill and location so the person who can fix it first sees it first.
Response Windows and SLAs
Put numbers on it: how long before a critical alarm must be acknowledged (five–fifteen minutes), how long before a production alarm must be acknowledged (fifteen–thirty), and what happens if nobody responds.
These SLAs anchor escalation timing and let you measure performance over time.
WIN-911 Configuration: Getting the Basics Right

Tactics and Sources
Create tactics (WIN-911’s rulesets) that define which alarms to watch, how to format messages, and who gets what.
Connect to alarm sources (SCADA/PLC/DCS) over OPC DA or the platform’s supported interfaces; group people by real responsibility, not org chart alone.
Prefer mobile numbers for first reach; use voice for noisy areas, SMS for weak-signal zones, and email for non-urgent, detail-heavy notices.
Test every contact method up front and on a cadence; numbers drift, inboxes change, and cell towers do, too.
Advanced Escalation
Build multi-stage sequences.
- Stage 1 goes to the equipment owner; no acknowledgment within the SLA triggers.
- Stage 2 (supervisor, controls engineer); a further miss escalates to
- Stage 3 (manager, on-call vendor).
Use conditional routing so safety-critical alarms fan out immediately, while planned-maintenance alarms route to reduced lists. Integrate shifts, holidays, and outages so coverage follows the calendar automatically.
Redundancy and Failover
Eliminate single points of failure. Use redundant servers (or instances), UPS, secondary WAN paths (cellular or satellite for remote pads), and dual notification channels per person. Document a manual backup (who calls whom) for black-sky events.
For cybersecurity and resilience, align your remote-access and notification architecture with ICS guidance from DHS/CISA (e.g., multifactor auth, segmentation, and least privilege for remote pathways).
Looking for an EPC Company that does it all from start to finish, with in house experts?
Implementation That Sticks
Start with a Pilot: pick a narrow but meaningful slice, like high-risk safety systems or a troublesome compressor station. Keep the pilot team small and engaged (operators + one controls engineer + one supervisor).
Define what “better” means: faster acknowledgments, fewer missed alarms, cleaner handoffs. Prove it, then scale.
System Testing: run end-to-end drills:
SCADA alarm → WIN-911 delivery → human acknowledgment (voice, SMS, app) → SCADA reflects the acknowledgment.
Test at shift change, at 2 a.m., and during planned outages. Load-test for upsets where hundreds of alarms arrive in minutes; you want to see filtering keep the console usable and the callout stack moving.
Training and Adoption: train for both the tool (installing the app, voice ack, handling repeats) and the process (who owns what, when to escalate).
Provide short “first-five-minutes” cards and one-page flowcharts so new hires and contractors can follow the playbook under stress. Keep a FAQ handy for IT (ports, firewall rules, device enrollment).
Staying Reliable Over Time

Health Monitoring
Watch server and gateway health, queue depths, and delivery success rates. Set meta-alerts for notification failures (e.g., a carrier outage or email server timeouts). Review monthly trends and downtime roots.
People Data
Audit contact data on a schedule—people change roles, numbers, and devices. Build automated checks (e.g., a weekly test ping) to catch stale entries before the 3 a.m. alarm finds a dead phone.
Alarm Hygiene
If operators are drowning, fix the upstream problem. EEMUA 191’s widely used benchmarks target manageable rates (e.g., alarm floods defined as >10 new alarms in 10 minutes) and encourage rationalization so real problems aren’t lost in noise.
Use your reports to find bad actors, chattering points, and mis-prioritized events; fix setpoints or logic before they become cultural background noise.
Security and Remote Access
Remote notification lives in the same ecosystem as remote access. Apply standard ICS security practices: network segmentation, strong authentication, least privilege, monitored gateways, and maintenance of the mobile endpoints that receive alarms.
CISA’s practical playbook on remote access for ICS is a useful checklist for policy and design reviews.
Scheduling That Mirrors Reality
Tie WIN-911 schedules to actual rosters, rotations, holidays, and known maintenance windows so coverage is automatic.
If a weekend crew is shorter, widen the Stage-1 distribution list; if a unit is down for PM, route its nuisance alarms to a parking group so they don’t page people unnecessarily.
Keep an “operations override” for storms and turnarounds so supervisors can temporarily broaden notifications with a single switch.
End-to-End Validation (Don’t Skip This)
Before go-live, run table-top drills and live tests for each alarm class. Trigger sample alarms from SCADA and verify:
- The payload is formatted clearly.
- The right people get it on the channels you expect.
- Acknowledgments close the loop in the control system.
- Escalations trigger on time.
Repeat the test during shift change and again at off-hours. Then perform a short flood test to make sure filtering rules prevent pile-ups while priority events still break through.
Reporting and Audits
Schedule monthly reports that show SLA performance, alarm volumes by class, top escalated events, and common failure points (bad numbers, full mailboxes, out-of-coverage cells).
Use the audit trail when you review incidents: who was notified, by what path, and when did they acknowledge? Close the loop with a standing “alarm quality” meeting so maintenance and controls can correct chattering points and setpoint errors instead of normalizing the noise.
Mobile Use in the Field
Coach techs on when to acknowledge immediately (clear, actionable events) and when to hold the ack until they’ve verified local conditions. The goal isn’t to hit the button fast—it’s to confirm the right action is underway.
Where coverage is weak, pair SMS with voice and let techs queue acknowledgments in the mobile app until connectivity returns. If your policy allows BYOD, apply MDM/MAM controls and require screen lock, encryption, and the ability to remote-wipe.
Measuring What Matters
Response analytics make the value visible:
- Average time-to-acknowledge before and after the rollout.
- Percent of alarms cleared within SLA
- Escalation depth (how often Stage 2 or 3 gets invoked).
Volume analytics point to upstream tuning: nuisance alarms, chattering points, and off-hours surges.
User feedback keeps things human: what messages were confusing, which alarms should be grouped, which voice prompts saved time.
ROI comes from avoided incidents, shorter upsets, and cleaner handoffs. Faster acknowledgment shrinks equipment damage and production loss; cleaner callouts reduce overtime and drive fewer emergency call-ins. Often the program pays for itself in months, not years.
Where Standards Fit
Alarm management isn’t a blank sheet. ISA-18.2 defines the alarm-management life cycle: philosophy, identification, rationalization, implementation, operation, maintenance, monitoring, and audit.
Your rulesets, KPIs, and reviews have a shared structure and vocabulary. If you’re aligning corporate policy, start here and tailor for your sites.
For data movement, understand your plumbing. OPC DA is the classic, widely deployed mechanism many plants still rely on for HMI/SCADA connectivity; newer systems often add OPC UA for secure, modern connectivity.
If you know how these clients and servers browse items, subscribe to changes, and handle quality/timestamps, you’ll troubleshoot integrations faster.
Putting It All Together
WIN-911 doesn’t fix culture by itself; it gives your culture a reliable nervous system.
Build a callout plan that mirrors how your people actually work, test it like you test your safety systems, and keep tuning both the alarms and the roster.
Start with a pilot, publish simple SLAs, and review the numbers every month.
When the 3 a.m. event hits, and they will, you’ll have a system that reaches the right person, gets a confirmed acknowledgment, and buys back the minutes that matter most.

Dan Eaves, PE, CSE
Dan has been a registered Professional Engineer (PE) since 2016 and holds a Certified SCADA Engineer (CSE) credential. He joined PLC Construction & Engineering (PLC) in 2015 and has led the development and management of PLC’s Engineering Services Division. With over 15 years of hands-on experience in automation and control systems — including a decade focused on upstream and mid-stream oil & gas operations — Dan brings deep technical expertise and a results-driven mindset to every project.
PLC Construction & Engineering (PLC) is a nationally recognized EPC company and contractor providing comprehensive, end-to-end project solutions. The company’s core services include Project Engineering & Design, SCADA, Automation & Control, Commissioning, Relief Systems and Flare Studies, Field Services, Construction, and Fabrication. PLC’s integrated approach allows clients to move seamlessly from concept to completion with in-house experts managing every phase of the process. By combining engineering precision, field expertise, and construction excellence, PLC delivers efficient, high-quality results that meet the complex demands of modern industrial and energy projects.
