Verinext
Duluth, GA
Job Details
Full-time
Full Job Description
Join Verinext, a technology company that's not just keeping up with the future, but actively shaping it. At Verinext, we firmly believe that work should be as enjoyable as it is rewarding. As Monitoring & Automation Engineer, you'll be stepping into an environment that thrives on innovation and fun. Our team-oriented culture isn't just a buzzword; it's a cornerstone of our success. We're incredibly proud to have been recognized as a "Best Place to Work" by the Philadelphia Business Journal for 10 consecutive years.
We are seeking a mid-level Monitoring and Automation Engineer to strengthen our visibility into infrastructure platforms spanning servers, virtualization, networking, storage, and disaster recovery systems. The ideal candidate will have a hands-on background with LogicMonitor, BigPanda, and platform-level monitoring across Zerto, Commvault, VMware, networking, servers and storage.
This is a growth-focused role designed for a technically capable professional ready to build, optimize, and automate monitoring across hybrid infrastructure environments.
Requirements
Key Responsibilities
· Act as the technical lead for monitoring configuration across systems including:
o Servers (Windows/Linux), Virtualization (VMware/Hyper-V), Networking (firewalls, switches, routers), Storage (SAN/NAS), and Backup/DR platforms (Commvault, Zerto).
· Maintain and enhance LogicMonitor alert configurations, thresholds, escalation paths, and dashboards to reduce false positives and improve signal clarity.
· Refine BigPanda correlation logic and enrich alerts from multiple infrastructure sources to support root cause identification and smarter incident response.
· Develop and maintain automation scripts and integrations (Python, PowerShell, Bash) to support event enrichment, ticket enrichment, and workflow automation.
· Interface with Systems, Networking, and Storage teams to ensure complete and relevant monitoring across all critical assets.
· Build and manage dashboards, uptime checks, synthetic monitors, and availability reports that support real-time operational awareness.
· Contribute to incident review cycles with feedback loops to improve monitoring scope and reduce operational overhead.
· Create and maintain clear, structured runbooks and playbooks for alert triage, escalation, and routine issue remediation.
· Support roadmap efforts for self-healing automation, telemetry standards, and event-driven workflows.
Qualifications Required:
· 3–5 years of experience in Infrastructure Monitoring, Systems Engineering, NOC, or Operations role.
· Demonstrated hands-on experience with:
o LogicMonitor: platform tuning, custom data sources, escalation routing.
o BigPanda: correlation setup, tag enrichment, and alert flow optimization.
o Monitoring platforms for servers, virtualization, network hardware, storage systems, and backup/DR solutions.
· Practical scripting skills in one or more of the following: Python, PowerShell, Bash.
· Familiarity with REST API integrations, especially for Zerto, Commvault, VMware, and networking equipment.
· Understanding of infrastructure monitoring concepts: availability, performance, alert noise, escalations.
Preferred:
· Exposure to multi-tenant managed services or enterprise-scale environments.
· Experience with event remediation tooling or automation triggers (e.g., StackStorm, Rundeck, Ansible).
· BigPanda or LogicMonitor certification(s) a plus.
· Comfortable collaborating across cross-functional teams (Networking, Systems, NOC, Service Desk).