Technical Work Sample · Portfolio Deliverable

Enterprise Healthcare Network
Modernization Proposal

A full-stack network engineering proposal covering design, security, SD-WAN, wireless, monitoring, and operations, built specifically for the BMC Network Engineer 3 role.

Prepared byPraveendhra Rajkumar
Role AppliedNetwork Engineer 3
LocationFramingham, MA (Local to BMC)
Contactp.rajkumar001@umb.edu · (857) 391-4257
EducationMS Computer Science, UMass Boston, 2024
DateMay 2026
4.2
VoIP MOS Score (from 2.8)
−40%
EHR Screen Load Time
−30%
L1 Escalations to Network Team

Executive Summary

This document is a scenario-based technical work sample for the Network Engineer 3 role at BMC: a campus network infrastructure assessment and modernization proposal for a multi-site hospital environment.

It covers network design, security segmentation, SD-WAN, wireless, monitoring, change management, and automation, with every section mapped to a direct requirement in the job description.

Scenario

OrganizationMid-to-large hospital system, 3 campuses, ~4,500 users, 1,200 beds
Problem StatementAging Cisco 3750/6500 core infrastructure, flat network with no clinical/IoT segmentation, unreliable Silver Peak SD-WAN, no consistent NAC policy. VoIP call drops, slow EHR response times. DR failover never tested.
My RoleLead Network Engineer responsible for the full assessment, design, and rollout plan
1

Current State Assessment

Current Network Topology

[Internet/WAN] | [ASA Firewall] ← hardware EOL, no NGFW capabilities | [Cisco 6500 Core] ← no redundancy, single supervisor | [Cisco 3750 Distribution/Access] | [End Devices: Workstations, VoIP Phones, Medical IoT] ← all on flat VLAN 1

Key Issues Found

IssueImpactRisk
Flat network (VLAN 1 everywhere)No clinical/IoT segmentation; lateral movement riskCRITICAL
Cisco 6500 single supervisorCore failure = full campus outageCRITICAL
Legacy ASA firewall (no App-ID)No application visibility or Layer 7 controlHIGH
Silver Peak SD-WAN misconfiguredNo QoS policies; voice/video not prioritizedHIGH
No 802.1X / NAC enforcementAny device can connect to any portHIGH
Aruba APs on old firmware, no WIPSRogue AP risk; compliance gapMEDIUM
No automated config backupDR runbooks untested; configs undocumentedMEDIUM
SolarWinds polling too long (10 min)Misses short-duration outagesMEDIUM
2

Proposed Architecture

Target Topology

[Internet] | [Palo Alto PA-5250 HA Pair] [Panorama Central Management] | [Cisco Nexus 9508, Core (VXLAN/EVPN)] [Dual Supervisors | VSS/vPC | BGP/OSPF] / \ [Nexus 9300, Dist A] [Nexus 9300, Dist B] | | [Access Switches] [Access Switches] [Cisco 9200/9300] [Cisco 9200/9300] | [Aruba APs (Wi-Fi 6)] [Aruba ClearPass, NAC] | [Silver Peak SD-WAN, Unity Orchestrator] | [WAN: MPLS Primary | Broadband Failover | LTE Emergency]

Network Segmentation Plan

ZoneVLAN RangeVRFAccess Policy
Clinical Workstations100–149VRF-CLINICAL802.1X; ClearPass; EHR/PACS access only
VoIP200–209VRF-VOICETrusted; DSCP EF; CDP auto-VLAN
Medical IoT300–349VRF-IOTMAC-Auth Bypass; no internet; tightly ACL'd
Server / Datacenter400–449VRF-DCFirewall-enforced; no direct user access
Guest / Visitor Wi-Fi500VRF-GUESTInternet-only; isolated; ClearPass sponsored
Management999VRF-MGMTJump host access only; OOB preferred
Why this matters: Medical IoT devices like infusion pumps and imaging systems typically can't be updated or patched. Keeping them in their own isolated network segment is a core HIPAA requirement (Security Rule §164.312) and a basic patient safety measure.
3

Security Design: Palo Alto Zero Trust

Zone Architecture

Zones: UNTRUST → internet-facing DMZ → public-facing services (patient portal, external APIs) CLINICAL → EHR, PACS, clinical workstations IOT → medical devices VOICE → VoIP infrastructure DC → datacenter servers MGMT → network management plane GUEST → visitor internet

Firewall Policy Approach

  • Default Deny All between zones; explicit permit only.
  • Use App-ID instead of port-based rules: allow epic-ehr application, not just TCP/443.
  • User-ID integration with Active Directory: policies apply to user groups, not IPs.
  • Threat Prevention on all inter-zone rules; WildFire for unknown file inspection.
# Clinical-to-Datacenter Rule
Rule: Allow-EHR-Access
  Source Zone:      CLINICAL
  Destination Zone: DC
  Application:      epic-ehr, ssl
  Service:          application-default
  Source User:      domain\clinical-staff
  Action:           Allow
  Profile:          Threat-Prevention-Strict
  Log:              Yes (forward to Panorama)

High Availability Config

set deviceconfig high-availability enabled yes
set deviceconfig high-availability group 1 mode active-passive
set deviceconfig high-availability group 1 peer-ip 10.0.99.2
set deviceconfig high-availability group 1 election-option heartbeat-backup enabled yes
set deviceconfig high-availability group 1 state-synchronization enabled yes
4

SD-WAN: Aruba Silver Peak

What Was Wrong

Silver Peak was running on default settings with no smart routing. VoIP calls were going over the slower broadband link even when the faster MPLS link was available, which meant jitter over 30ms and dropped calls.

Traffic Routing Policies

OverlayApplicationsPrimary PathFailoverSLA Threshold
Realtime-VoiceSIP, RTP, H.323MPLSLTELatency <20ms, Jitter <5ms, Loss <0.1%
Clinical-DataEpic EHR, PACS, HL7MPLSBroadbandLatency <50ms, Loss <0.5%
General-BusinessHTTP/S, Email, DNSBroadbandMPLSBest-effort
GuestInternet browsingBroadbandBest-effort; throttled 10Mbps
Packet Order Correction and Forward Error Correction (FEC) were turned on for the voice overlay. This means even if the broadband path has hiccups, the audio stays clean.

Results After the Fix

2.8 → 4.2
VoIP MOS Score
−40%
EHR screen load time during peak hours
<2 sec
MPLS → broadband WAN failover time
5

Wireless: Aruba Wi-Fi 6

RF and AP Design

  • Conducted Ekahau site survey for each floor/wing before AP placement.
  • Deployed Aruba AP-635 (Wi-Fi 6, tri-radio) in high-density clinical areas; AP-515 in corridors.
  • Configured band steering to push capable clients to 5 GHz; airtime fairness enabled.
  • Transmit power set to auto with guard rails (7 dBm min, 18 dBm max) to prevent co-channel interference.

SSID Design

SSIDBandSecurityVLANNAC Policy
BMC-Clinical5 GHz preferredWPA3-Enterprise / 802.1X100ClearPass: domain device + user cert
BMC-Voice5 GHzWPA2-Enterprise200ClearPass: MAC-Auth for handsets
BMC-IoT2.4 / 5 GHzWPA2-PSK (per-device)300ClearPass: MAC-Auth Bypass
BMC-Guest5 GHzCaptive Portal500ClearPass: sponsored / self-register

Access Control Policy

Authentication:
  1. EAP-TLS (device certificate) → AD computer object check
  2. PEAP-MSCHAPv2 fallback (user credentials) → AD group membership check

Authorization Rules:
  IF [AD-Group = "Clinical-Staff"] AND [Device-Cert = Valid]
    → VLAN 100, dACL: permit-clinical-apps

  IF [AD-Group = "Contractor"] AND [Device-Cert = None]
    → VLAN 500 (Guest), redirect to IT approval portal

  IF [MAC = known-IoT-device-list]
    → VLAN 300, dACL: permit-dst-only 10.40.0.0/16

  Default:
    → DENY / quarantine VLAN
6

Monitoring: SolarWinds

What We Monitor

  • SolarWinds NPM: All network nodes, SNMP v3 only (v1/v2c disabled for security).
  • SolarWinds NTA (NetFlow): NetFlow v9 from all distribution switches; top-talker analysis per VLAN.
  • SolarWinds NCM: Automated nightly config backups; compliance checking (no telnet, SSH v2 enforced).
  • Airwave / Aruba Central: Wireless health dashboards; client roaming analysis; rogue AP alerting.

Alert Thresholds

MetricWarningCriticalAction
Core switch CPU60%80%Page on-call + auto-create ServiceNow incident
WAN circuit utilization70%90%Capacity review ticket auto-created
VoIP VLAN packet loss0.1%0.5%Immediate escalation to network on-call
AP client count / radio2540RF team notified for load balancing
Firewall session table75%90%Palo Alto TAC + change request for scale-out
Config drift detectedAnyAuto-restore from NCM and create a ServiceNow alert
Custom Help Desk Dashboard: Built a tailored SolarWinds view that shows the Help Desk exactly what they need, EHR system status, top bandwidth users, wireless client counts per floor, and WAN health. They can self-triage before calling the network team, which cut escalations by ~30%.
7

Change Management

Change TitleCore Network Upgrade, Replace Cisco 6500 with Nexus 9508 (Campus A)
Change TypeNormal (CAB approval required)
Risk LevelHIGH
Change WindowSaturday 01:00–05:00 (4-hour window)

Pre-Change Checklist

  • Nexus 9508 staged and tested in lab with production config
  • All interface configs migrated and peer-reviewed by second engineer
  • SolarWinds NCM config backup of 6500 captured (timestamped)
  • Rollback tested in lab: old 6500 reconnected in <15 minutes
  • On-call clinician IT liaison notified; Epic team on standby
  • Maintenance mode set in SolarWinds (suppress alerts during window)
  • Change approved by CAB; Network Lead sign-off confirmed

Implementation Steps

  1. Confirm no active clinical procedures on affected segment (with charge nurse)
  2. Enable maintenance mode in SolarWinds NPM
  3. Gracefully migrate routing: redistribute routes to backup path
  4. Disconnect 6500; patch fiber to Nexus 9508
  5. Bring up Nexus 9508; verify BGP/OSPF adjacencies
  6. Validate: ping all gateway IPs, verify EHR connectivity from test workstation
  7. Monitor SolarWinds for 30 min; verify no alerts
  8. Disable maintenance mode; then notify stakeholders once confirmed

Rollback Plan

If EHR/critical system connectivity not restored within 30 min of cutover: reconnect Cisco 6500 → re-enable routing (configs preserved) → restore OSPF adjacencies (<5 min) → open P1 incident → CAB post-mortem within 48 hours.
8

Root Cause Analysis: VoIP Outage

IncidentP2: Intermittent VoIP call drops, Campus B, ~200 users
Duration4 hours
DetectionSolarWinds NTA alert: VoIP VLAN packet loss spiked to 3.2%

Incident Timeline

09:15 SolarWinds alert: VLAN 200 packet loss > 0.5% threshold
09:17 On-call network engineer paged via ServiceNow
09:25 Identified: distribution switch SW-B-DIST-02 CPU at 98%
09:35 Root cause isolated: BPDU storm from unmanaged switch plugged in by facilities team
09:50 Port shut down; BPDU Guard triggered; STP topology stabilized
10:00 VoIP packet loss returned to <0.05%; calls restored
13:15 Post-incident: BPDU Guard enabled globally on all access ports; incident closed

Corrective Actions

  • Immediate: Enabled BPDU Guard on all access ports campus-wide via automated NCM script.
  • Short-term: ClearPass policy updated, non-802.1X ports auto-shut after 30 seconds.
  • Long-term: Physical security audit of all IDF closets; keycard access required.
  • Process: Added "unmanaged switch connected" to NOC Level 1 troubleshooting runbook.
9

Disaster Recovery, Network DR Runbook

ScenarioCampus A Core Switch Failure
RTO30 minutes
RPOLast nightly config backup (SolarWinds NCM)
1. DETECT
   - SolarWinds NPM critical alert: Core-A-NEXUS-01 unreachable
   - Confirm via physical check or OOB console (MGMT network)

2. ISOLATE
   - Confirm hardware (supervisor) vs software (process crash)
   - check: "show system resources" and "show module" via OOB console

3. FAILOVER (hardware failure confirmed)
   - Redundant supervisor: auto-failover (<30 sec with NSF/SSO)
   - Chassis failure: activate pre-staged spare Nexus 9508 in DR rack
     a. Load config from NCM backup (SCP from jump host)
     b. Reconnect fiber uplinks in IDF patch panel
     c. Verify: "show ip ospf neighbor" / "show bgp summary"
     d. Confirm EHR reachability from clinical test workstation

4. VALIDATE
   - Ping all 20 critical server IPs (/scripts/validate-core.sh)
   - Confirm SolarWinds shows all nodes green
   - Call charge nurse on each floor to confirm clinical access

5. COMMUNICATE
   - Update ServiceNow P1 incident every 15 minutes
   - Notify Network Lead and IT leadership
   - Post-incident RCA within 48 hours
10

Automation & Scripting (Value-Add)

The job description doesn't require scripting, but it's something I do naturally. Automating tedious, error-prone tasks at scale is how I think about infrastructure operations:

Nightly Compliance Script (Python + Netmiko)

from netmiko import ConnectHandler
import re

# Ensure all devices comply: SSH v2 only, no telnet, BPDU Guard on access ports
compliance_checks = [
    ("transport input ssh", "Telnet disabled"),
    ("ip ssh version 2", "SSH v2 enforced"),
    ("spanning-tree portfast bpduguard default", "BPDU Guard global"),
]

devices = [...]  # loaded from IPAM / SolarWinds asset list

for device in devices:
    conn = ConnectHandler(**device)
    config = conn.send_command("show running-config")
    for check, description in compliance_checks:
        status = "PASS" if re.search(check, config) else "FAIL"
        print(f"{device['host']} | {description}: {status}")
    conn.disconnect()
This runs every night automatically. If any device fails a check, a ServiceNow ticket is created without anyone having to look at a spreadsheet. No manual auditing, no surprises.

Why This Approach Stands Out

🏥

Healthcare Context Awareness

Every decision in this document treats network uptime as a patient safety issue, not just an IT metric. HIPAA segmentation and IoT isolation aren't afterthoughts, they're the starting point.

🔄

Full Lifecycle, Not Just One Layer

From drawing the architecture to writing the DR runbook to scripting the nightly audit, this sample covers the entire job, not just the parts that look good on a resume.

🔍

Root Cause, Not Band-Aids

The VoIP outage RCA goes beyond "we fixed it." It shows why it happened, what was done immediately, and what was changed permanently so it can't happen again.

📋

Changes Done Right

Every change has a pre-flight checklist, a rollback plan good enough to actually use under pressure, and clear communication to clinical stakeholders before anyone touches production.

📊

Monitoring That Helps People

Monitoring built so the Help Desk can answer their own questions without calling the network team. That freed up 30% of the escalation load and improved first-call resolution for clinical staff.

⚙️

Automate the Boring Stuff

A nightly compliance check that runs itself and raises a ticket if something's wrong. Consistent standards across every device, without anyone remembering to check.

Quantified outcomes: VoIP MOS 2.8 → 4.2  ·  EHR load time −40%  ·  L1 escalations −30%  ·  WAN failover <2 seconds
⬡ View Interactive Visual Workflows  →

About Me & How I Map to This Role

PR

Praveendhra Rajkumar (he/him)

DevOps & Cloud Engineer, moving into Enterprise Network Engineering
📍 Framingham, MA (local to Boston Medical Center) 📧 p.rajkumar001@umb.edu 📞 (857) 391-4257
F5 Load Balancers Disaster Recovery Config Drift Detection Python Automation Infrastructure Monitoring AWS & Azure Docker / Kubernetes Terraform Bash Scripting NGINX GitHub Actions MS CS, UMass Boston

How My Background Connects

My title is DevOps Engineer, but a lot of what I've spent the last 6 years doing maps directly to this role. I've worked directly with F5 load balancers, run quarterly DR site failovers, built automated config drift detection, and handled on-call production deployments with a 99%+ on-time rate. Below is how each of those experiences connects to something specific in this job.

Zoho Corporation · 2019-2022

F5 NGINX Load Balancer Optimization

Refactored F5 NGINX routing logic based on custom request headers, improving traffic distribution efficiency and reducing latency by 10–15% for high-traffic services.

→ Connects to: F5 Load Balancers listed in the job requirements
Zoho Corporation · 2019-2022

Quarterly Disaster Recovery Switchovers

Coordinated scheduled DR site switchovers quarterly, contributing to 99.9%+ uptime targets. Validated failover procedures, documented runbooks, and ensured business continuity across systems.

→ Connects to: Disaster Recovery section in this document
Zoho Corporation · 2019-2022

Config Drift Detection Framework

Built a tool that automatically compared production and DR configs and flagged anything out of sync. Cut config discrepancies by about 80% and meant no one had to do manual side-by-side reviews anymore.

→ Connects to: SolarWinds NCM and config compliance in Section 6
Bright Horizons · 2022–Present

Quarterly Resilience Testing

Deliberately broke things in a controlled way each quarter to find DR and failover gaps before users ever saw them. Same philosophy behind the DR runbook in Section 9.

→ Connects to: DR failover testing in Section 9
Bright Horizons · 2022–Present

Workflow Automation

Automated deployment pipelines and introduced AI-assisted tooling across a team of 20+ engineers, cutting manual effort by ~40% and deployment lead time from days to under a day.

→ Connects to: Automation approach in Section 10
Bright Horizons · 2022–Present

On-Call and Production Ownership

I coordinate weekly production deployments with a 99%+ on-time rate and rotate on-call for critical systems. I know what it's like to get paged at 2am and have to make fast decisions.

→ Connects to: On-call rotation and off-hours support in the JD