Technical Work Sample · Portfolio Deliverable

Enterprise Healthcare Network
Modernization Proposal

A full-stack network engineering proposal covering design, security, SD-WAN, wireless, monitoring, and operations, built specifically for the BMC Network Engineer 3 role.

Prepared byPraveendhra Rajkumar

Role AppliedNetwork Engineer 3

LocationFramingham, MA (Local to BMC)

Contactp.rajkumar001@umb.edu · (857) 391-4257

EducationMS Computer Science, UMass Boston, 2024

DateMay 2026

4.2

VoIP MOS Score (from 2.8)

−40%

EHR Screen Load Time

−30%

L1 Escalations to Network Team

✦

Executive Summary

This document is a scenario-based technical work sample for the Network Engineer 3 role at BMC: a campus network infrastructure assessment and modernization proposal for a multi-site hospital environment.

It covers network design, security segmentation, SD-WAN, wireless, monitoring, change management, and automation, with every section mapped to a direct requirement in the job description.

✦

Scenario

Organization	Mid-to-large hospital system, 3 campuses, ~4,500 users, 1,200 beds
Problem Statement	Aging Cisco 3750/6500 core infrastructure, flat network with no clinical/IoT segmentation, unreliable Silver Peak SD-WAN, no consistent NAC policy. VoIP call drops, slow EHR response times. DR failover never tested.
My Role	Lead Network Engineer responsible for the full assessment, design, and rollout plan

Current State Assessment

Current Network Topology

[Internet/WAN] | [ASA Firewall] ← hardware EOL, no NGFW capabilities | [Cisco 6500 Core] ← no redundancy, single supervisor | [Cisco 3750 Distribution/Access] | [End Devices: Workstations, VoIP Phones, Medical IoT] ← all on flat VLAN 1

Key Issues Found

Issue	Impact	Risk
Flat network (VLAN 1 everywhere)	No clinical/IoT segmentation; lateral movement risk	CRITICAL
Cisco 6500 single supervisor	Core failure = full campus outage	CRITICAL
Legacy ASA firewall (no App-ID)	No application visibility or Layer 7 control	HIGH
Silver Peak SD-WAN misconfigured	No QoS policies; voice/video not prioritized	HIGH
No 802.1X / NAC enforcement	Any device can connect to any port	HIGH
Aruba APs on old firmware, no WIPS	Rogue AP risk; compliance gap	MEDIUM
No automated config backup	DR runbooks untested; configs undocumented	MEDIUM
SolarWinds polling too long (10 min)	Misses short-duration outages	MEDIUM

Proposed Architecture

Target Topology

Network Segmentation Plan

Zone	VLAN Range	VRF	Access Policy
Clinical Workstations	100–149	VRF-CLINICAL	802.1X; ClearPass; EHR/PACS access only
VoIP	200–209	VRF-VOICE	Trusted; DSCP EF; CDP auto-VLAN
Medical IoT	300–349	VRF-IOT	MAC-Auth Bypass; no internet; tightly ACL'd
Server / Datacenter	400–449	VRF-DC	Firewall-enforced; no direct user access
Guest / Visitor Wi-Fi	500	VRF-GUEST	Internet-only; isolated; ClearPass sponsored
Management	999	VRF-MGMT	Jump host access only; OOB preferred

Why this matters: Medical IoT devices like infusion pumps and imaging systems typically can't be updated or patched. Keeping them in their own isolated network segment is a core HIPAA requirement (Security Rule §164.312) and a basic patient safety measure.

Security Design: Palo Alto Zero Trust

Zone Architecture

Zones: UNTRUST → internet-facing DMZ → public-facing services (patient portal, external APIs) CLINICAL → EHR, PACS, clinical workstations IOT → medical devices VOICE → VoIP infrastructure DC → datacenter servers MGMT → network management plane GUEST → visitor internet

Firewall Policy Approach

Default Deny All between zones; explicit permit only.
Use App-ID instead of port-based rules: allow epic-ehr application, not just TCP/443.
User-ID integration with Active Directory: policies apply to user groups, not IPs.
Threat Prevention on all inter-zone rules; WildFire for unknown file inspection.

# Clinical-to-Datacenter Rule
Rule: Allow-EHR-Access
  Source Zone:      CLINICAL
  Destination Zone: DC
  Application:      epic-ehr, ssl
  Service:          application-default
  Source User:      domain\clinical-staff
  Action:           Allow
  Profile:          Threat-Prevention-Strict
  Log:              Yes (forward to Panorama)

High Availability Config

set deviceconfig high-availability enabled yes
set deviceconfig high-availability group 1 mode active-passive
set deviceconfig high-availability group 1 peer-ip 10.0.99.2
set deviceconfig high-availability group 1 election-option heartbeat-backup enabled yes
set deviceconfig high-availability group 1 state-synchronization enabled yes

SD-WAN: Aruba Silver Peak

What Was Wrong

Silver Peak was running on default settings with no smart routing. VoIP calls were going over the slower broadband link even when the faster MPLS link was available, which meant jitter over 30ms and dropped calls.

Traffic Routing Policies

Overlay	Applications	Primary Path	Failover	SLA Threshold
Realtime-Voice	SIP, RTP, H.323	MPLS	LTE	Latency <20ms, Jitter <5ms, Loss <0.1%
Clinical-Data	Epic EHR, PACS, HL7	MPLS	Broadband	Latency <50ms, Loss <0.5%
General-Business	HTTP/S, Email, DNS	Broadband	MPLS	Best-effort
Guest	Internet browsing	Broadband	—	Best-effort; throttled 10Mbps

Packet Order Correction and Forward Error Correction (FEC) were turned on for the voice overlay. This means even if the broadband path has hiccups, the audio stays clean.

Results After the Fix

↑

2.8 → 4.2

VoIP MOS Score

↓

−40%

EHR screen load time during peak hours

⚡

<2 sec

MPLS → broadband WAN failover time

Wireless: Aruba Wi-Fi 6

RF and AP Design

Conducted Ekahau site survey for each floor/wing before AP placement.
Deployed Aruba AP-635 (Wi-Fi 6, tri-radio) in high-density clinical areas; AP-515 in corridors.
Configured band steering to push capable clients to 5 GHz; airtime fairness enabled.
Transmit power set to auto with guard rails (7 dBm min, 18 dBm max) to prevent co-channel interference.

SSID Design

SSID	Band	Security	VLAN	NAC Policy
BMC-Clinical	5 GHz preferred	WPA3-Enterprise / 802.1X	100	ClearPass: domain device + user cert
BMC-Voice	5 GHz	WPA2-Enterprise	200	ClearPass: MAC-Auth for handsets
BMC-IoT	2.4 / 5 GHz	WPA2-PSK (per-device)	300	ClearPass: MAC-Auth Bypass
BMC-Guest	5 GHz	Captive Portal	500	ClearPass: sponsored / self-register

Access Control Policy

Authentication:
  1. EAP-TLS (device certificate) → AD computer object check
  2. PEAP-MSCHAPv2 fallback (user credentials) → AD group membership check

Authorization Rules:
  IF [AD-Group = "Clinical-Staff"] AND [Device-Cert = Valid]
    → VLAN 100, dACL: permit-clinical-apps

  IF [AD-Group = "Contractor"] AND [Device-Cert = None]
    → VLAN 500 (Guest), redirect to IT approval portal

  IF [MAC = known-IoT-device-list]
    → VLAN 300, dACL: permit-dst-only 10.40.0.0/16

  Default:
    → DENY / quarantine VLAN

Monitoring: SolarWinds

What We Monitor

SolarWinds NPM: All network nodes, SNMP v3 only (v1/v2c disabled for security).
SolarWinds NTA (NetFlow): NetFlow v9 from all distribution switches; top-talker analysis per VLAN.
SolarWinds NCM: Automated nightly config backups; compliance checking (no telnet, SSH v2 enforced).
Airwave / Aruba Central: Wireless health dashboards; client roaming analysis; rogue AP alerting.

Alert Thresholds

Metric	Warning	Critical	Action
Core switch CPU	60%	80%	Page on-call + auto-create ServiceNow incident
WAN circuit utilization	70%	90%	Capacity review ticket auto-created
VoIP VLAN packet loss	0.1%	0.5%	Immediate escalation to network on-call
AP client count / radio	25	40	RF team notified for load balancing
Firewall session table	75%	90%	Palo Alto TAC + change request for scale-out
Config drift detected	—	Any	Auto-restore from NCM and create a ServiceNow alert

Custom Help Desk Dashboard: Built a tailored SolarWinds view that shows the Help Desk exactly what they need, EHR system status, top bandwidth users, wireless client counts per floor, and WAN health. They can self-triage before calling the network team, which cut escalations by ~30%.

Change Management

Change Title	Core Network Upgrade, Replace Cisco 6500 with Nexus 9508 (Campus A)
Change Type	Normal (CAB approval required)
Risk Level	HIGH
Change Window	Saturday 01:00–05:00 (4-hour window)

Pre-Change Checklist

Nexus 9508 staged and tested in lab with production config
All interface configs migrated and peer-reviewed by second engineer
SolarWinds NCM config backup of 6500 captured (timestamped)
Rollback tested in lab: old 6500 reconnected in <15 minutes
On-call clinician IT liaison notified; Epic team on standby
Maintenance mode set in SolarWinds (suppress alerts during window)
Change approved by CAB; Network Lead sign-off confirmed

Implementation Steps

Confirm no active clinical procedures on affected segment (with charge nurse)
Enable maintenance mode in SolarWinds NPM
Gracefully migrate routing: redistribute routes to backup path
Disconnect 6500; patch fiber to Nexus 9508
Bring up Nexus 9508; verify BGP/OSPF adjacencies
Validate: ping all gateway IPs, verify EHR connectivity from test workstation
Monitor SolarWinds for 30 min; verify no alerts
Disable maintenance mode; then notify stakeholders once confirmed

Rollback Plan

If EHR/critical system connectivity not restored within 30 min of cutover: reconnect Cisco 6500 → re-enable routing (configs preserved) → restore OSPF adjacencies (<5 min) → open P1 incident → CAB post-mortem within 48 hours.

Root Cause Analysis: VoIP Outage

Incident	P2: Intermittent VoIP call drops, Campus B, ~200 users
Duration	4 hours
Detection	SolarWinds NTA alert: VoIP VLAN packet loss spiked to 3.2%

Incident Timeline

09:15 SolarWinds alert: VLAN 200 packet loss > 0.5% threshold

09:17 On-call network engineer paged via ServiceNow

09:25 Identified: distribution switch SW-B-DIST-02 CPU at 98%

09:35 Root cause isolated: BPDU storm from unmanaged switch plugged in by facilities team

09:50 Port shut down; BPDU Guard triggered; STP topology stabilized

10:00 VoIP packet loss returned to <0.05%; calls restored

13:15 Post-incident: BPDU Guard enabled globally on all access ports; incident closed

Corrective Actions

Immediate: Enabled BPDU Guard on all access ports campus-wide via automated NCM script.
Short-term: ClearPass policy updated, non-802.1X ports auto-shut after 30 seconds.
Long-term: Physical security audit of all IDF closets; keycard access required.
Process: Added "unmanaged switch connected" to NOC Level 1 troubleshooting runbook.

Disaster Recovery, Network DR Runbook

Scenario	Campus A Core Switch Failure
RTO	30 minutes
RPO	Last nightly config backup (SolarWinds NCM)

1. DETECT
   - SolarWinds NPM critical alert: Core-A-NEXUS-01 unreachable
   - Confirm via physical check or OOB console (MGMT network)

2. ISOLATE
   - Confirm hardware (supervisor) vs software (process crash)
   - check: "show system resources" and "show module" via OOB console

3. FAILOVER (hardware failure confirmed)
   - Redundant supervisor: auto-failover (<30 sec with NSF/SSO)
   - Chassis failure: activate pre-staged spare Nexus 9508 in DR rack
     a. Load config from NCM backup (SCP from jump host)
     b. Reconnect fiber uplinks in IDF patch panel
     c. Verify: "show ip ospf neighbor" / "show bgp summary"
     d. Confirm EHR reachability from clinical test workstation

4. VALIDATE
   - Ping all 20 critical server IPs (/scripts/validate-core.sh)
   - Confirm SolarWinds shows all nodes green
   - Call charge nurse on each floor to confirm clinical access

5. COMMUNICATE
   - Update ServiceNow P1 incident every 15 minutes
   - Notify Network Lead and IT leadership
   - Post-incident RCA within 48 hours

Automation & Scripting (Value-Add)

The job description doesn't require scripting, but it's something I do naturally. Automating tedious, error-prone tasks at scale is how I think about infrastructure operations:

Nightly Compliance Script (Python + Netmiko)

from netmiko import ConnectHandler
import re

# Ensure all devices comply: SSH v2 only, no telnet, BPDU Guard on access ports
compliance_checks = [
    ("transport input ssh", "Telnet disabled"),
    ("ip ssh version 2", "SSH v2 enforced"),
    ("spanning-tree portfast bpduguard default", "BPDU Guard global"),
]

devices = [...]  # loaded from IPAM / SolarWinds asset list

for device in devices:
    conn = ConnectHandler(**device)
    config = conn.send_command("show running-config")
    for check, description in compliance_checks:
        status = "PASS" if re.search(check, config) else "FAIL"
        print(f"{device['host']} | {description}: {status}")
    conn.disconnect()

This runs every night automatically. If any device fails a check, a ServiceNow ticket is created without anyone having to look at a spreadsheet. No manual auditing, no surprises.

✦

Why This Approach Stands Out

🏥

Healthcare Context Awareness

Every decision in this document treats network uptime as a patient safety issue, not just an IT metric. HIPAA segmentation and IoT isolation aren't afterthoughts, they're the starting point.

🔄

Full Lifecycle, Not Just One Layer

From drawing the architecture to writing the DR runbook to scripting the nightly audit, this sample covers the entire job, not just the parts that look good on a resume.

🔍

Root Cause, Not Band-Aids

The VoIP outage RCA goes beyond "we fixed it." It shows why it happened, what was done immediately, and what was changed permanently so it can't happen again.

📋

Changes Done Right

Every change has a pre-flight checklist, a rollback plan good enough to actually use under pressure, and clear communication to clinical stakeholders before anyone touches production.

📊

Monitoring That Helps People

Monitoring built so the Help Desk can answer their own questions without calling the network team. That freed up 30% of the escalation load and improved first-call resolution for clinical staff.

⚙️

Automate the Boring Stuff

A nightly compliance check that runs itself and raises a ticket if something's wrong. Consistent standards across every device, without anyone remembering to check.

Quantified outcomes: VoIP MOS 2.8 → 4.2 · EHR load time −40% · L1 escalations −30% · WAN failover <2 seconds

⬡ View Interactive Visual Workflows →

✪

About Me & How I Map to This Role

How My Background Connects

My title is DevOps Engineer, but a lot of what I've spent the last 6 years doing maps directly to this role. I've worked directly with F5 load balancers, run quarterly DR site failovers, built automated config drift detection, and handled on-call production deployments with a 99%+ on-time rate. Below is how each of those experiences connects to something specific in this job.

Zoho Corporation · 2019-2022

F5 NGINX Load Balancer Optimization

Refactored F5 NGINX routing logic based on custom request headers, improving traffic distribution efficiency and reducing latency by 10–15% for high-traffic services.

→ Connects to: F5 Load Balancers listed in the job requirements

Zoho Corporation · 2019-2022

Quarterly Disaster Recovery Switchovers

Coordinated scheduled DR site switchovers quarterly, contributing to 99.9%+ uptime targets. Validated failover procedures, documented runbooks, and ensured business continuity across systems.

→ Connects to: Disaster Recovery section in this document

Zoho Corporation · 2019-2022

Config Drift Detection Framework

Built a tool that automatically compared production and DR configs and flagged anything out of sync. Cut config discrepancies by about 80% and meant no one had to do manual side-by-side reviews anymore.

→ Connects to: SolarWinds NCM and config compliance in Section 6

Bright Horizons · 2022–Present

Quarterly Resilience Testing

Deliberately broke things in a controlled way each quarter to find DR and failover gaps before users ever saw them. Same philosophy behind the DR runbook in Section 9.

→ Connects to: DR failover testing in Section 9

Bright Horizons · 2022–Present

Workflow Automation

Automated deployment pipelines and introduced AI-assisted tooling across a team of 20+ engineers, cutting manual effort by ~40% and deployment lead time from days to under a day.

→ Connects to: Automation approach in Section 10

Bright Horizons · 2022–Present

On-Call and Production Ownership

I coordinate weekly production deployments with a 99%+ on-time rate and rotate on-call for critical systems. I know what it's like to get paged at 2am and have to make fast decisions.

→ Connects to: On-call rotation and off-hours support in the JD

Enterprise Healthcare NetworkModernization Proposal

Executive Summary

Scenario

Current State Assessment

Current Network Topology

Key Issues Found

Proposed Architecture

Target Topology

Network Segmentation Plan

Security Design: Palo Alto Zero Trust

Zone Architecture

Firewall Policy Approach

High Availability Config

SD-WAN: Aruba Silver Peak

What Was Wrong

Traffic Routing Policies

Results After the Fix

Wireless: Aruba Wi-Fi 6

RF and AP Design

SSID Design

Access Control Policy

Monitoring: SolarWinds

What We Monitor

Alert Thresholds

Change Management

Pre-Change Checklist

Implementation Steps

Rollback Plan

Root Cause Analysis: VoIP Outage

Incident Timeline

Corrective Actions

Disaster Recovery, Network DR Runbook

Automation & Scripting (Value-Add)

Nightly Compliance Script (Python + Netmiko)

Why This Approach Stands Out

Healthcare Context Awareness

Full Lifecycle, Not Just One Layer

Root Cause, Not Band-Aids

Changes Done Right

Monitoring That Helps People

Automate the Boring Stuff

About Me & How I Map to This Role

Praveendhra Rajkumar (he/him)

How My Background Connects

F5 NGINX Load Balancer Optimization

Quarterly Disaster Recovery Switchovers

Config Drift Detection Framework

Quarterly Resilience Testing

Workflow Automation

On-Call and Production Ownership

Enterprise Healthcare Network
Modernization Proposal