Updating the Testbed – Workstation Infrastructure
In early September, we published a piece detailing our SMB / SOHO NAS (Small & Medium Business / Small Office Home Office Network Attached Storage) testbed. Subsequently, we received extensive feedback from readers regarding the testbed as well as the proposed testing methodology. The universal consensus was that the VM (virtual machine) density could be increased (given the dual Xeon processors and the minimal load on the CPU from the workload). Readers also helpfully pointed out traces that could potentially replicate NAS usage in typical home scenarios.
We set out to implement the suggestions within the constraints of the existing workstation. As a reminder, our NAS testbed is based on the Asus Z9PED8-WS motherboard with dual Xeon E5-2630 CPUs. The CPUs have 6 cores each, and the motherboard has 14 SATA ports. Two of the SATA ports are connected to 128 GB SSDs holding the host OS (Windows Server 2008 R2 Enterprise) and Hyper-V snapshot / miscellaneous data for the VMs. The other 12 SATA ports are connected to 12 x 64 GB OCZ Vertex 4 SSDs. The motherboard has 7 PCIe x16 slots, out of which three were occupied by 3 x Intel ESA I-340 T4 quad-port GbE NICs. On this workstation, we were able to run 12 Windows 7 VMs, with each VM getting a dedicated physical disk and a dedicated physical GbE port.
We had carefully planned the infrastructure necessary for the testbed to ensure that the system wouldn’t suffer bottlenecks while generating traffic for the NAS under test. However, we hadn’t explicitly planned for increasing the VM density from 1 VM per CPU core / 1 physical disk per VM / 1 physical GbE port per VM. After doing a profiling of the Iometer workload, we discovered that there would be no issues with increasing the VM density.
Balancing the Network Requirements:
Given the purpose of the testbed, it was impossible to give up on the requirement that a physical GbE port be tied to each VM. If the VM density were to increase, it would become imperative to add more GbE ports to the testbed. Fortunately, Intel had supplied us with twelve ESA I-340 T4 units in the initial stages itself.
After finishing up our initial build, we had 4 PCIe x16 slots left unfilled. One of the PCIe slots had to be used for storage expansion (more on this in the next sub-section), which left us with three slots to be filled with the ESA I-340 units. This increased the number of physical GbE ports in the machine from 14 (3 x ESA I-340 + 2 x Native Z9PED8-WS) to 26 (6 x ESA I-340 + 2 x Native Z9PED8-WS). One of the native GbE ports was used by the host OS and the other was left unconnected in our initial build. In the update process, we decided to take advantage of the spare native GbE port and use it for one of the new VMs. Thus, with the new configuration, we could go up to 25 VMs.
The Storage Challenge:
Our primary challenge lay in the fact that we could no longer afford to dedicate a physical disk to each VM. To compound matters further, all the SATA ports were occupied. In hindsight, it was not a very good decision to attach a 64 GB SSD fully to each VM. While it was possible to store two VHD files corresponding to the boot disks of two VMs in each of the SSDs, we didn’t opt to go that route for two reasons. The occupied disk space for each VM was quite close to half the SSD capacity when the OS and the associated benchmarking programs were added up. We also didn’t want to go through the whole process of setting up the existing VMs all over again. It would have been nice to have 128 GB or 256 GB SSDs connected to each SATA port and not gone with the dedicated physical disk in the first place. For the update, we needed more storage, but there were no SATA ports available. The solution was to use some sort of PCIe based HBA / storage device.
In order to achieve fast turn around, we decided to image one of the twelve 64 GB SSDs into a VHD file and use copies of it for the new VM. This meant that each VHD file was close to 64 GB in size. While the expansion of the networking capabilities made the testbed capable of hosting up to 25 VMs, we had to add at least 13 x 64 GB (832 GB) over the PCIe slot to make it possible. One solution would have been to add a SATA – PCIe bridge adapter and add a SATA drive to the system. However, all the accessible SATA slots in the chassis had already been taken up. All the storage had to be restricted to the space over the PCIe slot. We could have gone in for a high capacity PCIe SSD, but cost considerations meant that we had to drop the idea. A solution presented itself in the form of the OCZ RevoDrive Hybrid.
The OCZ RevoDrive Hybrid consists of a 1 TB 2.5″ HDD and 100 GB of NAND in a two-layer single slot PCIe 2.0 x4 card. In its intended usage scenario, the RevoDrive Hybrid is meant to be used as a boot drive, with the driver and DataPlex software making sure that the hard disk and the NAND flash present themselves to the OS as a single drive. In our intended usage scenario, the unit was meant to act as a secondary storage device in the host OS. All the documentation from OCZ as well as various reviews completely avoided mentioning use of the unit as a secondary storage device, and, in fact, actively discouraged its use in that manner.
We took some risks and installed the RevoDrive Hybrid in the sole remaining PCIe slot. A driver was available only for Windows 7, but it installed without issue on Windows Server 2008 R2. Under Disk Management, the unit appeared as two drives, a 93 GB SSD and a 931 GB HDD. We made 13 VHD copies of one of the 64 GB physical disks on the HDD and set up 13 new VMs with configurations similar to the existing VMs. A physical GbE port was dedicated to each VM and another network port was created to connect to the internal network of the workstation (with the host OS acting as the DHCP server). The only difference compared to the existing VMs was that a VHD file had to be attached to the internal disk of the VM instead of a 64 GB physical disk.
The update to the internals of the workstation went without a hitch and we were able to get 25 VMs running without any major issues. Obviously, booting up and shutting down VMs 13 through 25 (VHD files on the PCIe HDD) simultaneously took more time than we would have liked. However, once the VMs had booted up, access over SSH and running programs that didn’t stress the disk subsystem of the VM were smooth. Our benchmarking script (running the Dynamo traffic generator on each VM with the IOMeter instance in the host OS) was also trivial to update and executed without any hiccups.
Moving forward, we will be stressing NAS systems by accessing it simultaneously from up to 25 VMs, which is representative of an average sized SMB setup. For any benchmarks which might stress the internal disk subsystem, we will continue with the 12 VMs on the dedicated SSDs.