PVS write cache on the local disks - a real world experience

11:05 AM
PVS write cache on the local disks - a real world experience -

When the implementation of a supply service infrastructure of the decision on the location of the write caching is one of the most important and therefore one of the most discussed. As I have already written two blogs on the subject (you can find them here and here) I will not bore you with more theoretical. Instead this time I would like to share some practical real-world data, an environment where the PVS Write Cache 6 virtual XenApp servers are written on the local disks of only one XenServer. This configuration is discussed quite often because it is a very cost effective solution (Alex Danilychev wrote a good blog about it), but it also holds the risk of local disks are a bottleneck. So in order to understand if this setup works for your environment, you must test and in-depth analysis. See CTX130632 for some advice.

For example my client in this configuration works well enough, for about 3 years. Initially, the customer chose this configuration for cost reasons, such as a configuration with shared storage they would need to upgrade the SAN controllers and added a large number of disks existing shared storage infrastructure.

The environment

- All XenApp servers are virtualized using XenServer 5.6 SP2

- HP 2U rack mounted servers

  • 4 x 140GB 15k SAS drives
  • Smart Array P410i - RAID Controller
  • battery 512MB write cache
  • RAID 5

- Running XenApp 6

- 30 concurrent users per XenApp server during office hours

- More than 1,500 concurrent users in total

- Ratio 6 XenApp servers XenServer Host

Gathering performance data

We collected performance data using the standard Linux iostat command. To filter out unnecessary data immediately we piped the output in the awk command. This gave us the following string:

" iostat -XK 15 5760 | awk 'BEGIN {print" Time peripheral rrqm / s wrqm / sr / sw / s RKB / s wkb / s avgrq-sz avgqu-sz await svctm% util "} / c0d0 / {print strftime ("% H:% M:% S "), $ 0} '> iostat.csv "

In detail means this command:

- iostat -XK 15 5760 - Requires extensive statistics iostat (KB format) at an interval of 15 seconds to 5760 times (= 24 hours)

- / c0d0 / - Filter the output device for c0d0 block

- > iostat.csv - Write output iostat in iostat.csv file

After running this command, we returned a csv 1MB file that could be imported into Excel. From there it was easy to create the graphs below (please note that we captured data during the night, so the middle of the graph is midnight):

The performance graphs

the first graphics display shows the IOPSs caused by the activity of writing PVS cache virtual XenApp servers. As we can see most of the IO activity is written, which is the same in virtual desktop environments. (Please click on the graphs to enlarge)

The second graph shows the read-write ratio even more impressive detail.

The third graph shows the flow in kbytes / s. So this is the actual amount of data read or written and

Finally, we need the data to put graphics in a relationship. Here is the table of use of local disk subsystem:

As we can see the disk subsystem is far from saturation, even if it must manage more than 800 I / Os per second during the logon peak. This is possible only because the RAID controller cache was equipped writing backer battery. This allows the RAID controller to commit the write operation to the XenServer almost immediately. The actual writing to disk is performed whenever the use permits. Although I have no data without such cover, I guess we would use at least 0%, if not even higher.

maintenance approach

I'm sure some of you are now wondering how the customer to maintain hardware / hypervisor and how do they face fault? The reason behind this question (for those who could not keep up) is that in such a configuration, the virtual XenApp servers are linked to an individual XenServer. Indeed, the virtual disk attached to the virtual machine can not be moved. Thus, XenMotion is not possible and if the XenServer should be put off all VMs residents need to go down as well. The customer has solved this problem with a very pragmatic approach. The just bought two XenServers longer required to hold 100% of the load. So in case of maintenance work, they simply disable the connections respective XenApp servers. When the last user was registered, they are able to do what must be done without harming any user. Of course, it is not as flexible as using XenMotion, but given the cost savings mentioned earlier, it was ok for them to have this level of "flexibility".

customer happiness

in addition to showing the purely technical measures, it is very important to answer if the customer is happy with this setup. Basically, the answer I got was, "Yes, we are happy with this setup and we recommend this solution to other customers as well. But of course it depends on the requirements of the individual environment. "

So I think it shows a writing PVS cache configuration local disks is a valid solution for XenApp environments that works not only in theory. I want to prove the same for a XenDesktop environment, but I could not get my hands on some real world data where I also got permission to publish it. So if you would help me here, leave please comment below or email me directly.

Previous
Next Post »
0 Komentar