INTRODUCTION
As a DELL System/Solution Consultant I designed hardware infrastructure for one Czech commercial ISP which wants to provide IPTV and VoD. ISP choosed software IPTV/VoD/DRM solution based on Linux OS. Together with software provider we choosed several servers PE 2970 which are AMD (x86_64) based servers. Streamer server needs cost-effective however fast enough disk subsystem. Unfortunatly, 2.5" hdd are available only as 10k rpm so streamer has eight 2.5" SAS 73GB 10k rpm in RAID5 where RAID is provided by internal DELL Power Edge RAID controller PERC 5/i (re-branded LSI MegaRAID SAS). Software provider has load tests which are able to recognized if system is good enough for requested solution. Software provider has reference hardware configuration with reference results. Reference hardware is server SuperMicro with eight 3.5" SCSI 36GB 10k rpm connected to RAID controller Areca ARC-1260. Disk subsystem is tested by real load test and also by synthetical disk benchmark utility "iozone". We can try to tune disk subsystem and check progress by synthetical benchmark tool.
RESULTS & TUNING
Here are results from reference hardware (SuperMicro):
1 THREAD
iozone -s 32g -r 1m -i 0 -i 1 -t 1 -b /tmp/test.xls
Initial write    299951.25 kB/s
Rewrite        254075.39 kB/s
Read        386271.91 kB/s
Re-read        388288.41 kB/s
2 THREADS
iozone -s 16g -r 1m -i 0 -i 1 -t 2 -b /tmp/test.xls
Initial write    327977.77 kB/s
Rewrite        321502.41 kB/s
Read        312530.73 kB/s
Re-read        315091.88 kB/s
10 THREADS
iozone -s 4g -r 1m -i 0 -i 1 -t 10 -b /tmp/test.xls
Initial write    293753.92 kB/s
Rewrite        281009.19 kB/s
Read        262086.25 kB/s
Re-read        260864.43 kB/s
Software provider tested DELL PE 2970 with these results:
1 THREAD
iozone -s 32g -r 1m -i 0 -i 1 -t 1 -b /tmp/test.xls
Initial write    339329.00 kB/s
Rewrite        325063.56 kB/s
Read        337726.91 kB/s
Re-read        320971.62 kB/s
2 THREADS
iozone -s 16g -r 1m -i 0 -i 1 -t 2 -b /tmp/test.xls
Initial write    356046.38 kB/s
Rewrite        359441.64 kB/s
Read        193787.55 kB/s
Re-read        194154.85 kB/s
10 THREADS
iozone -s 4g -r 1m -i 0 -i 1 -t 10 -b /tmp/test.xls
Initial write    296494.20 kB/s
Rewrite        281730.82 kB/s
Read        147723.06 kB/s
Re-read        148728.67 kB/s
We can see significant difference between reference hardware and DELL hardware. DELL is better in write throughtput but worse in read throughtput. Default parameters of DELL servers are preconfigured for database servers where are usually different requirements then for streaming applications. Streaming applications don't need write but read performace. DELL received request from software provider to optimize infrastructure for better read performance.
Tunning of disk subsystem is not easy task and it's depended on lot of aspects. In this particular environment we have OS Linux Debian 4.0 (kernel 2.6.x), filesystem XFS, Raid 5, PERC 5/i. Debian is not certified and supported operating system so customer cannot use standard DELL tech-support but sometimes DELL can help to their customers in some particular complex enterprise solutions.
For increase read performace - especially sequence reads - is very important to set up read-ahead cache. PERC 5/i can be in three modes - adaptive, read ahead and non-read ahead. PERC is by default in adaptive mode which means that PERC use internal algorithm to automaticaly recognize when to use read-ahead. In this particular solution we can explicitly set up "read ahead" mode in RAID management. Another very important point is to set up read-ahead in operating system Linux block device layer.
Linux kernel 2.6
Set the value to 8192 blocks using the blockdev command, for example
blockdev --setra 8192 /dev/sda
this example is setting up 4MB Cache (8192 blocks of 512-byte sector)
which is aligned with default XFS parameters see. xfs_info for current XFS parameters
DELL IOZONE tests reults on DELL PE 2970 with tuned block device layer:
2 THREADS
iozone -s 16g -r 1m -i 0 -i 1 -t 2 -b /tmp/test.xls
Initial write    290692.31 kB/s
Rewrite        359531.20 kB/s
Read        503044.62 kB/s
Re-read        496045.61 kB/s
10 THREADS
iozone -s 4g -r 1m -i 0 -i 1 -t 10 -b /tmp/test.xls
Initial write    297497.92 kB/s
Rewrite        279933.16 kB/s
Read        473969.33 kB/s
Re-read        481384.16 kB/s
CONCLUSION
It's possible to successfully tuned up disk subsystem by set up read-ahead parameters. Read throughtput of DELL PE 2970,PERC 5/i,8xhdd 10k rpm was increased approximately 3 times so we achieved great results in synthetical benchmark (iozone) and overcome reference hardware results. What we can see is that disk performance lowers with more concurrent threads. Local RAID controllers are designed as disk storage for one server where I/O stress is not so high. If someone is looking for storage without I/O stress issues he should focus to SAN disk arrays which are designed for environments with lot of servers, proceses and threads.
 
1 comment:
Great post, seams to help my filesystem from chooking (pe1950 with md1000 as nas server) during heavy reading. Still need to findout the optimal value for setra when using reiserfs...
Post a Comment