We recommend that a DPSS storage cluster consist of 4 servers, with 3-4 disks each. However your configuration depends on what your networking connections are. The most typical NGI environment these days will probably be a single 1000BT uplink to a router. Typical 1000 BT host throughput these days (without jumbo frames) is 300-350 Mbps, so 3-4 hosts are needed to saturate a 1000BT uplink. Also, at least 3 disks per host are needed to stream 350 Mbps from the disk.In other words, you want to try to have enough disks on each SCSI host adapter to fill the bus, and enough SCSI host adapters to fill the network adapter. Then you want have enough servers to satisfy your total bandwidth requirements. If you only have a 100BT or OC3 uplink, then a single DPSS server is plenty, or if you have an OC48 uplink, you might want more servers. Scale your total number of servers and disks/server to the network you are connected to.
SCSI is recommended over IDE because IDE does not all simultaneous transfers from multiple devices, and SCSI disks are faster and more reliable. However, with Linux servers there is a way to use IDE disks by getting a storage controller board from 3ware , which puts multiple IDE master devices on 1 board. Since IDE disks tend to be a bit slower (about 6 MB/sec vs 10 MB/sec), so this will slow you down some, but 4 disks should still give about 32 MB/sec per server.
Sample Costs:Configuration with SCSI disks:
sample server host = Pentium 500+ with 100 BT and/or 1000BT: about $2K
sample disk = 50 GB (43 GB formatted) Ultra-wide Seagate (10 MB/s): $815
Throughput Capacity Configuration Total Cost Bottleneck 37 MB/sec (300 Mb/s) 172 GB 1 server, 4 disks $3.6 K host interface 122 MB/sec (980 Mb/s) 516 GB 4 servers, 12 disks $18 K GE network 122 MB/sec (980 Mb/s) 1 TB 4 servers, 24 disks $28 K GE network Configuration with IDE disks:
sample server host = Pentium 500+ with 100 BT and/or 1000BT: about $2K
sample disk = 45 GB (40 GB formatted) IDE disk from Western Digital (6 MB/s): $260
Throughput Capacity Configuration Total Cost Bottleneck 32 MB/sec (250 Mb/s) 160 GB 1 server, 4 disks $3.1 K disk 72 MB/sec (980 Mb/s) 516 GB 4 servers, 12 disks $11.1 K disk 122 MB/sec (980 Mb/s) 960 GB 4 servers, 24 disks $14.2 K GE network
These costs assume you already have a high-speed network in place. If you must purchase switches/routers, the cost will be MUCH higher.It's best to also have a separate host for the DPSS master, but you can also run the master on on for the server hosts. Put at least 128 MB Ram in each server, and 256 MB if the server host is also running the DPSS master.
The DPSS uses raw disk partitions. Its O.K. to have non-DPSS partitions on the same disk, but the DPSS needs to have an empty partition for itself. A RAID disk should also work, but we've never tried it....
Other things to consider:
For sample pricing info, see:
- Remember that if you wish to replicate data for fault tolerance, you'll need that twice as many disks.
- Sun vs Pentium?
This is a hard one. A low-end Sun server is probably about twice the cost of a no-name PC clone. I personally prefer Sun's because they are much easier to install and configure, and Solaris is much more mature than Linux. Multi-threading in particular seems much better on Solaris. Of course Solaris x86 is also an option, but be sure your network hardware is supported.One of many sources for Cheap PC's or disks