top

Apache log rotation

One of the easiest things to do to save disk space (and our costs, and ultimately, yours) is to be sure to rotate apache logs. This is very simple to do in Apache and our OpenSolaris VPS servers.

Just yesterday, we had one customer that was had GB of log data:

root@wolf:/home# du -sh */logs/*
56K administracionalta/logs/access_log
898K administracionalta/logs/error_log
0K ascensoresfamtech/logs/access_log
47K ascensoresfamtech/logs/error_log
0K gusonline/logs/access_log
0K gusonline/logs/error_log
898K pintoporcelana/logs/access_log
11K pintoporcelana/logs/access_log.1270252800
3.3M pintoporcelana/logs/error_log
5.0M radiosentidos/logs/access_log
115K radiosentidos/logs/access_log.1270252800
14G radiosentidos/logs/error_log
0K sanceferinoweb/logs/access_log
1.4M sanceferinoweb/logs/error_log
0K teinvito/logs/access_log
69K teinvito/logs/error_log
0K teinvitotv/logs/access_log
0K teinvitotv/logs/error_log
root@wolf:/home#

We changed httpd.conf file for the CustomLog definition to look like:

CustomLog "|/usr/bin/rotatelogs /var/apache2/2.2/logs/access_log 86400" common

Restart apache, and that should do it! That was simple right? Now, next time use something better than Apache if possible. :)

Tomcat and OpenSolaris... OutOfMemoryError

This doesn't apply to just Tomcat, but we'll use Tomcat as an example. If get Java heap size errors along with those nasty OutOfMemoryError exceptions, here's what you need to do.

root@vps1:~# svccfg -s svc:/network/http:tomcat6 setenv JAVA_OPTS -Xmx256m
root@vps1:~# svccfg refresh svc:/network/http:tomcat6
root@vps1:~# svccfg restart svc:/network/http:tomcat6

That should do it. You can use the good old pargs [pid] command to make sure the above JAVA_OPTS were taken.

Tuning Databases for ZFS

Small post here. Check out this document we found for tuning databases running on ZFS/OpenSolaris VPS.

OpenSolaris VPS benchmark comparison

We ran unixbench 5.1.2 on one of our servers. We wanted to see how it compared to some of the competition. The Linode and EC2 benchmarks were done back in December 2009, on the latest hardware that was provided. Our OpenSolaris benchmarks were done on a system we bought back in March 2009.

We'll start off with Linode 360 plan (4 CPU / 4 parallel):

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables 116700.0 29911700.2 2563.1
Double-Precision Whetstone 55.0 7852.9 1427.8
Execl Throughput 43.0 5470.0 1272.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 315110.5 795.7
File Copy 256 bufsize 500 maxblocks 1655.0 82099.9 496.1
File Copy 4096 bufsize 8000 maxblocks 5800.0 866155.2 1493.4
Pipe Throughput 12440.0 2053207.3 1650.5
Pipe-based Context Switching 4000.0 237263.9 593.2
Process Creation 126.0 10784.4 855.9
Shell Scripts (1 concurrent) 42.4 9259.1 2183.7
Shell Scripts (8 concurrent) 6.0 1539.9 2566.5
System Call Overhead 15000.0 1768915.5 1179.3
========
System Benchmarks Index Score 1254.5

Now, Amazon's EC2 medium (2 CPU / 4 parallel):

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables 116700.0 24194215.0 2073.2
Double-Precision Whetstone 55.0 8422.0 1531.3
Execl Throughput 43.0 2379.7 553.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 142163.5 359.0
File Copy 256 bufsize 500 maxblocks 1655.0 36551.4 220.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 421398.5 726.5
Pipe Throughput 12440.0 239183.7 192.3
Pipe-based Context Switching 4000.0 82291.4 205.7
Process Creation 126.0 2974.6 236.1
Shell Scripts (1 concurrent) 42.4 5357.8 1263.6
Shell Scripts (8 concurrent) 6.0 737.7 1229.5
System Call Overhead 15000.0 1467331.5 978.2
========
System Benchmarks Index Score 579.6

Finally, our OpenSolaris VPS servers with about 20-30% utilization. The first one is with 2 CPU / 2 parallel:

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables 116700.0 54106806.1 4636.4
Double-Precision Whetstone 55.0 7594.6 1380.8
Execl Throughput 43.0 1849.9 430.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 126718.2 320.0
File Copy 256 bufsize 500 maxblocks 1655.0 32468.1 196.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 378133.6 652.0
Pipe Throughput 12440.0 2062538.7 1658.0
Pipe-based Context Switching 4000.0 262259.5 655.6
Process Creation 126.0 2888.7 229.3
Shell Scripts (1 concurrent) 42.4 4155.4 980.0
Shell Scripts (8 concurrent) 6.0 728.6 1214.4
System Call Overhead 15000.0 930953.7 620.6
========
System Benchmarks Index Score 724.0

We close out with 4 CPU / 4 parallel test on the same server:

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables 116700.0 101277172.7 8678.4
Double-Precision Whetstone 55.0 15197.7 2763.2
Execl Throughput 43.0 2764.1 642.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 152318.4 384.6
File Copy 256 bufsize 500 maxblocks 1655.0 39096.9 236.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 526719.9 908.1
Pipe Throughput 12440.0 4177112.1 3357.8
Pipe-based Context Switching 4000.0 650505.1 1626.3
Process Creation 126.0 4924.3 390.8
Shell Scripts (1 concurrent) 42.4 5666.8 1336.5
Shell Scripts (8 concurrent) 6.0 806.4 1344.0
System Call Overhead 15000.0 728384.9 485.6
========
System Benchmarks Index Score 1074.5

Clearly, we were not leaders in the benchmarks. But there is something interesting to be said about these results. Our OpenSolaris VPS servers make use of ZFS. ZFS does some interesting things. The File Copy tests lag behind greatly for one big reason. ZFS on our OpenSolaris VPS ensures reliability over performance. Let us explain.

Disks have cache on-board to help with speed. Most other operating systems and file systems will "tell" the application that a write is complete soon after the data is written to the disk cache (a temporary location, before it is written to the slower disk spindles). If the power were to fail at this point, the data is lost. Your business critical database thinks it wrote data to disk but in reality, it did not. It wrote to the cache. This leads to corruption and bad data.

ZFS doesn't do this. ZFS ensures that all data is properly flushed to disk on each commit. ZFS takes the safer and reliable approach to disk management. You can expect your database to be in good order if the system were to crash or if we were to ever experience a catastrophic power outage.

If we were to put aside the File Copy tests, we think we would have a pretty good chance of beating out the competition! In the future we'll run these same tests on a ZFS system with flush on commit turned off. But don't count on that being disabled on a production system. We prefer reliability over performance. Adding SSD to the mix should also improve things greatly, but you'll have to wait a little longer to experience OpenSolaris VPS with SSD flash disks.

Peer1 == vendor lock-in

Our main network provider is Internap, on the 10th floor in the Market Post Tower (San Jose) from CoreSite. We've recently started gathering quotes for a 2nd network provider. We came across Peer1. The negotiations fell through for one main reason:

If we added Peer1 to our mix, we would have to cross connect from 10th floor to Peer1's 16th floor. We weren't interested in dropping Internap and going with Peer1 only, so a cross connect was a must. There was a small catch (or rather a big catch): It would cost $2662 to put in a cross connect to our existing network provider in the same building. WHY was this so expensive!?

Market Post Tower is the most connect building, allowing providers to peer with other providers with ease. However, Peer1 not allowing us to do this was a big drawback (without paying lots of money). We dug a little deeper.

It turns out Peer1 has an office space rented from CoreSite which it converted to data center space. Peer1 has an isolated infrastructure from the rest of the building. Peer1 does not "peer" with CoreSite's Any2 network, so a cross connect would involve some expensive conduit installs. This is where the $2662 quote comes in.

There was no way for us to move our servers to Peer1 and still be connected to our current provider. If we had gone with Peer1, and down the line, if we wanted to not use Peer1 any longer, it would've involved some expensive cross connect charges.

Neither of which sounded interesting, so in the end we dropped Peer1 idea. We've decided to continue to use our premium "pure" Internap network inside Entic.net. We also didn't want to go from the "most connected building" status to a "least connected" floor status. :)

← Previous  1 2 3 4 5 6 … 14 Next →