November 3, 2010 by Jason Garber

How to force-drop a postgresql database by killing off connection processes

Ever need to drop a postgresql database, but it would not let you because there are open connections to it (from a webapp or whatever)?

Quite annoying. If on a production server, and other databases are being used, restarting postgresql is a last resort, because it generates downtime for your site (even if small).

I finally took the time to scratch around and find the answer.

As a super user, to list all of the open connections to a given database:

select * from pg_stat_activity where datname='YourDatabase';

As a superuser, to drop all of the open connections to a given database:

select pg_terminate_backend(procpid) from pg_stat_activity where datname=’YourDatabase’;

Or for 9.x, change `procpid` to `pid`

select pg_terminate_backend(pid) from pg_stat_activity where datname='YourDatabase';

—

Here are some references to the functions:

http://www.postgresql.org/docs/current/static/functions-admin.html#FUNCTIONS-ADMIN-SIGNAL-TABLE

http://www.postgresql.org/docs/8.4/static/monitoring-stats.html#MONITORING-STATS-VIEWS-TABLE

June 14, 2010 by Jason Garber

Ubuntu Post-Install tips…

I received this from a friend, and thought I would post it here in case anyone would find it useful.

After Installing Ubuntu, basically I do this:

Go to:
System -> Administration -> Software Sources -> Other Sofware, and enable partner repository.

After that, we can this on a Terminal:

sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install ubuntu-restricted-extras
sudo /usr/share/doc/libdvdread4/install-css.sh

May 16, 2010 by Jason Garber

PostgreSQL Dump and Restore Notes

The pg_dump and pg_restore commands provide excellent flexibility in storing a compressed dump file, and selectively restoring any part of it.

I’ve found that dropping and re-creating the target database is the cleanest way to restore a dumpfile — no stray relations left to cause trouble.

Unless you own all of the objects being restored, you may need to be SUPERUSER in order to have a successful restore.

The custom dump format is quite useful. Unlike the normal sequence of SQL statements you may be used to from mysqldump (and pg_dump as well), the –format=custom option will create a compressed archive file (internally a tar file) that can be selectivly read with pg_restore. That flexibility could come in handy if you *just* need the schema from 1 table, or *just* the data from another table.

Dump:
pg_dump –format=custom -U jason_super MyDatabase > MyDatabase.pgdump

Restore
pg_restore –exit-on-error –clean –dbname=MyDatabase MyDatabase.pgdump

Get all of the SQL
pg_restore TMTManage_2.pgdump | more

Get some of the SQL
pg_restore –schema=ACRM –table=Admin TMTManage_2.pgdump | more

February 24, 2010 by Jason Garber

nginx restart error

Was playing around with nginx on Centos 5 (EPEL package).

Most of the time I ran:

service nginx restart

I would get this message in the /var/log/nginx/error.log file:

panic: MUTEX_LOCK (22) [op.c:352].

After some hunting around, it appears to be a known bug in nginx (perhaps perl in nginx?)… Anyway, a simple workaround is to do this:

service nginx stop
service nginx start

Or, simply edit /etc/init.d/nginx, and add the sleep 1 line:

51 restart() {
52     configtest || return $?
53     stop
54     sleep 1
55     start
56 }

Nice workround!

December 26, 2009 by Jason Garber

Example Automated MySQL Backup Script

Rather than use --all-databases, which will prevent you from being able to selectively restore any single database, consider the following:

Ideally, you should have a daily backup, with some history. It should be bulletproof (--force), it should be logged (>> ...log), it should be compressed (| gzip), it should keep separate copies of each database, and it should automatically pick up any databases that are added.

Consider, rather, a shell script like this:

#!/bin/bash

Host=server.domain.com
BDir=/home/backup/backup/mysql

Dump="/usr/bin/mysqldump --skip-extended-insert --force"
MySQL=/usr/bin/mysql

Today=$(date "+%a")

# Get a list of all databases
Databases=$(echo "SHOW DATABASES" | $MySQL -h $Host)

for db in $Databases; do
        date=`date`
        file="$BDir/$Host-$db-$Today.sql.gz"
        echo "Backing up '$db' from '$Host' on '$date' to: "
        echo "   $file"
        $Dump -h $Host $db | gzip > $file
done

Which is assuming that you have a file ~/.my.cnf (chmod 600), that has:

[client]
user = "BACKUP"
password = "SOMEPASS8342783492"

Make sure that whatever user you are using for BACKUP has this grant statement:

GRANT
  SELECT, SHOW VIEW ON *.*
  TO BACKUP@localhost
  IDENTIFIED BY 'SOMEPASS8342783492';

So simply add this to a nightly cronjob, and you have a daily backup that rotates each 7 days week.

0 3 * * *   backup-mysql >> backup-mysql.log 2>> backup-mysql.log

The backup directory then contains:

-rw-r--r-- 1 backup backup 2217482184 Sep  3 13:35 base.appcove.net-VOS4_0-20090903.sql.gz
-rw-rw-r-- 1 backup backup 2505876287 Dec 25 00:48 base.appcove.net-VOS4_0-Fri.sql.gz
-rw-r--r-- 1 backup backup 2500384029 Dec 21 00:48 base.appcove.net-VOS4_0-Mon.sql.gz
-rw-r--r-- 1 backup backup 2506849331 Dec 26 00:48 base.appcove.net-VOS4_0-Sat.sql.gz
-rw-r--r-- 1 backup backup 2499859469 Dec 20 00:48 base.appcove.net-VOS4_0-Sun.sql.gz
-rw-rw-r-- 1 backup backup 2505046147 Dec 24 00:48 base.appcove.net-VOS4_0-Thu.sql.gz
-rw-rw-r-- 1 backup backup 2502277743 Dec 22 00:48 base.appcove.net-VOS4_0-Tue.sql.gz
-rw-r--r-- 1 backup backup 2504169910 Dec 23 00:48 base.appcove.net-VOS4_0-Wed.sql.gz
-rw-r--r-- 1 backup backup   76983829 Dec 25 00:49 base.appcove.net-VOS4_Mail_0-Fri.sql.gz
-rw-r--r-- 1 backup backup   76983829 Dec 21 00:49 base.appcove.net-VOS4_Mail_0-Mon.sql.gz
-rw-r--r-- 1 backup backup   76983829 Dec 26 00:49 base.appcove.net-VOS4_Mail_0-Sat.sql.gz
-rw-r--r-- 1 backup backup   76983829 Dec 20 00:48 base.appcove.net-VOS4_Mail_0-Sun.sql.gz
-rw-rw-r-- 1 backup backup   76983829 Dec 24 00:49 base.appcove.net-VOS4_Mail_0-Thu.sql.gz
-rw-rw-r-- 1 backup backup   76983829 Dec 22 00:49 base.appcove.net-VOS4_Mail_0-Tue.sql.gz
-rw-r--r-- 1 backup backup   76983829 Dec 23 00:49 base.appcove.net-VOS4_Mail_0-Wed.sql.gz
-rw-r--r-- 1 backup backup  304803726 Dec 25 00:49 base.appcove.net-WeSell_0-Fri.sql.gz
-rw-r--r-- 1 backup backup  303480087 Dec 21 00:49 base.appcove.net-WeSell_0-Mon.sql.gz
-rw-r--r-- 1 backup backup  304710121 Dec 26 00:49 base.appcove.net-WeSell_0-Sat.sql.gz
-rw-r--r-- 1 backup backup  303791294 Dec 20 00:49 base.appcove.net-WeSell_0-Sun.sql.gz
-rw-rw-r-- 1 backup backup  305315415 Dec 24 00:49 base.appcove.net-WeSell_0-Thu.sql.gz
-rw-rw-r-- 1 backup backup  302516217 Dec 22 00:49 base.appcove.net-WeSell_0-Tue.sql.gz
-rw-r--r-- 1 backup backup  303314217 Dec 23 00:49 base.appcove.net-WeSell_0-Wed.sql.gz
-rw-r--r-- 1 backup backup     135301 Dec 25 00:30 dc40.appcove.net-mysql-Fri.sql.gz
-rw-r--r-- 1 backup backup     135301 Dec 21 00:30 dc40.appcove.net-mysql-Mon.sql.gz
-rw-r--r-- 1 backup backup     135301 Dec 26 00:30 dc40.appcove.net-mysql-Sat.sql.gz
-rw-r--r-- 1 backup backup     135301 Dec 20 00:30 dc40.appcove.net-mysql-Sun.sql.gz
-rw-rw-r-- 1 backup backup     135301 Dec 24 00:30 dc40.appcove.net-mysql-Thu.sql.gz
-rw-rw-r-- 1 backup backup     135301 Dec 22 00:30 dc40.appcove.net-mysql-Tue.sql.gz
-rw-r--r-- 1 backup backup     135301 Dec 23 00:30 dc40.appcove.net-mysql-Wed.sql.gz

August 22, 2009 by Jason Garber

Freeky Bug

Ever have one of those bugs that customers complain about, but you just cannot reproduce it? Here is a good one…

Customers were complaining about being logged out when clicking a download link.

This particular setup is a Cisco CSS 11501 series load balancer with 2 Dell Poweredge web servers sitting behind it. Each webserver is running apache, as well as an application server (python) which handles authentication and processing for THAT server.

For weeks, I could not reproduce this bug. So tonight when I finally got bit by it (at home), I was clueless for a while. The code is so simple. A simple key lookup in a simple dictionary, yet it just was not making sense.

Here is the story:

A while ago, we were having problems with Internet Explorer downloading content over SSL. This turns out to be a common problem with IE, so to fix it, I caused the downloads to not use SSL, which is more efficient anyway.

We use a cisco hardware load balancer which balances incoming requests to different backend servers. It has a feature called STICKY SOURCE IP, which means that any connections routed from the same IP to the same site will be delivered to the same backend server. This is nice, because you are always visiting the same server.

So as it turns out, by turning the download SSL off, the load balancer was using another “site” definition to handle the DOWNLOAD request. STICKY SOURCE IP was out the window, and the request was being passed back to a “random” webserver.

About 50% of the time, users (like me tonight) were tossed to the other server, which knew nothing about the user login. That is why it was complaining about the “WB4_App::$DSEG and/or WB4_App::$AuthToken must be set in order to contact the applications server.” error message, which is not one that should normally be shown.

To make matters worse, our IP address at work was apparently always using the same server, so I could not reproduce the problem. I’m lucky that it happened to me at home, or I would still be banging my head against the desk…

August 1, 2009 by Jason Garber

Interesting Thoughts on Cloud Server Performance

Apache load testing on a Cloud Server – Jason – 7/31/2009

I recently created a cloud server for a wordpress blog, and configured it to the point that the blog was working OK. Then I decided to check the performance aspects of the server, as it was a small 256 MB + 10GB machine.

Using apachebench (ab), I ran some load tests on the blog home page. The server choked to death. It was swapping so bad, that RackSpace Cloud sent me this email:

This is an automatic notification to let you know that your Cloud Server, city.appcove.com, is showing a considerable amount of consistent swapping activity. Quite often this is an indicator that your application or database are not as efficient as they could be. It also may indicate that you need to upgrade your Cloud Server for more RAM.

That’s strange…

I found that the response rate was:

4 requests per second, 10 concurrent connections

When the concurrency was raised to 50, the server died. It took 10 minutes for it to calm down enough that I could LOG IN and KILL apache.

—

So upon further investingation, I found that the default httpd.conf configuration was WAY TOO LARGE:

We’re only working with 256 MB ram here, so if each apache process takes up any amount of memory at all, we have a low limit.

<IfModule prefork.c>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000
</IfModule>

Only after drastically reducing the configuration to the following, did we get reasonable performance:

<IfModule prefork.c>
StartServers 4
MinSpareServers 2
MaxSpareServers 4
ServerLimit 4
MaxClients 4
MaxRequestsPerChild 4000
</IfModule>

As it turns out, the performance went up considerably:

16 requests per second, 50 concurrent connections

Still, I thought that it could get better. So I looked into installing some PHP opcode caching software.

http://www.php.net/manual/en/intro.apc.php

The Alternative PHP Cache (APC) is a free and open opcode cache for PHP. Its goal is to provide a free, open, and robust framework for caching and optimizing PHP intermediate code.

As it turns out, it was easy to install.

# yum install php-pecl-apc

And after restarting apache:

47 requests per second, 50 concurrent connections

Even during this load test, the site was still responsive from a web browser.

Not bad for a cheap little Cloud Server, eh?

April 6, 2009 by Jason Garber

Updating a cert on the Cisco 11500 Series Content Services Switches (CSS)

Having recently moved some of our hosting infrastructure to the excellent Rackspace Platform group, we inherited the management of the Cisco 11500 Series Content Services Switches (CSS), which we use for general load balancing + ssl termination.

As a side note, it’s really powerful, fast, and well, plain nice. Not having to manage SSL certs on each apache instance is really nice, and all the LAN communication is done over plain old HTTP.

This blog post is a regurgitation of some notes I took internally. Perhaps someone who finds themselves managing this device will benefit…

The task at hand was re-issuing and updating one of our primary wildcard certificates that powers a lot of subdomains.

The first step is to generate the key, csr, and crt…

All these files should be:

Named the same as the domain that SSL is being generated for.
use WILD for a wildcard subdomain
Use this format “www.domain.com-0810.key”, where 08 is the from year and 10 is the to year
(the short version is because of name length limits on the CSS)

Start by generating the key and csr

This should be done in the ciscoftp role under the ~/load directory

# openssl genrsa -out WILD.vosecure.com-0810.key 1024
# openssl req -new -key WILD.vosecure.com-0810.key -out WILD.vosecure.com-0810.csr

Then get the certificate issued by (global sign)

Put the certificate into the the ~/load directory. When done, it should look like:

-rw-rw-r-- 1 ciscoftp ciscoftp  3139 Apr  6 15:59 WILD.vosecure.com-0810.crt
-rw-rw-r-- 1 ciscoftp ciscoftp   773 Apr  6 15:49 WILD.vosecure.com-0810.csr
-rw-rw-r-- 1 ciscoftp ciscoftp   883 Apr  6 15:47 WILD.vosecure.com-0810.key

Put the crt and key onto the load balancer

To do this, use the “copy command” on the load balancer

20132-201292# copy ssl ftp base import WILD.vosecure.com-0810.crt PEM "rack"
20132-201292# copy ssl ftp base import WILD.vosecure.com-0810.key PEM "rack"

Then make the associations...

20132-201292# config
20132-201292(config)# ssl associate cert WILD.vosecure.com-0810.crt WILD.vosecure.com-0810.crt 
20132-201292(config)# ssl associate cert WILD.vosecure.com-0810.key WILD.vosecure.com-0810.key

Now, it’s time to install it. Requires SSL downtime!

Suspend the SSL content rule
Suspend the SSL service
Suspend the SSL proxy list
Run the updates
Activate the SSL proxy list
Activate the SSL service
Activate the SSL content rule

Here are the exact commands:

20132-201292# config
20132-201292(config)# owner vosecure.com
20132-201292(config-owner[vosecure.com])# content 74.205.111.161-ssl
20132-201292(config-owner-content[vosecure.com-74.205.111.161-ssl])# suspend

20132-201292# config
20132-201292(config)# service ssl-service
20132-201292(config-service[ssl-service])# suspend

20132-201292# config
20132-201292(config)# ssl-proxy-list ssl-proxy

In the following commands, we remove the whole ssl-server so that it shows up at the bottom in one concise unit. Otherwise, the startup-config and running-config become fragmented.

20132-201292(config-ssl-proxy-list[ssl-proxy])# suspend
20132-201292(config-ssl-proxy-list[ssl-proxy])# no ssl-server 6
20132-201292(config-ssl-proxy-list[ssl-proxy])# ssl-server 6
20132-201292(config-ssl-proxy-list[ssl-proxy])# ssl-server 6 rsakey WILD.vosecure.com-0810.key
20132-201292(config-ssl-proxy-list[ssl-proxy])# ssl-server 6 rsacert WILD.vosecure.com-0810.crt
20132-201292(config-ssl-proxy-list[ssl-proxy])# ssl-server 6 vip address 192.168.1.161
20132-201292(config-ssl-proxy-list[ssl-proxy])# ssl-server 6 cipher rsa-with-rc4-128-sha 192.168.1.161 81
20132-201292(config-ssl-proxy-list[ssl-proxy])# active

20132-201292# config
20132-201292(config)# service ssl-service
20132-201292(config-service[ssl-service])# active

20132-201292# config
20132-201292(config)# owner vosecure.com
20132-201292(config-owner[vosecure.com])# content 74.205.111.161-ssl
20132-201292(config-owner-content[vosecure.com-74.205.111.161-ssl])# active

Test test test.  Firefox, IE, Chrome...

20132-201292# copy running-config ftp base running-config

Review changes with git diff

20132-201292# write memory

20132-201292# copy startup-config ftp base startup-config

And… Here is the git diff

diff --git a/load/startup-config b/load/startup-config
index 7042490..36fbbaa 100644
--- a/load/startup-config
+++ b/load/startup-config
@@ -1,4 +1,4 @@
-!Generated on 04/06/2009 16:05:48
+!Generated on 04/06/2009 21:51:02
!Active version: sg0810205

@@ -64,6 +64,8 @@ configure
+  ssl associate rsakey WILD.vosecure.com-0810.key WILD.vosecure.com-0810.key
+  ssl associate cert WILD.vosecure.com-0810.crt WILD.vosecure.com-0810.crt

!*********************** SSL PROXY LIST ***********************
ssl-proxy-list ssl-proxy
-  ssl-server 6
-  ssl-server 6 rsakey vosecure.com(080421-04300)-key
-  ssl-server 6 rsacert vosecure.com(080421-04300)-cert
-  ssl-server 6 vip address 192.168.1.161
-  ssl-server 6 cipher rsa-with-rc4-128-sha 192.168.1.161 81
@@ -146,6 +141,11 @@ ssl-proxy-list ssl-proxy
+  ssl-server 6
+  ssl-server 6 rsakey WILD.vosecure.com-0810.key
+  ssl-server 6 rsacert WILD.vosecure.com-0810.crt
+  ssl-server 6 vip address 192.168.1.161
+  ssl-server 6 cipher rsa-with-rc4-128-sha 192.168.1.161 81
active

March 8, 2009 by Jason Garber

I highly recommend yum + createrepo + rpmbuild

As I was discussing lightly before, I have recently been involved in building quite a few RPMs for our server clusters at AppCove.

Where we have arrived:

Our (new) primary production cluster consists of multiple RedHat Enterprise Linux 5 boxes in different capacities (webserver, appserver, database master, database slave, etc…).

Each machine is registered with 3 yum repositories:

RHEL (RedHat Enterprise Linux)
EPEL (Extra Packages for Enterprise Linux)
ACN (AppCove Network)

All of our custom software packages and custom builds of open source software are placed into individual RPMs, and entered into our ACN repository.

From there, it is a snap to update any given server with the correct version of the software that server needs.

We have a dedicated build area, versioned with git, that is used to build and package all of the custom software that is needed.

(note, RPMs are not used for web application deployment — rsync via ssh is used for that)

Recommendation:

Having worked through the process from start to finish, I must say that I would highly recommend the following tools to anyone who is responsible for RedHat Enterprise, Centos, or Fedora system administration.

git – to keep your .spec files versioned
rpmbuild – to build the rpms
createrepo – to create your very own yum repository
apache – to serve the yum repository
yum – to obtain, install, and upgrade your rpms

Additionally, if you are using RedHat Enterprise or Centos, I would highly recommend using Extra Packages for Enterprise Linux (EPEL) to get a few of those “other” packages that don’t come with your OS (git, for example).

—
Learning how to build RPMs was a fairly steep curve. But it wasn’t long. It is one of those things that if you know it you say “that’s easy” and if you don’t you say “what the ???”

yum+rpm was invented (I assume) to make life easier for countless system administrators and software publishers. So it’s not the kind of thing that everyone is involved in.

I was a bit tough to figure out the caveats of how to correctly build RPM’s that work. The documentation is a bit sparse. A bit here and a bit there.

What are the benefits?

Many. Let me list a few.

Your system stays really clean. With RPMs, you can uninstall everything you installed without leaving extra files laying around.

Upgrades are a snap. Once you have registered your own yum repository on a system, you can upgrade a given package by running:

yum upgrade your-package

All your systems can be on the same “page”. It is very easy, using yum, to ensure that all of your systems are using the exact same version of software.

Custom builds are super easy to maintain. We custom-compile php, python, and various other software. Once the .spec files are in place, all of your software can be re-packaged with a single command.

In our specific case, we wanted to have the memcached client statically compiled into PHP. With a few extra commands in the .spec file, it was a snap to pull in the source from pecl, and update `configure` to take it into account.

All builds can take place in one place. With one set of documentation, one consistent set of development tools, etc… We have a user called `build` on one of the hosts that is specifically used for building all of the RPMs.

Where to learn?

The best way to learn, as usual, is to jump in and figure it out. There is some really good documentation buried in the rpm.org site. It is a book called Maximum RPM, origninally published by redhat. The current snapshot of the book is available online.

http://www.rpm.org/max-rpm-snapshot/

Google is another good resource, depending on what it is you are looking for.

March 1, 2009 by Jason Garber

Installing Source RPMs to your home directory

I’ve been involved in an ongoing project to build RPMs for all of the “custom” software installs we use on RedHat Enterprise Linux 5 (RHEL5) at AppCove.

By default (on RHEL), source RPMs are installed to /usr/src/redhat. This is nice, except that I don’t want to be running as root when building software.

rpm -i --relocate /usr/src/redhat=/home/build/RPMBUILD setuptools-0.6c9-1.src.rpm

The previous command will install the specified source rpm to a local directory under the “build” user. That makes it easy to tweak the .spec file, and then build the desired RPM.

The Gahooa Perspective

Thoughts on Life, Engineering, Technology, Business, and more…

System Administration