Author: sjan

Analysis

Botnet on port 23130?

Yesterday evening my roommate’s machine was botted. I got a text message to my phone from Pingdom that my site was down and I did a bit of digging and found that his machine had somewhere in the range of 80-100 open outbound connections at all times.

I notified him and he immediately went to TrendMicro House Call to clean it up. He said it found “a few things,” but he didn’t note what they were, nor did he try to isolate them so I could attempt to decompile and inspect them. Ah well, such is the world, and he had work he needed to be able to finish with his machine.

The odd thing was, once his machine was cleaned and no longer in contact I began to get a flood of TCP SYN packets and UDP packets to the server on port 23130. The size of the UDP packets (between 75 and 196 bytes) leads me to believe they were some sort of botnet commands, while the TCP SYN packets were bots trying to reconnect to their lost buddy. This definitely did not have the marks of a DDoS of any sort, as once the bot on the Windows machine was stopped (and I once again had outbound bandwidth) the packets were hitting the server in a fairly steady fashion, but not in any kind of flooding behavior. In other words, each host was trying no more than 5 times to connect via TCP and no host sent 2 identical UDP packets in a row. The reason they were hitting the server is that the packets were being sent to a specific IP address, and trying to create a new connection with that IP means you are trying to connect to the server. Without the established connections in NAT on the router, all these packets were going to the server. Unfortunately the server in question is not beefy enough to run tcpdump, even for a few minutes, and trying to alter my network enough to get my laptop in where it could sniff the packets was out of the question.

While I didn’t have tcpdumps or even extensive firewall logs I did have the abbreviated logging that takes place in messages. (I also had dmesg logs to look at, but I never realized until last night that dmesg logs are not timestamped. I wonder if that is a configuration error on my part. Right now I am too exhausted to try to figure that one out.) So, I had the log entries in /var/log/messages and there is plenty good information there – and here is what I saw, from the hours of Sep 18, 19:16:59 to Sep 19 03:06:49. (Note that the packets are still coming in, but now at a rate of somewhere around 2 attempts per hour.)

There were a total of 178,335 TCP SYN packets to port 23130, along with 33,894 UDP packets to the same port. These requests came from 1,994 unique IP addresses. Below are some interesting statistics.

Top ISPs by number of unique hosts
ISP Country Hosts
Comcast Cable Communications United States 129
Abovenet Communications, Inc United States 119
Road Runner HoldCo LLC United States 92
AT&T United States 77
Shaw Communications Inc. Canada 51
Verizon Internet Services Inc. United States 41
Cox Communications Inc. Canada 34
Rogers Cable Communications Inc. Canada 26
Bell Canada Canada 19
Charter Communications United States 19
All countries by number of unique hosts
Country Hosts
United States 643
Canada 152
United Kingdom 135
India 117
China 67
Philippines 65
Australia 62
Malaysia 39
Japan 33
Russian Federation 32
Mauritius 30
Netherlands 30
Portugal 27
Uruguay 25
Pakistan 22
United Arab Emirates 22
Spain 21
Greece 21
Romania 19
Thailand 19
Poland 19
Saudi Arabia 18
Germany 18
France 18
Bulgaria 16
Norway 16
Singapore 16
Korea, Republic of 15
Taiwan, Province of China 13
Brazil 13
Viet Nam 12
Italy 11
Turkey 11
Mexico 10
Sweden 9
Croatia 9
Finland 9
Israel 9
Ukraine 8
Hong Kong 7
Ireland 7
Argentina 7
Switzerland 6
Denmark 6
Estonia 6
Cyprus 6
Czech Republic 5
Kazakhstan 5
Chile 5
Qatar 5
Belgium 5
Sri Lanka 5
Latvia 4
Iran, Islamic Republic of 4
Indonesia 3
New Zealand 3
Slovakia 3
Dominican Republic 3
Serbia 3
El Salvador 3
Slovenia 3
Unknown 2
Kuwait 2
Trinidad and Tobago 2
Brunei Darussalam 2
Costa Rica 2
Bangladesh 2
Venezuela, Bolivarian Republic of 2
Hungary 2
Moldova, Republic of 2
Barbados 2
Puerto Rico 2
Aruba 1
Malta 1
Ecuador 1
Bahamas 1
Austria 1
Peru 1
Montenegro 1
Angola 1
Guatemala 1
Paraguay 1
Antigua and Barbuda 1
Lithuania 1
South Africa 1
Palestinian Territory, Occupied 1
Aland Islands 1
Macao 1
Jamaica 1
Honduras 1
Oman 1
Iceland 1
Guam 1
Bahrain 1
Albania 1
Nepal 1
Luxembourg 1
Iraq 1
Afghanistan 1

Edit: Mostly I am curious about the botnet in question. If anyone comes across a bot that is communicating on port 23130 please let me know what you find out about it.

Best Practices

Data Scrubbing: Is there a right way?

An article yesterday from ars technica got me wondering. In my former position we often “scrubbed” databases for sample data from which to work. And certainly one can see the value in working with data with personally identifiable information removed for the purposes of business or health-care informatics, service level determinations, quality of service surveys, and so on. Yet, according to a study at Carnegie Mellon University:

87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}.

This seems to be the balance point: 3 pieces of non-anonymized data are enough to identify the majority of the population. (Think, “Three Strikes! You’re out!”) So what do we do when we need solid, anonymous data from which to work?

Taking the example of the health records from the article linked above, I would think that the following steps would be enough to fully scrub the data beyond where “reidentification” would not be possible. Since this is medical records we can safely (I feel) assume that randomizing the gender is a non-starter. (“Wow, according to this data 14% of all men went to the Emergency Room with pregnancy-related complications!”)

And since this data is taken from an area where the zip codes are known we are already at two strikes. So why did they not randomize the dates of birth? It would be difficult to do in the case of infants beyond a few days or weeks, since many of the health issues are related to their age in months. But for anyone over the age of 8 it should be simple enough to randomize the month and date of birth, and allow a set of ranges for randomizing the year of birth. If we assume a 20% range up or down we gain a lattitude of possible years of birth which increases the older the patient actually is. Another possibility is to give everyone the same date of birth, differing only in the year. (Jan 1, xxxx).

This of course means that any reporting done on age is meaningless, but it also means that the data can more safely be widely distributed. In cases where exact age and gender are required for study it would be better to merge data from many different areas, covering as many cities, counties, states and regions as possible. In this case we would still need to weigh the risks, as all three pieces of data would still be available, although at a much higher level of trail and error. In the case mentioned by ars technica the study covered seven zip codes. Perhaps spreading the information over a few hundred would make it much less worth the effort to sort through them all to try to identify individuals, and even then one would expect multiple possible hits.

The need for real data for statistical analysis and study is not going to go away. When you are considering releasing scrubbing data to release a “sanitized” version it would be good to keep the mantra “Three Strikes! You’re out!” in mind. When it comes to data for testing software operation, however, I still think the better method is complete randomization. Totally bogus data that has the look and feel of “real” data. (Which is no doubt why all the bogus users in my test dbs live in different cities, in different states, and at addresses ranging from 123 to 999 on Any Street!)

Database

Massive Fail!

While looking for the source of database backup errors I found it, the hard way. I have been running my database on a separate hard disk from everything else, and have been getting occassional errors from the cron script that does a nightly dump. I was under the impression that this was due to latency causing the script to time out. Not the case.

Looking at the dump it seemed like everything was being written ok, at least it looked that way yesterday. (I have not had much time for site maintenance, so this has been in the “put it off until later” pile.) While looking at the script this afternoon, and trying another run it timed out again (or so I thought). I figured I could put it off until this weekend, until I went to look at the site and got the big “Unable to contact the database” error. I went the server and fired up mysql on the command line and discovered that there were NO DATABASES! I tried to get a file listing of the /var/lib/mysql directory and got nothing. Nada. Since I don’t seem to be able to get anything off of that disc I did a quick modification of the fstab (to remove the line mounting that drive), rebuilt the dbs from the last (failed) backup, and here I am, missing two months worth of data.

Can I cry now?

In case you are wondering, the Margin vs Markup page is still available (as I made it a regular page as well as a post, it was my most popular ever).

EDIT: Ah the joys of using decade-old equipment.

Development

Error Handling and the PHP @ Operator

I have been trying to debug a plugin for WordPress (Shorten2Ping – I will keep plugging this because I think it is so nifty!) and I was running into a problem where the plugin would silently fail with nothing in the logs, no error printing to screen, just dead silence.

I turned on display_errors in php.ini for a while to see if anything would show up. Still nothing. So I started to look through the file again. I knew it was getting as far as creating the short url in bit.ly before it died, but nothing was getting entered into the database. So I started through the make_bitly_url() function and what jumped out and slapped me in the face? $json = @json_decode($response,true); That little, innocuous-looking @ was gulping the error message from a fatal error! (Namely, “Call to undefined function json_decode()“). It turns out that I had PHP compiled with –disable-json, which is default for Gentoo unless you have json in your USE flags.

According to the PHP docs for the Error Control Operator @:

Currently the “@” error-control operator prefix will even disable error reporting for critical errors that will terminate script execution. Among other things, this means that if you use “@” to suppress errors from a certain function and either it isn’t available or has been mistyped, the script will die right there with no indication as to why.

So, if you really must supress error messages, do so, but do so with care. In the case where a suppressed error may be fatal (as in this case) be sure to add documentation to that effect. As in “If this dies a silent death it may very well be that you do not have function xyz() enabled.”

And, note to self, when debugging PHP, the first thing to do is look for and remove the error control operator.

Security

Interesting log activity

While trying to debug the Shorten2Ping plugin (a really nifty thing, if I could get it working) I went digging through my Apache error logs looking for any PHP errors. (Well, okay, I didn’t actually dig, I just did a last on the file.) What I saw was interesting, even though it did not help the debugging at all. In fact it kind of derailed the whole process. What I saw was an obvious attempt to find Horde on my server (which I did run temporarily a few years ago). My first guess was that there was a new exploit out for Horde. I did some digging around and found that, yes, indeedy, there is. I found the details of the exploit at securityvulns.com (which is a mirror of or mirrored by www.security.nnov.ru which is where the first relevant Google link took me.) Oddly enough I have not seen this show up on any other security sites yet, even though I see that the report on securityvulns.com is from March.

Anyhow, in case you are curious, here are the relevant lines from the log. (IPs have not been changed to protect the guilty.)

[Sat Jun 06 01:46:53 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/README
[Sat Jun 06 01:46:53 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde
[Sat Jun 06 01:46:54 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde2
[Sat Jun 06 01:46:55 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde3
[Sat Jun 06 01:46:56 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde-3.0.5
[Sat Jun 06 01:46:57 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde-3.0.6
[Sat Jun 06 01:46:58 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde-3.0.7
[Sat Jun 06 01:46:58 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde-3.0.8
[Sat Jun 06 01:46:59 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/horde-3.0.9
[Sat Jun 06 01:47:00 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/mail
[Sat Jun 06 01:47:01 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/email
[Sat Jun 06 01:47:02 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/webmail
[Sat Jun 06 01:47:03 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/newmail
[Sat Jun 06 01:47:03 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/mails
[Sat Jun 06 01:47:04 2009] [error] [client 81.210.76.194] File does not exist: /var/www/localhost/htdocs/evardsson.com/mailz
Best Practices

Keeping Gentoo Fresh

I had a converstion with a friend about Linux distros earlier today, and I was asked why I choose to run Gentoo on my web server. He told me that Gentoo was too hard to maintain on a server, and that when it came time to upgrade something (like Apache or PHP) due to security patches that it took too long, and too often failed. I was confused by this so I asked for clarification. What he described was the pain of updating anything on a “stale” Gentoo machine.

Unlike so many of the other popular distros, Gentoo does not, by default, use pre-compiled packages. So unlike doing rpm -i or apt-get install doing emerge on Gentoo requires that the package you are installing, and any missing dependencies, are pulled in as source code and compiled. When you think about adding packages like, say, Lynx, the process takes only a few minutes on a moderately decent machine. (Mine is a PII 966 and Lynx took about 4 minutes start to finish). When you talk about upgrading something like Apache, however, the length of time it takes depends not only on the speed of the machine, but how many of its dependencies are out of date. In fact, if you fail to update regularly you can run into an issue where not only are most of your files out of date, but your system profile is out of date and you need to do some serious wrenching to get the whole thing working again. In the times that this has happened to me (twice) I was able to get the system up-to-date once, and just gave up and reinstalled a newer version the second time. (These were both rarely used VMs, and not production boxes.) However, updating the profile on a “fresh” Gentoo is (in my experience) a painless procedure of rm /etc/make.profile && ln -sf /usr/portage/profiles/profile_name /etc/make.profile && emerge -uND world (uND : update newuse deep: update, take into account new use settings from the profile and make.conf, and include deep dependencies).

So how do I avoid the “stale” Gentoo syndrome? I take a three-step approach.

  1. A daily cron job runs emerge -puvD world (puvD : pretend update verbose deep : just tells you what would be emerged, in an update, verbosely, and include deep dependencies) and emails me the output. This enables me to see each morning which packages have updates available.
  2. Every day that I have the time for it I log into the machine and run emerge -uD world and follow it up with etc-update (if needed) and revdep-rebuild if any libraries were included in the updates. (I save building new kernels for Sundays, and that doesn’t happen all that often, but I do like to always run the latest.)
  3. I check the messages from emerge to see if there are any special configuration changes that need to happen post-install that cannot be handled by etc-update. For instance, changing configurations in /etc/conf.d/packagename, new profiles or anything of that sort.

Ok, so I like to keep my system on the latest and keep a shiny new everything on it. How does that compare with something like, say, Debian? In Debian (and Debian-based distros) you can update packages to a certain point, after which the package for that version of Debian is no longer supported or updated. So you need to upgrade your version, and your kernel, which you do with apt-get upgrade dist. Seems easy enough. And how does Gentoo handle version upgrades? It doesn’t need to. If you keep your system up-to-date in the way I described your system will match whatever the latest Gentoo release has. In fact, I built my web server using Gentoo 2006.0 and have been keeping it up-to-date since then. (Gentoo seems to have stopped doing the biannual releases, btw – they are now releasing updated minimal install CDs nearly weekly for each architecture.)

git

My thoughts on Git

I have been working with Git for a while now, and I have a few thoughts about it. I am used to the SVN model, where multiple developers check out of a central repository, make changes locally and then commit them back to the repo (and repair failed merges). While SVN is not perfect, at least I am comfortable with it and know what to expect from it. While some of my issues with Git may relate solely to my relative unfamiliarity with it, at least it seems like I am not the only one with issues.

In reading through the man pages and all the documentation I can find on line (including the Git – SVN Crash Course) I was totally unprepared for what actually happens. According to the Crash Course git commit -a does the same thing as svn commit, but that is not quite the case. After a commit the files still need to be moved to the repo. So as I am looking for the best way to do that I find git push. Well, it would seem to me that firing off a git push would mean that I want the files I have modified and committed to merge into the repo I checked them out of. But that isn’t quite what happens if there are any working directories checked out the repo (meaning the one you are working on). It seems that the only way to properly get the files back into the repo is to issue a git pull from the repo aimed at your working branch. Hans Fugal said it best in his post from last year: git-push is less than worthless.

What this means is that I am unable to easily move changes from my laptop to the server. Instead, in order to reduce the number of steps I need to take I check out the repo I am working on (git clone) in my home directory on the server and then use sshfs to mount that directory locally. I do my work there, and then ssh in to the working directory, fire off git commit -a and then cd to the repo to pull from the working directory. Of course this seriously hampers my ability to work offline (as in I can’t). While I am sure there are plenty of good things about Git I am not yet really seeing them. With the exception of git stash I have yet to be wowed by anything Git has to offer. In the interest of fairness, however, I have not been incredibly wowed by anything SVN has to offer either, I just find it easier to use.

PHP

SPDO moving to SourceForge (I hope)

While school has kept me busy during my non-work hours I have not had much time to tend to other projects like SPDO. Seeing how my “Abandoned Project Takeover” has been approved on SourceForge, however, has encouraged me to set aside some time to do some house cleaning on the code to get the 1.0b ready to bump to 1.0 stable. According the information they have given me it takes 2 to 3 weeks to fully process an “APT” so I do have a little time (although not a lot). This is not the first time I have tried to do this, but last time it never made it to the approval stage and languished in the queue until it timed out. So, now that it was approved and has been in the queue for 4 business days I am hopeful.

In a lot of ways, moving the project to SourceForge is a lot like making a physical change of residence. It will require updating links, setting up the pages there including wiki, bug tracking, etc, and moving the actual code base over. (Good thing it’s small!)

As always, any suggestions, bugs, or feature requests are welcome, and (for now) should be posted at the page set aside for just that.

Gentoo

Gentoo: Spamassassin crashed! What?

Life has been rather hectic lately, and a few things have fallen by the wayside in the interim, including simple maintenance tasks like, oh, mowing the lawn, and cleaning out the folder of spam for bayseian analysis, and other fun items. So when my inbox started to be flooded with what seemed like ten times the normal amount of spam I get (very little of which Mail.app leaves in the Inbox) I decided to take a quick look at the server and see what was going on.

I ran a quick /etc/init.d/spamd status and got the response *status: crashed – Uh oh. A quick check of running processes showed that spamd (the SpamAssassin daemon) was indeed not running. I did a quick restart and thought nothing of it. Spam traffic slowed right back down. For about 10 hours. Suddenly I was getting hammered again. Checked the server and once again *status: crashed – “Ok”, I thought. I restarted it yet again and then immediately checked the status. Again it showed as crashed. Checking the running processes with ps -a | grep spamd showed something different, however. It showed the SpamAssassin daemon happily chugging along.

A bit of searching brought me to the relevant bug report for Gentoo. It seems that there is something in the way SpamAssassin starts the daemon that causes the pidfile to not be written, which leaves OpenRC with the notion that it has crashed. The fix is there in the bug report, though. By changing the spamd init file ( /etc/init.d/spamd ) with the addition of a forced pidfile SpamAssassin starts and OpenRC is happy as well. All by adding the one line you see below (in bold):

        start-stop-daemon --start --quiet \
                --nicelevel ${SPAMD_NICELEVEL:-0} \
                --pidfile ${PIDFILE} \
                --exec /usr/sbin/spamd -- -d -r ${PIDFILE} "" \
                        ${SPAMD_OPTS}

Now if I could just find a one-line fix for an overgrown lawn. And dandelions …

[Edit]:

The reason for the actual crashing was due to the overgrown spam folder for bayseian analysis. It seems that running the sa-learn utility against a 6 GB directory causes all sorts of Bad Things TM to hapen, as it parses every file (again) to determine whether it can “learn” from it. I cleaned out the directory and have had no problems since then.

Home

New Beginnings

Just a short post to update all my regular readers out there (yes, both of you) that I start my new job next Wednesday and I started school full-time today. Yes, today. I have started in the BS – IT Security Emphasis program at Western Governor’s University. I don’t have a projected finish date yet, but their competency-based model means that I can finish my degree much faster than with a standard program. (Word from one of the admissions counselors is that IT professionals going in to the IT program tend to finish their Bachelor in around 2 years.)

Although I shouldn’t have to remind anyone I will: everything I say (well, type) here is my opinion and does not reflect the views or opinions of my employer or the school which I attend. With that said I have only one small grievance about my first day at WGU: every link in the student portal has a target of _blank or uses a js call to open a new window. Grrrr. My opinion: bad design idea.