Skip to main content.

Sun, 05 Oct 2008

qemu IP address patch

I sometimes use the qemu virtualization system, or its cousin kvm, for creating virtual computers to test software in. Conveniently, qemu makes networking those really easy.

Unfortunately, the IP addresses it assigns for virtualization happen to be in the same subnet as my desktop at work (at CC, 10.0.2.x). I had some fear of changing a piece of software as presumably complex as qemu.

I forged ahead and came up with a patch that I posted to the qemu-devel mailing list. I'm just wring this post in case someone wonders, "How can I change the IP address of the user net layer used by qemu to avoid a conflict?"

The answer is as easy as replacing the string "10.0.2" with "10.0.3" globally across the qemu codebase and recompiling. If that mailing list post ever goes away, I have a local copy of the patch.

(This work was sponsored by CC, but pending an okay from CC, you should be free to use it under the terms of the WTFPL.)

[/sysop] permanent link

Toasted flash drive

I just got an email. (For background, Matt B. is my flatmate's name.)

From: Travis M.
To: Asheesh Laroia
Subject: Matt B. left the oven on!!
I am here in the park with Matt. He left the oven on, with a Flash Drive in there
no joke!

As it happens, this email was real, not malarkey.

[/debian] permanent link

Sat, 04 Oct 2008

What are your most expensive websites to run? Patching Apache to find out

When running a busy webserver, one may want to know how much server time is spent preparing each request. That would be especially useful if broken-down per web site you host. Server processing time indicates things like how long MySQL queries took, or how loaded the disks are; in general, they are the measure of how difficult it was to answer a request. It may also be interesting to compare server time spent processing a request today to the same request's time in the past as an indication of how system changes (upgraded disks, more complex filesystem) have affected your ability to process web requests.

Apache's mod_log_config lets you log how long a request takes from start to end, which includes the amount of time taken to send the actual data. That can be imagined as server_processing_time + time_to_send_data_to_client. I wasn't interested in seeing how slow or fast clients' net connections were.

In a project I named vhost_effort, I wrote a patch to Apache to be able to log just that server time spent from the start of the request to when the request is ready to be sent. That work was done at Creative Commons, and the software results are available under the Apache 2.0 license. vhost_effort.py is a hack that generates a pie graph for how much server time is spent on each vhost (among other sorts of visualizable statistics). I began thinking of using a visualizer for disk usage to make the pie graph interactive, but by the time I was nearly done working that out we had already gathered all the data we needed.

My projects page has a link to the code in the Creative Commons Subversion repository. I did write about this at labs.creativecommons.org a year ago also.

Code in Creative Commons Subversion.

[/sysop] permanent link

Sun, 28 Sep 2008

Colorizing standard error: Adventures in LD_PRELOAD

Kristian again asked an interesting question on the SF-LUG mailing list. This time, it was: "How can one get stderr and stdout to appear in different colors?" He was asking on behalf of someone, in turn on behalf of a Java programmer.

I thought about this and discussed it with Jesse Zbikowski, who I happened to be sitting next to at the Tenderloin Computer Help Day that Christian Einfeldt invited the list to (which turned out to be a lot more interesting and orderly than I had imagined!).

Jesse and I talked and we thought of named pipes, which Jesse got to work on and produced a nice Perl tool for. I thought about LD_PRELOAD and got off to a few false starts, and finally came up with a tool I called stderred (tarball of v1.2). It includes a demo program in Java and a README.

LD_PRELOAD

LD_PRELOAD wrappers are a way to change the way a program executes by replacing library functions, like write() or gettimeofday(), with your own homebrew versions. You can think of the dynamic linker as allowing you to stack your own things "above" the C library, but "below" the actual program that runs. So in looking for a symbol (a function name, typically), the program searches down until it finds it, and uses that.

"stderred" is a C program and a Makefile that you can demonstrate works properly; it includes a sample Java program and a README. Because it intercepts the Java JRE's calls to write() to write out messages to stdout, stderr, or whatever, and only modifies the ones to stderr, it should be safe to use everywhere. Plus there are no race conditions; it runs right in the context of the program, so it also avoids the performance penalty of context switches.

This LD_PRELOAD wrapper is interesting, I think, because (thanks to Eric Northup for the idea) it calls the real system write() function by yanking it out of libc using dlopen()+dlsym(). I was also (you can see this in the first few revisions) trying a #define hack to get access to libc definitions without the real symbols; however, this failed a link-time. I don't see how it could work.

The problem with named pipes: Buffering can change the order of outputted lines

Jesse pointed out to me that the named pipe approach has a serious buffering issue related to timing: if the process writes to stderr and stdout in quick succession, the lines could appear colorized in the wrong order. Jesse shows me some variations of his script that changed which wrong order it generated, but we couldn't quite figure out how to make it always right. This seems like a race condition to me.

That's because when the named pipe in question is read from, the Perl script doesn't know *how much* to read. So in this case:

       one line to stderr
       one line to stdout
       one line to stderr

After Jesse explained this to me a few times, I understood it would get printed as either:

       one line to stdout
       one line to stderr
       one line to stderr

or the same with stderr's lines on top. Note that the interweaving is gone; this is because the information of how *much* was printed each time is thrown away by the OS. Because the read()s are happening out-of-process in both the ZSH and Perl ways to do this, I don't see how they could get around this issue. An implementation based on select() or epoll() would have the same issues, I believe.

Why my solution doesn't work for "ls"

stderred is as simple as it is because it only overrides write(). The JRE only seems to use write(), not any of the helper functions like straight-up printf(), or error(), or fprintf(), that also write to file descriptors. Unfortunately, if you try to stderred-ify "ls", none of stderr appears red! That's because ls uses fprintf_unlocked() and error(), which themselves *inside libc* call write().

If you think of ls as standing on top of a library stack that looks like this:

       ls
       [stderred]
       [libc]

if you know that symbol resolution only looks "down," it's clear that the functions *inside libc* don't go back *up* to stderred to find my hacked write(). So they use the libc write(), which doesn't colorize.

Therefore, I started down the long road of modifying "all the important" functions to colorize if the output was going to stderr. Trying to colorize "ls" is where I started, so I wrote quite a few of those before actually checking what Java used. "ls" nearly gets colorized properly; you can look through the with_error branch for the latest work down that path. But I stopped once I figured out Java seems okay with just write(), and for cleanliness's sake I left that out of the released version (currently 1.1). Patches welcome!

zsh, python, and further reading

According to the Gentoo-Wiki, zsh users have an easy way to enable colorizing stderr. Knowing little about zsh but something about UNIX, it seems to me when they fork to run the new program, they close() fd #2 (stderr) and open it as a pipe to this program. I don't see how they solve the races brought up by the Perl thing; it seems to me they'd have the same race.

This is the same path that Jesse and I started down in the beginning; we read http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html and noticed it didn't discuss setting stderr to a pipe, and then we talked about named pipes....

The Pythonic way to do this would have been to "simply" globally override what "sys.stderr" is. I don't know if such a thing is possible in Java.

You can read a quick tutorial on LD_PRELOAD in the IBM DeveloperWorks article, "Override the GNU C Library -- Painlessly." You can read a lot more about dynamic linking in the exhaustive "How To Write Shared Libraries" by Ulrich Drepper.

[/software] permanent link

Fri, 26 Sep 2008

Load average

sh-3.1 $ uptime
12:10:16 up 20 days, 18:54,  4 users,  load average: 680.29, 656.27, 636.17

Huh.

[/debian] permanent link

Fri, 22 Aug 2008

dd, dd_rescue, and ddrescue

The short answer: "Use GNU ddrescue. GNU stands for Quality."

dd is a classic UNIX utility to read from and write to files (often devices). Typically, one uses it to copy a hard disk to a file, or to image a hard drive by copying a backup onto it.

One hits a problem when the hard disk has errors. In this case, dd abruptly stops working in the middle, reporting an "Input/output error." But when the hard disk has errors, usually what you want is to get an image of all the blocks on the hard disk that are readable - not just the first few before the first error!

(Note for the pedantic: Yes, I know about dd conv=notrunc,noerror. They're so easy to misuse (mostly by forgetting one of those two options) that they're worth avoiding.)

Two tools are available for this particular purpose. Confusingly, one is called ddrescue, and the other is called dd_rescue.

Around 2001, Kurt Garloff wrote dd_rescue. It does what dd does if you pass it some options, but it comes with instructions on how to use it to recover data from drivers, like by running it multiple times or bakcwards. A wrapper script called dd_rhelp automates that process.

When you're running dd_rescue on an obscure OS like Mac OS X 10.3 because you dropped your laptop in Uganda and the Linux partition grew bad blocks and you still want your data, you will find that dd_rhelp is written as a complicated shell script that relies on GNU versions of core system utilities. OS X provides non-GNU versions, and you will waste hours fiddling with compiling those utilities just so you can run some dumb shell script.

In the summer of 2004, the same summer as I dropped my laptop, Antonio Diaz Diaz wrote "ddrescue," a stand-alone C++ tool that does the same things as dd_rhelp, but more sanely and therefore more efficiently. It became an official GNU project. GNU ddrescue, like dd_rhelp, can keep a log file to let itself gracefully pick up after interrputions.

When your hard disk fails, you should turn to your backups. But if you need a tool like these, just remember: "GNU ddrescue."

$ sudo apt-get install gddrescue

[/sysop] permanent link

Wed, 20 Aug 2008

Lamers

Kragen Sitaker and his wife Beatrice were very gracious in hosting me and my brother for a week in Buenos Aires.

I was looking for something on Kragen's website and found a ten-years-old discussion of how to find security problems in software. In it, he writes:

Body text last updated 1998-07-22. Recently has become the most popular page of mine, presumably because a bunch of lamers want to learn how to break into things. [...]
I wouldn't be surprised if calling 100-200 people a day `lamers' results in electronic attacks on me or my machine (kragen.dnaco.net.) All I can say is that people who do this would thereby demonstrate their lamosity.

Lamers, you say? Nelson took this picture of me a few years back. Look at the thumbs-up from the driver!

(Photo available for re-use under Creative Commons Attribution-ShareAlike 2.0.)

Note: Mako addressed this topic earlier this year, and then again more recently.

[/ba-2008] permanent link

Sun, 17 Aug 2008

Fake Out in Buenos Aires

"Falso," he said.

I accepted the 100 peso (US$30) note back. The only place we had gotten 100 peso notes were ATMs.

I found a different one with a good watermark and handed it to him. (This happened a bit over a week ago.)

[/ba-2008] permanent link

Fri, 15 Aug 2008

Hello Planet Debian

I have a face on Planet Debian!

(Thanks to John Wright for setting it up for me!)

[/debian] permanent link

Mon, 04 Aug 2008

Francisco

Francisco is the name of the very energetic hostel attendant at America del Sud El Calafate.

After offering me a key (literally) for the wireless, he told me the password.

"What are you doing there?," he asked me. "It's email," I answered.

"Email? And how can you see? I can't see any letters." (The fonts are pretty small on my laptop.) "What program is that?"

"Pine," I said. "It's called Alpine."

He paused for a moment, and reported, "You look like a hacker with that." He patted me on the shoulder and wandered off.

[/ba-2008] permanent link