Fri, 05 Mar 2010
What are you avoiding working on?
(Cross-posted from the OpenHatch blog.)
I remember working on my first big group programming project back in college. The project involved some web scraping work; I enthusiastically took charge of that part. But a few weeks in, my code just wasn’t working. I felt frustrated and helpless — I’m supposed to be the scraping expert, so why couldn’t I fix the problems? I retreated into hiding and didn’t want to think about the group of people or the project.
A week later, I feebly ran svn update to see how the project had progressed. “Oh!” I exclaimed to myself. “Someone made the data importer work!” I felt a rush of relief. (While writing this paragraph, I sighed again remembering it.)
When I talked to my teammate George, he mentioned off-handedly that he had fixed it. He didn’t feel any of the anguish I felt or assign me the blame I thought I deserved. I guess if I had just asked for help earlier, I could have skipped the feelings of inadequacy entirely and George would have just fixed the bug.
Half a decade later, I feel the same dread about the lack of Maildir support in Alpine. The bug is three years old! Ugh.
This time, I’m going to ask for help. So I listed the issue on the Alpine project page on OpenHatch. To put it there, here’s what I did:
- Go to the OpenHatch projects list and enter your project’s name. That will take to your project page.
- Answer the last question, “What is a bug or issue that you’ve been putting off, neglecting, or just plain avoiding?”
- Hit submit! We’ll help you sign in with your Google account or other OpenID.
For those of you who work on Free Software projects, what are the issues that drain you the same way?
No bug tracker I’ve seen has a field that says, “I’m avoiding working on this, and that sucks.” To say that, list the issue on your project’s OpenHatch page.
So join in! What are the issues you don’t want to think about? Once you share them, maybe a fellow developer or a new contributor will come by and help you out.
Head to OpenHatch and let the world know.
P.S. Do you have any ideas about how we can make these project pages more useful? Let us know!
P.P.S. Dear Joey Hess and everyone else, sorry that Alpine still doesn’t have Maildir support.
[/debian] permanent link and comments
Thu, 11 Feb 2010
Google and University email
Should universities switch away from hosting their own email and join the Google bandwagon?
Adi Kamdar wrote to the Students for Free Culture discussion mailing list linking us to a Yale Daily News article discussing an "almost-definite" switch for Yale to Google Apps for Education.
Adi asked for thoughts, so here are mine.
Privacy
One interesting thing about the Gmail option is that, when deployed for all students, students have no choice but to let Google read their official email.
Some students might take part in activism that they want to shield from Google, a corporation with its own interests, or the attackers that attempt to break into Google's systems (see the recent attacks from crafty pro-Chinese-government hacktivists). Before a switch to Gmail, each student could choose if Google was the kind of company they wanted to share their email with. After a switch to Google Apps, the choice is made for them.
But this highlights a different issue: Before a switch to Google, students have no choice but to let university administrators read their official email. That's not necessarily optimal, either. Others have pointed out in this thread that this option comes with legal advantages with regard to privacy law. At least that's some consolation.
It's still true that students can encrypt their email and make it difficult for any eavesdropper to figure out the bodies of their emails. But that's no use for hiding the identities of the people with whom they communicate -- no email crypto I know hides email addresses.
Software freedom and open standards
This is juxtaposed against another difficult situation: Matt Senate wrote about the sucky Squirrelmail system that Berkeley uses (used?) for webmail. The fact that SquirrelMail is Free Software is small consolation for Matt. From his perspective, because it's a hosted web application, he has no more freedom than he would have with Gmail.
At least Berkeley's IMAP server followed standards! That's more than we can say for the Gmail IMAP server, which is famous for basically supporting just enough IMAP for Microsoft Outlook to work.
But standards compliance is small consolation. If the university email server "properly" supports IMAP, but isn't fast or doesn't provide the new extensions that make threading or search speedy, it's not much relief to know that you can use any client you want to slowly read your email.
Internet history
Truth be told, University-hosted Internet services are based on what you might call the original Internet perspective. The Internet began as a network of networks. A University was a network island unto itself, running its own email, news, and Ethernet services. When inter-network connection was available, you can email people at other institutions. When it wasn't, well -- the Internet is a useful tool, but it's down right now. "Don't worry," the admins might tell you, "you can still read your email with PINE."
It used to be that inter-network connectivity was icing on top of the "real" network a person used.
Today, inter-network connectivity is the whole point of a network connection. How embarrassing for each individual network! To test if our connections are working, we skip right over the local content and point our web browsers at a search engine.
Users today aren't satisfied to read their email from imap.institution.edu and read USENET news-- they thirst for real-time access resources available beyond the university. Students weren't interested in the J-Stream service I helped set up at Johns Hopkins; instead they mostly posted and watched videos at YouTube. They don't really care if the the student-oriented wiki is based on campus or instead halfway across the globe (say, in Japan).
The original Internet was based on autonomous networks and opt-in routing. But eventually, all the networks opted in. Users drive everything, and when they don't get what they want, they vote with their feet. Companies like Facebook and Google stand ready and armed to provide shockingly-efficient services to millions of users who choose them. You could say that with today's network, the autonomy shifted from the network to the users.
The nice thing about University-run services is that students can organize and ask for changes, as Fred pointed out. And for people like me, there's something nice about knowing the person who runs your email system.
But if your busy university staff doesn't have time to investigate an email server with fast full-text indexing, you might wish for change. Having the university tear down its internal services is a progression toward seeing its network as simply transit.
Imagine the loss of pride. It used to be that the university personally ran a system for helping users get what they wanted. As it becomes simply transit, the staff are just greasing the cogs of a larger, invisible machine that's easy for users take for granted.
Some netizens like me hold email as sacred, a beautiful institution based on standards and a decision to interoperate. When your university switches to Gmail, I'll be sadder, but maybe what you'll get is professors who can spend more time with students and less time configuring desktop software.
Trade offs
Every university has a choice: Pay hundreds of thousands to millions of dollars a year for dedicated staff to run an in-house email system, or let Google do it. Think for a moment of what good could come from those dollars when put to use in other ways for students.
Your school could start a switch to all-organic food. It could start paying more of its employees a living wage. Imagine the travel funding for student activities that can come from hundreds of thousands a year well-spent. It could run a massive used textbook clearinghouse to help students avoid pouring their dollars into the textbook industry.
And now cry with me. What I've asked you do is to consider sacrificing institutional autonomy for cold, hard cash. That's to say nothing of the ecological benefits or the productivity increases possible from having Google's paid experts run this part of the computing system.
Conclusion
Is an official Google email system much different than the reality most students I know live, which is configuring their student email address to forward to gmail.com?
For those of us who would be sadder with one more push toward centralizing email with Google -- for those who see it as the behemoth whose size threatens the decentralization that used to be the core of the Internet -- I ask you to think positive. "See the profit from your loss."
I have no conclusions for you, just niggling questions.
[/debian] permanent link and comments
Sun, 07 Feb 2010
Cosmetic carbon copy: The opposite of blind carbon copy
I propose a new feature for email software: Cosmetic carbon copy. CCC works very similarly to today's carbon copy feature: If you put an address in the CCC: list, when the recipients open the message, they can see the address. The one difference is that with "cosmetic" carbon copy, the CCC:d address never recieves the message.
An example might illustrate the situation. Let's say someone (Alice) wants to invite Bob to a movie but doesn't want Charlie to come. She might send this email:
When Alice sends the email, Bob recives a copy. Bob thinks that Charlie received a copy, as he will be in the CC list. But Alice's mail software never sent Charlie a copy. So Alice can relax, knowing that Bob won't forward a copy to Charlie, since Bob thinks he already received it.
Alice and Bob will meet up without Charlie, and Alice will quietly sigh in relief.
How can this work?
Cosmetic carbon copy works on the same principle as blind carbon copy: the contents of the message are fundamentally independent from the recipients.
Saavy email users have used the "blind carbon copy" feature for years. When you place an address in the BCC list, that address will receive a copy of the email even though other recipients won't see the address listed. This allows for a form of privacy between the sender of the email and that hidden recipient.
This works because, like postal mail, email messages are delivered using an "envelope" that contains the actual recipients. When you compose a message in Alpine, Gmail, or Thunderbird, you are writing the contents of the envelope. When you hit send, the mail software looks for email addresses that should receive a copy. It creates one envelope per address, stuffs the letter inside, and sends the whole thing on its way through the Internet using a protocol called SMTP. When the recipient (Bob)'s email system receives it, the letter is pulled out of the envelope and placed in an inbox.
Which means the envelope is entirely invisible to Bob, the recipient. Most mail software puts information about the envelope in email headers. Since each email recipient gets a separate envelope, In the case of cosmetic carbon copy, Alice's letter claims that Charlie was on the CC: list. But Bob can never learn that Charlie's copy was never sent.
So to implement cosmetic carbon copy, Alice's mail program must understand this rule:
Because of the last bullet point, this "carbon copying" is purely cosmetic rather than functional.
A possible privacy problem
You might wonder, what happens when Bob hits reply-all to say "Yes, I'm coming"? Then (oh no!) Charlie might see the invitation.
Not a problem: Most of my friends use Gmail, and Gmail users almost always hit Reply instead of Reply-All. Gmail used to have an experimental option called "Reply All by Default", but thankfully they removed that option.
A future version of this specification might suggest mangling the last "." in the BCC:d email addresses with the ONE DOT LEADER character, which looks like "." but isn't one. That way, even if Bob clicks "Reply all," Charlie's email address is subtly incorrect and will bounce.
Future work
The story I've told here is marginally simplified so it's understandable to a less-technical audience. Those who want a more technical version can request I submit an RFC. Perhaps in two months?
(File under "games to play with email.")
[/debian] permanent link and comments
Sat, 02 Jan 2010
So are we
Asheesh: "I got that Orangina poster on the French exchange in 2001. Some of my things are from a while ago, at this point."
Rebekah: "So are we."
[/people] permanent link and comments
Fri, 01 Jan 2010
Detecting stale versions of WordPress
I run a personal server that hosts web space for a few friends. Probably the most popular thing to do with the space is to install WordPress and run a personal blog.
A few days ago, I discovered some attackers were abusing one of the sites. Once we upgraded the site to the latest version of WordPress, the attack went away. So I wrote a tool that, every night, emails me a report of the locations of old versions of WordPress. A sample email:
Eek! WordPress 2.5 is old!
How it works
Each time it runs, it looks at wordpress.org to see what the current version is. The code to do that is written in Python and uses lxml.html. It prints the current version in the report, and it uses it when analyzing WordPress installs.
To analyze WordPress installs, it executes locate readme.html, looking for WordPress's tell-tale documentation file. For every such readme.html, if it matches a simple regular expression suggesting it's a WordPress readme file, it performs the following analysis:
If it found any installs of old WordPress, it prints a report like the one blockquoted above to stdout.
How to use it
To get emails, I run it with cron. You can add a stanza like this to your crontab (edit it with crontab -e):
Ta-da, nightly reports.
To get a copy
Do a git clone:
or browse its gitweb.
Feedback
I'm quite interested to hear what others do to avoid old web apps being attacked. If there's another bit of software that monitors web apps for needing upgrades, I'd love to hear about it! Obviously if you have feedback on this tidbit I wrote, let me know.
(To me, apt-get doesn't seem to be the answer. Web apps (especially PHP ones) don't usually seem to support keeping the code in one place with multiple different configuration files. And users get excited about the latest and greatest and don't want to wait for me to upgrade, and I can't blame them.)
If some of you don't like the Python dependency or anything else, I do welcome patches!
[/debian] permanent link and comments
Tue, 22 Dec 2009
PyCon 2010: "Scrape the Web," and a poster session
"Scrape the Web," my PyCon tutorial on web scraping is back this year! Plus I'll be leading a conversation on how to get involved with Free Software from my poster at the poster session.
This year's Python conference takes place February 19-21 in Atlanta, Georgia, USA.
Poster session
This year is the first year PyCon is holding a poster session. My poster is on open source and Free Software for the Python community, focusing on how you can get involved.
It's a plenery session. This means, for 90 minutes, there will be a dozen of us presenters standing in front of our posters hoping PyCon attendees will talk to us. Everyone at PyCon will be milling about, since there will be no talks during the poster session. So stop by!
Web scraping tutorial
I had lots of fun last year talking to a packed room about programming the web. The World-Wide Web is the world's most widely-used distributed computing system; if you're only using it from a web browser, you're missing out. It's a tutorial, which is a paid three-hour course (with refreshments) in a classroom setting. Based on what last year's attendees said afterward at lunch, it seemed the attendees enjoyed themselves too!
From Python, there's a host of choices for pulling information from the web, and a few choices for pushing data back (usually through forms). Here are some topics we'll cover:
I think the most exciting part is the discussion of getting around anti-scraping countermeasures. This is where the rubber hits the road. We'll:
Last year's version is online as a video. If you missed it last year, register for PyCon and sign up for my tutorial, "Scrape the Web." You're likely to learn a lot, and I'm always happy to answer questions during and afterward.
Brian Gershon, one of last year's attendees, explained best:
[/preso] permanent link and comments
Sun, 20 Dec 2009
Anti-depressants and personality shift
Lisa pointed me to a Science News article discussing Paxil, a medicine prescribed for depression. The important bit:
The article explains that, even after "accounting for the extent to which each treatment diminished standard measures of depression," taking Paxil makes you less neurotic and less introverted.
Recently we learned that the placebo effect is getting stronger. What this research makes me wonder is, If we helped these people adjust these personality traits via e.g. cognitive therapy, and then gave them placebo, would they have the same high success in defeating depression as the Paxil takers?
Please understand that I do believe the lived experience of depressed people is terrible. I don't mean to diminish their suffering. I'm wondering here about ways how we can help people be happier.
[/drugs] permanent link and comments
Fri, 18 Dec 2009
Diversity in Free Software: South Asians as an example
As someone born in India, I sometimes look around and wonder, Where are the Indians (and other South Asians) in Free Software?
(I don't mean to exclude South Asians from other countries, so I will lump us together. I believe that we are more similar than we are different, although I know more about India than about the rest of South Asia.)
There is no shortage of Indians performing information technology jobs in the United States. The same is true in academia; the Computing Research Association uses National Science Foundation data to show about 15% of computer science bacholor's degrees are awarded to "Asians or Pacific Islanders." These are not precise numbers targeted at South Asians in particular, but they confirm a general feeling that plenty of technologists in the United States are from that part of the world.
South Asia is quite a populous region, coming in at over one billion people. It, too, has plenty of technology workers. So much FLOSS conversation happens in English, and India is well-suited to handle this; English is an "official language". Indian academia reports that there are 350 million English users and about 90 million English speakers.
So let's visually compare the Debian developers map for South Asia (over one billion people) and that of New Zealand, a country of four million.
India:
New Zealand:
These two countries have about the same number of Debian developers (at least, who have marked their location in the Debian LDAP database). About four.
South Asians comprise about one sixth of the world's population. There are about one thousand Debian developers; we represent at best 1% of that. These numbers are comparable to the under-representation of women in Free Software, especially when you compare the figure to South Asians' over-representation in the rest of information technology.
That makes me sad.
Take a look at the Debian developer map again. You'll see that Debian is certainly not an Americans-only project, or even an English-speakers-only project. South America has a respectable dotting of developers, and Western- to Central-Europe are packed.
I have strong feelings about Free Software. It emerges from an ethos of personal empowerment, and with open source it has become a dominant force in computing. Yet there are plenty of sharp people -- at least women and South Asians -- who, somehow, become culturally excluded from participating.
Why care about diversity?
Consider the diversity of contributors we already have. Some contribute to Free Software because of particular business needs, such as what caused Avi Kivity to write KVM, the new leader in Linux-based virtualization. Everaldo's art background gave us the "Crystal" icon set that set the standard for sharp-looking icons on the Free Desktop for years. Josh Coalson knew about compressing sound, and his Free Lossless Audio Codec is now the standard in high quality audio.
We already have a great deal of diversity. We should be celebrating!
Back in 2001, FLAC's users were celebrating. In that year, I decided to ditch proprietary operating systems because I felt I could achieve all my computing needs in the Free world. A happy user of FLAC myself, I lurked on the mailing list as I watched grateful people thank Josh for the great software he wrote.
Different contributions will excite different sorts of users. The more different people we have improving FLOSS, the more happy users we can make. Happy users of FLOSS are Free users. Happy users can become contributors, putting forth code, documentation, translations, and word-of-mouth marketing.
The first reason to improve diversity in FLOSS is to better suit our users' needs. The more diversity we have in our contributors, the more chance we have of tickling our users in the ways that please them the most. I wish to see an end to software that restricts users' freedom, so I want to see us build the tools that users want.
One thing that pleases me is when I see other people contributing who seem similar to me. When I went to Debconf, I was thrilled to be surrounded by people who cared about software freedom and technical excellence. I had even more fun being social, chatting about rainforests, mutual friends, websites, and music. I might have had the most fun playing the card game Mao.
A second reason, then, to improve diversity in FLOSS is to increase contributor retention by increasing joy. Mao was an example of a cultural bond I happened to share with a handful of Debianites. The more diversity we have, the more frequent these sorts of coincidences will be.
The final, most obvious, reason to reach out to groups of people who do not typically contribute is that we can increase our numbers. That by itself is so valuable. Ubuntu sees 100 new bugs per week, even after the bug squad's efforts. If we can do a better job of recruiting new contributors, the raw numbers give us more strength in creating and maintaining world-class software as well as letting the world know about it.
Changing the balance
I believe that there are plenty of South Asians quite capable of contributing to FLOSS. I believe the same of women. I believe the same of men.
Back to the topic at hand. Why do the South Asians vanish when we look at Free Software, not tech in general?
There are plenty of reasons I can dream up, based on my experience with Indians.
It's tough for FLOSS advocates to work directly on these distant issues. But I think we can focus some problems we can help solve. Crucially, awareness of Free Software spreads best by social circles. I learned about Linux from a friend at a summer camp. I'll repeat that:
So if you want to spread that awareness, try to be a bridge.
If you meet someone from an unusual background for open source who needs support or mentorship, try to help. That is an investment in the diversity and growth of Free Software. Those people can now unlock more "open source minorities."
What success looks like
Google Summer of Code helps some new contributors get started and provides that mentorship. Rachel McCreary was invited to the SciPy conference after a successful summer. Her father left a comment explaining how her sisters participated in FLOSS via Google's Highly Open Participation (GHOP) Contest:
Soon, these stories will be commonplace. Until then, we have work to do.
(I'm still researching these topics. If you can help me find any sort of data to help me learn more about diversity in FLOSS, even if it seems like I wouldn't like it, leave a comment.)
[/debian] permanent link and comments
Tue, 15 Dec 2009
Two questions
A few weeks ago, I was listening to an R.E.M. album. All I knew about the song I was hearing was that R.E.M. recorded it, and I liked hearing it.
Raffi's ears perked up. He asked, "Is this R.E.M. covering the Velvet Underground?"
I asked, "Is that true? I didn't know that."
Now Raffi knows what I knew, and I know what he knew.
[/communication] permanent link and comments
Thu, 10 Dec 2009
OpenHatch tracking bite-size bugs
Cross-posted to asheesh.org from the OpenHatch blog. (OpenHatch is my current project.)
If this sort of thing is interesting to you, take a look at OpenHatch and subscribe to our blog or @openhatchery on Identi.ca or Twitter.
[/debian] permanent link and comments