Making Peace with Autism 0

ludo, Monday 09 May 2005

Ned Batchelder is a (Python) programmer who has to deal daily with the autism that afflicts his oldest son:

For most people, the degree of examination is a matter of choice, a reflection of your interest in introspection and self-awareness. Most people can adjust their level of self-examination to balance the effort with the reward. With an autistic child, there is little room for laying back and letting things be. "Go with the flow" doesn't usually apply.

His wife Susan Senator is a writer and activist, who has just published the book Making Peace with Autism: One FamilyÂ’s Story of Struggle, Discovery, and Unexpected Gifts. Ned is asking his friends, readers, and fellow programmers to help raise his wife's Google ranking:

Could I ask a favor? Susan's name makes Google searches difficult. Searching for "Susan Senator" tends to find Senators named Susan, and "Sue Senator" is worse: there are lots of news stories about people suing their senators. Here's the favor: make a link to Susan Senator to help Google find its way.

It's an easy favor to grant, and maybe reading Ned and Susan's experiences will make you stop for a while, and give some thought to many important things we all take for granted in our daily lives.

Python never had a chance against PHP? 0

ludo, Wednesday 30 March 2005

As much as I usually like John Lim's excellent PHP Everywhere, I don't particularly agree with his comments on Ian Bicking's Why Web Programming Matters Most. Being a longtime PHP programmer (since when it was called PHP/FI) and PEAR contributor, and having heavily used (and loved) Python for all my projects in the past couple of years, both Ian's

resolving Python's problems with web programming is the most important thing we can do to market Python
and John's
what made PHP successful is not what PHP is lacking but the features that PHP has that are superior to Python
ring true to my experience.

What I don't particularly agree with is John's list of things PHP "does better" than Python:

An unusual referrer 0

ludo, Thursday 30 September 2004

My brother noticed a strange referrer for this page in his logs today (split over a few lines for convenience):

http://adtools.corp.google.com/cgi-bin/annotatedquery/annotatedquery.py?q=SMSMAIL
&host=www.google.com&hl=it&gl=IT&ip=&customer_id=&decodeallads=1&ie=UTF-8
&oe=utf8&btnG=Go Annotate&sa=D&deb=a&safe=active&btnG=Go Annotate

Maybe it's pretty common, and it's just that living at the periphery of the Empire we're not too interesting for Google and never see this kind of referrers in our logs. But I'm curious anyway, and I like the fact it's obviously a Python script.

The few bits of information that can be gathered from the URL are:

  • it's a Python script :) whose name seems to suggest that its purpose is to perform a query that lets you add or view annotations for a set of results or for some of them
  • it's a placement ad tool for corporate clients (adtools.corp.google.com)
  • it has a customer_id field, empty in this particular case
  • it has a button labeled "Go Annotate", thus it probably lets you add not (or not only) view annotations
  • it has a decodeallads field, so probably the query results are used to check for ad placement

My guess is a tool to aid in fine-tuning ad placement for Google's paying customers, where they can perform a query on a set of keywords, see on which of the result pages their ad would be placed, and annotate the placement for Google's staff.

Anybody knows better?

HTTP Status Codes (linked to RFC2616) 0

ludo, Wednesday 15 September 2004

As I cannot sleep since I got woken up by a loud burglar alarm nearby, I might as well post something new here. Recently I had the need to display hits on a web site grouped by HTTP status code, so to turn a quick few lines of code into something a bit more interesting, I decided to display each status code in the summary cross-linked to the relevant part of the HTTP 1.1 specification.

The spec looks like a docbook-generated HTML set of pages (just a guess) with sections and subsections each having a unique URL built following a common rule, so once you have a dict of status codes:section numbers, it's very easy to build the links:

HTTP_STATUS_CODES = {
    100: ('Continue', '10.1.1'),
    101: ('Switching Protocols', '10.1.2'),
    200: ('OK', '10.2.1'),
    201: ('Created', '10.2.2'),
    202: ('Accepted', '10.2.3'),
    203: ('Non-Authoritative Information', '10.2.4'),
    204: ('No Content', '10.2.5'),
    205: ('Reset Content', '10.2.6'),
    206: ('Partial Content', '10.2.7'),
    300: ('Multiple Choices', '10.3.1'),
    301: ('Moved Permanently', '10.3.2'),
    302: ('Found', '10.3.3'),
    303: ('See Other', '10.3.4'),
    304: ('Not Modified', '10.3.5'),
    305: ('Use Proxy', '10.3.6'),
    306: ('(Unused)', '10.3.7'),
    307: ('Temporary Redirect', '10.3.8'),
    400: ('Bad Request', '10.4.1'),
    401: ('Unauthorized', '10.4.2'),
    402: ('Payment Required', '10.4.3'),
    403: ('Forbidden', '10.4.4'),
    404: ('Not Found', '10.4.5'),
    405: ('Method Not Allowed', '10.4.6'),
    406: ('Not Acceptable', '10.4.7'),
    407: ('Proxy Authentication Required', '10.4.8'),
    408: ('Request Timeout', '10.4.9'),
    409: ('Conflict', '10.4.10'),
    410: ('Gone', '10.4.11'),
    411: ('Length Required', '10.4.12'),
    412: ('Precondition Failed', '10.4.13'),
    413: ('Request Entity Too Large', '10.4.14'),
    414: ('Request-URI Too Long', '10.4.15'),
    415: ('Unsupported Media Type', '10.4.16'),
    416: ('Requested Range Not Satisfiable', '10.4.17'),
    417: ('Expectation Failed', '10.4.18'),
    500: ('Internal Server Error', '10.5.1'),
    501: ('Not Implemented', '10.5.2'),
    502: ('Bad Gateway', '10.5.3'),
    503: ('Service Unavailable', '10.5.4'),
    504: ('Gateway Timeout', '10.5.5'),
    505: ('HTTP Version Not Supported', '10.5.6')}

def getHTTPStatusUrl(status_code): if not status_code in HTTP_STATUS_CODES: return None description, section = HTTP_STATUS_CODES[status_code] return '''<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec%s" >%s - %s</a>''' % (section, status_code, description)

To get and parse the server log files, and store them in MySQL I'm using a small app I wrote with a colleague at work in Python (what else?), which started as a niche project and may become a company standard. The app uses components for both the log getting (currently http:// and file://) and parsing (our apps custom logs, and apache combined), uses HTTP block transfers to get only new log records, and a simple algorithm to detect log rotations and corruptions. The log records may be manipulated before handing them off to the db so as to insert meaningful values (eg to flag critical errors, etc.). The app has a simple command line interface, uses no threading as it would have complicated development too much (we use the shell to get registered logs and run one instance for each log), and has no scheduler, as cron does the job very well. At work, it is parsing the logs of a critical app on a desktop pc, handling something like 2 million records in a few hours, and it does it so well we are looking into moving it to a bigger machine (maybe an IBM Regatta) and DB2 (our company standard), to parse most of our applications logs and hand off summaries to our net management app. It still needs some work, but we would like to release its code soon. If you're interested in trying it, drop me a note.

Russ discovers Perl (and bashes Python docs) 0

ludo, Tuesday 07 September 2004

I'm on the train to Switzerland for a job interview (no I'm not unemployed, I have a good well-paid -- by Italian standards -- pretty comfortable job with a huge solid company, but I'd like to go work abroad again), reading stuff I dumped this morning at home before leaving on my newly resurrected iPAQ*. Apart from Danah Boyd's Social Software, Sci-Fi and Mental Illness which was mentioned on Joel's latest entry, the only other interesting thing to read was Russ's sudden discovery of Perl.

I was a bit surprised to read that a Java developer with some exposure to Python (mainly due to his obsession with Series 60 phones, which sooner or later will get Python as their default scripting language) can suddenly discover Perl and even like it. I was more surprised when I read that a good part of Russ's sudden love for Perl is due to "Python's horrible documentation.[..] Python's docs were just half-ass and bewildering to me". . Horrible? How's that?

I've used my share of languages these past 10 years (and Perl was the first I loved), and I've never found anything as easy to learn and use as Python. The docs are a bit concise but well written, and they cover the base language features and all the standard library modules with a well-organized layout. And if the standard docs are not enough, you can always go search comp.lang.python, read PEPs on specific topics, read AMK's "What's new in Python x.x" for the current and older versions of Python, keep the Python Quick Reference open, or buy the Python Cookbook (which may give you the right to ask Alex about some obscure Python feature for the 1000th time on clp or iclp and get an extra 20 lines in his replies). And as for examples, you don't see many of them in the docs because Python has an interactive interpreter, and very good built-in introspection capabilities. So usually when you have to deal with a new module you skim the docs then fire up the interpreter, import the module, and start poking around to see how to use it. And while you're at the interpreter console, remember that dir(something) and help(something) are your friends. Maybe the Perl and Perl modules docs are so good (are they, really? I can't remember) because you would be utterly lost in noise without them, as Russ seems to notice in dealing with special variables, something that still gives me nightmares from time to time.

So I don't really get Russ's sudden love for Perl, nor am I overexcited by learning that he got invited to Foo Camp, as I'm usually not much into the "look how important/well known/an alpha geek I am" stuff (probably due to my not being important, or well known, and definitely not an alpha anything, apart for my dogs).

Rereading this entry after having posted it, and before I forget about Perl for next (hopefully many) years, I have to say that if despite everything you're really into Perl you should have a look at Damian Conway's excellent Object Oriented Perl, one of the best programming books I have read.

BTW, I live 50 kms away but I always forget how green Switzerland is. Oh, and I should remember to book a place in a non-smoking car next time, so as to lower my chances of sitting next to an oldy lady smoking Gauloises, exploding in bursts of loud coughing every odd minute, and traveling with super-extra-heavy suitcases she cannot move half an inch, let alone raise over the seats.

update This morning I found a reply by Russ in my inbox. I'm not an expert in the nuances of written English, but he did not sound happy with my post. I guess the adrenaline rush of going to a long interview in three different languages in a foreign country (not to say being a bit envious of work conditions overseas) made me come out harsher than I wanted to. Or maybe I just like pissing people off when I have nothing better to do, and they trumpet false opinions to the whole world (something I... uhm... never did). Anyway, it turns out Russ could not find the Python Library Reference, to learn how to use the mailbox module to parse a Unix mailbox. I guess sometimes it just pays to look twice, especially if not looking involves Perl.

As usual on my site, comments are disabled. I'm too lazy to add them to my homegrown blogging thing, and I have no time for dealing with comment spam. If you feel like you have something worthwile to say, reply to Russ's post on his blog, email me, or just get busy on something else for a few minutes then try to remember what you wanted to say, maybe this is not such an interesting topic as you thought.

update October 18, 2004 With all the excitement on podcasts these past weeks, which involved starting a new site and doing my own podcasts for Qix.it, I forgot to mention an email from Pete Prodoehl about the comments I make on US versus European work conditions and salaries in the footnote below. Pete writes:

Ludo, I certainly don't have the disposable income that many US bloggers seem to have. I know what you mean though. I constantly seem to come across posts where people say they handed down their old 20gig iPod because the got a new 40gig iPod.
Honestly, it all sounds crazy to me... Then again, I live in the US Midwest, maybe things are different here as well. ;)
Thanks Pete, it's nice to learn that not all Americans live in Eden (I should know that myself, having lived there for a year, but it was a long time ago and things might have changed).


* I don't have the money to buy a new PDA (how do you people in the US manage to buy so many gadgets and crap anyway? don't you have to pay rent/bills/taxes/etc? aren't you hit by the recession or is it something only we Italians have to live with?), so I resurrected my iPAQ 3850 which had been lying around thinking itself a brick for ages, got a new battery off ebay, wiped out that sorry excuse for an operating system which is PocketPC, installed OPIE/Opie Reader/ JpluckX and soon had a working PDA again. I did not remember how much I missed having one, a smartphone is not the same thing. Now if only I had the money to buy a Zaurus SL-C860... back



Line Endings in Mail Messages 0

ludo, Thursday 01 July 2004

This entry expands on the subject of line endings in email (and news) messages, which I introduced in my previous entry on SMIME. In my (brief) experience working with mail and news messages, there are three different contexts involving line endings:

  • transmission over the wire by SMTP (or NNTP)
  • MIME canonicalization
  • local handling of messages (storage, mail applications, etc.)

Transmission by SMTP RFC2822 specifies CR+LF line endings for on-the-wire transmission by SMTP.

MIME canonicalization RFC2049 specifies that CR+LFs are used in the canonicalization of MIME body parts BEFORE applying transfer encoding. It further states that

The output of the encoders may have to pass through one or more additional steps prior to being transmitted as a message. As such, the output of the encoder may not be conformant with the formats specified by RFC 822. In particular, once again it may be appropriate for the converter's output to be expressed using local newline conventions rather than using the standard RFC 822 CRLF delimiters.

In this context then, the appropriate linefeeds for an email message depend on the steps to perform after MIME canonicalization, and the local line endings convention.

Local handling It is apparent from the previous excerpt from RFC2049 that local handling of mail messages has to follow the local operating system's line endings convention. The most typical examples of this context are storing messages (eg in mailbox files or Maildirs), and piping/passing messages to mail handling programs like qmail-inject or Openssl's smime command. Googling on this context reveals the complexity often surrounding the question of proper line endings in email and news messages, and produces a few interesting links such as Life With Qmail: G.11. Carriage Return/Linefeed (CRLF) line breaks don't work, PHP bug #15841, a thread on the ietf- 822 list, and a thread on the Usenet Article Standard Update mailing list.

An illustrious victim of the line endings complexity I had to struggle with these past few days is Openssl's smime command, which has the nasty habit of outputting only the pkcs7 payload with CR+LF line endings when clear signing a message. The Python email package also seems to have a few problems with line endings, cf. bug #975330 submitted by Anders Hammarquist, who gave the very interesting Real-world email handling in python presentation at EuroPython 2004.

SMIME sucks 0

ludo, Thursday 24 June 2004

Well, not exactly..... SMIME is a very nice spec, it's widely supported, and there are SMIME libraries for most development environments. The problem is that the RFC822 SMTP-vs-local line- endings nastyness combines with MIME canonicalization and libraries/mail UAs idiosyncracies to make SMIME messages very fragile. In the rest of this entry, I briefly describe a couple of SMIME pitfalls I spent quite a few hours debugging recently.

If you want to experiment with SMIME signing, you can download the M2Crypto- based SMIME signing class which is the companion to the verifying class of my previous entry on SMIME.

line endings

Using OpenSSL (via M2Crypto) to cleartext sign a SMIME message, you get back a valid multipart/signed message that has one problem: the cleartext part of the message has CR+LF line endings, while the rest of the message (pkcs7 signature, SMIME headers) has LF line endings (cf this thread on mailing.openssl.dev). OpenSSL performs MIME canonicalization (ie it converts line endings to CR+LF -- SMIME_crlf_copy() in crypto/pkcs7/pk7_smime.c/PKCS7_sign()) on the message before signing it, as per the SMIME spec. The problem is that a message with a mix of CR+LF and bare LF is almost never what you need. If you send the message by SMTP directly after signing it, it should have CR+LF line endings as per RFC822. If you hand off the message to a program like qmail-inject, your message should respect local conventions, ie on Unix have bare LF line endings. This problem becomes apparent when you open a signed message in Outlook or Outlook Express, since the message appears tampered and cannot be verified. If you sign with the "binary" flag, you get no CR+LF line endings but the resulting message cannot be verified unless you perform the MIME canonicalization yourself before signing, getting the same output as in the above situation. A sample of the resulting message, with line endings prefixed to the actual lines (and long lines snipped):

'n'    MIME-Version: 1.0
'n'    Content-Type: multipart/signed; protocol="application/x-pkcs7-sig
'n'
'n'    This is an S/MIME signed message
'n'
'n'    ------526F05E052FA5F1DF695C4ABA3E3EF81
'rn'  prova prova prova
'rn'   prova 123
'rn'
'rn'  prova
'n'
'n'    ------526F05E052FA5F1DF695C4ABA3E3EF81
'n'    Content-Type: application/x-pkcs7-signature; name="smime.p7s"
'n'    Content-Transfer-Encoding: base64
'n'    Content-Disposition: attachment; filename="smime.p7s"
'n'
'n'    MIIGDAYJKoZIhvcNAQcCoIIF/TCCBfkCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3
'n'    DQEHAaCCA9gwggPUMIIDPaADAgECAgECMA0GCSqGSIb3DQEBBAUAMIGbMQswCQYD

outlook interoperability

If you sign SMIME messages with OpenSSL on Unix, you may discover that your messages are valid in Mozilla, but they appear as tampered in Outlook, Outlook Express, and programs using MS libraries to validate them. A search on Google Groups turns up quite a few threads on this topic, none of which unfortunately offer any practical help, apart from generic suggestions regarding line-ending conversions, which do not work. After quite a few hours spent debugging this problem, I could only come up with a practical solution with no theorical explanation: append a single linefeed (a bare LF) to the end of the payload before signing it. Mozilla and OpenSSL keep verifying the resulting SMIME messages, and Outlook stops complaining that the message has been tampered with. I still have to verify the implications of this workaround when you sign a message that includes a SMIME message as a MIME message/rfc822 attachment, since I've noticed that signing such a message often breaks the attachment validity.

Verifying SMIME email with M2Crypto 0

ludo, Monday 31 May 2004

In my spare time, I'm working on a project where I have to sign and verify SMIME mail using M2Crypto which works quite well, but lacks a bit in documentation especially on SMIME functions. The Programming S/MIME in Python with M2Crypto howto is enough to point you in the right direction, and the source has a few SMIME examples. What is missing is a recipe to verify signed SMIME messages if you don't have the signer's certificate, which is what usually happens when you have to verify Internet email.

Openssl's smime command is able to do that, so there should be a way to accomplish the same thing from Python using M2Crypto. After a bit of fiddling around and looking at openssl's source, I have found out a way which seems to work (update: content check done against the output of SMIME.smime_load_pkcs7_bio instead of using email.message_from_string, return a list of certificates on succesful verification, show content diff if verification fails):

#!/usr/bin/python
"""
Simple class to verify SMIME signed email messages
without having to know the signer's certificate.
The signer's certificate(s) is extracted from
the signed message, and returned on successful
verification.
A unified diff of the cleartext content against
the one resulting from verification is returned
as exception value if the content has been tampered
with.

Use at your own risk, send comments and fixes. May 30, 2004 Ludovico Magnocavallo <[email protected]> """

import os, base64 from M2Crypto import BIO, SMIME, m2, X509 from difflib import unified_diff

class VerifierError(Exception): pass

class Verifier(object): """ accepts an email payload and verifies it with SMIME """ def __init__(self, certstore): """ certstore - path to the file used to store CA certificates eg /etc/apache/ssl.crt/ca-bundle.crt >>> v = Verifier('/etc/dummy.crt') >>> v.verify('pippo') Traceback (most recent call last): File "/usr/lib/python2.3/doctest.py", line 442, in _run_examples_inner compileflags, 1) in globs File "<string>", line 1, in ? File "verifier.py", line 46, in verify self._setup() File "verifier.py", line 36, in _setup raise VerifierError, "cannot access %s" % self._certstore VerifierError: cannot access /etc/dummy.crt >>> """ self._certstore = certstore self._smime = None def _setup(self): """ sets up the SMIME.SMIME instance and loads the CA certificates store """ smime = SMIME.SMIME() st = X509.X509_Store() if not os.access(self._certstore, os.R_OK): raise VerifierError, "cannot access %s" % self._certstore st.load_info(self._certstore) smime.set_x509_store(st) self._smime = smime def verify(self, text): """ verifies a signed SMIME email returns a list of certificates used to sign the SMIME message on success

text - string containing the SMIME signed message

>>> v = Verifier('/etc/apache/ssl.crt/ca-bundle.crt') >>> v.verify('pippo') Traceback (most recent call last): File "<stdin>", line 1, in ? File "signer.py", line 23, in __init__ raise VerifierError, e VerifierError: cannot extract payloads from message >>> >>> certs = v.verify(test_email) >>> isinstance(certs, list) and len(certs) > 0 True >>> """ if self._smime is None: self._setup() buf = BIO.MemoryBuffer(text) try: p7, data_bio = SMIME.smime_load_pkcs7_bio(buf) except SystemError: # uncaught exception in M2Crypto raise VerifierError, "cannot extract payloads from message" if data_bio is not None: data = data_bio.read() data_bio = BIO.MemoryBuffer(data) sk3 = p7.get0_signers(X509.X509_Stack()) if len(sk3) == 0: raise VerifierError, "no certificates found in message" signer_certs = [] for cert in sk3: signer_certs.append( "-----BEGIN CERTIFICATE-----n%s-----END CERTIFICATE-----" % base64.encodestring(sk3[0].as_der())) self._smime.set_x509_stack(sk3) try: if data_bio is not None: v = self._smime.verify(p7, data_bio) else: v = self._smime.verify(p7) except SMIME.SMIME_Error, e: raise VerifierError, "message verification failed: %s" % e if data_bio is not None and data != v: raise VerifierError, "message verification failed: payload vs SMIME.verify output diffn%s" % 'n'.join(list(unified_diff(data.split('n'), v.split('n'), n = 1))) return signer_certs

test_email = """put your test SMIME signed email here"""

def _test(): import doctest return doctest.testmod()

if __name__ == "__main__": _test()

 

  next 14 posts >