The Python Library is a continue source of amazement: I just discovered the very useful unicodedata module, which pairs the u'N{LETTER NAME}' escape sequence.
The N{} escape sequence works like this:
>>> u'N{LATIN SMALL LETTER M WITH DOT BELOW}' u'u1e43'
The unicodedata module, among other things, allows you to lookup the unicode character associated with a name, which allows you to build mapping tables using character names:
>>> import unicodedata >>> unicodedata.lookup('LATIN SMALL LETTER M WITH DOT BELOW') u'u1e43'
The reverse of lookup() is name():
>>> unicodedata.name(unicodedata.lookup('LATIN SMALL LETTER M WITH DOT BELOW')) 'LATIN SMALL LETTER M WITH DOT BELOW' >>>
If you want to check unicode names, a very useful site is the Letter Database at the Institute of the Estonian Language. An example is the search for LATIN SMALL LETTER S WITH DOT BELOW, which yields this page.
Related posts
» An unusual referrer (30/09/2004)
» HTTP Status Codes (linked to RFC2616) (15/09/2004)
» Russ discovers Perl (and bashes Python docs) (07/09/2004)
» SMIME sucks (24/06/2004)