2012-07-01

Blog moved to Wordpress

I have moved all my posts to Wordpress.com. The new home is http://sunh11373.wordpress.com/

2012-04-25

Python multiprocessing configuration fix on Centos

The OS I am running is 2.6.27-chistyakov.1 #1 SMP Tue Dec 29 10:26:29 PST 2009 x86_64 x86_64 x86_64 GNU/Linux. The Python multiprocessing module reported follow errors

File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 49, in __init__
sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 38] Function not implemented


Turned out that we need to enable the SHM on the box. To do that, find the file /etc/fstab, add a new line

none /dev/shm tmpfs rw,nosuid,nodev,noexec 0 0

Restart the box and the SHM should be enabled now.

2011-10-18

How to Write a Spelling Corrector - The Haskell Version

Peter Norvig has an essay on How to Write a Spelling Corrector. There are implementations in many languages. However, the link to the Haskell version is broken now. So I used this opportunity to practice my limited Haskell skill.


File: SpellChecker/Core.hs




File: Main.lhs


2008-03-20

Export emails from Outlook

"""
Exports email address from MS Outlook from recent eemails
"""

from win32com.client import Dispatch, constants
import os.path

class Exporter(object):

app = None
emails = []

def __init__ (self):

self.app = Dispatch("Outlook.Application")
if self.app == None:
raise Exception, "Unable to create an Outlook application object"

def export_emails(self, foldername, outfile="email.csv", maxcount=1000):

emails = []
ns = self.app.GetNamespace("MAPI")
folders = ns.Folders.Item(1).Folders #Usually the users folders
folder = folders.Item(foldername)
cnt = len(folder.Items)
if cnt > maxcount:
cnt = maxcount
for i in range(1, cnt + 1):
item = folder.Items.Item(i)
print "Processing item %d" % i
self.export_from_field(item, "To")
self.export_from_field(item, "CC")

self.emails.sort()
emailfile = open(outfile, 'w')
for email in self.emails:
emailfile.write(email + "\n")
emailfile.close()

def export_from_field(self, item, fieldname, maxcount = 10):
if not hasattr(item, fieldname):
return
ms = getattr(item, fieldname).split(';')
if len(ms) > maxcount:
return
for m in ms:
parts = m.split(',')
if len(parts) <> 2:
continue
email = "%s_%s@intuit.com" % (parts[1].strip(), parts[0].strip())
if not email in self.emails:
self.emails.append(email)

def __del__ (self):

self.app = None
self.emails = None

def main ():
export = Exporter()
export.export_emails("InBox", "emails.csv", 1000)

if __name__ == "__main__":
main();

2008-02-29

A few lessons learned on prototype.js and RoR

A few lessons learned from my recent work on prototype.js and Ruby on Rails

1) "class" is a reserved word for IE
"class" is a reserved word for IE. If you use "class" as variable name in your JavaScript such as this (example using prototype.js):
var highlight_span= new Element('span', { margin: '0', class: 'highlight' }); 

It works in Firefox but gives you an error in IE. The work around is:
var highlight_span= new Element('span', { margin: '0'}); highlight_span.addClassName("highlight"); 


2) IE requires 'tbody' in dynamically generated DOM
If you use dynamic JavaScript, such as "new Element(..)", to generate a table in layout, make sure the 'tbody' element is generated explicitly. Otherwise, the table won't show up in IE. Firefox does not seem to care about the missing of 'tbody'. Static coded table without 'tbody' works fine in IE.

3) IE requires you to explicitly register event handler
Following code works in Firefox, but not IE:
var a = new Element('a', {onclick: handler}); 

You have to use "observe" to explicitly register the handler function:
var a = new Element('a').observe('click', handler); 


4) You cannot set cookie value to int in RoR
This code won't work:
cookies[:worker] = @worker.id 

Instead, you need to say:
cookies[:worker] = @worker.id.to_s 

2008-02-08

Web page smoke test script

#!/usr/local/bin python

"""
This script is used to check urls with a pattern. The configuration file has following format

protocol_name|url|additional_args

example:
simpleGET|http://www.google.com/|Google
simpleGET|http://www.aol.com/|WhatWhat

Example output:
http://www.google/ -> True
http://www.aol.com/ -> False

"""

import os.path, re, getopt, sys, urllib2, urllib

verbose = False
FETCHER = 'fetcher'
MATCHER = 'matcher'

def say(s):
if verbose:
print s

class UrlChecker:

def __init__(self, urls, protocols):
self.urls = urls
self.protocols = protocols

def check(self):
for rurl in self.urls:
self._check_url(rurl)

def _check_url(self, rurl):
protocol, url, args = rurl.split('|')
say("Check URL %s with protocol %s. Args = %s" % (url, protocol, args))
rslt = self.protocols[protocol][FETCHER](url, args)
self.protocols[protocol][MATCHER](rslt, url, args)

def load_urls(urlfile):
urls = []
f = open(urlfile, 'r')
try:
for line in f:
say("Load one raw url %s" % (line.strip()))
urls.append(line.strip())
finally:
f.close()
return urls

def add_simpleGET(protocols):

def simple_get(url, ignore):
r = urllib2.urlopen(url)
rslt = r.read()
r.close()
return rslt

def str_match(body, url, pattern):
print "%s -> %s" % (url, body.find(pattern) != -1)

protocols['simpleGET'] = {FETCHER : simple_get, MATCHER : str_match}

def init_protocols():
protocols = {};
add_simpleGET(protocols)
return protocols

if __name__ == '__main__':

def usage():
print "Usage:"
print "python check_url.py -f <url_file> -v"

try:
opts, args = getopt.getopt(sys.argv[1:], "f:v")
except getopt.GetoptError:
usage()
sys.exit(2)

urls = None
protocols = init_protocols()
for opt, arg in opts:
if opt in ("-v"):
verbose = True
elif opt in ("-f"):
urls = load_urls(arg)
if urls is None:
usage()
sys.exit(1)

checker = UrlChecker(urls, protocols)
checker.check()

2008-02-05

Email forwarding script

#!/usr/local/bin python

"""
This script is used to forward email
"""

import os, sys, poplib, getopt, getpass, email, smtplib

from_email = '???'
to_email = '???'
pop3_server_url = '???'
pop3_server_port = 110
pop3_user = '???'
pop3_pass = '???'
delete_after = True
subject_filter = '????'

def match_filter(subject, sfilter):
return subject.find(sfilter) == 0

def forward_mail(femail, temail, mail):

for (header, value) in mail.items():
if not header in ('Content-Type', 'Content-Transfer-Encoding', 'Subject', 'Mime-Version'):
del mail[header]

mail["From"] = femail
mail["To"] = temail
for (header, value) in mail.items():
print "%s: %s" % (header, value)

server = smtplib.SMTP('localhost')
server.set_debuglevel(0)
server.sendmail(femail, temail, mail.as_string())
server.quit()

def process_mail(pop3_mail):

mail = email.message_from_string(pop3_mail)
subject = mail.get("Subject")
if match_filter(subject, subject_filter):
print "Forward email: %s" % (subject)
forward_mail(from_email, to_email, mail)
return True
else:
print "Skip email: %s" % (subject)
return False

if __name__ == '__main__':

def usage():

print "Usag:"
print "python forward_mail.py -s <pop3_server> -p <pop3_port> -u <user> -c <credential> -f <from_email> -t <to_email>"

try:
opts, args = getopt.getopt(sys.argv[1:], "s:p:u:c:f:t:")
except getopt.GetoptError:
usage()
sys.exit(2)

for opt, arg in opts:
if opt in ("-s"):
pop3_server_url = arg
elif opt in ("-p"):
pop3_server_port = int(arg)
elif opt in ("-u"):
pop3_user = arg
elif opt in ("-c"):
pop3_pass = arg
elif opt in ("-f"):
from_email = arg
elif opt in ("-t"):
to_email = arg

if pop3_user is None:
pop3_user = getpass.getUser()
if pop3_pass is None:
pop3_pass = getpass.getpass()

pop3 = poplib.POP3(pop3_server_url, pop3_server_port)
pop3.user(pop3_user)
pop3.pass_(pop3_pass)
msg_count = len(pop3.list()[1])
print "Total messages in Inbox: %d" % (msg_count)

for i in range(msg_count):
resp, text, octets = pop3.retr(i+1)
if process_mail('\n'.join(text)) and delete_after:
print "Forwarded email is deleted"
pop3.dele(i+1)
pop3.quit()

2008-02-02

Search text pattern in log files

#!/usr/local/bin python

"""
This script is used to analyze log file
"""

import os.path, time, re, getopt, sys, fnmatch

verbose = False

def say(s):
if verbose:
print s

def searchPattern(args, dirname, filenames):

file_pattern, text_pattern, count = args
say("Processing directory %s now" % (dirname))
say("File pattern: %s" % (file_pattern))
say("Text pattern: %s" % (text_pattern))
say("Count only: %s" % (count))

counts = {}
def incMatchCount(matched):
counts[matched] = counts.get(matched, 0) + 1

compiled_pattern = re.compile(text_pattern)
for filename in fnmatch.filter(filenames, file_pattern):
say("Processing file %s now" % (filename))
log = open(os.path.join(dirname, filename), 'r')
for line in log:
m = re.match(compiled_pattern, line)
if m:
if count:
incMatchCount(m.group(1))
else:
print ';;'.join(m.groups())
if count:
for matched in counts.keys():
print " -> ".join([matched, str(counts[matched])])

if __name__ == '__main__':

def usage():
print "Usage:"
print "python find_pattern.py -d <root_dir> -f <file_pattern> -p <search_pattern> -c -v"

try:
opts, args = getopt.getopt(sys.argv[1:], "d:f:p:cv")
except getopt.GetoptError:
usage()
sys.exit(2)

count = False
root_dir = '.'
file_pattern = '*.log'
text_pattern = '(.+)'
for opt, arg in opts:
if opt in ("-v"):
verbose = True
elif opt in ("-c"):
count = True
elif opt in ("-d"):
root_dir = arg
elif opt in ("-f"):
file_pattern = arg
elif opt in ("-p"):
text_pattern = arg

os.path.walk(root_dir, searchPattern, (file_pattern, text_pattern, count))

2008-01-28

SlimFIX gets new life

I just found that a project I developed a few years back SlimFIX has been adopted by an open source project called ActiveQuant. I am glad it finally gets a new life.

2007-04-11

ssl mutual authentication in perl

Here is a perl script to invoke web service using SSL mutual authentication.

use LWP::UserAgent;
use HTTP::Request::Common;

my $request = << REQ;
# message here
REQ

# client certificate support
$ENV{HTTPS_CERT_FILE} = 'test.crt';
$ENV{HTTPS_KEY_FILE} = 'test.key';

# CA cert peer verification
$ENV{HTTPS_CA_FILE} = 'ca.crt';

my $ua = new LWP::UserAgent;
my $res = $ua->request(POST 'https://test.com',
SOAPAction => 'http://test.com/operationA',
Content_Type => 'application/xml',
Content => $request);

print $res->code."\n";
print $res->content."\n";

print "Work is done. \n\n";