libxml FTW

By Shawn Medero on 2008-04-02T21:42:23Z

Ian Bicking throws nine Python HTML parsing tools at python.org and benchmarks the results with lxml (based on the C library libxml) coming out on top (at least in terms of performance and memory). The lesson here is that libxml is extremely fast and powerful and if your scripting language of choice provides access to it you should use it.