Lxml Install Procedure

Many complain that installing lxml is PITA. Well it does not have to be and I will tell you why — the install procedure at the codespeak website is incomplete. They don’t tell you that you will be doing a build. As a consequence if you just follow their procedure you will probably have an 80% failure rate. So what ARE you supposed to do?

The procedure outlined here is for a Ubuntu/Debian or close derivatives.

1) Fundamentals

Are the following installed?

  • libxml2
  • libxslt
  • python

From the command line do the following –

python If it comes up with ‘>>>’ you are ready to go. Enter Ctrl-D to get back to a linux prompt.

do a ‘aptitude search libxml2′ and a ‘aptitude search libxslt’. If the two packages are installed then there should be a ‘i’ indicated. if a ‘p’ they are not. If they are not installed then –

aptitude install libxml2
aptitude install libxslt

(note: install must be done with the root account)

2) Provide the development headers for the build.

perform an ‘aptitude search libxml2′ like you did before. Same with the libxslt package. take note of each. You should also see a libxml2-dev and a libxslt-dev. Names might be slightly different depending on which OS version you have installed. If they are not installed then –

aptitude install [libxml2-dev]
aptitude install [libxslt-dev]

or the appropriate name for your OS.

Next, install the python dev headers as well. A ‘aptitude search python’ should display a python-dev entry. Perform the install –

aptitude install python-dev

Almost to the last step. Need to install the python utils package —

aptitude install python-setuptools

3) Now install lxml –

easy_install lxml

The build will take a while. On my CoreDuo 2.6Ghz box it takes about 5 minutes. If all is successful you should see no errors.

Bonus item:

If you have good skills in jQuery you might want to install PyQuery. This is a python tool that operates using selector defined by jQuery. Comes in quite handy at wrapping and reducing HTML code. --

easy_install pyquery