the great source-highlight vs. pygments face-off
Installation
Source-highlight is in C++ and needs the Boost regex library. About 3.5 MB together. Pygments is in Python and has no dependencies beyond that. About 2 MB installed. No problem on either side.In Debian, the packages are source-highlight and python-pygments. Note that the Pygments command-line tool is called pygmentize.
Source-highlight is licensed under the GPL, Pygments under a BSD license.
Getting Started
pygmentize file.cwrites a colored version of file.c to the standard output. Nice.
source-highlight file.cwrites a colored version of file.c to file.c.html. As I had written before, the correct invocation for this purpose is
source-highlight -fesc -oSTDOUT file.cThat makes pygmentize slightly easier to use, I suppose.
Supported Languages
Source-highlight supports 30 languages, Pygments supports 136.Source-highlight can produce output for DocBook, ANSI console, (X)HTML, Javadoc, LaTeX, Texinfo. Pygments can produce output for HTML, LaTeX, RTF, SVG, and several image formats.
Source-highlight supports styles, but only ships a few. Pygments supports styles and filters, and ships over a dozen styles.
So more options with Pygments here.
Also note that Pygments is a Python library that can be used, say, in web applications for syntax highlighting. This is used in Review Board, for example. Source-highlight is just a command-line tool, but it could of course also be invoked by other packages. Horde uses this, for instance.
Speed
To process all the C files in the PostgreSQL source tree (709271 lines), writing the ANSI console colored version of each file.c to file.c.txt:- source-highlight
- 25 seconds
- pygmentize
- 5 minutes
Robustness
pygmentize gave me a few errors of this sort during the processing of the PostgreSQL source tree:*** Error while highlighting:That probably shouldn't happen.
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb7' in position 83: ordinal not in range(128)
(file "/usr/lib/python2.5/site-packages/Pygments-1.0-py2.5.egg/pygments/formatters/terminal.py", line 93, in format)
*** Error while highlighting:
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 196-198: invalid data
(file "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode)
source-highlight gave no spurious errors in my limited testing.
hey!
ReplyDeletei did a similar comparison a while back when writing the http://patch-tracker.debian.net . since i have a strong aversion to writing anything using system() in a webapp, and i was writing in python anyway, pygments was a clear winner for that use case.
In my test to produce source to be input into a Latex document, source-highlight produce output that just works... pygments not and I receive several errors when I did a pdflatex.
ReplyDeleteI know it is not relevant for your post, but if someone is looking for highlight into Latex, pygments is not recommended.
Thanks =)
Another option to check out is highlight, also
ReplyDeletein debian. Upstream ships swig based api's for perl and python, although so far I only install the perl ones in the debian package. The killer feature for me is a passthrough option to allow lines of markup in your source. I use it to add overlays for beamer.