2009-09-05

How to parse optional arguments without braces or brackets in TeX

This blog post shows the two winning solutions for the TeX argument parsing problem proposed by Kees van der Laan on the EuroTeX 2009 conference on 2009-08-31. My conclusion is that TeX macro programming, especially undelimited input parsing is tricky, and can become ugly since there are no powerful string inspection and matching operations built into TeX.

The basic problem

The basic problem is to write the argument parsing part of a TeX macro called \jpg, which can be used to include external (JPEG) images, optionally overriding the image width and/or height, like below:
\jpg file.jpg  % at original size
\jpg width5cm file.jpg % scale proportionally (keeping aspect ratio)
\jpg height6cm file.jpg % ditto
\jpg width5cm height6cm file.jpg % resize ignoring aspect ratio
\jpg height6cm width5cm file.jpg % ditto
It is OK to assume that no file name starts with width, height or depth. The space after the dimension (e.g. 5cm) is mandatory.

Solution for the basic problem

This is my winning solution (download source) which solves the basic problem:
%
% jpgcmd_simple.tex: a simple braceless image inclusion TeX macro
% by pts@fazekas.hu at Mon Aug 31 13:42:49 CEST 2009
%
% This is one of the winning solutions for the problem proposed by Kees
% van der Laan on the conference EuroTeX 2009 (2009-08-31).

\def\jpgfilename#1 {%
\egroup % end of the \setbox0\vbox
%\showthe\dp0
\ifdim\dp0=0pt \else \errmessage{depth specified for jpg}\fi%
%\pdfximage
% \ifdim\wd0=0pt \else width\wd0\fi
% \ifdim\ht0=0pt \else height\ht0\fi
% {#1}%
%\pdfrefximage\pdflastximage
\message{JPG file=#1 width=\the\wd0 \space height=\the\ht0;}%
\endgroup}
\def\jpg{%
\begingroup
\setbox0\vtop\bgroup
\hsize0pt \parindent0pt
\everypar{\jpgfilename}%
\hrule height0pt }

ABC\jpg height3cm smiley.jpg
DEF\jpg width3cm smiley.jpg
GHI\jpg width4cm height3cm smiley.jpg
JKL\jpg width3cm smiley.jpg

MNO\jpg smiley.jpg % test for \par in the line below

PQR\jpg depth5mm smiley.jpg

\end
The basic solution above creates a vbox (using \vrule) containing an \hrule of the specified size, and then measures the size of the vbox. Thus the parsing of the optional width and height arguments gets delegated to to the TeX \hrule primitive. A \vtop is used instead of a \vbox so a depth can be detected.

A more versatile solution

This is another winning solution of mine (download source) which solves the basic problem, but it allows for more versatile optional arguments:
%
% jpgcmd_versatile.tex: a versatile braceless image inclusion TeX macro
% by pts@fazekas.hu at Thu Sep 3 11:01:07 CEST 2009
%
% This is one of the winning solutions for the problem proposed by Kees
% van der Laan on the conference EuroTeX 2009 (2009-08-31).

% If #2 starts with #1, then do #3{#z}, otherwise do #4. We get #z by removing
% #2 from the beginning of #1.

\def\ifprefix#1#2#3#4{%
\ifprefixloop#1\hbox\vbox!#2\hbox\vbox!{#3}{#4}%

}

\def\firstoftwo#1#2{#1}
\def\secondoftwo#1#2{#2}
\def\ifxgroup#1#2{%
\ifx#1#2\expandafter\firstoftwo\else\expandafter\secondoftwo\fi}
\def\striphbox#1\hbox\vbox!#2#3{#2{#1}}
\def\ifprefixloop#1#2\vbox!#3#4\vbox!{%
\ifxgroup#1\hbox{\striphbox#3#4\vbox!}{%
\ifxgroup#1#3{\ifprefixloop#2\vbox!#4\vbox!}\secondoftwo}}

% Tests.
\def\paren#1{(#1)}
\message{\ifprefix{}{barden}{1\paren}{0}!}

\message{\ifprefix{bar}{barden}{1\paren}{0}!}
\message{\ifprefix{bar}{bad}{1\paren}{0}!}

\message{\ifprefix{bar}{ba}{1\paren}{0}!}

\newdimen\jpgwidth
\newdimen\jpgheight
\newdimen\jpgscale

\def\jpg{%
\jpgwidth0pt
\jpgheight0pt
\jpgscale0pt
\jpgparse}

% #2 can start with or without =.
\def\jpgsetdimen#1#2{#1#2 \jpgparse}

\def\skipuntilhbox#1\hbox{}

% #2 can start with or without =.
% #2 can end with `pt' or not.
\def\jpgsetfloat#1#2{
\afterassignment\skipuntilhbox
#1#2pt\space\space\hbox\jpgparse}

\def\jpgdeptherror#1{%
\errmessage{depth specified for jpg}\jpgparse}

\def\jpgparse#1 {%
\ifprefix{height}{#1}{\jpgsetdimen\jpgwidth}{%
\ifprefix{width}{#1}{\jpgsetdimen\jpgheight}{%
\ifprefix{depth}{#1}{\jpgdeptherror}{%
\ifprefix{scale}{#1}{\jpgsetfloat\jpgscale}{%
\jpgshow{#1}}}}}}

\def\jpgshow#1{%
\message{JPG file=#1 width=\the\jpgwidth\space height=\the\jpgheight\space
scale=\the\jpgscale;}%
}

ABC\jpg height3cm smiley.jpg
DEF\jpg width3cm smiley.jpg
GHI\jpg width4cm height=3cm smiley.jpg
JKL\jpg width3cm scale=4 smiley.jpg

% test for \par in the line above

MNO\jpg width3cm scale=4pt smiley.jpg
PQR\jpg smiley.jpg
STU\jpg depth5cm smiley.jpg

\end
This solution accepts the optional argument scale with or without a unit after its optional argument (the default unit is pt), and it also accepts an equals sign (=) after the optional argument keywords. The implementation is a lot longer now since it has to parse the optional arguments manually. It uses some well-known TeX macro programming tricks to scan a string character-by-character. The \ifxgroup macro is worth mentioning: it is an \ifx, but it accepts the code for the then and else branches as brace-delimited arguments. Using braces here is a fundamental trick for implementing nested ifs and recursion, because otherwise the the macros in the then and else branches would receive \else and \fi (respectively) instead of the next token.

No comments: