This is very rough, because I haven't distributed it to anyone before. It's not quite finished. There are some changes I want to make to the internals, and the TeX and LaTeX output drivers aren't done. (The LaTeX one is barely begun.) If I left anything out, let me know. ---------------------------------------------------------------- The conversion program is called 'mod'. To use it: mod -o output-format [ -c chapter-number ] input-file 'm2t' is the same as 'mod -o text' 'm2T' is the same as 'mod -o TeX' 'm2h' is the same as 'mod -o html' 'm2g' is the same as 'mod -o generic' 'm2c' is the same as 'mod -o crappy' 'crappy' is a trivial conversion to plain text. If you want to modify one of the drivers, you should probably have a look at Mod::Crappy first. 'generic' is the base class from which the others inherit. It contains the MOD parsing code, escape sequence translation, and so on. One day I will write API documentation for it. If the input file is foo.mod, the output file is foo.tex, or foo.html, foo.txt, or foo.crap, as appropriate. The -c option is used for numbering sections. (So that the sections in chapter 4 get numbered 4.1, 4.2, etc.) If you omit it, mod tries to guess the chapter number from the input filename. ---------------------------------------------------------------- MOD is Pod-like. Input is structured as paragraphs, separated by blank lines. If a paragraph is indented, it's program source code; if not it's prose. Source code has one special feature: If the leftmost character in a source code line is a '*', the '*' is removed. The driver gets an array of source code lines and an array indicating which lines were starred. Starred lines are 'special'. In my book, this means that they are lines that have changed since the last time the reader saw this same code. The HTML driver sets the special lines in bold purple font. The text driver sets special lines with a '*' in the left margin. The TeX driver will set the special lines with a change par in the margin. If a paragraph begins with an = sign, it's a command. MOD defines a syntax and a recommended command set, but commands are implemented by the various drivers, each in its own way. Commands are case-insensitive, but you should use all lowercase so that I can reserve the capitalized names for nonstandard drivers. Commands that my drivers understand at present: =startlisting listingname =endlisting listingname =listing listingname =inline listingname All program listings between these two marks are considered to be part of the named program listing. The text driver will dump the code to a file named Programs/listingname and inline it. The HTML driver will inline the code, with a link to Programs/listingname. Prose paragraphs between =startlisting and =endlisting are typeset normally, but are not included in the listing. So you can do: The following function takes an argument and returns the next largest prime number: =startlisting next_prime sub next_prime { my $n = shift; $n++; We need to increment C<$n> immediately, because the function is guaranteed to return a prime I than C<$n>. while (1) { return $n if is_prime($n); We saw F on page R. $n++; } } =endlisting next_prime The code, and only the code, will be inserted into Programs/next_prime. Note that =startlisting and =endlisting commands need not nest correctly. It is permissible to have listings overlap. If you have two listings with the same name, the later one will be appended to the earlier. =listing is synonymous with =startlisting. =inline listingname inserts the code for a previously defined listing: Recall the function F, which we saw back in Chapter R: =inline next_prime This could be made more flexible by... =startlisting...=endlisting may also make an entry in the index, and may also define a crossreference for the R<...> sequence. =chapter chaptertitle =section sectiontitle =subsection subsectiontitle =subsubsection subsubsectiontitle Start a new chapter, section, etc. Typesets the title appropriately and adjusts the sections numbers. The text file driver makes an appropriate entry in the .toc file. Drivers may set up crossreference information for the R<...> sequence, described below. =startpicture picturename =endpicture picturename The contents are ascii-art. The text driver inserts this art verbatim. The HTML driver looks for an image file named 'picturename' and inlines that instead. The TeX driver will probably look for picturename.ps or something of the sort. =note TEXT Ignored. This is a comment. =comment =endcomment Intervening text is discarded. This may go away in the next version because =note was good enough. =bulletedlist =endbulletedlist =numberedlist =endnumberedlist =item Start or close a list of the specified type. =item typesets a list item. Lists must be nested correctly; failure to do so will yield a warning. =indented =endindented Prose text that is indented, as for example a block quotation. The existing drivers accept =maxim and =endmaxim as synonyms for =indented. This will probably go away. =stop End of the file; everything following is ignored. ---------------- Escape codes in prose: In a prose paragraph, and in certain command texts, certain escape sequences are recognized. They are all podlike: X where 'X' is some single letter. For example: B<...> Typeset text in boldface. (The text driver uses *asterisks*) I<...> Typeset text in italics. (The text driver uses _underscores_) C<...> Typeset text as code. (The text driver uses 'quotes') Some replacements are carried out recursively. For example, B baz> generates "*foo _bar_ baz*". But no recursion is done inside of C<...>. I don't have a spec for this yet, but I do think it is likely to do what you expect. To put a literal < or > mark inside an escape sequence, backslash it. To put a literal \ inside an escape sequence, backslash it: To call a method, use C<$object-\>method(arguments...)> The backslashes are mandatory. The rest of the escape sequences are: F<...> This is a function name. It is probably typeset like code. Some drivers may choose to append notation like '()' to the function name. Some drivers may automatically make a special index entry for the function. This may get moved out of the standard drivers into my own private drivers. N<...> Typeset a footnote. The text driver just does [[Footnote: ...]]. The HTML driver generates a hotlinked footnote that appears at the bottom of the page. The teX driver will use \footnote{...}. V<...> Typeset a mathematical variable name. This usually means italics or the equivalent. For example: The argument to F is the smallest prime number V

which is larger than the specified argument, V. M<...> An extended math formula. The contents should be in TeX format. The TeX driver will insert this verbatim. The text driver maintains a database showing an approximate plain text representation of the formula. Formulas that are listed in the database have their equivalent translations inserted into the output. Formulas that are not listed in the database are appended to the 'badmath' file, a warning is printed, and the TeX version of the formula is inserted directly. After the run is over, you can run the program 'defmath', which reads the 'badmath' file and prompts you for additions to the database. The HTML driver presently does the same thing that the text driver does, but this might change. It might have a database if inlinable gifs, or use TeX to generate the gifs if one is missing. R generates a crossreference. This isn't fully implemented yet. The intent is that you'll be able to write R and the driver will insert the section number of the section titled 'Carrots'; similarly C or C or whatever. At present, there is no way to define a cross-reference. T<...> Generate a reprensentation of an HTML tag. The text driver turns T into "". The HTML driver turns it into "<font>". X<...> Generate an index entry. An index entry has four properties: The text that is actually typeset inline, the text that is typeset in the index, the alphabetizing key for the index entry, and the style that is used for the page number in the index. Support isn't complete yet. However, the following do work: X typesets 'fish' inline and inserts it into the index. For example: It is not known whether X obey the X. X is the same, but nothing is typeset inline. (The 'i' is for 'invisible.) For example: Some might say that our scaly friends of the deep are ignorant of statistics.X It is not known whether X obey the X. X X is the same, but the intent is that this is the index entry for the *definition* of 'fish'. 'fish' is typeset inline in italics, and there may be a special annotation in the index (an italicized page number, for example) to indicate that this reference is special. Example: These water-dwelling vertebrates are known as X. X X Material pertaining to 'fish' starts/ends here. '(' and ')' imply 'i'. The index entry will come out with a range of page numbers, as "fish, 273--284" X> like X, but alphabetize under 'foo'. X like X, but the page number in the index should be in bold face. X like X, but the page number in the index should be in italics. X inserts the note "see also ichthyes" in the index under 'fish' The intent is that flags can be combined, so that you might write Xid> to indicate that the current page is the point at which 'fish' is defined, but the word 'fish' is not actually inserted at that point on the page, and the entry for 'fish' should be alphabetized under 'ichthyes'. In general, if there is a | in the sequence, everything after the last | is taken to be a flag option. Any other |s separate subentries. For example, X inserts a subentry in the index under 'fruits': fruits apples, 357 If you want to omit the 'i' here, the word 'fruits' will be inserted inline. You have to say X. X won't work because mod will try to interpret the 'apples' as a set of flags. All of the drivers generate the inline text correctly, but none actually builds an index. I'm planning to write another driver, Mod::Index, which extracts the index terms and builds an index data file, which the other drivers can then read and transform into a formatted index. I think that's all, except for the API documentation.