Subclassing formatters

In the previous section we customized the output format by altering the format dictionary. Sometimes more advanced customizations are needed. This can be done by subclassing Existing formatters .

The base class PwebFormatter has a method preformat_chunk() that processes all chunks before they are processed by default formatters.

Suppose I have this document (view the source in browser) using markdown markup and I want convert the doc chunks to HTML and output code chunks using Pweave default HTML formatter.

I can do this by subclassing PwebHTMLFormatter. MDtoHTML class below converts the content of all documentation chunks to HTML using python Markdown package. (chunk['type'] for code chunks is “code”). The class also stores the chunks for us to see what they contain, but that’s not needed for formatting.

from pweave import *
import markdown

class MDtoHTML(PwebHTMLFormatter):

    chunks = [] #Let's keep a copy of chunks

    def preformat_chunk(self, chunk):
        MDtoHTML.chunks.append(chunk.copy()) #Store the chunks
        if chunk['type'] == "doc":
            chunk['content'] = markdown.markdown(chunk['content'])

The specified subclass can then be used as formatter with Pweb class.

doc = Pweb('ma.mdw')
doc.setformat(Formatter = MDtoHTML)
Processing chunk 1 named None from line 22
Processing chunk 2 named None from line 31
Processing chunk 3 named None from line 42
Pweaved ma.mdw to ma.html

And here is the weaved document.

A closer look at the chunks

Remember that we kept a copy of the chunks in the previous example? As you can see below the chunk is a dictionary that contains code, results and all of the chunk options. You can manipulate all of these options as we did to content in previous example to control how the chunk is formatted in output.


You can your own options (key = value) to chunks and they will also appear in the chunk dictionary.

Let’s see what the first code chunk contains:

import pprint
{'caption': False,
 'codeend': '',
 'codestart': '',
 'complete': True,
 u'content': u'\nfrom pylab import *\nimport scipy.signal as
signal\n#A function to plot frequency and phase response\ndef
mfreqz(b,a=1):\n    w,h = signal.freqz(b,a)\n    h = abs(h)\n
return(w/max(w), h)\n',
 'doctype': 'html',
 'echo': True,
 'engine': 'python',
 'evaluate': True,
 'extension': 'html',
 'f_env': None,
 'f_pos': 'htpb',
 'f_size': (8, 6),
 'f_spines': True,
 'fig': True,
 'figfmt': '.png',
 'figure': [],
 'include': True,
 'name': None,
 u'number': 1,
 'option_string': u'',
 'outputend': '',
 'outputstart': '',
 'result': '\n\n',
 'results': 'verbatim',
 'savedformats': ['.png'],
 u'start_line': 22,
 'term': False,
 'termend': '',
 'termstart': '',
 u'type': u'code',
 'width': '600',
 'wrap': True}


Pweb class also uses separate classes to parse and execute the document, but subclassing these is not currently documented and is hopefully not needed.