Subclassing formatters

In the previous section we customized the output format by altering the format dictionary. Sometimes more advanced customizations are needed. This can be done by subclassing Existing formatters .

The base class PwebFormatter has a method preformat_chunk() that processes all chunks before they are processed by default formatters.

Suppose I have this document (view the source in browser) using markdown markup and I want convert the doc chunks to HTML and output code chunks using Pweave default HTML formatter.

I can do this by subclassing PwebHTMLFormatter. MDtoHTML class below converts the content of all documentation chunks to HTML using python Markdown package. (chunk['type'] for code chunks is “code”). The class also stores the chunks for us to see what they contain, but that’s not needed for formatting.

from pweave import *
import markdown

class MDtoHTML(PwebHTMLFormatter):

    chunks = [] #Let's keep a copy of chunks

    def preformat_chunk(self, chunk):
        MDtoHTML.chunks.append(chunk.copy()) #Store the chunks
        if chunk['type'] == "doc":
            chunk['content'] = markdown.markdown(chunk['content'])
        return(chunk)

The specified subclass can then be used as formatter with Pweb class.

doc = Pweb('ma.mdw')
doc.setformat(Formatter = MDtoHTML)
doc.weave()
---------------------------------------------------------------------------FileNotFoundError
Traceback (most recent call last)~/anaconda3/lib/python3.6/site-
packages/pweave/readers.py in read_file_or_url(source)
     15     try:
---> 16         codefile = io.open(source, 'r', encoding='utf-8')
     17         contents = codefile.read()
FileNotFoundError: [Errno 2] No such file or directory: 'ma.mdw'
During handling of the above exception, another exception occurred:
ValueError                                Traceback (most recent call
last)<ipython-input-1-cfc1e282b38f> in <module>()
----> 1 doc = Pweb('ma.mdw')
      2 doc.setformat(Formatter = MDtoHTML)
      3 doc.weave()
~/anaconda3/lib/python3.6/site-packages/pweave/pweb.py in
__init__(self, source, doctype, informat, kernel, output, figdir,
mimetype)
     68
     69         self.setformat(doctype)
---> 70         self.read(reader = informat)
     71
     72     def _setwd(self):
~/anaconda3/lib/python3.6/site-packages/pweave/pweb.py in read(self,
string, basename, reader)
    106
    107         if string is None:
--> 108             self.reader = Reader(file=self.source)
    109         else:
    110             self.reader = self.Reader(string=string)
~/anaconda3/lib/python3.6/site-packages/pweave/readers.py in
__init__(self, file, string)
     36         # Get input from string or
     37         if file is not None:
---> 38             self.rawtext = read_file_or_url(self.source)
     39         else:
     40             self.rawtext = string
~/anaconda3/lib/python3.6/site-packages/pweave/readers.py in
read_file_or_url(source)
     18         codefile.close()
     19     except IOError:
---> 20         r = request.urlopen(source)
     21         contents = r.read().decode("utf-8")
     22         r.close()
~/anaconda3/lib/python3.6/urllib/request.py in urlopen(url, data,
timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224
    225 def install_opener(opener):
~/anaconda3/lib/python3.6/urllib/request.py in open(self, fullurl,
data, timeout)
    509         # accept a URL or a Request object
    510         if isinstance(fullurl, str):
--> 511             req = Request(fullurl, data)
    512         else:
    513             req = fullurl
~/anaconda3/lib/python3.6/urllib/request.py in __init__(self, url,
data, headers, origin_req_host, unverifiable, method)
    327                  origin_req_host=None, unverifiable=False,
    328                  method=None):
--> 329         self.full_url = url
    330         self.headers = {}
    331         self.unredirected_hdrs = {}
~/anaconda3/lib/python3.6/urllib/request.py in full_url(self, url)
    353         self._full_url = unwrap(url)
    354         self._full_url, self.fragment =
splittag(self._full_url)
--> 355         self._parse()
    356
    357     @full_url.deleter
~/anaconda3/lib/python3.6/urllib/request.py in _parse(self)
    382         self.type, rest = splittype(self._full_url)
    383         if self.type is None:
--> 384             raise ValueError("unknown url type: %r" %
self.full_url)
    385         self.host, self.selector = splithost(rest)
    386         if self.host:
ValueError: unknown url type: 'ma.mdw'

And here is the weaved document.

A closer look at the chunks

Remember that we kept a copy of the chunks in the previous example? As you can see below the chunk is a dictionary that contains code, results and all of the chunk options. You can manipulate all of these options as we did to content in previous example to control how the chunk is formatted in output.

Note

You can your own options (key = value) to chunks and they will also appear in the chunk dictionary.

Let’s see what the first code chunk contains:

import pprint
pprint.pprint(MDtoHTML.chunks[1])
---------------------------------------------------------------------------IndexError
Traceback (most recent call last)<ipython-input-1-ae88b73f6ea0> in
<module>()
      1 import pprint
----> 2 pprint.pprint(MDtoHTML.chunks[1])
IndexError: list index out of range

Note

Pweb class also uses separate classes to parse and execute the document, but subclassing these is not currently documented and is hopefully not needed.