Python HTMLParser and super()
Posted on Thu 16 February 2012 in Posts
So I have a class that inherits from HTMLParser
, and I want to call the super class init (the __init__
of
HTMLParser
), I would think I should do:
class MyParser(HTMLParser):
def __init__(self):
super(MyParser, self).__init__()
But this causes a problem:
myparser = MyParser()
Traceback (most recent call last):
File "", line 1, in
File "", line 3, in __init__
TypeError: must be type, not classobj
What's with that? The super(class, instance).__init__
idiom is the supposed proper way of calling a parent class
constructor, and it is -- if the class is a "new-style" Python class (one which inherits from object
, or a class which
inherits from object
).
And therein is the problem: HTMLParser
inherits from markupbase.ParserBase
, and markupbase.ParserBase
is defined as:
class ParserBase:
"""Parser base class which provides some common support methods used
by the SGML/HTML and XHTML parsers."""
That is, as an old style class. One definitely wonders why in Python 2.7+ the classes that form part of the standard
library wouldn't all be new-style classes, especially when the class is intended as being something you inherit from
(like HTMLParser
). Anywho, to fix:
class MyParser(HTMLParser):
def __init__(self):
# Old style way of doing super()
HTMLParser.__init__(self)
(Note: this post originally appeared on my blogspot blog at: http://codependentcodr.blogspot.ca/2012/02/python-htmlparser-and-super.html)