Package tdi :: Module _htmldecode
[frames] | no frames]

Module _htmldecode

source code

HTML Decoder.


Author: André Malo

Functions
unicode
decode(value, encoding='latin-1', errors='strict', entities=None)
Decode HTML encoded text
source code
Variables
  __package__ = 'tdi'
Function Details

decode(value, encoding='latin-1', errors='strict', entities=None)

source code 
Decode HTML encoded text
Parameters:
  • value (basestring) - HTML content to decode
  • encoding (str) - Unicode encoding to be applied before value is being processed further. If value is already a unicode instance, the encoding is ignored. If omitted, 'latin-1' is applied (because it can't fail and maps bytes 1:1 to unicode codepoints).
  • errors (str) - Error handling, passed to .decode() and evaluated for entities. If the entity name or character codepoint could not be found or not be parsed then the error handler has the following semantics:

    strict (or anything different from the other tokens below)
    A ValueError is raised.
    ignore
    The original entity is passed through
    replace
    The character is replaced by the replacement character (U+FFFD)
  • entities (dict) - Entity name mapping (unicode(name) -> unicode(value)). If omitted or None, the HTML5 entity list is applied.

Returns: unicode
The decoded content

Variables Details

__package__

Value:
'tdi'