decode(value,
encoding=' latin-1 ' ,
errors=' strict ' ,
entities=None)
| source code
|
Decode HTML encoded text
- Parameters:
value (basestring) - HTML content to decode
encoding (str) - Unicode encoding to be applied before value is being processed
further. If value is already a unicode instance, the encoding is
ignored. If omitted, 'latin-1' is applied (because it can't fail
and maps bytes 1:1 to unicode codepoints).
errors (str) - Error handling, passed to .decode() and evaluated for entities.
If the entity name or character codepoint could not be found or
not be parsed then the error handler has the following semantics:
- strict (or anything different from the other tokens below)
- A ValueError is raised.
- ignore
- The original entity is passed through
- replace
- The character is replaced by the replacement character
(U+FFFD)
entities (dict) - Entity name mapping (unicode(name) -> unicode(value)). If
omitted or None, the HTML5 entity list is applied.
- Returns: unicode
- The decoded content
|