%PDF- %PDF-
Direktori : /usr/local/lib/python3.8/html/__pycache__/ |
Current File : //usr/local/lib/python3.8/html/__pycache__/parser.cpython-38.pyc |
U p©ßa9E ã @ sÀ d Z ddlZddlZddlZddlmZ dgZe d¡Ze d¡Z e d¡Z e d¡Ze d ¡Ze d ¡Z e d¡Ze d¡Ze d ¡Ze dej¡Ze d ¡Ze d¡ZG dd„ dejƒZdS )zA parser for HTML and XHTML.é N)ÚunescapeÚ HTMLParserz[&<]z &[a-zA-Z#]z%&([a-zA-Z][-.a-zA-Z0-9]*)[^a-zA-Z0-9]z)&#(?:[0-9]+|[xX][0-9a-fA-F]+)[^0-9a-fA-F]z <[a-zA-Z]ú>z--\s*>z+([a-zA-Z][^\t\n\r\f />\x00]*)(?:\s|/(?!>))*z]((?<=[\'"\s/])[^\s/>][^\s/=>]*)(\s*=+\s*(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?(?:\s|/(?!>))*aF <[a-zA-Z][^\t\n\r\f />\x00]* # tag name (?:[\s/]* # optional whitespace before attribute name (?:(?<=['"\s/])[^\s/>][^\s/=>]* # attribute name (?:\s*=+\s* # value indicator (?:'[^']*' # LITA-enclosed value |"[^"]*" # LIT-enclosed value |(?!['"])[^>\s]* # bare value ) \s* # possibly followed by a space )?(?:\s|/(?!>))* )* )? \s* # trailing whitespace z#</\s*([a-zA-Z][-.a-zA-Z0-9:_]*)\s*>c @ sè e Zd ZdZdZddœdd„Zdd„ Zd d „ Zdd„ Zd Z dd„ Z dd„ Zdd„ Zdd„ Z dd„ Zd9dd„Zdd„ Zdd„ Zdd „ Zd!d"„ Zd#d$„ Zd%d&„ Zd'd(„ Zd)d*„ Zd+d,„ Zd-d.„ Zd/d0„ Zd1d2„ Zd3d4„ Zd5d6„ Zd7d8„ Zd S ):r aE Find tags and other markup and call handler functions. Usage: p = HTMLParser() p.feed(data) ... p.close() Start tags are handled by calling self.handle_starttag() or self.handle_startendtag(); end tags by self.handle_endtag(). The data between tags is passed from the parser to the derived class by calling self.handle_data() with the data as argument (the data may be split up in arbitrary chunks). If convert_charrefs is True the character references are converted automatically to the corresponding Unicode character (and self.handle_data() is no longer split in chunks), otherwise they are passed by calling self.handle_entityref() or self.handle_charref() with the string containing respectively the named or numeric reference as the argument. )ZscriptÚstyleT)Úconvert_charrefsc C s || _ | ¡ dS )zÆInitialize and reset this instance. If convert_charrefs is True (the default), all character references are automatically converted to the corresponding Unicode characters. N)r Úreset)Úselfr © r ú'/usr/local/lib/python3.8/html/parser.pyÚ__init__W s zHTMLParser.__init__c C s( d| _ d| _t| _d| _tj | ¡ dS )z1Reset this instance. Loses all unprocessed data.Ú z???N)ÚrawdataÚlasttagÚinteresting_normalÚinterestingÚ cdata_elemÚ_markupbaseÚ ParserBaser ©r r r r r ` s zHTMLParser.resetc C s | j | | _ | d¡ dS )z‘Feed data to the parser. Call this as often as you want, with as little or as much text as you want (may include '\n'). r N)r Úgoahead©r Údatar r r Úfeedh s zHTMLParser.feedc C s | d¡ dS )zHandle any buffered data.é N)r r r r r Úcloseq s zHTMLParser.closeNc C s | j S )z)Return full source of start tag: '<...>'.)Ú_HTMLParser__starttag_textr r r r Úget_starttag_textw s zHTMLParser.get_starttag_textc C s$ | ¡ | _t d| j tj¡| _d S )Nz</\s*%s\s*>)Úlowerr ÚreÚcompileÚIr )r Úelemr r r Úset_cdata_mode{ s zHTMLParser.set_cdata_modec C s t | _d | _d S ©N)r r r r r r r Úclear_cdata_mode s zHTMLParser.clear_cdata_modec C sX | j }d}t|ƒ}||k rè| jrv| jsv| d|¡}|dk r | dt||d ƒ¡}|dkrpt d¡ ||¡spqè|}n*| j ||¡}|r’| ¡ }n| jrœqè|}||k rÞ| jrÌ| jsÌ| t |||… ƒ¡ n| |||… ¡ | ||¡}||kröqè|j}|d|ƒrJt ||¡r"| |¡} n†|d|ƒr:| |¡} nn|d|ƒrR| |¡} nV|d|ƒrj| |¡} n>|d |ƒr‚| |¡} n&|d |k rè| d¡ |d } nqè| dk r<|s¼qè| d|d ¡} | dk rú| d|d ¡} | dk r|d } n| d 7 } | jr*| js*| t ||| … ƒ¡ n| ||| … ¡ | || ¡}q|d|ƒrðt ||¡}|r²| ¡ d d… } | | ¡ | ¡ } |d| d ƒs¢| d } | || ¡}qn<d||d … krè| |||d … ¡ | ||d ¡}qèq|d|ƒrÚt ||¡}|rP| d ¡} | | ¡ | ¡ } |d| d ƒsB| d } | || ¡}qt ||¡}|rª|rè| ¡ ||d … krè| ¡ } | |kr”|} | ||d ¡}qèn.|d |k rè| d¡ | ||d ¡}nqèqdstdƒ‚q|rF||k rF| jsF| jr(| js(| t |||… ƒ¡ n| |||… ¡ | ||¡}||d … | _ d S )Nr ú<ú&é"