Search In Html Page Using Regex Patterns With Python March 20, 2024 Post a Comment I'm trying to find a string inside a HTML page with known patterns. for example, in the following HTML code: Solution 1: re.findall(r'<HR>\s*<font size="\+1">(.*?)</font><BR>', html, re.DOTALL) Copyfindall is returning a list with everything that is captured between the brackets in the regular expression. I used re.DOTALL so the dot also captures end of lines.I used \s* because I was not sure whether there would be any whitespace. Solution 2: This works, but may not be very robust: Baca JugaRegex To Match Scientific NotationHow To Ignore Unmatched Group In A String In Re Python?Regex To Join Numbers When They Have Spaces Between Themimport re r = re.compile('<HR>\s?<fontsize="\+1">(.+?)</font>\s?<BR>', re.IGNORECASE) r.findall(html) CopyYou will be better off using a proper HTML parser. BeautifulSoup is excellent and easy to use. Look it up. Solution 3: re.findall(r'<HR>\n<font size="\+1">([^<]*)<\/font><BR>', html, re.MULTILINE) Copy Share You may like these postsNeed Python Lxml Syntax Help For Parsing HtmlBeautiful Soup 4: How To Replace A Tag With Text And Another Tag?How To Get "subsoups" And Concatenate/join Them?Getting More Granular Diffs From Difflib (or A Way To Post-process A Diff To Achieve The Same Thing) Post a Comment for "Search In Html Page Using Regex Patterns With Python"
Post a Comment for "Search In Html Page Using Regex Patterns With Python"