Beautifulsoup4 - Identifying Info By Strong Tag Value Only Works For Some Values Of The Tag
I am working with the following 'block' of HTML:
-
Solution 1:
The reason you're getting text for
'Accepts Business From:'
and not for'Classes of business'
is that you're using thetry-except
in the wrong place.In the second iteration of the
for li in li_stuff:
loop,li
becomes<li>U.S.A</li>
, which will throw andAttributeError
for callingli.strong
on it since there's not<strong>
tag present. And, according to your currenttry-except
, the error is caught outsite thefor
loop and ispass
ed. So, the loop won't reach its third iteration in which it should be getting the text for 'Classes of business'.To continue to loop even after the error is caught, use this:
for li in li_stuff: try: if li.strong.text.strip() == 'Accepts Business From:': business_final = li.find('li').text.strip() print('Accepts Business From:', business_final) if li.strong.text.strip() == 'Classes of business': business_final = li.find('li').text.strip() print('Classes of business:', business_final) except AttributeError: pass # or you can use 'continue' too.
Output:
Accepts Business From: U.S.A Classes of business: Engineering
But, as there are many values present for the 'Classes of business', you can change your code to this to get them all:
if li.strong.text.strip() == 'Classes of business': business_final = ', '.join([x.text.strip() for x in li.find_all('li')]) print('Classes of business:', business_final)
Output:
Accepts Business From: U.S.A Classes of business: Engineering, NM General Liability (US direct), Property D&F (US binder), Terrorism
Post a Comment for "Beautifulsoup4 - Identifying Info By Strong Tag Value Only Works For Some Values Of The Tag"