Multi-threaded Python Web Crawler Got Stuck

February 01, 2024 Post a Comment

I'm writing a Python web crawler and I want to make it multi-threaded. Now I have finished the basic part, below is what it does: a thread gets a url from the queue; the thread ex

Solution 1:

Your crawl function has an infinite while loop with no possible exit path. The condition True always evaluates to True and the loop continues, as you say,

not exiting properly

Modify the crawl function's while loop to include a condition. For instance, when the number of links saved to the csv file exceeds a certain minimum number, then exit the while loop.

i.e.,

def crawl():
    whilelen(exist) <= min_links:
        ...

Learn Python Tutorials

Multi-threaded Python Web Crawler Got Stuck

Solution 1:

Post a Comment for "Multi-threaded Python Web Crawler Got Stuck"