Skip to content Skip to sidebar Skip to footer

Python Pool Multiprocessing With Functions

Okay I've been playing with some code partly to get a better understanding of python, and partly to scrape some data from the web. Part of what I want to learn about if using Pytho

Solution 1:

Multithreading and multiprocessing are quite different when it comes to how your variables and functions can be accessed. Separate processes (multiprocessing) have different memory spaces and therefore simply cannot access the same (instances of) functions or variables, so the concept of global variables doesn't really exist. Sharing data between processes has to be done via pipes or queues that can pass data for you. Both the main process and the child process can have access to the same queue though, so in a way you could think of that as a type of global variable.

With multithreading you can definitely access global variables and it can be a good way to program if your program is simple. For example, a child thread may read the value of a variable in the main thread and use it as a flag in the child thread's function. You need to be aware of threadsafe operations however; like you say complex operations by multiple threads on the same object can result in conflicts. In this case you need to use thread locking or some other safe method. However many operations are naturally atomic and therefore threadsafe, for instance reading a single variable. There's a good list of threadsafe operations and thread syncing on this page.

Generally with multiprocessing and multithreading you have some time consuming function that you pass to the thread or the process, but they won't be rerunning the same instance of that function. The below example shows a valid use case for multiple threads atomically accessing a global variable. The separate processes however won't be able to.

import multiprocessing as mp
import threading
import time

work_flag = Truedefworker_func():
    global work_flag
    whileTrue:
        if work_flag:
            # do stuff
            time.sleep(1)
            print mp.current_process().name, 'working, work_flag =', work_flag
        else:
            time.sleep(0.1)

defmain():
    global work_flag

    # processes can't access the same "instance" of work_flag!
    process = mp.Process(target = worker_func)
    process.daemon = True
    process.start()

    # threads can safely read global work_flag
    thread = threading.Thread(target = worker_func)
    thread.daemon = True
    thread.start()

    whileTrue:
        time.sleep(3)
        # changing this flag will stop the thread, but not the process
        work_flag = Falseif __name__ == '__main__':
    main()

Post a Comment for "Python Pool Multiprocessing With Functions"