Why Is My Python 3 Implementation Much Faster Than The One I Wrote In C++?
Solution 1:
The main problem is that you're reseeding a random number generator for each random number in your C++ code. Additionally you're not compiling with optimizations enabled (-O3
).
I moved the initialization of the random number generator outside the randomFloat
function (equally, you could use static
variables inside the function):
random_device rd;
default_random_engine generator(rd()); // rd() provides a random seeduniform_real_distribution<float> distribution(0,1);
floatrandomFloat() {
float x = distribution(generator);
return x;
}
and compiled with -O3
and now C++ is considerably faster than Python
Another possibility could be that python and C++ code use a different random number generator. Python random
module (C code here) uses a MT19937 Mersenne Twister random number generator that is a fast PRNG optimized specifically for numerical problems such as Monte Carlo; the algorithm of default_random_engine
in C++ is implementation-defined. As pointed out by Melak47, you can force the use of MT19937 PRNG in C++ with:
mt19937 generator(rd());
or
mt19937_64 generator(rd());
P.S., Python outperforming C++ is not unheard of; the C++ algorithms value genericity whereas the Python algorithms are often quite optimized for some use cases. See for example this question on substring matching.
Solution 2:
The main cost is your randomFloat() c++ method.
building a random_device, default_random_engine and uniform_real_distribution every iteration is incredibly wasteful.
By making these static I was able to increase the speed of the c++ implementation by over a factor of 100. But you'd be better served injecting them, or wrapping this in a class and making them instance members.
#include<iostream>// std library#include<random>// random number generator#include<ctime>// calculating runtime#include<cmath>// absolute value functionusingnamespace std;
constdouble g_PI {3.141592653589793238463};
voidsimulate(unsignedlong value);
floatrandomFloat();
boolunit_circle(float x, float y);
intmain(){
// repitition valueslong values[5] = {1000, 10000, 100000, 1000000, 10000000};//, 100000000, 1000000000, 10000000000};// runs the simulation with the different repetition valuesfor (auto value : values)
simulate(value);
cout << "\nPress return to exit";
cin.get();
return0;
}
/**
* The actual simulation
*/voidsimulate(unsignedlong value){
// start time for calculating runtimeconstclock_t startTime = clock();
// area's variablesunsignedlong area_of_circle = 0;
unsignedlong area_of_square = 0;
// print the repitiion value
cout << "\nProcessing calculations with a repetition value of " << value <<
" times." << endl;
for (unsignedlong i = 0; i != value; i++)
{
// gets random values from 0 to 1 for (x) and (y)float x = randomFloat();
float y = randomFloat();
// checks if (x, y) are in a unit circle, if so increment circle areaif (unit_circle(x, y))
area_of_circle++;
area_of_square++;
}
// pi = area of circle * 4 / area of squaredouble calculatedPi = static_cast<double>(area_of_circle * 4) / area_of_square;
float endTime = static_cast<float>(clock() - startTime) / CLOCKS_PER_SEC;
// prints the value of calculated pi
cout << "\tCalculated Value of Pi: " << calculatedPi <<
" (" << endTime << " seconds, " << endTime/60 << " minutes)" << endl;
// difference between the calc value and pi
cout << "Estimated Num of Pi is off by " << abs(calculatedPi - g_PI) << '\n';
}
/**
* returns a random number from 0 to 1
*/floatrandomFloat(){
static random_device rd;
static default_random_engine generator(rd()); // rd() provides a random seedstatic uniform_real_distribution<float> distribution(0,1);
float x = distribution(generator);
return x;
}
/**
* checks if the two input parameters are inside a unit circle
*/boolunit_circle(float x, float y){
if ((x*x + y*y) <= 1)
returntrue;
elsereturnfalse;
}
Original implmentation Log
Processing calculations with a repetition value of 1000 times.
Calculated Value of Pi: 3.08 (0.019227 seconds, 0.00032045 minutes)
Estimated Num of Pi is off by0.0615927
Processing calculations with a repetition value of 10000 times.
Calculated Value of Pi: 3.124 (0.162044 seconds, 0.00270073 minutes)
Estimated Num of Pi is off by0.0175927
Processing calculations with a repetition value of 100000 times.
Calculated Value of Pi: 3.14568 (1.72181 seconds, 0.0286968 minutes)
Estimated Num of Pi is off by0.00408735//Couldn't be bothered to wait :P
Using static random generator
Processing calculations with a repetition value of1000 times.
Calculated Value of Pi: 3.136 (0.000144 seconds, 2.4e-06 minutes)
Estimated Num of Pi isoffby0.00559265
Processing calculations with a repetition value of10000 times.
Calculated Value of Pi: 3.1824 (0.000596 seconds, 9.93333e-06 minutes)
Estimated Num of Pi isoffby0.0408073
Processing calculations with a repetition value of100000 times.
Calculated Value of Pi: 3.14044 (0.005889 seconds, 9.815e-05 minutes)
Estimated Num of Pi isoffby0.00115265
Processing calculations with a repetition value of1000000 times.
Calculated Value of Pi: 3.14278 (0.058896 seconds, 0.0009816 minutes)
Estimated Num of Pi isoffby0.00118335
Processing calculations with a repetition value of10000000 times.
Calculated Value of Pi: 3.14165 (0.589034 seconds, 0.00981723 minutes)
Estimated Num of Pi isoffby6.09464e-05
Solution 3:
Not meant as an answer to your question why python is faster, just to show that python can get event faster and neater for this problem.
To possibilities to speed things up in python:
Use numpy vectorization:
import numpy as np
defpi(N):
x, y = np.random.uniform(-1, 1, size=(2, N))
in_circle = np.count_nonzero(x**2 + y**2 <= 1)
return4 * in_circle / N
And / or numba just in time compilation:
from numba import jit
import random
@jitdefpi(N):
in_circle = 0for i inrange(N):
x = 2 * random.random() - 1
y = 2 * random.random() - 1if x**2 + y**2 <= 1:
in_circle += 1return4 * in_circle / N
Post a Comment for "Why Is My Python 3 Implementation Much Faster Than The One I Wrote In C++?"