{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Extending to multiple GPUs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A simple way to extend this example to multiple GPUs is to use a single host process that manages multiple GPUs. If we have *M* GPUs and *N* sample points to evaluate, we can distribute *N/M* to each GPU, and in principle calculate the result up to *M* times more quickly.\n", "\n", "Look at [exercises/monte_carlo_mgpu_cuda.cpp](exercises/monte_carlo_mgpu_cuda.cpp) for an example of this. We've given you a few simple tasks to do in the code -- look for locations denoted by `FIXME`. Note that in this example we're giving each GPU a different seed for the random number generator so that they're each doing different work. As a result our answer will change a little.\n", "\n", "If you get stuck, you can consult [the solution](solutions/monte_carlo_mgpu_cuda.cpp)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "!nvcc -x cu -arch=sm_70 -rdc=true -o monte_carlo_mgpu_cuda solutions/monte_carlo_mgpu_cuda.cpp" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using 4 GPUs\n", "Estimated value of pi = 3.14072\n", "Error = 0.000277734\n", "CPU times: user 18.4 ms, sys: 11.1 ms, total: 29.5 ms\n", "Wall time: 2.04 s\n" ] } ], "source": [ "%%time\n", "!./monte_carlo_mgpu_cuda" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 4 }