.. _parallel_computation: ********************* Parallel Computations ********************* Installation ************ The Muscat C++ backend is based on the Kokkos ecosystem [#githubkokkos]_ to support (thread-scalable) node-level parallelism (i.e. CPU/GPU shared memory). Usage ***** .. note:: Muscat only supports the usage of Kokkos on some specific algorithm specifically designed for it. They are all listed in the ``Muscat::KK`` namespace. Host & Device -------------- In the context of parallel computing (particularly in environments like GPU programming) the terms "*host*" and "*device*" refer to different types of **processing units**: * ``Host``: This typically refers to the *CPU* (Central Processing Unit) and the main system memory (RAM) where the main application runs. The host is responsible for managing the overall application flow, including setting up computations and transferring data to and from the device. * ``Device``: This usually refers to the *GPU* (Graphics Processing Unit) or other accelerators used for high-performance parallel computations. The device is optimized for executing large numbers of parallel operations. .. warning:: Using a device like a GPU isn't always the best choice because data transfer overhead can negate speedup for smaller tasks, and some algorithms may not parallelize well, making the CPU more efficient. To maximize efficiency, it's crucial to fully utilize both the host and device. This involves wisely distributing workloads, and ensuring that each platform is leveraged for its strengths. Python Usage ------------ .. code-block:: python from Muscat.Helpers.Kokkos.KokkosHelper import * # [...] with UseDevice(): # equivalent to UseDevice(False) # This part of the code will run on the Device (if available) with UseDevice(True): # This part of the code will run on the Device or crash if not available with UseHost(): # This part of the code will run on the Host You can ensure that a Device is available by using: .. code-block:: python from Muscat.Helpers.Kokkos.KokkosHelper import * ensureDeviceAvailable(True) # True to raise an exception if not available .. warning:: If a device is not present and the ``ensureDeviceAvailable`` is not used to raise an exception, the program will run using available CPUs. .. rubric:: Footnotes .. [#githubkokkos] https://github.com/kokkos/kokkos GPU Offloading -------------- In order to maximize the usage of your CPU and GPU, it would be useful to use both at the same time. That's what ``Future`` are for ! Some python functions have an adapted version returning a ``Future`` object. Those object are running in background and allow you to deal with other kind of data meanwhile. For example: .. code-block:: python from Muscat.Helpers.Kokkos.KokkosHelper import * from Muscat.LinAlg.Kokkos.Utils import * # [...] matrix = np.array([[1,2,3],[4,5,6],[7,8,9]]) # some data ensureDeviceAvailable() # If there is no device available, this code is not optimal with UseDevice(): future = uniqueRowsFuture(matrix) # not blocking instruction with UseHost(): # This part of the code will run on the Host during the computation # [...] print(future.availble()) # will print if the data is available, non blocking result = future.get() # retrieve the data afterward, blocking