Multiprocessing to Improve TI Process Performance

When processing large data sets, splitting a process into smaller parts, and running them simultaneously to leverage multiprocessing, can be a great way to reduce total process run time. This improvement is limited by the processing power and available threads of the Planning Analytics server. This technique is especially useful when working with very large data sets and models with complex calculations.

 

For instance, assume a process exists that calculates customer profitability for 100 customers that has a total run time of an hour. If that process is split to run by individual customer, it could reduce the run time to only a few minutes. Before explaining how to transition this example process to be multi-processed, in this case, the process source should be assumed to be a cube view.

In this example, the first step is to reduce the scope of the process from running for all customers to only running for one customer at a time. To do this, a customer parameter should be added to define what customer to run the process for. In addition, the customer subset in the source view should be limited to only the parameterized customer. At this point, the process could be run to ensure it works correctly for one provided customer.

Next, a master process needs to be created to run the customer profitability process for each customer. The simplest way to do this is to create a process that uses the Customer dimension as the source; limited to the customers the process should run for. Then in the data tab, use the RunProcess function to run the customer profitability process passing the customer variable into the function. Now running the master process will start the customer profitability process for all customers that will run in the background simultaneously.

Be wary when utilizing multiprocessing, as issues can arise when not utilized properly. For instance, if the master process is not set up correctly it can inadvertently run more processes than the Planning Analytics server can handle at once. This depends on the specifications of the Planning Analytics server and may require some trial and error to determine the best configuration to multi-process TI processes. Also, when using RunProcess, the administrator must be aware of all data required by and from the process. This is because processes run using RunProcess will not be aware of any uncommitted data changes made by any other running processes.

When the master process is run, it will complete as soon as it is done starting all the individual subprocesses, and it will not wait for them to complete. This can be a detriment when recording process timing statistics or coordinating additional processes to run after process completion. There are ways to work around these issues, with one solution described here