StarPU Handbook - StarPU Extensions
Loading...
Searching...
No Matches
18. Hierarchical DAGS

The STF model has the intrinsic limitation of supporting static task graphs only, which leads to potential submission overhead and to a static task graph which is not necessarily adapted for execution on heterogeneous systems.

To address these problems, we have extended the STF model to enable tasks subgraphs at runtime. We refer to these tasks as hierarchical tasks. This approach allows for a more dynamic task graph. This allows to dynamically adapt the granularity to meet the optimal size of the targeted computing resource.

Hierarchical tasks are tasks that can transform themselves into a new task-graph dynamically at runtime. Programmers submit a coarse version of the DAG, called the bubbles graph, which represents the general shape of the application tasks graph. The execution of this bubble graph will generate and submit the computing tasks of the application. It is up to application programmers to decide how to build the bubble graph (i.e. how to structure the computation tasks graph to create some groups of tasks). Dependencies between bubbles are automatically deduced from dependencies between their computing tasks.

18.1 An Example

In order to understand the hierarchical tasks model, an example of "bubblification" is showed here. We start from a simple example, multiplying the elements of a vector.

18.1.1 Initial Version

A computation is done several times on a vector split in smaller vectors. For each step and each sub-vector, a task is generated to perform the computation.

void func_cpu(void *descr[], void *_args)
{
(void) _args;
int x;
int nx = STARPU_VECTOR_GET_NX(descr[0]);
TYPE *v = (TYPE *)STARPU_VECTOR_GET_PTR(descr[0]);
for(x=0 ; x<nx ; x++)
v[x] += 1;
}
struct starpu_codelet vector_cl =
{
.cpu_funcs = {func_cpu},
.nbuffers = 1,
.modes = {STARPU_RW}
};
int vector_no_bubble()
{
TYPE *vector;
/* ... */
starpu_vector_data_register(&vhandle, 0, (uintptr_t)vector, X, sizeof(vector[0]));
starpu_data_map_filters(vhandle, 1, &f);
for(loop=0 ; loop<NITER; loop++)
for (x = 0; x < SLICES; x++)
{
starpu_task_insert(&vector_cl,
0);
}
/* ... */
}
starpu_cpu_func_t cpu_funcs[STARPU_MAXIMPLEMENTATIONS]
Definition starpu_task.h:414
#define STARPU_MAIN_RAM
Definition starpu_task.h:144
Definition starpu_task.h:338
void starpu_vector_data_register(starpu_data_handle_t *handle, int home_node, uintptr_t ptr, uint32_t nx, size_t elemsize)
#define STARPU_VECTOR_GET_NX(interface)
Definition starpu_data_interfaces.h:2100
#define STARPU_VECTOR_GET_PTR(interface)
Definition starpu_data_interfaces.h:2084
void starpu_data_unregister(starpu_data_handle_t handle)
struct _starpu_data_state * starpu_data_handle_t
Definition starpu_data.h:45
@ STARPU_RW
Definition starpu_data.h:60
void starpu_data_map_filters(starpu_data_handle_t root_data, unsigned nfilters,...)
starpu_data_handle_t starpu_data_get_sub_data(starpu_data_handle_t root_data, unsigned depth,...)
void starpu_data_unpartition(starpu_data_handle_t root_data, unsigned gathering_node)
int starpu_task_insert(struct starpu_codelet *cl,...)

18.1.2 Bubble Version

The bubble version of the code replaces the inner loop that realizes the tasks insertion by a call to a bubble creation. At its execution, the bubble will insert the computing tasks. The bubble graph is built accordingly to the dependencies of the subdata.

void no_func(void *buffers[], void *arg)
{
assert(0);
return;
}
int is_bubble(struct starpu_task *t, void *arg)
{
(void)arg;
(void)t;
return 1;
}
void bubble_gen_dag(struct starpu_task *t, void *arg)
{
int i;
for(i=0 ; i<SLICES ; i++)
{
starpu_task_insert(&vector_cl,
STARPU_RW, subdata[i],
0);
STARPU_CHECK_RETURN_VALUE(ret, "starpu_task_insert");
}
}
struct starpu_codelet bubble_codelet =
{
.cpu_funcs = {no_func},
.bubble_func = is_bubble,
.bubble_gen_dag_func = bubble_gen_dag,
.nbuffers = 1
};
int vector_bubble()
{
TYPE *vector;
starpu_data_handle_t sub_handles[SLICES];
/* ... */
starpu_vector_data_register(&vhandle, 0, (uintptr_t)vector, X, sizeof(vector[0]));
starpu_data_partition_plan(vhandle, &f, sub_handles);
for(loop=0 ; loop<NITER; loop++)
{
starpu_task_insert(&bubble_codelet,
STARPU_RW, vhandle,
STARPU_NAME, "B1",
0);
}
starpu_data_partition_clean(vhandle, SLICES, sub_handles);
/* ... */
}
#define STARPU_BUBBLE_GEN_DAG_FUNC_ARG
Definition starpu_task_util.h:394
Definition starpu_task.h:683
void starpu_data_partition_plan(starpu_data_handle_t initial_handle, struct starpu_data_filter *f, starpu_data_handle_t *children)
void starpu_data_partition_clean(starpu_data_handle_t root_data, unsigned nparts, starpu_data_handle_t *children)
#define STARPU_NAME
Definition starpu_task_util.h:199
#define STARPU_CHECK_RETURN_VALUE(err, message,...)
Definition starpu_util.h:416

The full example is available in the file bubble/tests/vector/vector.c.

To define a hierarchical task, one needs to define the fields starpu_codelet::bubble_func and starpu_codelet::bubble_gen_dag_func.

The field starpu_codelet::bubble_func is a pointer function which will be executed by StarPU to decide at runtime if the task must be transformed into a bubble. If the function returns a non-zero value, the function starpu_codelet::bubble_gen_dag_func will be executed to create the new graph of tasks.

The pointer functions can also be defined when calling starpu_task_insert() by using the arguments STARPU_BUBBLE_FUNC and STARPU_BUBBLE_GEN_DAG_FUNC. Both these functions can be passed parameters through the arguments STARPU_BUBBLE_FUNC_ARG and STARPU_BUBBLE_GEN_DAG_FUNC_ARG

When executed, the function starpu_codelet::bubble_func will be given as parameter the task being checked, and the value specified with STARPU_BUBBLE_FUNC_ARG.

When executed, the function starpu_codelet::bubble_gen_dag_func will be given as parameter the task being turned into a hierarchical task and the value specified with STARPU_BUBBLE_GEN_DAG_FUNC_ARG.

An example involving these functions is in bubble/tests/basic/brec.c. And more examples are available in bubble/tests/basic/*.c.