Compilation of Single Program Multiple Data kernels with independent subgroup forward progress on CPUs