Menu

#31 Advanced loop transformations

closed
nobody
5
2012-09-21
2001-10-12
Dave Grove
No

Stripmining (in particular to mitigate yieldpoint overhead).

Peeling/Tiling.

loop versioning

Discussion

  • Dave Grove

    Dave Grove - 2006-11-17

    Logged In: YES
    user_id=1215435
    Originator: YES

    Loop versioning has since been done (OPT_LoopVersioning), but has not been enabled by default due to lingering bugs it exposed in SSA (see https://sourceforge.net/tracker/index.php?func=detail&aid=1488798&group_id=128805&atid=712768).

    Stripmining is a simpler transformation, so may be less vulnerable to lingering SSA bugs.

    The main place where we think stripmining might be useful is in moderately small loop bodies to help
    reduce the cost of backedge yieldpoints. (To support m-n threading, all JikesRVM compilers inject
    conditional yields on loop backedges). So a basic loop like:

    static int x = 0;
    for (int i=0; i<100000; i++) {
    x++;
    }

    gets compiled into:

    for (int i=0; i<100000; i++) {
    x++;
    BACKEDGE_YIELDPOINT;
    }

    The BACKEDGE_YIELDPOINT expands to a load/compare/branch sequence. The branch is very infrequnetly taken, so likely to be correctly predicted. So the cost is 2-3 instructions per loop iteration.

    Loop unrolling can transform the loop into:

    for (int i=0; i<25000; i++) {
    x++;
    x++;
    x++;
    x++;
    BACKEDGE_YIELDPOINT;
    }

    With stripmining, one would get something like:
    for (int i=0; i<100; i++) {
    for (int j=0; j<100; j++) {
    x++;
    }
    BACKEDGE_YIELDPOINT;
    }

    The inner loop has a small number of iterations and a small body, so the inner yieldpoint can be dropped.

    Overall, it's not clear how much of a benefit this would actually be. In some tight loops it would make a difference and might be a better choice than unrolling (since it results in less code space expansion). But often loop unrolling is better than stripmining.

    I'd suggest taking a look at Ian's LoopVersioning code and then talking some to Ian. He probably has a sense of what loop optimizations are most likely to be profitable ontop of the loop versioning that he's already implemented.

     
  • Ian Rogers

    Ian Rogers - 2006-11-17

    Logged In: YES
    user_id=308843
    Originator: NO

    Hi, not sure what prompted the comment but anyway, we have a number of loop optimizations here. Jisheng Zhao is responsible for most of these. Some of the more recent ones are loop blocking, interchange, reversal, the usual gamut of things to improve parallelization. I want to add these phases to the repository, but I want to fix loop versioning first (leave SSA is a problem). The eventual goal would be to release the parallelization code although our current results show its performance isn't good for a dual CPU SMP x86 system. I believe there'd be significant research interest in a parallelizing open source JVM. We also have a number of adaptive optimizations that tune the parallelization and extensions to interprocedural analysis, these should all make it into the repository. I want to improve their behavior and leave SSA is barring this.

    Ian

     
  • Dave Grove

    Dave Grove - 2008-07-06

    Logged In: YES
    user_id=1215435
    Originator: YES

    Moved to JIRA: http://jira.codehaus.org/browse/RVM-567

     

Log in to post a comment.