Remote-Scope Promotion: Clarified, Rectified, and Verified (SPLASH 2015 - OOPSLA)

Fri 23 - Fri 30 October 2015 Pittsburgh, Pennsylvania, United States

Who

John Wickerson, Mark Batty, Bradford M. Beckmann, Alastair F. Donaldson

Track

SPLASH 2015 OOPSLA

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 30 Oct 2015 13:30 - 13:52 at Grand Station 1 - 11. Programming Language Design Chair(s): Gary T. Leavens

Abstract

Modern accelerator programming frameworks, such as OpenCL, organise threads into work-groups. Remote-scope promotion (RSP) is a language extension recently proposed by AMD researchers that is designed to enable applications, for the first time, both to optimise for the common case of intra-work-group communication (using memory scopes to provide consistency only within a work-group) and to allow occasional inter-work-group communication (as required, for instance, to support the popular load-balancing idiom of work stealing).

We present the first formal, axiomatic memory model of OpenCL extended with RSP. We have extended the Herd memory model simulator with support for OpenCL kernels that exploit RSP, and used it to discover bugs in several litmus tests and a work-stealing queue, that have been used previously in the study of RSP. We have also formalised the proposed GPU implementation of RSP. The formalisation process allowed us to identify bugs in the description of RSP that could result in well-synchronised programs experiencing memory inconsistencies. We present and prove sound a new implementation of RSP that incorporates bug fixes and requires less non-standard hardware than the original implementation.

This work, a collaboration between academia and industry, clearly demonstrates how, when designing hardware support for a new concurrent language feature, the early application of formal tools and techniques can help to prevent errors, such as those we have found, from making it into silicon.

DOI

https://doi.org/10.1145/2814270.2814283

John Wickerson

Imperial College London

United Kingdom

Mark Batty

University of Cambridge

Bradford M. Beckmann

Advanced Micro Devices, Inc

Alastair F. Donaldson