1 : 
jhr 
233 
%!TEX root = paper.tex

2 : 


%

3 : 



4 : 


\section{Introduction}

5 : 



6 : 


The analysis of structure in threedimensional images is increasingly

7 : 


valuable for biomedical research and computational science.

8 : 


\emph{Computed Tomography} (CT) and \emph{Magnetic Resonance Imaging}

9 : 


(MRI) scanners continue

10 : 


to improve in speed and resolution, driving clinical research in

11 : 


imagebased screening and diagnosis, and recent microscopy techniques

12 : 


are producing detailed snapshots of cellular processes for biomedical

13 : 


research.

14 : 


Regardless of the image modality or application, the values

15 : 


stored at individual samples in the threedimensional image are

16 : 


measurements of some essential physical or biological quantity, and

17 : 


the spatial structure of those values indirectly represents relevant

18 : 


features of the scanned object or system.

19 : 


Image analysis plays the

20 : 


crucial role of detecting and extracting the salient image structure

21 : 


in order to build geometric and quantitative models of anatomy or

22 : 


organization.

23 : 



24 : 


%Volumetric imagery

25 : 


%also arises in nondestructive testing of manufacturing, siesmology

26 : 


%and geophysics, and homeland security.

27 : 



28 : 


Ideally, image analysis research creates efficient and powerful tools

29 : 


that increase the scientific utility and impact of image acquisitions.

30 : 


In practice, a combination of factors limits the development and

31 : 


application of novel imageanalysis methods.

32 : 


The computational burden

33 : 


of processing images is increasing as devices produce images of higher

34 : 


resolution (\eg{}, typical CT scans have gone from $128^3$ to roughly $512^3$

35 : 


resolutions~\cite{GLK:Reiser2008}).

36 : 
jhr 
307 
With the latest scanning technologies, it is also more common for the values measured at

37 : 
jhr 
233 
each sample to be multidimensional rather than a single scalar, which

38 : 


further complicates implementing mathematically correct methods.

39 : 


% diffusionweighted MRI now commonly measures at least 30 value per

40 : 


% sample, and the simplest tensor models estimated from these

41 : 


% measurements have at least six degrees of freedom per point.

42 : 


While imageanalysis algorithms are computationally expensive, there

43 : 


is hope in being able to exploit modern parallel hardware to greatly

44 : 


reduce their running time.

45 : 


Unfortunately, the scientists involved in

46 : 


creating the latest imaging experiments and acquisitions are usually

47 : 


not in a position to implement correspondingly sophisticated analysis

48 : 


algorithms that run on the latest parallel hardware, which reinforces the

49 : 


gap between the class of methods advanced by computer science

50 : 
jhr 
307 
research and those that gain currency in scientific research

51 : 
jhr 
233 
applications.

52 : 



53 : 


We propose a new way to express and implement imageanalysis algorithms that

54 : 


strives for both efficiency and flexibility.

55 : 


Our proposed approach is based on a \emph{domainspecific language} (DSL)

56 : 


that encapsulates the mathematical ingredients of working in

57 : 
jhr 
307 
a smooth image domain (reconstructed from discrete samples) and

58 : 
jhr 
233 
offers a powerful abstraction with which analysis algorithms can be

59 : 


decomposed into a coordinated set of actors that capture the parallelism

60 : 


inherent in these algorithms.

61 : 



62 : 
jhr 
307 
One advantage of DSLs is that they provide a highlevel programming model

63 : 
jhr 
233 
that allows domain experts to program using the natural notation and concepts of the domain.

64 : 


For example, parser generators, such as Yacc~\cite{yacc}, allow language syntax

65 : 
jhr 
307 
to be specified using the notation of contextfree grammars, and form compilers,

66 : 
jhr 
233 
such as FFC~\cite{ffc}, take highlevel descriptions of a finite element variational form

67 : 


and generate lowlevel C code for efficient evaluation of the element tensor

68 : 


and associated quantities.

69 : 
jhr 
307 
A second advantage is that DSLs allow for \emph{domainspecific} optimizations,

70 : 


which can result in highlyefficient implementations.

71 : 


Thirdly, we can view a DSL as a program generator, which have proven to be

72 : 


a very effective way to get portable highperformance code.

73 : 
jhr 
233 
For example, the leading FFT library (FFTW~\cite{fftw}) uses a program generator

74 : 


to generate versions of the FFT algorithm that are specialized for many different

75 : 


hardware targets.

76 : 
jhr 
307 
Thus, DSLs can provide both a very highlevel programming model with very highperformance

77 : 


executables on a variety of target platforms.

78 : 
jhr 
233 

79 : 


As others have argued~\cite{hanrahan:dslgpu}, the use of DSLs for parallelism

80 : 


is particularly advantageous.

81 : 


Getting the maximum performance for a parallel program on a particular parallel platform

82 : 


usually requires hand tuning the program for the given platform.

83 : 


But such a tuned program is not portable and will have to be retuned, or even rewritten,

84 : 


to take full advantage of any new platform.

85 : 


For example, clusters, multicore SMP systems, and GPUs (Graphics Processing Units)

86 : 


are parallel systems that are found in many labs, and that have very different computational

87 : 


and memorysystem characteristics.

88 : 


As the FFTW library demonstrates, domainspecific program generation is one solution to

89 : 


the portable performance problem.

90 : 


DSLs provide a highlevel interface to such program generators.

91 : 


Our proposed research follows this model of fixing the application domain (to image

92 : 


analysis in our case) and combining domainspecific optimizations with targetspecific

93 : 
jhr 
307 
program generation to get a highlevel programming language with high performance.

94 : 
jhr 
233 

95 : 


There are three equally important aspects of the proposed research.

96 : 


\begin{enumerate}

97 : 


\item

98 : 
jhr 
307 
To design a highlevel, mathematical programming model that allows existing

99 : 


visualization and analysis methods to be expressed in simple and readily

100 : 


understood programs.

101 : 
jhr 
233 
\item

102 : 


To test and evolve this programming model by exploring a range of

103 : 


both established and novel imageanalysis algorithms.

104 : 


\item

105 : 


And to develop techniques for implementing the programming

106 : 


model so that a given program can be compiled to highperformance

107 : 


implementations on a variety of parallelhardware platforms without

108 : 


handtuning the code for each platform.

109 : 


\end{enumerate}%

110 : 



111 : 


This proposal is a collaboration between PI Reppy, who has extensive experience

112 : 


in both parallel and sequential language design and implementation, and PI Kindlmann, who

113 : 


has extensive domain expertise in the area of image analysis.

114 : 


The proposed research will make contributions to both the area of parallellanguage

115 : 
jhr 
307 
design and implementation and to the area of image analysis and visualization.

116 : 
jhr 
233 

117 : 


The proposal is organized as follows.

118 : 


In the next section, we give an overview of the imageanalysis domain

119 : 


and the numeric techniques that have proven useful for extracting features

120 : 


from image data.

121 : 


This survey motivates the major aspects of our proposed DSL design, which we

122 : 


discuss in \secref{sec:design}.

123 : 


We then discuss the challenges of implementing this language on parallel hardware,

124 : 


with a particular focus on GPUs, in \secref{sec:impl}.

125 : 


Our research plan is described in \secref{sec:plan}, followed by a discussion of

126 : 


the broader impact of the proposed research.

127 : 


Finally, we describe the PI backgrounds and existing NSF support.

128 : 


Related work is discussed throughout the proposal.
