Home My Page Projects Code Snippets Project Openings diderot

SCM Repository

[diderot] Annotation of /trunk/doc/whitepaper/intro.tex
 [diderot] / trunk / doc / whitepaper / intro.tex

Annotation of /trunk/doc/whitepaper/intro.tex

 1 : jhr 233 %!TEX root = paper.tex 2 : % 3 : 4 : \section{Introduction} 5 : 6 : The analysis of structure in three-dimensional images is increasingly 7 : valuable for biomedical research and computational science. 8 : \emph{Computed Tomography} (CT) and \emph{Magnetic Resonance Imaging} 9 : (MRI) scanners continue 10 : to improve in speed and resolution, driving clinical research in 11 : image-based screening and diagnosis, and recent microscopy techniques 12 : are producing detailed snapshots of cellular processes for biomedical 13 : research. 14 : Regardless of the image modality or application, the values 15 : stored at individual samples in the three-dimensional image are 16 : measurements of some essential physical or biological quantity, and 17 : the spatial structure of those values indirectly represents relevant 18 : features of the scanned object or system. 19 : Image analysis plays the 20 : crucial role of detecting and extracting the salient image structure 21 : in order to build geometric and quantitative models of anatomy or 22 : organization. 23 : 24 : %Volumetric imagery 25 : %also arises in non-destructive testing of manufacturing, siesmology 26 : %and geophysics, and homeland security. 27 : 28 : Ideally, image analysis research creates efficient and powerful tools 29 : that increase the scientific utility and impact of image acquisitions. 30 : In practice, a combination of factors limits the development and 31 : application of novel image-analysis methods. 32 : The computational burden 33 : of processing images is increasing as devices produce images of higher 34 : resolution (\eg{}, typical CT scans have gone from $128^3$ to roughly $512^3$ 35 : resolutions~\cite{GLK:Reiser2008}). 36 : jhr 307 With the latest scanning technologies, it is also more common for the values measured at 37 : jhr 233 each sample to be multi-dimensional rather than a single scalar, which 38 : further complicates implementing mathematically correct methods. 39 : % diffusion-weighted MRI now commonly measures at least 30 value per 40 : % sample, and the simplest tensor models estimated from these 41 : % measurements have at least six degrees of freedom per point. 42 : While image-analysis algorithms are computationally expensive, there 43 : is hope in being able to exploit modern parallel hardware to greatly 44 : reduce their running time. 45 : Unfortunately, the scientists involved in 46 : creating the latest imaging experiments and acquisitions are usually 47 : not in a position to implement correspondingly sophisticated analysis 48 : algorithms that run on the latest parallel hardware, which reinforces the 49 : gap between the class of methods advanced by computer science 50 : jhr 307 research and those that gain currency in scientific research 51 : jhr 233 applications. 52 : 53 : We propose a new way to express and implement image-analysis algorithms that 54 : strives for both efficiency and flexibility. 55 : Our proposed approach is based on a \emph{domain-specific language} (DSL) 56 : that encapsulates the mathematical ingredients of working in 57 : jhr 307 a smooth image domain (reconstructed from discrete samples) and 58 : jhr 233 offers a powerful abstraction with which analysis algorithms can be 59 : decomposed into a coordinated set of actors that capture the parallelism 60 : inherent in these algorithms. 61 : 62 : jhr 307 One advantage of DSLs is that they provide a high-level programming model 63 : jhr 233 that allows domain experts to program using the natural notation and concepts of the domain. 64 : For example, parser generators, such as Yacc~\cite{yacc}, allow language syntax 65 : jhr 307 to be specified using the notation of context-free grammars, and form compilers, 66 : jhr 233 such as FFC~\cite{ffc}, take high-level descriptions of a finite element variational form 67 : and generate low-level C code for efficient evaluation of the element tensor 68 : and associated quantities. 69 : jhr 307 A second advantage is that DSLs allow for \emph{domain-specific} optimizations, 70 : which can result in highly-efficient implementations. 71 : Thirdly, we can view a DSL as a program generator, which have proven to be 72 : a very effective way to get portable high-performance code. 73 : jhr 233 For example, the leading FFT library (FFTW~\cite{fftw}) uses a program generator 74 : to generate versions of the FFT algorithm that are specialized for many different 75 : hardware targets. 76 : jhr 307 Thus, DSLs can provide both a very high-level programming model with very high-performance 77 : executables on a variety of target platforms. 78 : jhr 233 79 : As others have argued~\cite{hanrahan:dsl-gpu}, the use of DSLs for parallelism 80 : is particularly advantageous. 81 : Getting the maximum performance for a parallel program on a particular parallel platform 82 : usually requires hand tuning the program for the given platform. 83 : But such a tuned program is not portable and will have to be retuned, or even rewritten, 84 : to take full advantage of any new platform. 85 : For example, clusters, multicore SMP systems, and GPUs (Graphics Processing Units) 86 : are parallel systems that are found in many labs, and that have very different computational 87 : and memory-system characteristics. 88 : As the FFTW library demonstrates, domain-specific program generation is one solution to 89 : the portable performance problem. 90 : DSLs provide a high-level interface to such program generators. 91 : Our proposed research follows this model of fixing the application domain (to image 92 : analysis in our case) and combining domain-specific optimizations with target-specific 93 : jhr 307 program generation to get a high-level programming language with high performance. 94 : jhr 233 95 : There are three equally important aspects of the proposed research. 96 : \begin{enumerate} 97 : \item 98 : jhr 307 To design a high-level, mathematical programming model that allows existing 99 : visualization and analysis methods to be expressed in simple and readily 100 : understood programs. 101 : jhr 233 \item 102 : To test and evolve this programming model by exploring a range of 103 : both established and novel image-analysis algorithms. 104 : \item 105 : And to develop techniques for implementing the programming 106 : model so that a given program can be compiled to high-performance 107 : implementations on a variety of parallel-hardware platforms without 108 : hand-tuning the code for each platform. 109 : \end{enumerate}% 110 : 111 : This proposal is a collaboration between PI Reppy, who has extensive experience 112 : in both parallel and sequential language design and implementation, and PI Kindlmann, who 113 : has extensive domain expertise in the area of image analysis. 114 : The proposed research will make contributions to both the area of parallel-language 115 : jhr 307 design and implementation and to the area of image analysis and visualization. 116 : jhr 233 117 : The proposal is organized as follows. 118 : In the next section, we give an overview of the image-analysis domain 119 : and the numeric techniques that have proven useful for extracting features 120 : from image data. 121 : This survey motivates the major aspects of our proposed DSL design, which we 122 : discuss in \secref{sec:design}. 123 : We then discuss the challenges of implementing this language on parallel hardware, 124 : with a particular focus on GPUs, in \secref{sec:impl}. 125 : Our research plan is described in \secref{sec:plan}, followed by a discussion of 126 : the broader impact of the proposed research. 127 : Finally, we describe the PI backgrounds and existing NSF support. 128 : Related work is discussed throughout the proposal.