Cluster_Back_Large

HPC Techniques for Building Web Image Caches

posted: September 25, 2012 | author: Chris Morabito

One challenge we faced from the outset with SmartView® Connect was publishing large datasets to the web quickly. On a single machine, even with a high-end processor, the process of creating a web cache from an average county of 6-inch imagery can take days—we needed to cut this down to a matter of minutes. Our solution was to use Windows® HPC Server to build a cluster of workstation machines to process imagery in parallel.

Building an image pyramid (as we conceptualize it) can be broken into two steps:

  1. Image reprojection, resampling and tile clipping
  2. Pyramid overview construction

In our algorithm, each of these steps can be described, in parallel computing terminology, as “embarrassingly parallel” and are therefore easy to distribute using the Windows HPC “Parametric Sweep” technique.

We developed an application based on the Geospatial Data Abstraction Library (GDAL) C++ API, which is designed to be run in parallel as a Parametric Sweep task. This application segments the input data for parallel processing, and then employs GDAL to perform the necessary image transformations. The result: a process that would take days to run on a single machine can be completed in less than 20 minutes with our cluster.

Some other technologies we’re using to squeeze every ounce of performance out of the cluster:

With the increase in speed also comes with better quality—a real win-win. With so much horsepower dedicated to this task, we can afford to use the best interpolation algorithms in our resampling, where other applications might skimp to save time and/or resources. At Woolpert, we put a premium on image quality, and we’re proud to say that we haven’t compromised that in our web services.

Image caching is just one example of how woolpert_labs is working to use the latest high-performance computing techniques to work faster and smarter. With powerful hardware and innovative software, we’re building solutions that enable us to produce data more rapidly without sacrificing quality.

 

SmartView is a registered trademark of Woolpert, Inc. Windows is a registered trademark of Microsoft Corporation. Intel is a registered trademark and Core is a trademark of Intel Corporation. NVIDIA is a registered trademark and CUDA is a trademark of NVIDIA Corporation. All trademarks are used only for proper description and do not indicate sponsorship or endorsement.