solutions of geo: WPS

Tampilkan postingan dengan label WPS. Tampilkan semua postingan

Senin, 28 November 2011

Serving Meteo data with GeoServer, GeoBatch and GeoNetwork: the LaMMA use case

Dear All,
in this post I'd like to talk about the work we have done for the LaMMA consortium.

The Problem
The purpose of this project is to build a complete Spatial Data Infrastructure (SDI) to provide a spatio-temporal raster data processing, publishing, and interactive visualisation facility. This platform is candidate to substitute the current one which was already built leveraging on Open Source software but which was rather static and contained no OGC services.

The data that will be ingested into the system is generated by an existing processing infrastructure which produces a set of different MetOc models. Our goal is to manage the geophysical parameter (or variables) produced by the following models:

ARW ECM

3 Km resolution
9 Km resolution

50 Km resolution

The ingestion is started every day at noon and midnight, hence there are 2 run-times a day for each model at a certain resolution and the produced data contains different forecast times.
- ARW ECM (3 days with interval of 1h)
- GFS (8 days with interval of 6h)
The data is produced in GriB format (version 1).

Our Solution
Leveraging on the OpenSDI suite and specifically on the following components:

as well as some other well known Open Source project such as (Apache Tomcat, Apache Http server, Postgres) we provided an extensible and standard based platform to automatically ingest and publish data.

The infrastructure we have put together is depicted in the deployment diagram below.


Deploy diagram

This infrastructure has been designed from the beginning with the goal of being scalable in terms of supporting large number of external users since it is based on a GeoServer Master/Slave infrastructure where multiple slaves can be installed for higher throughput. Caching will be tackled in a successive phase.

As you can see we provided three access level for different type of users:

Admin can locally access to the entire infrastructure and add instances of GeoServer to the cluster to improve performances
Poweruser can remotely add files to ingestion and administer GeoBatch via Basic Autentication
User can look at ingested data accessing one of the GeoServer slave machines via Apache httpd proxy server. The load of these accesses is distributed between all available slaves.

As mentioned above, the main building blocks are as follows:

GeoServer for providing WMS, WCS and WFS services with support for the TIME and Elevation dimensions
GeoNetwork, for publishing metadata for all data with specific customizations for managing the TIME dimensions in the dataset
GeoBatch, to perform preprocessing and ingestion in near real time of data and related metadata with minimal human intervention

Using GeoBatch for ingestion and data preprocessing

In the LaMMA project the GeoBatch framework is used to preprocess and ingest the incoming GriB files as well as to handle data removal based on a sliding temporal window (currently set to 7 days) since it was a design decision to keep around for live serving on the last 7 days of forecasts.

Below you can find a diagram depicting one of the automatic ingestion flow we created for the LaMMA project using the GeoBatch framework.


GeoBatch ingestion flow example

The various building blocks comprising this flow are explained here below:

NetCDF2GeotiffAction reads the incoming GRIB file and produces a proper set of Geotiff perfoming on-the- fly tiling, pyramiding and unit conversions.Each GeoTiff represent a 2D slice out of one of the original 4D cubes contained in the source GriB file
ImageMosaicAction uses the GeoServer Manager library to create the ImageMosaic store and layer in the GeoServer Master. The created ImageMosaic contains proper configuration to parse Time and Elevation dimensions' values from the GeoTiff in order to create 4D layers in GeoServer.
XstreamAction takes an XML file and deserializes it to a Java object this is passed to the next action.
FreeMarkerAction produces a proper xml metadata file for publishing in GeoNetwork, using a pre-cooked template and the passed data model.
GeoNetworkAction published the metadata on the target GeoNetwork
ReloadAction forces a reload on all the GeoServer slaves in order to pick up the changes done by the master instance

This type of flow, (with a slight different set up) is used to convert and publish the 3 different incoming models.

The other type of flow is the remove flow which is a composed by the following building blocks:

ScriptingAction executes a remove.groovy script which will:

calculate the oldest time to retain
select older files to be removes
search and remove matching metadata from the GeoNetwork
remove collected layers and stores from the GeoServer Master catalog
delete permanently succesfully removed files

ReloadAction forces a reload on all the GeoServer Slave.

Using GeoNetwork for metadata management
We have customized the metadata indexing (thanks Lucene!) in GeoNetwork in order to be able to index meteorological model execution in terms of their run time as well as in term of their forecast times.
Generally speaking the data we are dealing with is driven by a meterological model which produces daily a certain number of geophysical parameters with temporal validity that spans for certain number of time instants (forecast times) in the future. In GeoNetwork we are currently creating a new metadata object for each geophysical parameter (e.g. Temperature) of a new model run; this metadata object contains multiple links to WMS requests for each forecast time, leveraging the TIME dimension in GeoServer (see picture below). Moreover the forecast times themselves are indexed so that advanced searches can be done on them.

If you have questions about the work described in this post or if you want to know more about our services could help your organization to reach its goals, do not hesitate to contact us.

The GeoSolutions team,

Minggu, 23 Januari 2011

Developers Corner: have your SLD transform raster data into vectors on the fly

Hi all,

in this post we'd like to share our most recent endeavor in dynamic data rendering within the GeoServer and GeoTools open source projects.

The problem

Suppose you have a set of scientific raster data sets, maybe they represent some sort of concentration, elevation, or maybe they represent wind, currents, or some other vector phenomena via two bands (one for magnitude, one for direction).

Now, you have lots of them and people want to display them in various ways. Raster with color scales are nice, but often you need to render them in other ways, such as contour lines, polygons catching all the pixels within certain ranges, or vector fields (think wind barbs).

Those are all raster to vector conversion processes that a WPS can take care of. However, suppose you also have a ton of those raster data, and that the raster classification parameters need to be dynamic, with a user providing, for example, the contour levels to extract.

Now you're facing a somewhat hard problem, in theory you would have to:

call the WPS with the given data
store the results somewhere
register that new layer as a published WMS layer, along with the proper style
update the viewer to add that layer
purge that temporary layer once the user is done or wants a different set of parameters to be applied in the transformations

To add icing on the cake, suppose your datasets are massive, so doing the WPS extraction at full resolution can take its dear time.... does not really sound like a situation one would like your server and client infrastructure to deal with.

The solution

Instead of doing all of the above work, wouldn't it be nice to just specify the transformation needed in the style sheet? That's exactly the road we decided to follow.
We've created and SLD extension allowing to pipe a process (yes, a WPS one) inside the SLD so that it can be dinamically updated. It looks like the following:

The above would call on the fly the contouring process and then render its result: no need to create and manage a new vector layer, the data is generated on the fly only when needed.
Here is how the result looks (using a style sheet just a bit more complex than the above one):

Chaining transformations we can also extract and display the value of the single pixels and show it as a label, as in the following example:

Alternatively you may want to extract the polygons containing all the cells in a certain data range, like in the following transformation:

The result, coloring each range in a different way, is:

Finally, we may want to extract as set of wind arrows starting from a raster having the horizontal and vertical components of a vector (u and v):

We're linking to the full SLD of this last one because it's quite the testament of SLD flexibility: the magnitude and direction of the arrow are computed on the fly by using filter functions (functions that are part of GeoTools/GeoServer, you may not find them in just any implementation).

One important bit here is that the raster to vector conversion are happening at the visualization resolution: this means you can have the transformation work against large datasets without heavy slowdowns, because it is going to happen only in the area you're looking at, and at the resolution that would have been used to draw the raster.
This makes it possible to get fast, on the fly operations that do not excessively slow down rendering.

This is yet another example of how processing capabilities can be integrated into GeoServer, and it's by no means the last. Also, there is still plenty that can be done to improve this kind of transformations, as well as new transformations to support mapping tools such as heatmaps. Interested? Let us know!

The GeoSolutions team

Rabu, 24 November 2010

Fun Stuff: Computing circular buffers in geographic coordinates

Finding all the objects within a certain distance from a point is surely a common GIS problem. The problem is normally solved using OGC "dwithin" filters or by computing a buffer and then finding all the intersecting objects.

Very often both of the approaches fail miserably in case the coordinate system is a geographic one, as common libraries, such as JTS and GEOS, are not able to handle the non planar nature of it. As far as "dwithin" is concerned rencent Oracle and PostGIS versions can manage the problem properly, but what to do if they cannot be used?

We had to solve this problem when computing data distribution statistics over raster data cells that are within a certain distance from a given point, and making for an accurate calculation regardless of how long the distance was.

To do that we created a new GeoServer WPS process, "gs:PointBuffers", that can create a set of buffers given a point, a target SRS and a set of distances in meters.

In case the SRS denotes a geographic spatial reference system the GeoTools GeodeticCalculator is used to sample the set of points that are at the given distance, looping over a closed sets of azimuths to cover the entire shape.

Interested in seeing the results? I certainly was.

Let's start with a set of small buffers at a medium latitude: 10, 30, 50 and 100 km buffers around a point located in northern Italy. Here is there result:

As you can see, drawing the result in plain WGS84 (plate carré for the conoisseurs) we get elliptical shapes. This should not come as a surprise if you consider that at 45° one degree of latitude spans 111km, whilst a degree of longitude spans only 78km (see the "Degree length" table at Wikipedia).

What if we pump up the distance significantly? Let's try with 100, 500, 1000, 2000 and 3000km instead. Here is the result:

See the funny shape we get? This is the effect of the size of one degree of longitude shrinking as we move towards north.

It is also a good indicator of how deformed the now common WGS84 maps, often published on the web, are.

If you want to see the same data in a common projection, let's have a look at the same map in EPSG:3857 (aka the Google projection):

Somewhat better, even if the Mercator tendency to inflate areas at high latitudes is well evident.

Well, this is it. The gs:PointBuffers is soon going to land in GeoServer for your testing pleasure.

We'd very much like to tackle the same problem against lines and polygons as well. Interested? Let us know!

The GeoSolutions team

Senin, 15 November 2010

GeoSolutions helps GeoServer WPS going mainstream

GeoSolutions is funding Andrea Aime's time to bring the GeoServer Web Processing Service (WPS) module to become an officially supported extension. A proposal has been submitted and is being voted on by the Project Steering Committe; you can track the progress here.

For those who have no idea of what WPS means I can cite part of the description I have found in wikipedia, which is very informative:
"The (WPS) is designed to standardize the way that GIS calculations are made available to the Internet. WPS can describe any calculation (i.e. process) including all of its inputs and outputs, and trigger its execution as a Web Service. WPS supports simultaneous exposure of processes via HTTP GET, HTTP POST, and SOAP, thus allowing the client to choose the most appropriate interface mechanism. The specific processes served up by a WPS implementation are defined by the owner of that implementation. Although WPS was designed to work with spatially referenced data, it can be used with any kind of data.
WPS makes it possible to publish, find, and bind to processes in a standardized and thus interoperable fashion. Theoretically it is transport/platform neutral (like SOAP), but in practice it has only been specified for HTTP.
WPS defines three operations:

GetCapabilities returns service-level metadata
DescribeProcess returns a description of a process including its inputs and outputs
Execute returns the output(s) of a process"

GeoSolutions plans for the WPS module in GeoServer extend beyond simply making it an official extension; in the short term we intend to publish a set of processes that we have developed for FAO which will add support for sophisticated statistics on both raster and vector data as well as other more specific processes (raster crop by polygon to name one).

In the longer term our goals include the following items:

Exposing GDAL utilities (e.g. gdalinfo) as WPS processes
Exposing Octave functions as WPS processes
Exposing IDL routines as WPS processes
Raster Algebra support via JAI-Tools
Improving support for grid processing
Improving support for clustered processing for superior scalability

Stay tuned for more information and, if you want to know more, you can always contact us directly!

The GeoSolutions team.