Ticket #292 (assigned Task: null)

Opened 5 years ago

Last modified 4 years ago

Create Generic OPeNDAP Fetch Tool

Reported by: bbest Owned by: jjr8
Priority: Medium Milestone: Unscheduled
Component: Tools - Conversion Version:
Keywords: Cc:

Description

OPeNDAP server configuration could be with individual Python config files, similar to the repositories specified with  yum in the  yum.repos.d. These config files could specify the time (poch, interval) and coordinate system (based on COARDS, cdf compliance). There is also the ability to query the attributes of the server, either on the fly at the validation step or with another tool, like HDF SDS header tool. Then we could create a list of these server configs on an MGET server and in the ArcGIS 9.3 validation step, fetch the latest list. Highlighting the multitude of available remote data sources on the main Wiki of the website should make MGET very attractive.

Attachments

get_dap.py Download (7.4 KB) - added by bbest 5 years ago.
latest OPeNDAP Python tool for Raincoast project by Ben
get_opendap.m Download (10.4 KB) - added by bbest 5 years ago.
latest OPeNDAP Matlab tool for Tortugas project by Ben
get_dap.2.py Download (8.9 KB) - added by bbest 4 years ago.
more recent script, from Census Map & Vis workshop

Change History

Changed 5 years ago by bbest

latest OPeNDAP Python tool for Raincoast project by Ben

Changed 5 years ago by bbest

latest OPeNDAP Matlab tool for Tortugas project by Ben

Changed 5 years ago by jjr8

  • status changed from new to assigned
  • component changed from Unknown to Tools - Conversion
  • milestone set to 0.7

Tentatively scheduling this for MGET 0.7. Will re-examine when we define the priorities and target date for 0.7.

Changed 4 years ago by bbest

more recent script, from Census Map & Vis workshop

Changed 4 years ago by bbest

Greetings, all.

Yesterday I finally released a new major version of Pydap! More than two years after 2.2 was released, I'm glad to announce the release of Pydap 3.0 beta 1. You can find the new version together with updated documentation of the official website,  http://pydap.org/. For people who would still like to use Pydap 2.2, the old website is still up at  http://pydap.org/2.x/.

Why a new major version? ====================

The way Pydap 2.2 works with sequences is non-intuitive, sub-optimal and sometimes quirky. As soon as I started to write a Dapper compatible server and accessing Dapper servers with Pydap I realized that its data model could be rewritten in a better way, simplifying the code. That was the major reason.

I also wasn't happy with the server. There's no way to change the HTML templates for the help response and data request form, for example. I rewrote the server and the responses having in mind developing an Opendap server (or more than one!) *on top* of Pydap, so the result is that the code is quite flexible: templates are not tied to a specific templating engine, for example, and can even be loaded from a database instead of disk.

These changes required me to rewrite everything, breaking responses, handlers and the client API.

Why 3.0? Wasn't this supposed to be 2.3? ================================

Pydap 2.3 was almost ready two years ago; I was just finishing the documentation when I came back from the Opendap Developer's Meeting in Boulder full of new and fancy ideas. I scratched what I had and started a new version.

This other new version (which should've been 2.4, since 2.3 was now a dead branch) was almost done a couple of months ago. I was just finishing the documentation when I realized how I could make sequences more intuitive and natural, mimicking record arrays from Numpy. I started a parallel branch to test the idea, and rewrote most of the code again to work with the new data model. I also quickly wrote the documentation this time!

So this new version should be 2.5, but since I changed the name of the package (from dap to Pydap) and the module (from dap to pydap), and since this was conceptually very far from 2.2, I decided to change the version to 3.0.

What has changed? ===============

The more perceptible changes are in the client API, specially when accessing grids and sequences  http://pydap.org/client.html. Both can now be subset using child variables; for example:

from pydap.client import open_url

dataset = open_url(' http://test.opendap.org/dap/data/nc/coads_climatology.nc') sst = dataset.SST data = sst[ 0 , (-10 < sst.COADSY) & (sst.COADSY < 10) ,

(sst.COADSX > 320) & (sst.COADSX < 328) ]

dataset = open_url(' http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp') seq = dataset.location[

(dataset.location.LATITUDE > -2) & (dataset.location.LATITUDE < 2) & (dataset.location.LONGITUDE > 320) & (dataset.location.LONGITUDE < 330) ]

for record in seq[:5]:

print record.JULD.data

The client also supports calling server-side functions. Here's an example with the geogrid() function from Hyrax:

dataset = open_url(' http://test.opendap.org/dap/data/nc/coads_climatology.nc') new_dataset = dataset.functions.geogrid(dataset.SST, 10, 20, -10, 60)

There's no mechanism for auto-discovery of functions, neither a list of pre-determined functions. You can pass any function name and Pydap will try to call it on the server. It even works with nested function calls, thanks to some Python magic (lazy evaluation rules!).

On the server, not much of what was changed is apparent. It's now possible to customize the help response and the data request form HTML; both are now templates, together with the template for the directory listing. Another change is that handlers are now full WSGI applications, so it's very easy to integrate Pydap with other Python frameworks like Django, for example.

What lies ahead? =============

Currently, only the NetCDF handler has been ported to the 3.0 tree. Some handlers are very easy to port (Matlab), some will take a bit more of work (SQL), and some need to be redone (HDF5). Actually, the NetCDF handler works with PyNIO, so it can also handle Grib and HDF 4 files. If you need a specific handler, send an email to the mailing list and I'll bump its priority.

For my next step I want to rewrite the WMS response, using matplotlib for better plots, and improve the KML response -- drawing colorbars, for example, and plotting sequence data as markers which you can click and get information about the data. After that I want to write a NcML parser in Python. This will bring aggregation capabilities to the server.

I also plan writting a new server, based on Pydap. This new server should be a full web application, handling authentication/authorization for specific datasets, mounting datasets on arbitrary URLs, and with a georeferenced search interface. I also want to make this easier to install, perhaps using a VMWare appliance, or an executable (yes, you can make executables from Python code) with everything included.

Meanwhile, I should maintain the 2.x build for at least an year, backporting fixed and small features like I usually do.

Enjoy Pydap! As I like to say, even though the phrase is not mine, a lot of effort went into making this effortless.

Thanks, --Rob

Changed 4 years ago by jjr8

  • milestone changed from 0.7 to Unscheduled

Postponing to Unscheduled milestone. We will revisit this when next revisit the goals of upcoming releases.

Note: See TracTickets for help on using tickets.