Marine Geospatial Ecology Tools
Getting Started with Marine Geospatial Ecology Tools
| MGET version: | 0.8a2 |
| Document version: | $Id: GettingStarted.xsl 404 2009-06-08 17:48:16Z jjr8 $ |
| Document status: | Complete |
| Maintainer email: | jason.roberts@duke.edu |
| MGET home page: | http://code.env.duke.edu/projects/mget |
Marine Geospatial Ecology Tools (MGET) is an open source programming library designed for coastal and marine researchers and GIS analysts who work with spatially-explicit ecological and oceanographic data in scientific or managerial workflows. The initial MGET releases focus on tools useful in habitat modeling, including tools for processing and sampling remotely-sensed oceanographic data and for creating and evaluating statistical models. Also included are many powerful general-purpose data processing tools, such as batch-processing versions of popular ArcGIS Spatial Analyst tools. Subsequent MGET releases will include more advanced tools presently under development, such as tools for hydrodynamic modeling, connectivity modeling, and spatially-explicit fishery modeling.
MGET was designed with an emphasis on maximizing reliability and usability. We are all too familar with the abundance of undocumented, confusing, and difficult-to-install software that works great on the original developer's machine but not your own. We try to minimize these headaches by providing the same features you expect from the highest quality software, including an installer, documentation, extensive error-handling code, and adequate testing before we release new versions. We are professional software engineers who take pride in our work and hope that if you find a bug or see something that could be improved, please report it to Jason Roberts (jason.roberts@duke.edu). Thanks for your assistance!
You may notice that the Reference Documentation and the MGET source code use the word GeoEco rather than the phrase Marine Geospatial Ecology Tools. The two are synonymous.
If you have done some computer programming, you may be familiar with the need to invent words for use in source code. These words are typically short and allow programmers to quickly distinguish one library of functions from another. We selected GeoEco as the code-word for Marine Geospatial Ecology Tools because it is catchy, it shows up infrequently in Internet search engines relative to other choices such as MGET, and it does not suggest "marine". This last point may seem contrary to our mission, but want to offer our tools to terrestrial users, even if we currently market the tools mainly to marine users. Many of the existing tools are equally applicable above and below the waterline.
Minimum requirements:
Additional requirements for full functionality:
Much of MGET is written to be operating-system independent. We will enable installation on other operating systems in a future release.
Although you must be logged on as an administrator to install MGET, you do not need to be an administrator to run the tools once they are installed.
Exit all instances of ArcCatalog, ArcMap, ArcGlobe, etc.
You can skip this step if you do not have ArcGIS and do not want to install it. If you do have ArcGIS we recommend you take this opportunity to install the latest service pack. This is not required, but may improve the reliability and performance of the MGET tools that require ArcGIS.
There is no harm in applying the same ArcGIS service pack more than once.
MGET is implemented in the Python programming language as a package called GeoEco. Before you install it, you must install the proper versions of Python and the pywin32 package. But first, there are a few things you need to know about Python.
Python version numbers
Each release of the Python language interpreter is assigned a version number of the form X.Y, where X is the major number and Y is the minor number. The major number is incremented when there is a fundamental change to the language. The minor number is reset to zero whenever the major number changes, and is incremented when new features are added to the major release.
Occasionally, the Python development team will fix a bunch of bugs and issue a "bug fix" release. These releases have the form X.Y.Z where Z is the bug fix release number, starting with 1. For example, 2.5.1 is the first bug fix to Python 2.5.
Multiple releases of Python can be installed on a Windows computer at the same time. For example, you can have Python 2.1 and 2.5 installed simultaneously. But whenever you install a bug fix release, it overwrites the installation having the same X.Y number. For example, If you have Python 2.5 installed and you install 2.5.1, it will overwrite your 2.5 installation and now you will only have 2.5.1 on your machine. If you then install 2.5.2, you will only have 2.5.2 on your machine.
Unless you override the installation options in the Python setup program, Python releases are installed to C:\PythonXY. For example, if you install Python 2.1 and 2.5, they will be installed to C:\Python21 and C:\Python25. When you install bug fix releases, they overwrite the files in these directories with updated versions. We recommend you use the default installation directory C:\PythonXY unless you are an expert user.
Windows file association determines which Python version is used to execute scripts
When you have multiple versions of Python installed, Windows must decide which version to use when you double-click a Python (.py) script to run it from Windows Explorer. Windows maintains a "file association" database that associates programs to file types based on their file extensions. In general, whenever you install new software, the setup program modifies the file associations to tell Windows to use the new program to open the types of files it works with. Whenever you install a version of Python, it reconfigures the file associations so that version of Python will be used to execute scripts. For example, if you have Python 2.1 and then install 2.4, from that point forward 2.4 will be used to run Python scripts. If you then installed 2.3, it would be used even though 2.4 is installed.
You can change the file associations manually; see Microsoft Knowledge Base article 307859. To change from Python 2.3 to 2.4, for example, you would change the associations for .py, .pyc and .pyo files from C:\Python23\python.exe to C:\Python24\python.exe and .pyw from C:\Python23\pythonw.exe to C:\Python24\pythonw.exe. If you are uncomfortable doing this, you can always reinstall Python 2.4 to restore the file associations.
The main things to remember are:
ArcGIS 9.2 and prior versions rely on Windows file associations
ArcGIS 9.2 and prior versions use the Windows Command Processor (cmd.exe) to execute Python scripts. cmd.exe uses Windows file associations to determine which program should be used to execute .py files. Thus, unless you have manually modified your file associations, these versions of ArcGIS will execute Python scripts using the version of Python that was most recently installed.
ArcGIS 9.3 relies on Windows file associations only when the Run Python script in process option is not checked
ArcGIS 9.3 introduced a new option for script tools called Run Python script in process. This option is specified on a tool-by-tool basis. This option improves performance dramatically and ESRI recommends that script developers use it whenever possible. Most MGET tools use this option, although a few do not (for reasons that I will not discuss here).
If this option is enabled, ArcGIS 9.3 always invokes Python 2.5, which comes with ArcGIS 9.3. It does not matter if a different version of Python was installed more recently.
If this option is disabled, ArcGIS 9.3 relies on file associations just like 9.2 and earlier versions. This behavior can lead to unexpected results when the most recently installed version of Python is not 2.5: the "in process" tools will run with 2.5 but the others will run with the most recent version of Python.
Which version of Python should I use if I have ArcGIS?
ArcGIS 9.1 users:
ArcGIS 9.1 shipped with Python 2.1. The MGET Python package, GeoEco, requires Python 2.4 or later. If you do not have a preference, we recommend the latest version of Python. As of this writing, it was 2.5.2. We have used ArcGIS 9.1 with Python 2.5 for several years without problems.
Important: Do not uninstall your existing version of Python 2.1! Even though ArcGIS relies on Windows file associations to select the version of Python for script execution, it explicitly checks for the presence of Python 2.1. It also checks for the "Python 2.1 combined Win32 extensions" (the win32all package). If you accidentally uninstall these, you can download them from here: Python 2.1.3, win32all. After reinstalling them, you will must restore your file associations to your later version of Python by reinstalling it or manually editing the associations.
ArcGIS 9.2 users:
According to ESRI Technical Article 31912, ArcGIS 9.2 is "hard wired" to work with Python 2.4.1, and using any other version is not supported. Based on the strong wording of ERSI's article, we recommend you stick with Python 2.4.1 unless you are an expert programmer.
In our experience, it is safe to install a more recent bug fix release of Python 2.4. We have used Python 2.4.4 with ArcGIS 9.2 for years without problems. You are probably safe to upgrade to 2.4.4 even if you are not an expert programmer. The Python rules for bug fix releases are fairly strict and it is unlikely that a newer bug fix release would break ArcGIS geoprocessing.
We have also found that 9.2 will work with Python 2.5 but only if the Python script instantiates
the geoprocessor object using the old technique, invoking win32com.client.Dispatch.
The new technique, invoking arcgisscripting.create(), will not work. This is
is because the arcgisscripting module is implemented as a Python "extension DLL"
and therefore can only work with a specific version of Python. If you examine the
DLLs required by C:\Program Files\ArcGIS\Bin\arcgisscripting.dll, you will see
python24.dll. If you try to import arcgisscripting from Python 2.5, it will raise
ImportError: No module named arcgisscripting.
GeoEco is written to try arcgisscripting.create() first. If that
fails, it tries win32com.client.Dispatch. Under this logic, GeoEco
can successfully instantiate the geoprocessor running under Python 2.5 or any later
GeoEco-supported version so long as the pywin32 Python package is installed.
We regularly run GeoEco with Python 2.5 and ArcGIS 9.2 without problems.
If you are an expert programmer you can do the same. But beware: ArcGIS 9.2
includes some geoprocessing tools implemented as Python scripts. These scripts
all use the import arcgisscripting approach and will therefore
fail under Python 2.5.
ArcGIS 9.3 users:
ArcGIS 9.3 installs Python 2.5.1. Just as ArcGIS 9.2 is "hard wired" to work with Python 2.4.1, ArcGIS 9.3 is hard wired to 2.5.1. But 9.3 is integrated even tighter with Python 2.5 than 9.2 was with Python 2.4 (see discussion above about how 9.3 uses file associations to invoke the Python interpreter only for scripts that are not "in process" tools). Because of this, we strongly recommend you stick with Python 2.5 if you have ArcGIS 9.3.
In our limited experience, we have found that Python 2.5.2 seems to work fine with ArcGIS 9.3. You are probably safe to upgrade to it even if you are not an expert programmer.
What if I don't have ArcGIS?
We recommend the latest version of Python that is supported by GeoEco. At the time of this writing, it was Python 2.5.2.
Python installation procedure
import sys
print 'Python ' + sys.version
print 'Press Enter to close this window...'
sys.stdin.readline()
As mentioned above, MGET is also known as the GeoEco Python package. The GeoEco setup program requires that the pywin32 Python package be installed for your version of Python. Pywin32 is also known as Python Extensions for Windows.
Even though ArcGIS 9.2 and later do not require pywin32, GeoEco still requires it for functionality not related to ArcGIS. You still need pywin32 to use GeoEco even if you have ArcGIS 9.2 or later.
Important: We recommend pywin32 build 212. Later builds (e.g. 213, 214) have a compatibility problem with ArcGIS (see Ticket #386).
Python packages
Python packages are libraries of functions written in Python. When you install a package, you must choose the version of Python that you want it applied to. The setup program then installs the package into the directory for the Python release you chose.
Many Python packages will only work for specific releases of Python. Pywin32 is one of these. For these packages, it is important that you download the package version that applies to your version of Python. The package download web page will usually list multiple versions of the package, with file names that only differ by Python version. If you want to install it into your Python 2.4 installation, you must download pywin32-xxx.win32-py2.4.exe. To install it into your Python 2.5 installation, you must download pywin32-xxx.win32-py2.5.exe.
A common mistake is to install a package for one version of Python but not realize that your Windows file associations are configured to run another version. For example, you might have both Python 2.4 and 2.5 installed and then install pywin32 for Python 2.4, not remembering that Python 2.5 was the last version you installed and therefore the version configured in the Windows file assocation database. When you then run a Python script that requires pywin32, will execute under Python 2.5 but fail to find pywin32 because you installed the pywin32 for 2.4 not 2.5. It will seem as though your installation of pywin32 had no effect.
Most Python packages appear in the Add/Remove Programs window
When you install a Python package, the installer usually adds it to the Add/Remove Programs window in the Control Panel (this window is called Programs and Features in Windows Vista). In this window, the package will typically be named "Python X.Y NAME-VERSION" where X and Y are the Python version the packaged was installed for, NAME is the name of the package, and VERSION is the package's version. Packages often use the same versioning scheme as Python, but some use different schemes.
The pywin32 package
This package exposes the functionality of the Windows operating system to Python scripts. GeoEco uses it for certain installation tasks and to expose its Python classes using Microsoft COM Automation, allowing virtually any scripting language can instantiate GeoEco classes and call their functions and properties.
Future releases of GeoEco for other operating systems will not require pywin32.
pywin32 installation procedure
If you installed a previous version of GeoEco, uninstall it now:
Download and install the GeoEco Python package from http://code.env.duke.edu/projects/mget/. Be sure to install the package that is specific to your version of Python.
Many MGET tools will be functional after you install the GeoEco Python package. But some tools rely on other software applications, libraries, and Python packages. To ease the burden of getting started, GeoEco requires almost none of these to be on your machine when you install it. All GeoEco tools check their software dependencies when you invoke them, and report error messages and brief installation instructions if required software is not installed.
If you prefer to get started right away, you can rely on these error messages to prompt you for additional software installations when they are needed. If you prefer not to see these error messages, you can preemptively install all of the additional software now.
If you are a Python programmer, this section may be important to you. If not, you can probably skip this section.
MGET utilizes a number of Python modules developed by others. (For a complete list,
click here.)
To simplify the installation procedure, the GeoEco setup program installs private copies
of several of these in the AggregatedModules subdirectory of the GeoEco
installation directory (typically C:\PythonXY\Lib\site-packages\GeoEco,
where XY are the Python version numbers, such as 24 or 25). When you run a GeoEco function
that requires one of these, the function first tries to import the module from the normal
Python locations (e.g. C:\PythonXY\Lib\site-packages). If the module is already
available in a standard location, GeoEco will use it. If not, GeoEco then appends the
AggregatedModules directory to Python's sys.path variable and
performs the import again. This loads GeoEco's private copy.
This mechanism allows you to avoid installing these modules yourself, simplifying the steps needed to get GeoEco working, but allows you to control which version GeoEco loads by installing your own copies of these modules. This can be beneficial, but it can also be dangerous. If you install a version of a module that is not compatible with GeoEco, you may receive strange errors when you run GeoEco functions. We recommend you install a version that is no older than the Minimum Recommended Version listed in the table below. Later versions will probably work (and earlier versions may too, but are less likely).
| Module Included In GeoEco | Version Included In GeoEco | Minimum Recommended Version |
|---|---|---|
| dap | 2.2.6.7 | 2.2.6.7 |
| httplib2 | 0.4.0 | 0.4.0 |
| lxml | 2.1.4 | 2.1.4 |
| numpy | 1.2.1 | 1.0.3 |
The version of numpy included with MGET is compiled with SSE2 support. If your computer's processor does not support SSE2 (click the SSE2 link to find out) then you should install your own copy of numpy. Most Intel processors released since 2001 and AMD processors released since 2003 support SSE2.
To uninstall it:
The best way to learn MGET is to try it out. Here are some examples.
Please email Jason Roberts (jason.roberts@duke.edu) with any questions or feedback.
If you would like to observe the details of the processing performed by MGET tools, you can enable verbose logging.
The Reference Documentation is generated in several formats that are tailored to specific user communities:
Marine Geospatial Ecology Tools is built atop a lot of other software, much of it free. We would particularly like to thank these developers for making their excellent work freely reusable. Without your work, MGET would never have gotten off the ground. Cheers to all of you!
Development of Marine Geospatial Ecology Tools is funded by:



Except where otherwise noted, this document and the Marine Geospatial Ecology Tools software is Copyright © 2008 by Jason J. Roberts.
The terms "MGET" and "GeoEco" are synonymous with, and occasionally used instead of, "Marine Geospatial Ecology Tools".
MGET is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. MGET is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License (available in the file LICENSE.txt) for more details.
MGET makes use of several other programs graciously provided for free by other developers. MGET "aggregates" these under the terms of the GNU GPL. These programs require that their original license be reproduced when they are redistributed. Please see the file LICENSE.txt for the copyright notices and licensing details for these programs. Many thanks to these developers for their contributions.