Converting MODIS HDFs with sinusoidal projections to ArcGIS raster format
Some providers of MODIS data distribute HDFs containing Level 3 grids in sinusoidal projections. These providers include the USGS/NASA Land Processes Distributed Active Archive Center (LP DAAC), which provides a variety of terrestrial data such as land surface temperature (LST) and vegetation indices, and the National Snow and Ice Data Center (NSIDC), which provides snow cover data and related products. These and many other products can be obtained through NASA's Warehouse Inventory Search Tool (WIST).
Converting these data to ArcGIS-compatible formats can be tricky. LP DAAC offers the MODIS Reprojection Tool, which can convert them to GeoTIFFs, and also provides a list of other tools. At the time of this writing, MGET was not included in that list yet it can be use to convert the data. Below, we show how to use MGET's HDF conversion tools to convert land surface temperature (LST) data from LP DAAC's MODIS/Terra Land Surface Temperature & Emissivity Daily L3 Global 1km dataset (MOD11A1) to ArcInfo Binary Grid format.
Understanding the MODIS sinusoidal projection
The crux of this problem is understanding the MODIS sinusoidal projection and representing it properly in ArcGIS. According to various websites, MODIS data providers used an "Integerized Sinusoidal Grid" projection up through Version 3 (V3) of MODIS data. Starting with Version 4, data providers began using a "Sinusoidal Grid" projection. I do not understand the difference between these two projections. I do not know which organization defines the versions and projections. I suspect it is some group at NASA Goddard Space Flight Center (GSFC). My understanding is, at the time of this writing, MODIS data is at Version 5 (V5) and using a Sinusoidal Grid projection. This example only applies to data in that projection. If you have V3 or earlier data, this example presumably does not apply to you. (I did read somewhere that the differences between the two are negligible, but I can't find that web page.)
It is hard to find information on this projection. The best I have found is an NSIDC web page that describes their MODIS/Terra Snow Cover Daily L3 Global 500m Grid, Version 5 product. That page provides the following description:
Gridded resolution is 500 m.
In the sinusoidal projection, areas on the data grids are proportional to the same areas on the Earth, and distances are correct along all parallels and the central meridian. Shapes are increasingly distorted away from the central meridian and near the poles. Finally, the data are neither conformal, perspective, nor equidistant (USGS 2000).
Meridians are represented by sinusoidal curves except for the central meridian, and parallels are represented by straight lines. The central meridian and parallels are straight lines of true scale (Pearson 1990). Specific parameters are listed in Table 2:
Table 2. Sinusoidal Projection Parameters
Earth radius 6371007.181000 meters Projection origin 0° latitude, 0° longitude Orientation 0° longitude, oriented vertically at top Upper left corner point (m) -20015109.354(x), 10007554.677(y) Lower right corner point (m) 20015109.354(x), -10007554.677(y) True scale (m) 463.31271653(x), 463.31271653(y)
The MOD10A1 daily product is gridded in equal area tiles. Each tile consists of a 1200 km by 1200 km data array, which corresponds to 2400 by 2400 cells at 500 m resolution. The following image shows tile locations for MOD10A1 V005 data in a sinusoidal projection.
This projection is based on a sphere rather than an ellipsoid. We can assume this because only one radius is provided, 6371007.181000 m. This value appears to be the authalic radius for WGS 84; if you compute the authalic radius using the a and b parameters for WGS 84, you get this value. So, in essence, this MODIS sinusoidal projection uses a sphere that has the same area as the WGS 84 ellipsoid.
NSIDC describes these data as a 36 by 18 array of 1200 km by 1200 km tiles, where each tile is 2400 by 2400 cells, and the cells are 500 m by 500 m. Each HDF file contains one tile. But the distance 1200 km and 500 km are really just rough approximations. To geolocate a given tile properly, we need to know precise values for the tile's cell size and the coordinates of its lower-left corner. We can do some math on the values of NSIDC's Table 2:
Tile width or height = earth width / 36 = (20015109.354 + 20015109.354) / 36 = 1111950.5196666666 m
Cell size = tile width / cells = 1111950.5196666666 / 2400 = 463.31271652777775 m
X coordinate of tile lower-left corner = -20015109.354 + horizontal tile number * tile width
Y coordinate of tile lower-left corner = -10007554.677 + (17 - vertical tile number) * tile height
The cell size corresponds to the True Scale listed in NSIDC's Table 2.
Converting the MODIS/Terra land surface temperature data with MGET
In this example, we want to convert land surface temperature (LST) data from LP DAAC's MODIS/Terra Land Surface Temperature & Emissivity Daily L3 Global 1km dataset (MOD11A1) to ArcInfo Binary Grid format. At the time of this writing, these data were available from NASA's WIST and also LP DAAC's Data Pool. I use five files in this example:
- British Isles: MOD11A1.A2006034.h17v03.005.2008074151903.hdf
- West Africa: MOD11A1.A2008006.h16v07.005.2008007232041.hdf
- New England: MOD11A1.A2008004.h12v04.005.2008006021326.hdf
- Patagonia: MOD11A1.A2008018.h13v13.005.2008019174742.hdf
- Australia: MOD11A1.A2008004.h28v12.005.2008006022646.hdf
This example resembles the main MGET HDF conversion example. If you have not converted HDFs with MGET before, you may want to review that example before proceeding. Below, I will assume you are familiar with it and not provide as much detail.
Opening the tool and populating the SDS name
First, we open MGET's Convert SDS in HDF to ArcGIS Raster tool, choose the HDF to convert and, specify an output raster. In this example, we will use the British Isles HDF and create a raster called nw_europe. Next, we specify the name of the Scientific Data Set (SDS) contained in the HDF that represents the grid that we want to extract. In this case, we will choose LST_Night_1km, which is the nighttime land surface temperature.
The SDS name is available in the HDF file header (see the main HDF example for how to extract and interpret this) but it is also available on LP DAAC's web page for this dataset, under the Layers tab, along with other useful information:
Calculating the X and Y coordinates of the lower left corner
We can calculate the X and Y coordinates of the lower left corner of the tile using the formulas I listed above. To do this, weneed to know the horizontal and vertical number of the tile. This information is encoded in the file name: MOD11A1.A2006034.h17v03.005.2008074151903.hdf. This tile is horizontal number 17, vertical number 3. So:
X coordinate of lower-left corner = -20015109.354 + horizontal tile number * tile width = -20015109.354 + 17 * 1111950.5196666666 = -1111950.5196666643 m
Y coordinate of lower-left corner = -10007554.677 + (17 - vertical tile number) * tile height = -10007554.677 + (17 - 3) * 1111950.5196666666 = 5559752.5983333346 m
If you search through the HDF file header carefully, you can discover the horizontal and vertical tile numbers in it as well.
Calculating the cell size
LP DAAC's web page for this dataset (and the HDF file header) states that the dimensions are 1200 columns by 1200 rows. From this, we calculate the cell size:
Cell size = tile width / cells = 1111950.5196666666 / 1200 = 926.6254330555555 m
Note that, as of this writing, this page lists the size as 0.928 km, the LP DAAC page lists it as 0.93 km, and the product documentation available from LP DAAC ists it as 0.927 km. For best results, it is essential that you calculate it yourself and use as accurate a value as possible! A difference of a meter or two per cell may not seem like much, but for a 1200x1200 grid, that adds up to one or two cells of cumulative error. If you use one of these rounded-off values, your grid may appear to have a geolocation error.
The tool so far
Now we can populate all of the mandatory fields of the tool, along with the NODATA Value, which we looked up in the Layers tab above.
Defining the coordinate system
Now, under Post-conversion processing, we need to fill in Define coordinate system with a custom sinusoidal projection based on the MODIS 6371007.181 m sphere:
After filling this in as shown, click Finish, Finish, OK to get back to the Convert SDS in HDF to ArcGIS Raster tool.
Setting the map algebra expression
Finally, we need to set Execute map algebra expression to an expression that will scale the integers from the HDF file to the actual floating point values. In the Layers tab above, we saw that the scale factor was 0.2 (this is also present in the HDF header), so we can use this expression:
We also check the Build pyramids box so the tool will save us from doing this later.
The output in ArcMap looks like this. Note that NoData values are set to gray.
Getting other data into this projection
Now that we have converted a land surface temperature image to ArcInfo Binary Grid format, we want to make a map that includes this image and other data, such as the GSHHS shoreline. Because the MODIS projection does not rely on the WGS 84 ellipsoid, or some other well-known ellipsoid, it can be difficult to get other data into this projection. There are some hints about how to do it in an ESRI forum thread. From that thread and some experimentation, I know of two methods.
In this example, we will make a map using the GSHHS full resolution shoreline, which is in the WGS 1984 geographic coordinate system. This is the standard coordinate system that many global products use. From the ArcGIS Spatial Reference Properties dialog, it is accessed by clicking Select, Geographic Coordinate Systems, World, WGS 1984. We obtained the GSHHS shoreline by downloading the GSHHS 2.0 shapefiles available on the GSHHS website.
Method 1: Project to the MODIS projection using custom geographic transformation
This method uses an ArcGIS geoprocessing tool called Create Custom Geographic Transformation, which exists in ArcGIS 9.2 and later. I have only tested with 9.3.1. If you have 9.1, you can probably use the second method below.
First, open the Create Custom Geographic Transformation and fill in the four inputs as follows:
- For Geographic Transformation Name, type a descriptive name such as "WGS 1984 Geographic to MODIS V5 Sinusoidal".
- For Input Geographic Coordinate System, click the button next to the text box, then click Select, Geographic Coordinate Systems, World, WGS 1984 and close the Spatial Reference Properties dialog by clicking OK.
- For Output Geographic Coordinate System, click the button next to the text box, then click Import and import the coordinate system from the raster we created above.
- For Custom Geographic Transformation, select LONGITUDE_ROTATION. It may be the last option available, and you may have to scroll down to see it.
The filled-in tool should look like this:
Now we want to project the WGS 84 geographic data to the MODIS sinusoidal projection. In this case we use the gshhs_f.shp that we created previously. Open the Project tool and populate it as follows:
- For Input Dataset or Feature Class, select the WGS 84 geographic data.
- For Output Dataset or Feature Class, specify a suitable output.
- For Output Coordinate System, click the button next to the text box, then click Import and import the coordinate system from the raster we created above. This will cause a green dot to appear next to the Geographic Transformation parameter, indicating that it is no longer optional.
- For Geographic Transformation, select the custom transformation that we just created.
The filled-in tool should look something like this:
Now we can load the projected shoreline into ArcMap and see that it is aligned fairly closely with the MODIS land surface temperature image. Because the shoreline layer is higher than the MODIS layer, we had to set the Fill Color of the polygons to No Color so the polygons are hollow, allowing the MODIS layer to show through.
I do not fully understand why this method works. I have never seen ESRI documentation on the various transformation methods. Some comments in the ESRI forum thread indicate that the goal of this exercise was to create a "null transformation" and rely on ArcGIS's capability of automatically translating coordinates between ellipsoids and spheres. Basically, we wanted ArcGIS to perform that translation but not do anything else, like a datum shift. This makes sense because both the shoreline and the MODIS data presumably use the same WGS 1984 datum. Maybe if they used a different datum, this would not work. But I am not an expert in datums and projections, so I am hesitant to conclude anything here.
I do know that the forum thread's suggestion of using GEOCENTRIC_TRANSLATION and leaving the parameters set to zero, rather than using LONGITUDE_ROTATION (which has no parameters), does not seem to work. When I tried this, the shoreline was not aligned with the MODIS data, as show below. The black is the shoreline produced with LONGITUDE_ROTATION, the red is it produced with GEOCENTRIC_TRANSLATION.
If anyone provides a more detailed explanation on the forum thread (or just emails me, email@example.com), I will add it here.
Method 2: Just load into ArcMap and ignore the Geographic Coordinate Systems Warning
A second way of achieving the "null translation" discussed above is to start ArcMap, first load the MODIS image, and then load the WGS 84 shoreline, ignoring the Geographic Coordinate Systems Warning that pops up:
Just click Close and the shoreline layer will come out properly aligned. This will presumably work on ArcGIS 9.1 and earlier, but I haven't tested it.
A downside of this approach is that the translation from the WGS 84 ellipsoid to the MODIS sphere happens on the fly, as you pan and zoom the map. This makes ArcMap pretty slow. For faster performance, use Method 1 above. As suggested in the ESRI forum thread, you may also be able to right-click on the shoreline layer and export it using the data frame's coordinate system. Because you loaded the MODIS image first, the data frame uses its coordinate system.
Checking the geolocation of the data
To see how well this land surface temperature data is aligned with the GSHHS full resolution shoreline, I converted the QC_Day variable from the HDF from each of the five images. According to the LP DAAC page, the value 3 (binary 11) indicates that the "LST [was] not produced primarily due to reasons other than cloud". I coded the image so that the value 3 was blue and all other values were green. This effectively shows the land mask used for this dataset.
As you can see, the land mask lines roughly with the GSHHS shoreline, but not exactly. In these images, I zoomed in on recognizable features to see how well they came out. The level of zoom is different in each image.
In some cases, it appears that land extends beyond the shoreline (e.g. in the Africa and Patagonia image). In others, it appears that the image is shifted relative to the shoreline by a cell or two (e.g. New England). I do not know the source of this error. It could be the LST data is not georectified properly. It could be the GSHHS data is not georectified properly (although I doubt that, after working with the GSHHS data for a few years). It could be that the projection procedure used by ESRI is different than that used by the MODIS data provider.
In any case, if you need results accurate to the 1 km resolution of this data, I encourage you to check the georectification of your images using a high resolution dataset that you trust and consult the original data provider about any apparent problems. If you find anything out about this specific dataset, please contact me so I can update this description.
Batch converting HDFs
Once you have determined the parameters needed to convert one tile, you can easily convert a batch of tiles using these values. MGET includes two batch-conversion tools.
If you just want to convert a bunch of HDFs stored in a directory, use Find HDFs and Convert SDS to ArcGIS Rasters. This tool allows you to search a directory tree and select files using a wildcard. The hardest part of using this tool is specifying the Output raster Python expression parameter. For this parameter, you must provide a Python expression that can calculate the name of an output raster given the input HDF name. The default expression:
os.path.join(outputWorkspace, os.path.dirname(inputFile[len(directoryToSearch)+1:]), os.path.basename(inputFile).split(u'.')[:13])
will not work. This expression extracts the characters of the HDF name up to the first dot, with a maximum of 13 characters so the name will be a valid ArcInfo Binary Grid name. If you used this expression with the MODIS data from this example, the default expression would produce the name MOD11A1 for every HDF. The tool would report an error saying that multiple input files produced the same raster name. The second component of the HDF name, which contains the date of the image, may be more unique, resulting in names like A2006034. To get this component, you can modify the default expression to this:
os.path.join(outputWorkspace, os.path.dirname(inputFile[len(directoryToSearch)+1:]), os.path.basename(inputFile).split(u'.')[:13])
This second expression differs from the first by just one character: the split(u'.') was changed to split(u'.').
For more control over output raster names, you could use the second MGET batch tool, Convert SDS in HDFs Listed in Table To ArcGIS Rasters. In this tool, you must provide a table with two fields: one for the input HDF and one for the output raster. The tool goes through the table and performs one conversion for each row. You can populate the table however you like (e.g. construct a .DBF table with Microsoft Excel, construct a .CSV file with a text editor, etc.).
Finally, if you have ArcGIS 9.2 or later, you can use ArcGIS's own batch processing capability. For more information about this, please see the ArcGIS documentation.