Reading Hdf Files into R and Converting Them to Geotiff Rasters

Reading hdf files into R and converting them to geoTIFF rasters

Ok, so my MODIS hdf files were hdf4 rather than hdf5 format. It was surprisingly difficult to discover this, MODIS don't mention it on their website but there are a few hints in various blogs and stack exchange posts. In the end I had to download HDFView to find out for sure.

R doesn't do hdf4 files and pretty much all the packages (like rgdal) only support hdf5 files. There are a few posts about downloading drivers and compiling rgdal from source but it all seemed rather complicated and the posts were for MAC or Unix and I'm using Windows.

Basically gdal_translate from the gdalUtils package is the saving grace for anyone who wants to use hdf4 files in R. It converts hdf4 files into geoTIFFs without reading them into R. This means that you can't manipulate them at all e.g. by cropping them, so its worth getting the smallest tiles you can (for MODIS data through something like Reverb) to minimise computing time.

Here's and example of the code:

library(gdalUtils)

# Provides detailed data on hdf4 files but takes ages

gdalinfo("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")

# Tells me what subdatasets are within my hdf4 MODIS files and makes them into a list

sds <- get_subdatasets("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")
sds

[1] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_500m"
[2] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_QC_500m"

# I'm only interested in the first subdataset and I can use gdal_translate to convert it to a .tif

gdal_translate(sds[1], dst_dataset = "NPP2000.tif")

# Load and plot the new .tif

rast <- raster("NPP2000.tif")
plot(rast)

# If you have lots of files then you can make a loop to do all this for you

files <- dir(pattern = ".hdf")
files

[1] "MOD17A3H.A2000001.h21v09.006.2015141183401.hdf" "MOD17A3H.A2001001.h21v09.006.2015148124025.hdf"
[3] "MOD17A3H.A2002001.h21v09.006.2015153182349.hdf" "MOD17A3H.A2003001.h21v09.006.2015166203852.hdf"
[5] "MOD17A3H.A2004001.h21v09.006.2015099031743.hdf" "MOD17A3H.A2005001.h21v09.006.2015113012334.hdf"
[7] "MOD17A3H.A2006001.h21v09.006.2015125163852.hdf" "MOD17A3H.A2007001.h21v09.006.2015169164508.hdf"
[9] "MOD17A3H.A2008001.h21v09.006.2015186104744.hdf" "MOD17A3H.A2009001.h21v09.006.2015198113503.hdf"
[11] "MOD17A3H.A2010001.h21v09.006.2015216071137.hdf" "MOD17A3H.A2011001.h21v09.006.2015230092603.hdf"
[13] "MOD17A3H.A2012001.h21v09.006.2015254070417.hdf" "MOD17A3H.A2013001.h21v09.006.2015272075433.hdf"
[15] "MOD17A3H.A2014001.h21v09.006.2015295062210.hdf"

filename <- substr(files,11,14)
filename <- paste0("NPP", filename, ".tif")
filename

[1] "NPP2000.tif" "NPP2001.tif" "NPP2002.tif" "NPP2003.tif" "NPP2004.tif" "NPP2005.tif" "NPP2006.tif" "NPP2007.tif" "NPP2008.tif"
[10] "NPP2009.tif" "NPP2010.tif" "NPP2011.tif" "NPP2012.tif" "NPP2013.tif" "NPP2014.tif"

i <- 1

for (i in 1:15){
sds <- get_subdatasets(files[i])
gdal_translate(sds[1], dst_dataset = filename[i])
}

Now you can read your .tif files into R using, for example, raster from the raster package and work as normal. I've checked the resulting files against a few I converted manually using QGIS and they match so I'm confident the code is doing what I think it is. Thanks to Loïc Dutrieux and this for the help!

Converting HDF to georeferenced file (geotiff, shapefile)

You've not set the extent of your raster, so its assumed to be 1:ncols, 1:nrows, and that's not right for a lat-long data set...

The gdalinfo implies its meant to be a full sphere, so if I do:

 extent(rast)=c(xmn=-180, xmx=180, ymn=-90, ymx=90)
plot(rast)
writeRaster(rast, "output.tif")

I see a raster with full global lat-long extent and when I load the raster into QGIS it overlays nicely with an OpenStreetMap.

There doesn't seem to be enough metadata in the file to do a precise projection (what's the Earth radius and eccentricity?) so don't try and do anything small-scale with this data...

Here's how it looks:

Sample Image

Also you've jumped through a few unnecessary hoops to read this. You can read the HDF directly and set its extent and projection:

> r = raster("./cbpm.2017001.hdf")

What do we get:

> r
class : RasterLayer
dimensions : 1080, 2160, 2332800 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : 0, 2160, 0, 1080 (xmin, xmax, ymin, ymax)
coord. ref. : NA
data source : /home/rowlings/Downloads/HDF/cbpm.2017001.hdf
names : cbpm.2017001

Set extent:

> extent(r)=c(xmn=-180, xmx=180, ymn=-90, ymx=90)

And projection:

> projection(r)="+init=epsg:4326"

And land values to NA:

> r[r==-9999]=NA

Write it, plot it:

> writeRaster(r,"r.tif")
> plot(r)

Reading and working with FireModis HDF files in R

Reading the documentation in the link provided, you're trying to create a 5-dimensional dataset.

And my best guess is, that you're using normal GDAL (as opposed to this GEE?) which seems to be the cause for your error (according to the link):

The RASTERXDIM, ..., RASTER4DIM options allow you to access 5-dimensional dataset and they are available only in GEE. If you use regular GDAL, you cannot access the dataset correctly.

But if you just want to read and process the HDF files, this works just fine:

library(MODIS)

## Note: I'm using the MODIS package to download the HDF file (not necessary if you have it on disk.

# the hdf variable will be the path to the file
hdf <- getHdf(HdfName = 'MCD64A1.A2000306.h12v11.006.2017012010432.hdf',forceDownload=T)

# print the subdatasets

gdalUtils::get_subdatasets(hdf)

# [1] "HDF4_EOS:EOS_GRID:/tmp/Rtmp9QhqAG/MODIS_ARC/MODIS/MCD64A1.006/2000.11.01/MCD64A1.A2000306.h12v11.006.2017012010432.hdf:MOD_Grid_Monthly_500m_DB_BA:Burn Date"
# [2] "HDF4_EOS:EOS_GRID:/tmp/Rtmp9QhqAG/MODIS_ARC/MODIS/MCD64A1.006/2000.11.01/MCD64A1.A2000306.h12v11.006.2017012010432.hdf:MOD_Grid_Monthly_500m_DB_BA:Burn Date Uncertainty"
# [3] "HDF4_EOS:EOS_GRID:/tmp/Rtmp9QhqAG/MODIS_ARC/MODIS/MCD64A1.006/2000.11.01/MCD64A1.A2000306.h12v11.006.2017012010432.hdf:MOD_Grid_Monthly_500m_DB_BA:QA"
# [4] "HDF4_EOS:EOS_GRID:/tmp/Rtmp9QhqAG/MODIS_ARC/MODIS/MCD64A1.006/2000.11.01/MCD64A1.A2000306.h12v11.006.2017012010432.hdf:MOD_Grid_Monthly_500m_DB_BA:First Day"
# [5] "HDF4_EOS:EOS_GRID:/tmp/Rtmp9QhqAG/MODIS_ARC/MODIS/MCD64A1.006/2000.11.01/MCD64A1.A2000306.h12v11.006.2017012010432.hdf:MOD_Grid_Monthly_500m_DB_BA:Last Day"

sds <- gdalUtils::get_subdatasets(hdf)

r <- raster(sds[1])

# check raster output
r

# class : RasterLayer
# dimensions : 2400, 2400, 5760000 (nrow, ncol, ncell)
# resolution : 463.3127, 463.3127 (x, y)
# extent : -6671703, -5559753, -3335852, -2223901 (xmin, xmax, ymin, ymax)
# coord. ref. : +proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs
# data source : HDF4_EOS:EOS_GRID:/tmp/Rtmp9QhqAG/MODIS_ARC/MODIS/MCD64A1.006/2000.11.01/MCD64A1.A2000306.h12v11.006.2017012010432.hdf:MOD_Grid_Monthly_500m_DB_BA:Burn Date
# names : MCD64A1.A2000306.h12v11.006.2017012010432.hdf.MOD_Grid_Monthly_500m_DB_BA.Burn_Date
# values : -32768, 32767 (min, max)

Convert HDF4 to raster; with longitude/latitude grids are available in another HDF file

I used one of the geotiff files that they also make available to find the extent and crs.

library(raster)
raster('asi-AMSR2-s6250-20180922-v5.tif')
#class : RasterLayer
#dimensions : 1328, 1264, 1678592 (nrow, ncol, ncell)
#resolution : 6250, 6250 (x, y)
#extent : -3950000, 3950000, -3950000, 4350000 (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=stere +lat_0=-90 +lat_ts=-70 +lon_0=0 +k=1 +x_0=0 +y_0=0 +a=6378273 +b=6356889.449 +units=m +no_defs
#data source : asi-AMSR2-s6250-20180922-v5.tif
#names : asi.AMSR2.s6250.20180922.v5
#values : 0, 255 (min, max)

Now I know I can do

library(raster)
CurrTemp <- tempfile()
download.file(url = "https://seaice.uni-bremen.de/data/amsre/asi_daygrid_swath/s6250/2003/feb/Antarctic/asi-s6250-20030214-v5.hdf", destfile = CurrTemp, mode = "wb", quiet = T)
r <- raster(CurrTemp)

extent(r) <- c(-3950000, 3950000, -3950000, 4350000)
crs(r) <- "+proj=stere +lat_0=-90 +lat_ts=-70 +lon_0=0 +k=1 +x_0=0 +y_0=0 +a=6378273 +b=6356889.449 +units=m +no_defs "
# writeRaster(r, 'my_asi-s6250-20030214-v5.tif')

The "other hdf" file has longitude / latitude values for the cells, but that is not what you are after as the data do not have a lon/lat coordinate reference system.



Related Topics



Leave a reply



Submit