[Developers] CarpetIOHDF5 indexes

Ian Hinder ian.hinder at aei.mpg.de
Sat May 15 16:58:24 CDT 2010

(Sent to Cactus list because Carpet list is down)

I have recently been working on improving the performance of the  
visitCarpetHDF5 plugin.  I noticed that for large files it spends a  
lot of time when the file is opened reading dataset attributes.  Since  
these are stored in amongst the raw data in the HDF5 files, this can  
be very inefficient due to block reads transferring more data than is  
necessary.  The attached patch to CarpetIOHDF5 adds support for  
writing an "index" HDF5 file at the same time as the data file,  
conditional on a parameter "CarpetIOHDF5::output_index".  The index  
file is the same as the data file except it contains null datasets,  
and hence is very small.  The attributes can be read from this index  
file instead of the data file, greatly increasing performance.  The  
datasets have size 1 in the index file, so an additional attribute  
(h5space) is added to the dataset to specify the correct dataset  
dimensions.  For a file phi.file_0.h5, the index file will be named  
phi.file_0.idx.h5, and you will get one for each data file.

I have also written support for these index files for  
visitCarpetHDF5.  For a test case, it reduced the time to open the  
file initially from 160 seconds to 20 seconds.

Ian Hinder
ian.hinder at aei.mpg.de

