Detect empty HDF5 data
In some cases, a HDF5 master file is moved but the underlying data files are not moved. Consequently, if we read data from the moved file, the data will be all zeros. (The similar situation on a filesystem is moving a symbolic link without moving the file it point to).
It would be good to have a mechanism detecting such cases.
A simple but inelegant way would be
def check_virtual_sources_exist(fname, data_path):
with HDF5File(fname, "r") as f:
if data_path not in f:
print("No dataset %s in file %s" % (data_path, fname))
return False
dptr = f[data_path]
if not dptr.is_virtual:
return True
for vsource in dptr.virtual_sources():
vsource_fname = path.join(path.dirname(dptr.file.filename), vsource.file_name)
if not path.isfile(vsource_fname):
print("No such file: %s" % vsource_fname)
return False
elif not check_virtual_sources_exist(vsource_fname, vsource.dset_name):
print("Error with virtual source %s" % vsource_fname)
return False
return True