Python H5py - Why Do I Get A Broadcast Error?
Solution 1:
As @hpaulj noted, h5py returns HDF5 datasets as NumPy arrays. It can be confusing at first -- h5py uses Python's dictionary syntax to reference HDF5 objects (groups and datasets), but they are not dictionaries! Your attempt is on the right track. You need to modify it to work with data from hdf['metaData']
as a NumPy array instead of list.
Example below should work. Since I don't have the starting file (data.h5
), I created a file to replicate the values in the image. Code to create that file is at the end.
Note 1: This example would be simpler if it didn't use use '°'
for degree. That adds a extra step working with string data. That's why I used np.char.decode()
to print dset['Variable name']
and dset['Unit']
. More about that below.
Note 2: Be careful using hard-code string sizes. ('Variable name', 'S10')
is from the answer to an earlier question (that had shorter strings). The code below gets the size from the hdf['metaData']
dataset dtype, then uses it when defining the dtype used for the new dataset in telemetry.h5
.
with h5.File('data.h5','r') as hdf:
#read dataset as a numpy array metaData, it is not a dictionary
metaData = hdf['metaData'][:]
print('dtype is:',metaData.dtype)
#write new h5 filewith h5.File('telemetry.h5', 'w') as var:
dt = np.dtype( [('n°', int), ('Variable name', metaData.dtype), ('Unit', metaData.dtype)] )
dset = var.create_dataset( 'data', dtype=dt, shape=(len(metaData),))
dset['n°'] = np.arange(metaData.shape[0])
dset['Variable name'] = metaData[:,0] # use first column of metaData
dset['Unit'] = metaData[:,-1] # use last column of metaDatafor row in dset:
print(row[0], np.char.decode(row[1]), np.char.decode(row[2]))
Here is the code I wrote to mimic data.h5
. When you printed the raw data (as byte strings), you see b'\xc2\xb0'
where you expect '°'
for degree units. There are some complications with that character with h5py. It is fine in Python and NumPy. However, h5py (and HDF5) don't support NumPy's Unicode dtype; you need to use Numpy byte arrays ('S' dtype). That's why I used np.char.encode()
to encode array arr
to arr2
before saving with h5py. That's why np.char.decode()
is necessary when printing dset['Variable name']
and dset['Unit']
above -- you have to decode the encoded string byte data back to Unicode.
arr = np.array( \
[['ADC_ALT_TC', 'ADC_ALT_TC', 'ADC.ALT:TC [ft]', 'ft'],
['ADC_AOA_TC', 'ADC_AOA_TC', 'ADC.AOA:TC [°]', '°'],
['ADC_AOS_TC', 'ADC_AOS_TC', 'ADC.AOS:TC [°]', '°'],
['ADC_CAS_TC', 'ADC_CAS_TC', 'ADC.CAS:TC [kts]', 'kts'],
['ADC_OAT_TC', 'ADC_OAT_TC', 'ADC.OAT:TC [°]'., '°'],
['ADC_SpeedWarning_TC', 'ADC_SpeedWarning_TC','ADC_SpeedWarning:TC', ''],
['ADC_Stall_TC', 'ADC_Stall_TC','ADC_Stall:TC', ''],
['ADC_TAS_TC', 'ADC_TAS_TC', 'ADC.TAS:TC [kts]', 'kts'],
['AHRS1_accX_TC', 'AHRS1_accX_TC', 'AHRS1.accX:TC [m/s^2]', 'm/s^2'],
['AHRS1_accY_TC', 'AHRS1_accY_TC', 'AHRS1.accX:TC [m/s^2]', 'm/s^2'],
['AHRS1_accZ_TC', 'AHRS1_accZ_TC', 'AHRS1.accX:TC [m/s^2]', 'm/s^2'] ]
print(arr)
arr2 = np.char.encode(arr)
print(arr2)
with h5.File('data.h5','w') as hdf:
hdf.create_dataset('metaData',data=arr2)
Here are some references for those that need more details about the underlying HDF5/h5py requirement, or encoding/decoding between Unicode and string bytes:
Post a Comment for "Python H5py - Why Do I Get A Broadcast Error?"