python - Read binary file which has different datatypes -
attempting read binary file produced in fortran python, has integers, reals , logicals. @ moment read first few numbers correctly with:
x = np.fromfile(filein, dtype=np.int32, count=-1) firstint= x[1] ...
(np numpy). next item logical. , later on ints again , after reals. how can it?
typically, when you're reading in values such this, they're in regular pattern (e.g. array of c-like structs).
another common case short header of various values followed bunch of homogenously typed data.
let's deal first case first.
reading in regular patterns of data types
for example, might have like:
float, float, int, int, bool, float, float, int, int, bool, ...
if that's case, can define dtype match pattern of types. in case above, might like:
dtype=[('a', float), ('b', float), ('c', int), ('d', int), ('e', bool)]
(note: there many different ways define dtype. example, write np.dtype('f8,f8,i8,i8,?')
. see documentation numpy.dtype
more information.)
when read array in, structured array named fields. can later split individual arrays if you'd prefer. (e.g. series1 = data['a']
dtype defined above)
the main advantage of reading in data disk very fast. numpy read memory, , interpret memory buffer according pattern specified.
the drawback structured arrays behave bit differently regular arrays. if you're not used them, they'll seem confusing @ first. key part remember each item in array 1 of patterns specified. example, showed above, data[0]
might (4.3, -1.2298, 200, 456, false)
.
reading in header
another common case have header know format , long series of regular data. can still use np.fromfile
this, you'll need parse header seperately.
first, read in header. can in several different ways (e.g. have @ struct
module in addition np.fromfile
, though either work purposes).
after that, when pass file object fromfile
, file's internal position (i.e. position controlled f.seek
) @ end of header , start of data. if of rest of file homogenously-typed array, single call np.fromfile(f, dtype)
need.
as quick example, might have following:
import numpy np # let's have file 512 byte header, # first 16 bytes of width , height # stored big-endian 64-bit integers. rest of # "main" data array stored little-endian 32-bit floats open('data.dat', 'r') f: width, height = np.fromfile(f, dtype='>i8', count=2) # seek end of header , ignore rest of f.seek(512) data = np.fromfile(f, dtype=np.float32) # presumably we'd want reshape data 2d array: data = data.reshape((height, width))
Comments
Post a Comment