python - Read binary file which has different datatypes -

- September 15, 2015

attempting read binary file produced in fortran python, has integers, reals , logicals. @ moment read first few numbers correctly with:

x = np.fromfile(filein, dtype=np.int32, count=-1) firstint= x[1] ...

(np numpy). next item logical. , later on ints again , after reals. how can it?

typically, when you're reading in values such this, they're in regular pattern (e.g. array of c-like structs).

another common case short header of various values followed bunch of homogenously typed data.

let's deal first case first.

reading in regular patterns of data types

for example, might have like:

float, float, int, int, bool, float, float, int, int, bool, ...

if that's case, can define dtype match pattern of types. in case above, might like:

dtype=[('a', float), ('b', float), ('c', int), ('d', int), ('e', bool)]

(note: there many different ways define dtype. example, write np.dtype('f8,f8,i8,i8,?'). see documentation numpy.dtype more information.)

when read array in, structured array named fields. can later split individual arrays if you'd prefer. (e.g. series1 = data['a'] dtype defined above)

the main advantage of reading in data disk very fast. numpy read memory, , interpret memory buffer according pattern specified.

the drawback structured arrays behave bit differently regular arrays. if you're not used them, they'll seem confusing @ first. key part remember each item in array 1 of patterns specified. example, showed above, data[0] might (4.3, -1.2298, 200, 456, false).

reading in header

another common case have header know format , long series of regular data. can still use np.fromfile this, you'll need parse header seperately.

first, read in header. can in several different ways (e.g. have @ struct module in addition np.fromfile, though either work purposes).

after that, when pass file object fromfile, file's internal position (i.e. position controlled f.seek) @ end of header , start of data. if of rest of file homogenously-typed array, single call np.fromfile(f, dtype) need.

as quick example, might have following:

import numpy np  # let's have file 512 byte header,  # first 16 bytes of width , height  # stored big-endian 64-bit integers.  rest of # "main" data array stored little-endian 32-bit floats  open('data.dat', 'r') f:     width, height = np.fromfile(f, dtype='>i8', count=2)     # seek end of header , ignore rest of     f.seek(512)     data = np.fromfile(f, dtype=np.float32)  # presumably we'd want reshape data 2d array: data = data.reshape((height, width))

Search This Blog

Th