Python Function To Read Variable Length Blocks Of Data From File While Open
Solution 1:
You can use the max_rows
argument of numpy.genfromtxt
:
withopen("timesteps.dat", "rb") as f:
whileTrue:
line = f.readline()
iflen(line) == 0:
# End of filebreak# Skip blank lineswhilelen(line.strip()) == 0:
line = f.readline()
line2_fields = f.readline().split()
timestep = float(line2_fields[0])
particles = int(line2_fields[1])
data = np.genfromtxt(f, names=True, dtype=None, max_rows=particles)
print("Timestep:", timestep)
print("Particles:", particles)
print("Data:")
print(data)
print()
Here's a sample file:
TIMESTEP PARTICLES
0.005001034
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
65105.23599e-070.000397935-0.084626-0.03478490.0018816400-1.0490343005.23599e-070.000397935-0.0837742-0.04422930.012104600-1.0490338405.23599e-070.000397935-0.0749234-0.03956520.014340100-1.0490397105.23599e-070.000397935-0.0954931-0.01596070.010015500-1.04903
TIMESTEP PARTICLES
0.005001035
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
97105.23599e-070.000397935-0.0954931-0.01596070.010015500-1.0490365205.23599e-070.000397935-0.084626-0.03478490.0018816400-1.0490343105.23599e-070.000397935-0.0837742-0.04422930.012104600-1.0490338505.23599e-070.000397935-0.0749234-0.03956520.014340100-1.0490397205.23599e-070.000397935-0.0954931-0.01596070.010015500-1.04903
TIMESTEP PARTICLES
0.005001033
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
22205.23599e-070.000397935-0.0837742-0.04422930.012104600-1.0490333305.23599e-070.000397935-0.0749234-0.03956520.014340100-1.0490344405.23599e-070.000397935-0.0954931-0.01596070.010015500-1.04903
And here is the output:
Timestep: 0.00500103Particles: 4Data:
[ (651, 0, 5.23599e-07, 0.000397935, -0.084626, -0.0347849, 0.00188164, 0, 0, -1.04903)
(430, 0, 5.23599e-07, 0.000397935, -0.0837742, -0.0442293, 0.0121046, 0, 0, -1.04903)
(384, 0, 5.23599e-07, 0.000397935, -0.0749234, -0.0395652, 0.0143401, 0, 0, -1.04903)
(971, 0, 5.23599e-07, 0.000397935, -0.0954931, -0.0159607, 0.0100155, 0, 0, -1.04903)]
Timestep: 0.00500103Particles: 5Data:
[ (971, 0, 5.23599e-07, 0.000397935, -0.0954931, -0.0159607, 0.0100155, 0, 0, -1.04903)
(652, 0, 5.23599e-07, 0.000397935, -0.084626, -0.0347849, 0.00188164, 0, 0, -1.04903)
(431, 0, 5.23599e-07, 0.000397935, -0.0837742, -0.0442293, 0.0121046, 0, 0, -1.04903)
(385, 0, 5.23599e-07, 0.000397935, -0.0749234, -0.0395652, 0.0143401, 0, 0, -1.04903)
(972, 0, 5.23599e-07, 0.000397935, -0.0954931, -0.0159607, 0.0100155, 0, 0, -1.04903)]
Timestep: 0.00500103Particles: 3Data:
[ (222, 0, 5.23599e-07, 0.000397935, -0.0837742, -0.0442293, 0.0121046, 0, 0, -1.04903)
(333, 0, 5.23599e-07, 0.000397935, -0.0749234, -0.0395652, 0.0143401, 0, 0, -1.04903)
(444, 0, 5.23599e-07, 0.000397935, -0.0954931, -0.0159607, 0.0100155, 0, 0, -1.04903)]
Solution 2:
The with does not loop, it will just make sure the file is properly closed afterwards.
To loop you'll need to add a while just after the with statement (see the code below). But before you can do that you'll need to check in the readBlock(f) function for an end of file (EOF). Replace line = f.readline().strip()
with this code:
line = f.readline()
ifnot line:
# EOF: returning None's.returnNone, None, None# We do the strip after the check.# Otherwise a blank line "\n" might be interpreted as EOF.
line = line.strip()
So adding the while loop in the with block and checking if we get None
back indicating an EOF and so we can break out of the while loop:
withopen('file1') as file_handle:
whileTrue:
startWallTime = time.clock()
Timestep, numParticles, particleData = readBlock(file_handle)
if Timestep == None:
breakprint(Timestep)
## Do processing stuff here print("Timestep Processed")
endWallTime = time.clock()
Solution 3:
Here'a quick-n-dirty test (it worked on the 2nd try!)
import numpy as np
withopen('stack41091659.txt','rb') as f:
while f.readline(): # read the 'TIMESTEP PARTICLES' line
time, n = f.readline().strip().split()
n = int(n)
print(time, n)
ablock = [f.readline()] # block header linefor i inrange(n):
ablock.append(f.readline())
print(len(ablock))
data = np.genfromtxt(ablock, dtype=None, names=True)
print(data.shape, data.dtype)
test run:
1458:~/mypy$ python3 stack41091659.py
b'0.00500103'45
(4,) [('ID', '<i4'), ('GROUP', '<i4'), ('VOLUME', '<f8'), ('MASS', '<f8'), ('PX', '<f8'), ('PY', '<f8'), ('PZ', '<f8'), ('VX', '<i4'), ('VY', '<i4'), ('VZ', '<f8')]
b'0.00500103'34
(3,) [('ID', '<i4'), ('GROUP', '<i4'), ('VOLUME', '<f8'), ('MASS', '<f8'), ('PX', '<f8'), ('PY', '<f8'), ('PZ', '<f8'), ('VX', '<i4'), ('VY', '<i4'), ('VZ', '<f8')]
b'0.00500103'23
(2,) [('ID', '<i4'), ('GROUP', '<i4'), ('VOLUME', '<f8'), ('MASS', '<f8'), ('PX', '<f8'), ('PY', '<f8'), ('PZ', '<f8'), ('VX', '<i4'), ('VY', '<i4'), ('VZ', '<f8')]
b'0.00500103'45
(4,) [('ID', '<i4'), ('GROUP', '<i4'), ('VOLUME', '<f8'), ('MASS', '<f8'), ('PX', '<f8'), ('PY', '<f8'), ('PZ', '<f8'), ('VX', '<i4'), ('VY', '<i4'), ('VZ', '<f8')]
Sample file:
TIMESTEP PARTICLES
0.005001034
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
65105.23599e-070.000397935-0.084626-0.03478490.0018816400-1.0490343005.23599e-070.000397935-0.0837742-0.04422930.012104600-1.0490338405.23599e-070.000397935-0.0749234-0.03956520.014340100-1.0490397105.23599e-070.000397935-0.0954931-0.01596070.010015500-1.04903
TIMESTEP PARTICLES
0.005001033
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
65105.23599e-070.000397935-0.084626-0.03478490.0018816400-1.0490343005.23599e-070.000397935-0.0837742-0.04422930.012104600-1.0490338405.23599e-070.000397935-0.0749234-0.03956520.014340100-1.04903
TIMESTEP PARTICLES
0.005001032
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
38405.23599e-070.000397935-0.0749234-0.03956520.014340100-1.0490397105.23599e-070.000397935-0.0954931-0.01596070.010015500-1.04903
TIMESTEP PARTICLES
0.005001034
ID GROUP VOLUME MASS PX PY PZ VX VY VZ
65105.23599e-070.000397935-0.084626-0.03478490.0018816400-1.0490343005.23599e-070.000397935-0.0837742-0.04422930.012104600-1.0490338405.23599e-070.000397935-0.0749234-0.03956520.014340100-1.0490397105.23599e-070.000397935-0.0954931-0.01596070.010015500-1.04903
I'm using the fact that genfromtxt
is happy with anything that feeds it a block of lines. Here I collect the next block in a list, and pass it to genfromtxt
.
And using the max_rows
parameter of genfromtxt
, I can tell it to read the next n
rows directly:
withopen('stack41091659.txt','rb') as f:
while f.readline():
time, n = f.readline().strip().split()
n = int(n)
print(time, n)
data = np.genfromtxt(f, dtype=None, names=True, max_rows=n)
print(data.shape, len(data.dtype.names))
I'm not taking into account that optional blank line. Probably could squeeze that in at the start of the block read. I.e. Readlines until I get one with the valid float int
pair of strings.
Post a Comment for "Python Function To Read Variable Length Blocks Of Data From File While Open"