Python Iterating Through Two Files By Line At The Same Time
I am trying to compare columns in two files to see if the values match, and if there is a match I want to merge/concatenate the data for that row together. My issue is that when re
Solution 1:
Use the zip
builtin function.
withopen(file1) as f1, open(file2) as f2:
for line1, line2 inzip(f1, f2):
motif1 = line1.split()[0]
motif2 = line2.split()[0]
...
Note that zip
behaves differently in python2 and python3. In python2, it would be more efficient to use itertools.izip
instead.
Solution 2:
I'm assuming you're using Python 3. Here's a nice abstraction, iterlines
. It hides the complexity of opening, reading, pairing, and closing n files. Note the use of zip_longest
, this prevents the ends of longer files being silently discarded.
defiterlines(*paths, fillvalue=None, **open_kwargs):
files = []
try:
for path in paths:
files.append(open(path, **open_kwargs))
for lines in zip_longest(*files, fillvalue=fillvalue):
yield lines
finally:
for file_ in files:
with suppress():
file_.close()
Usage
for line_a, line_b in iterlines('a.txt', 'b.txt'):
print(line_a, line_b)
Complete code
from contextlib import suppress
from itertools import zip_longest
defiterlines(*paths, fillvalue=None, **open_kwargs):
files = []
try:
for path in paths:
files.append(open(path, **open_kwargs))
for lines in zip_longest(*files, fillvalue=fillvalue):
yield lines
finally:
for file_ in files:
with suppress():
file_.close()
for lines in iterlines('a.txt', 'b.txt', 'd.txt'):
print(lines)
Post a Comment for "Python Iterating Through Two Files By Line At The Same Time"