Python Iterating Through Two Files By Line At The Same Time
I am trying to compare columns in two files to see if the values match, and if there is a match I want to merge/concatenate the data for that row together. My issue is that when re
Solution 1:
Use the zip builtin function.
withopen(file1) as f1, open(file2) as f2:
    for line1, line2 inzip(f1, f2):
        motif1 = line1.split()[0]
        motif2 = line2.split()[0]
        ...
Note that zip behaves differently in python2 and python3. In python2, it would be more efficient to use itertools.izip instead.
Solution 2:
I'm assuming you're using Python 3. Here's a nice abstraction, iterlines. It hides the complexity of opening, reading, pairing, and closing n files. Note the use of zip_longest, this prevents the ends of longer files being silently discarded.
defiterlines(*paths, fillvalue=None, **open_kwargs):
    files = []
    try:
        for path in paths:
            files.append(open(path, **open_kwargs))
        for lines in zip_longest(*files, fillvalue=fillvalue):
            yield lines
    finally:
        for file_ in files:
            with suppress():
                file_.close()
Usage
for line_a, line_b in iterlines('a.txt', 'b.txt'):
    print(line_a, line_b)
Complete code
from contextlib import suppress
from itertools import zip_longest
defiterlines(*paths, fillvalue=None, **open_kwargs):
    files = []
    try:
        for path in paths:
            files.append(open(path, **open_kwargs))
        for lines in zip_longest(*files, fillvalue=fillvalue):
            yield lines
    finally:
        for file_ in files:
            with suppress():
                file_.close()
for lines in iterlines('a.txt', 'b.txt', 'd.txt'):
    print(lines)
Post a Comment for "Python Iterating Through Two Files By Line At The Same Time"