performance - Python - Exclude contents of one file from another / removing duplicate lines amongst two files -


first off, i'm using python 2.7.9 ..... now, i'm trying find efficient way compare lines of 1 text file (file a) lines of text file (file b) , write lines unique file new file (file a\b).

actually i've written short script this, beyond slow... need script able handle files of 70mb(each, a&b), unthinkable 'bad' boy:

import string naked = string.strip kiss = ''.join  def main():     list1 = raw_input("enter name of .txt-file clean!\n")     list2 = raw_input("enter name of .txt-file exclude!\n")     action(list1, list2)     raw_input("done!\npress [enter] exit!")  def action(list1, list2):     f = open(kiss([list1, '.txt']), "r")     g = open(kiss([list2, '.txt']), "r")     h = open(kiss([list1, '_without_', list2, '.txt']), "w")     h_w = h.write     reset = g.seek     found = false     in f:         found = [true j in g if naked(i) == naked(j)]         if not found:             h_w(kiss([naked(i), '\n']))         else:             found = false         reset(0)     f.close()     g.close()     h.close()  main() 

yeah... have idea how more efficiently?! in advance!

def read_file(filename):     open(filename) src:         return [line.strip() line in src.readlines()]   def main():     list1 = raw_input("enter name of .txt-file clean!\n")     list2 = raw_input("enter name of .txt-file exclude!\n")     file1 = read_file(list1)     file2 = read_file(list2)     file3 = open('new_file.txt', 'w')      line in file1:         if line not in file2:             file3.write(str(line) + '\n')  # writes new file      file3.close()     print 'completed'  main() 

i not sure fastest way trick. can use "diff" or "comm" linux commands required output.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -