Python: how to slice a csv file with respect to a column other than the first? -

July 15, 2012

i have csv file displays number of columns , 500000 rows. need slice file respect second column, displays year, maintaining other columns:

col1   col2   col3   col4   col5   col6   col7 xxx    1986   xxx    xxx    xxx    xxx    xxx xxx    1992   xxx    xxx    xxx    xxx    xxx xxx    1998   xxx    xxx    xxx    xxx    xxx ...    ...    ...    ...    ...    ...    ... xxx    2015   xxx    xxx    xxx    xxx    xxx xxx    1984   xxx    xxx    xxx    xxx    xxx

my question: how can produce csv file out of this, values in second column >=1992?

desired output:

col1   col2   col3   col4   col5   col6   col7 xxx    1992   xxx    xxx    xxx    xxx    xxx xxx    1998   xxx    xxx    xxx    xxx    xxx xxx    2015   xxx    xxx    xxx    xxx    xxx

my attempt this, got stuck @ point should insert if linked second column, don't know how that:

from __future__ import division import numpy numpy import * import csv collections import * import os import glob  directorypath=raw_input('working directory: ') #indicates csv file located i,file in enumerate(os.listdir(directorypath)): #loops on folder csv files     if file.endswith(".csv"): #checks if csv files         filename=os.path.basename(file) #takes complete path file         filelabel=file #takes filename         strpath = os.path.join(directorypath, file) #retrieves complete path find csv file         x=numpy.genfromtxt(strpath, delimiter=',')[:,7] #i got stuck here

you can iterate on rows of csv see if value in col2 >= year interested in. if is, add row new list. pass in new list csv writer. can call function in loop create new csvs files ending csv extension.

you have pass in working_directory , year. folder of csvs want process.

import csv import os def make_csv(in_file, out_file, year):     open(in_file, 'rb') csv_in_file:         csv_row_list = []         first_row = true         csv_reader = csv.reader(csv_in_file)         row in csv_reader:             if first_row:                 csv_row_list.append(row)                 first_row = false             else:                 if int(row[1]) >= year:                     csv_row_list.append(row)      open(out_file, 'wb') csv_out_file:         csv_writer = csv.writer(csv_out_file)         csv_writer.writerows(csv_row_list)  root, directories, files in os.walk(working_directory):     f in files:         if f.endswith('.csv'):             in_file = os.path.join(root, f)             out_file = os.path.join(root, os.path.splitext(f)[0] + '_new' + os.path.splitext(f)[1])             make_csv(in_file, out_file, year)

Search This Blog

Color

Python: how to slice a csv file with respect to a column other than the first? -

Comments

Post a Comment

Popular posts from this blog

android - net_scheduler holding wakelock -

sql - MySQL : Getting Entries from a many-to-many table -

java - Retrieving data from database using jsp (Hibernate + Spring + Maven) -