java - Cache file in memory and read in parallel -
i've program (simple log parser) that's slow couse in cases had full scan input file. think pre-cache entire file (~100mb) in , read multiple thread.
with actual configuration use bufferedreader "main read" , randomaccessfile goto onto specific offset , read need.
i've tried way:
.. reader reader = null; if (cache) { // caching file in memory br = new bufferedreader(new filereader(file)); buffer = new stringbuilder(); (string line = br.readline(); line != null; line = br.readline()) { buffer.append(line).append(cr); } br.close(); reader = new stringreader(buffer.tostring()); } else { reader = new filereader(file); } br = new bufferedreader(reader); (string line = br.readline(); line != null; line = br.readline()) { offset += line.length() + 1; // il +1 รจ per il line.separator matcher = constants.pt_begin_composition.matcher(line); if (matcher.matches()) { linecount++; record = new record(); record.setcompositioncode(matcher.group(1)); matcher = constants.pt_prefix.matcher(line); if (matcher.matches()) { record.setbegincomposition(constants.sdf_date.parse(matcher.group(1))); record.setprocessid(matcher.group(2)); if (cache) { executor.submit(new pubblicationparser(buffer, offset, record)); } else { executor.submit(new pubblicationparser(file, offset, record)); } records.add(record); } else { br.close(); throw new parseexception(line, 0); } } }
in pubblicationparser
there init()
method choose custom reader use. randomaccessfilereader:
if (file != null) { this.logreader = new randomaccessfilereader(file, offset); } else if (sb != null) { this.logreader = new stringbuilderreader(sb, (int) offset); }
and 2 custom reader:
// public class stringbuilderreader implements logreader { public static final string cr = system.getproperty("line.separator"); private final stringbuilder sb; private int offset; public stringbuilderreader(stringbuilder sb, int offset) { super(); this.sb = sb; this.offset = offset; } @override public string readline() throws ioexception { if (offset >= sb.length()) { return null; } int indexof = sb.indexof(cr, offset); if (indexof < 0) { indexof = sb.length(); } string substring = sb.substring(offset, indexof); offset = indexof + cr.length(); return substring; } @override public void close() throws ioexception { // todo auto-generated method stub } } // public class randomaccessfilereader implements logreader { private static final string filemode_r = "r"; private final randomaccessfile raf; public randomaccessfilereader(file file, long offset) throws ioexception { this.raf = new randomaccessfile(file, filemode_r); this.raf.seek(offset); } @override public void close() throws ioexception { raf.close(); } @override public string readline() throws ioexception { return raf.readline(); } }
the problem "cache way" slow , understand why!
you should making sure indeed i/o making application slow, not else (e.g inefficient logic in parser). that, use java profiler (jprofiler, example).
if indeed i/o, might better use ready-made solution load file memory - that's trying implement yourself.
have @ mappedbytebuffer , bytebuffer.
Comments
Post a Comment