c++ - Convert between c++11 clocks -
if have time_point arbitrary clock (say high_resolution_clock::time_point), there way convert time_point arbitrary clock (say system_clock::time_point)?
i know there have limits, if ability existed, because not clocks steady, there functionality such conversions in spec @ all?
i wondering whether accuracy of conversion proposed t.c. , howard hinnant improved. reference, here base version tested.
template < typename dsttimepointt, typename srctimepointt, typename dstclockt = typename dsttimepointt::clock, typename srcclockt = typename srctimepointt::clock > dsttimepointt clock_cast_0th(const srctimepointt tp) { const auto src_now = srcclockt::now(); const auto dst_now = dstclockt::now(); return dst_now + (tp - src_now); } using test
int main() { using namespace std::chrono; const auto = system_clock::now(); const auto steady_now = clock_cast<steady_clock::time_point>(now); const auto system_now = clock_cast<system_clock::time_point>(steady_now); const auto diff = system_now - now; std::cout << duration_cast<nanoseconds>(diff).count() << '\n'; } where clock_cast #defined to, now, colck_cast_0th, collected histogram idle system , 1 under high load. note cold-start test. first tried calling function in loop gives much better results. however, think give false impression because real-world programs convert time point every , , will hit cold case.
the load generated running following tasks parallel test program. (my computer has 4 cpus.)
- a matrix multiplication benchmark (single-threaded).
find /usr/include -execdir grep "$(pwgen 10 1)" '{}' \; -printhexdump /dev/urandom | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip| gunzip > /dev/nulldd if=/dev/urandom of=/tmp/spam bs=10 count=1000
those commands terminate in finite time run in infinite loop.
the following histogram – follow – shows errors of 50 000 runs worst 1 ‰ removed.
note ordinate has logarithmic scale.
the errors fall range between 0.5 µs , 1.0 µs in idle case , 0.5 µs , 1.5 µs in contended case.
the striking observation error distribution far symmetric (there no negative errors @ all) indicating large systematic component in error. makes sense because if interrupted between 2 calls now, error in same direction , cannot interrupted “negative amount of time”.
the histogram contended case looks perfect exponential distribution (mind log-scale!) rather sharp cut-off seems plausible; chance interrupted time t proportional e−t.
i tried using following trick
template < typename dsttimepointt, typename srctimepointt, typename dstclockt = typename dsttimepointt::clock, typename srcclockt = typename srctimepointt::clock > dsttimepointt clock_cast_1st(const srctimepointt tp) { const auto src_before = srcclockt::now(); const auto dst_now = dstclockt::now(); const auto src_after = srcclockt::now(); const auto src_diff = src_after - src_before; const auto src_now = src_before + src_diff / 2; return dst_now + (tp - src_now); } hoping interpolating scr_now partially cancel error introduced inevitably calling clocks in sequential order.
in first version of answer, claimed didn't anything. turns out, wasn't true. after howard hinnant pointed out did observe improvements, improved tests , there observable improvement.
it wasn't improvement in terms of error span, however, errors centered around 0 means have errors in range −0.5Ҳf;µs 0.5Ҳf;µs. more symmetric distribution indicates statistic component of error became more dominant.
next, tried calling above code in loop pick best value src_diff.
template < typename dsttimepointt, typename srctimepointt, typename dstdurationt = typename dsttimepointt::duration, typename srcdurationt = typename srctimepointt::duration, typename dstclockt = typename dsttimepointt::clock, typename srcclockt = typename srctimepointt::clock > dsttimepointt clock_cast_2nd(const srctimepointt tp, const srcdurationt tolerance = std::chrono::nanoseconds {100}, const int limit = 10) { assert(limit > 0); auto itercnt = 0; auto src_now = srctimepointt {}; auto dst_now = dsttimepointt {}; auto epsilon = detail::max_duration<srcdurationt>(); { const auto src_before = srcclockt::now(); const auto dst_between = dstclockt::now(); const auto src_after = srcclockt::now(); const auto src_diff = src_after - src_before; const auto delta = detail::abs_duration(src_diff); if (delta < epsilon) { src_now = src_before + src_diff / 2; dst_now = dst_between; epsilon = delta; } if (++itercnt >= limit) break; } while (epsilon > tolerance); #ifdef global_iteration_counter global_iteration_counter = itercnt; #endif return dst_now + (tp - src_now); } the function takes 2 additional optional parameters specify desired accuracy , maximum number of iterations , returns current-best value when either condition becomes true.
i'm using following 2 straight-forward helper functions in above code.
namespace detail { template <typename durationt, typename reprt = typename durationt::rep> constexpr durationt max_duration() noexcept { return durationt {std::numeric_limits<reprt>::max()}; } template <typename durationt> constexpr durationt abs_duration(const durationt d) noexcept { return durationt {(d.count() < 0) ? -d.count() : d.count()}; } } the error distribution symmetric around 0 , magnitude of error dropped down as factor of 100.
i curious how iteration run on average added #ifdef code , #defined name of global static variable main function print out. (note collect 2 iteration counts per experiment histogram has sample size of 100 000.)
the histogram contended case, on other hand, seems more uniform. have no explanation , have expected opposite.
as seems, hit iteration count limit (but that's okay) , return early. shape of histogram can of course influenced altering values of tolerance , limit passed function.
finally, thought clever , instead of looking @ src_diff use round-trip error directly quality criterion.
template < typename dsttimepointt, typename srctimepointt, typename dstdurationt = typename dsttimepointt::duration, typename srcdurationt = typename srctimepointt::duration, typename dstclockt = typename dsttimepointt::clock, typename srcclockt = typename srctimepointt::clock > dsttimepointt clock_cast_3rd(const srctimepointt tp, const srcdurationt tolerance = std::chrono::nanoseconds {100}, const int limit = 10) { assert(limit > 0); auto itercnt = 0; auto current = dsttimepointt {}; auto epsilon = detail::max_duration<srcdurationt>(); { const auto dst = clock_cast_0th<dsttimepointt>(tp); const auto src = clock_cast_0th<srctimepointt>(dst); const auto delta = detail::abs_duration(src - tp); if (delta < epsilon) { current = dst; epsilon = delta; } if (++itercnt >= limit) break; } while (epsilon > tolerance); #ifdef global_iteration_counter global_iteration_counter = itercnt; #endif return current; } it turns out not such idea.
we have moved non-symmetric error distribution again , magnitude of error has increased. (while function became more expensive!) actually, histogram idle case looks weird. spikes correspond how interrupted? doesn't make sense.
the iteration frequency shows same trend before.
in conclusion, recommend use 2nd approach , think default values optional parameters reasonable of course, may vary machine machine. howard hinnant has commented limit of 4 iterations worked him.
if implement real, wouldn't miss optimization opportunity check whether std::is_same<srcclockt, dstclockt>::value , in case, apply std::chrono::time_point_cast without ever calling now function (and not introducing error).
in case want repeat experiments, i'm providing full code here. clock_castxyz code complete. (just concatenate examples 1 file, #include obvious headers , save clock_cast.hxx.)
here actual main.cxx used.
#include <iomanip> #include <iostream> #ifdef global_iteration_counter static int global_iteration_counter; #endif #include "clock_cast.hxx" int main() { using namespace std::chrono; const auto = system_clock::now(); const auto steady_now = clock_cast<steady_clock::time_point>(now); #ifdef global_iteration_counter std::cerr << std::setw(8) << global_iteration_counter << '\n'; #endif const auto system_now = clock_cast<system_clock::time_point>(steady_now); #ifdef global_iteration_counter std::cerr << std::setw(8) << global_iteration_counter << '\n'; #endif const auto diff = system_now - now; std::cout << std::setw(8) << duration_cast<nanoseconds>(diff).count() << '\n'; } the following gnumakefile builds , runs everything.
cxx = g++ -std=c++14 cppflags = -dglobal_iteration_counter=global_counter cxxflags = -wall -wextra -werror -pedantic -o2 -g runs = 50000 cutoff = 0.999 execfiles = zeroth.exe first.exe second.exe third.exe datafiles = \ zeroth.dat \ first.dat \ second.dat second_iterations.dat \ third.dat third_iterations.dat picturefiles = ${datafiles:.dat=.png} all: ${picturefiles} zeroth.png: errors.gp zeroth.freq tag='zeroth' title="0th approach ${subtitle}" micros=0 gnuplot $< first.png: errors.gp first.freq tag='first' title="1st approach ${subtitle}" micros=0 gnuplot $< second.png: errors.gp second.freq tag='second' title="2nd approach ${subtitle}" gnuplot $< second_iterations.png: iterations.gp second_iterations.freq tag='second' title="2nd approach ${subtitle}" gnuplot $< third.png: errors.gp third.freq tag='third' title="3rd approach ${subtitle}" gnuplot $< third_iterations.png: iterations.gp third_iterations.freq tag='third' title="3rd approach ${subtitle}" gnuplot $< zeroth.exe: main.cxx clock_cast.hxx ${cxx} -o $@ ${cppflags} -dclock_cast='clock_cast_0th' ${cxxflags} $< first.exe: main.cxx clock_cast.hxx ${cxx} -o $@ ${cppflags} -dclock_cast='clock_cast_1st' ${cxxflags} $< second.exe: main.cxx clock_cast.hxx ${cxx} -o $@ ${cppflags} -dclock_cast='clock_cast_2nd' ${cxxflags} $< third.exe: main.cxx clock_cast.hxx ${cxx} -o $@ ${cppflags} -dclock_cast='clock_cast_3rd' ${cxxflags} $< %.freq: binput.py %.dat python $^ ${cutoff} > $@ ${datafiles}: ${execfiles} ${shell} -eu run.sh ${runs} $^ clean: rm -f *.exe *.dat *.freq *.png .phony: clean the auxiliary run.sh script rather simple. improvement earlier version of answer, executing different programs in inner loop more fair , maybe better rid of caching effects.
#! /bin/bash -eu n="$1" shift exe in "$@" name="${exe%.exe}" rm -f "${name}.dat" "${name}_iterations.dat" done i=0 while [ $i -lt $n ] exe in "$@" name="${exe%.exe}" "./${exe}" 1>>"${name}.dat" 2>>"${name}_iterations.dat" done i=$(($i + 1)) done and wrote binput.py script because couldn't figure out how histograms in gnuplot alone.
#! /usr/bin/python3 import sys import math def main(): cutoff = float(sys.argv[2]) if len(sys.argv) >= 3 else 1.0 open(sys.argv[1], 'r') istr: values = sorted(list(map(float, istr)), key=abs) if cutoff < 1.0: values = values[:int((cutoff - 1.0) * len(values))] min_val = min(values) max_val = max(values) binsize = 1.0 if max_val - min_val > 50: binsize = (max_val - min_val) / 50 bins = int(1 + math.ceil((max_val - min_val) / binsize)) histo = [0 in range(bins)] print("minimum: {:16.6f}".format(min_val), file=sys.stderr) print("maximum: {:16.6f}".format(max_val), file=sys.stderr) print("binsize: {:16.6f}".format(binsize), file=sys.stderr) x in values: idx = int((x - min_val) / binsize) histo[idx] += 1 (i, n) in enumerate(histo): value = min_val + * binsize frequency = n / len(values) print('{:16.6e} {:16.6e}'.format(value, frequency)) if __name__ == '__main__': main() finally, here errors.gp …
tag = system('echo ${tag-hist}') file_hist = sprintf('%s.freq', tag) file_plot = sprintf('%s.png', tag) micros_eh = 0 + system('echo ${micros-0}') set terminal png size 600,450 set output file_plot set title system('echo ${title-errors}') if (micros_eh) { set xlabel "error / µs" } else { set xlabel "error / ns" } set ylabel "relative frequency" set xrange [* : *] set yrange [1.0e-5 : 1] set log y set format y '10^{%t}' set format x '%g' set style fill solid 0.6 factor = micros_eh ? 1.0e-3 : 1.0 plot file_hist using (factor * $1):2 boxes notitle lc '#cc0000' … , iterations.gp scripts.
tag = system('echo ${tag-hist}') file_hist = sprintf('%s_iterations.freq', tag) file_plot = sprintf('%s_iterations.png', tag) set terminal png size 600,450 set output file_plot set title system('echo ${title-iterations}') set xlabel "iterations" set ylabel "frequency" set xrange [0 : *] set yrange [1.0e-5 : 1] set xtics 1 set xtics add ('' 0) set log y set format y '10^{%t}' set format x '%g' set boxwidth 1.0 set style fill solid 0.6 plot file_hist using 1:2 boxes notitle lc '#3465a4' 











Comments
Post a Comment