mapreduce - distribute processing to a cluster of heterogeneous compute nodes taking relative performance and cost of communication into account? -
given cluster of heterogeneous compute nodes how possible distribute processing them while taking account both relative performance , cost of passing messages between them? (i know optimising np-complete in general) concurrency platforms best support this?
you might rephrase/summarise question as:
what algorithms make efficient use of cpu, memory , communications resources distributed computation in theory , existing (open source) platforms come closest realising this? depends on workload understanding trade-offs critical.
some background
i find on s/o want understand background can provide more specific answer, i've included quite bit below, not necessary essence of question.
a typical scenario see is:
we have application runs on x nodes each y cores. start homogeneous cluster. every operations team buys 1 or more new servers. new servers faster , may have more cores. integrated cluster make things run faster. older servers may re-purposed new cluster contains machines different performance characteristics. cluster no-longer homogeneous has more compute power overall. believe scenario must standard in big cloud data-centres well. its how kind of change in infrastructure can best utilised i'm interested in.
in 1 application work work divided number of relative long tasks. tasks allocated logical processors (we have 1 per core) become available. while there tasks perform cores not unoccupied part jobs can classified "embarassingly scalable".
this particular application c++ roll own concurrency platform using ssh , nfs large task. i'm considering arguments various alternative approaches. parties prefer various hadoop mad/reduce options. i'm wondering how shape versus more c++/machine oriented approaches such openmp, cilk++. i'm more interested in pros , cons answer specific case.
the task model seems scalable , sensible independent of platform. so, i'm assuming model divide work tasks , (probably distributed) scheduler tries decide processor allocate each task. open alternatives. there task queues each node, possibly each processor , idle processors should allow work stealing (e.g. processors long queues).
however, when @ various models of high performance , cloud cluster computing don't see discussed much.
michael wong classifies parallelism, ignoring hadoop, 2 main camps (starting around 14min in). https://isocpp.org/blog/2016/01/the-landscape-of-parallelism-michael-wong-meetingcpp-2015 hpc , multi-threaded applications in industry
the hpc community seems favour openmp on cluster of identical nodes. may still heterogeneous if each node supports cuda or has fpga support each node tends identical. if that's case upgrade data centres in big bang or what? (e.g. supercomputer 1 = 100 nodes of type x. supercomputer v2.0 on different site 200 nodes of type y).
openmp supports single physical computer itself. hpc community gets around either using mpi (which consider low level) or creating virtual machine nodes using hypervisor scalemp or vnuma (see example - openmp program on different hosts). (anyone know of open source hypervisor doing this?) believe these still considered powerful computing systems in world.
i find surprising don't see prevents map/reduce people creating bigger cluster more less efficient overall wins on brute force due total number of cores utilised?
so other concurrency platforms support heterogeneous nodes varying characteristics , how deal performance mismatch (and distribution of data)?
i'm excluding mpi option while powerful low-level. might use sockets. framework building on mpi acceptable (does x10 work way?).
from user's perspective map/reduce approach seems add enough nodes doesn't matter , not worry using them @ maximum efficiency. details kept under hood in implementation of schedulers , distributed file systems. how/where cost of computation , message passing taken account?
is there way in openmp (or favourite concurrency platform) make effective use of information node n times fast node , data transfer rate or node on average x mb/s?
in yarn have dominant resource fairness: http://blog.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-yarn/ http://static.usenix.org/event/nsdi11/tech/full_papers/ghodsi.pdf covers memory , cores using linux control groups not yet cover disk , network i/o resources.
are there equivalent or better approaches in other concurrency platforms? how compare drf?
which concurrency platforms handle best , why? there popular ones evolutionary dead ends? openmp keeps surprising me actively thriving. cilk++ made scale way?
apologies in advance combining several phd thesis worth questions one. i'm looking tips on for further reading , advice on platforms investigate further (from programmer's perspective). summary of platforms investigate and/or links papers or articles suffice useful answer.
Comments
Post a Comment