[go: nahoru, domu]

MapReduce: Difference between revisions

Content deleted Content added
m →‎References: replaced: {{Google Inc.}} → {{Google LLC}}
OAbot (talk | contribs)
m Open access bot: doi updated in citation with #oabot.
Line 5:
 
The model is a specialization of the ''split-apply-combine'' strategy for data analysis.<ref>{{Cite journal | doi = 10.18637/jss.v040.i01| title = The split-apply-combine strategy for data analysis| journal = Journal of Statistical Software| volume = 40| pages = 1–29| year = 2011| last1 = Wickham| first1 = Hadley | doi-access = free}}</ref>
It is inspired by the [[map (higher-order function)|map]] and [[reduce (higher-order function)|reduce]] functions commonly used in [[functional programming]],<ref name="map">"Our abstraction is inspired by the map and reduce primitives present in Lisp and many other functional languages." -[http://research.google.com/archive/mapreduce.html "MapReduce: Simplified Data Processing on Large Clusters"], by Jeffrey Dean and Sanjay Ghemawat; from Google Research</ref> although their purpose in the MapReduce framework is not the same as in their original forms.<ref>{{Cite journal | doi = 10.1016/j.scico.2007.07.001| title = Google's Map ''Reduce'' programming model — Revisited| journal = Science of Computer Programming| volume = 70| pages = 1–30| year = 2008| last1 = Lämmel | first1 = R. | doi-access = free}}</ref> The key contributions of the MapReduce framework are not the actual map and reduce functions (which, for example, resemble the 1995 [[Message Passing Interface]] standard's<ref>http://www.mcs.anl.gov/research/projects/mpi/mpi-standard/mpi-report-2.0/mpi2-report.htm MPI 2 standard</ref> ''reduce''<ref>{{cite web|url=http://mpitutorial.com/tutorials/mpi-reduce-and-allreduce/|title=MPI Reduce and Allreduce · MPI Tutorial|website=mpitutorial.com}}</ref> and ''scatter''<ref>{{cite web|url=http://mpitutorial.com/tutorials/performing-parallel-rank-with-mpi/|title=Performing Parallel Rank with MPI · MPI Tutorial|website=mpitutorial.com}}</ref> operations), but the scalability and fault-tolerance achieved for a variety of applications by optimizing the execution engine {{Citation needed|reason=This claim needs a reliable source.|date=February 2019}}. As such, a [[single-threaded]] implementation of MapReduce is usually not faster than a traditional (non-MapReduce) implementation; any gains are usually only seen with [[multi-threaded]] implementations on multi-processor hardware.<ref name=stackoverflow>{{cite web
| url = https://stackoverflow.com/questions/3947889/mongodb-terrible-mapreduce-performance
| title = MongoDB: Terrible MapReduce Performance
Line 180:
| first = Leonidas
| s2cid = 44629767
| doi-access = free
}}</ref><ref>{{cite arXiv
|last=Lin