H2O on Hadoop¶
H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms while keeping the widely used languages of R and JSON as an API. H2O brings and elegant lego-like infrastructure that brings fine-grained parallelism to math over simple distributed arrays. Customers can use data locked in HDFS as a data source. H2O is a primary citizen of the Hadoop infrastructure & interacts naturally with the Hadoop JobTracker & TaskTrackers on all major distros.