Hadoop: How to set up a single node:
Simple operations to use Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
Prerequisites
GNU/Linux
is supported as a development and production platform. Hadoop has been
demonstrated on GNU/Linux clusters with 2000 nodes.
Win32 is
supported as a development platform. Distributed operation has not been
well tested on Win32, so it is not supported as a production platform.
$ sudo apt-get install ssh
$ sudo apt-get install rsync
Installation
Download a recent stable release from one of the Apache Download Mirrors.
Edit
the file conf/hadoop-env.sh and point JAVA_HOME to your JAVA root
installation. How to know where is your JAVA ($ echo $JAVA_HOME)
Pseudo-Distributed Operation
Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process. Follow this link.
In one of the steps you will need to ssh to the localhost:
$ ssh localhost
If you get the error
$ ssh: connect to host localhost port 22: Connection refused
Then try:
$ sudo apt-get install openssh-server
No comments:
Post a Comment