Instructions for Building Your ODROID C2 Compute Cluster

  1. Download the OS image from the hardkernel.com web site.
  2. The file comes zipped in a .xz file. Unzip the file. For Windows users, 7-Zip can unzip it. For Linux users, you can use the command unxz the_file.img.xz (info on unxz).
  3. Flash the image onto two microSD cards. For Windows users, Win32 Disk Imager works. Hardkernel provides a link labelled "OS Image flashing tool for Windows" to their version of Win32 DiskImager on their downloads page.
     
    Linux users can use dd to flash the image onto the microSD cards. If it's a brand new card, you should be able to just use the commands
    sudo dd if=/dev/zero of=/dev/sdb bs=1M count=10 (assuming that the card is sdb - dmesg can show you.)
    dd if=~/Desktop/ubuntu64-16.04lts-mate-odroid-c2-20160525.img | pv | sudo dd of=/dev/sdb       (Make sure to replace the ~/Desktop/ubuntu64-16.04lts-mate-odroid-c2-20160525.img with the path to your image file. Also, this command will prompt you for the root password but appear to keep going while you enter it. Make sure to enter the root password.) Once you've gotten your command prompt back because the commands are done, type sync. When that's done, unmount/eject the card properly through the GUI or command line and then remove it.
     
    If your card is not brand new (AKA it has been written on already), you may need to do a little extra work. Once you have flashed a card and removed it, you'll likely be unable to boot from it immediately and get an error like this: EXT4-fs (mmcblk0p2): bad geometry: block count 3856384 exceeds size of device (1003264 blocks). Put the card back in your Linux machine and you'll get a corresponding error message like this: # EXT4-fs (sdb2): bad geometry: block count 3856384 exceeds size of device (1003264 blocks) . Run the command sudo e2fsck -fy /dev/sdb2, then sync, and then eject the card properly and remove it. Now you should be able to boot from it.
  4. Once you have flashed the microSD cards, pop them into your systems.
  5. Connect an Ethernet cable to the system and to your network, a keyboard, mouse, and monitor/TV to the system and power it on.
  6. Once Linux has booted, log on. The username and passsword are both odroid.
  7. At some point a screen will pop up prompting you to set up backup settings. I chose Don't Show Again.
  8. The software updater also pops up after a couple of minutes. I chose Install Now. Enter the password odroid when it asks for the password. You will want to hit details so you can see what's happening. You will need to hit enter when it talks about replacing the boot.ini file and has an OK option. Otherwise it seems to hang there. Reboot the nodes when it says to.
  9. Download and install all the software using the command sudo apt-get install gedit nfs-kernel-server nfs-common rpcbind mpich (source).
  10. Firefox crashes when you launch it, so use the Chromium web browser to grab the program below which comes from Blaise Barney's MPI Tutorial on the LLNL web site and save it as mpitest.c.
  11. As superuser, edit the /etc/hostname file (sudo vi /etc/hostname). For the top node in your cluster, make the text in that file top-master. For the bottom node in your cluster, make the text in that file bottom-slave.
  12. As superuser, edit the /etc/hosts file (sudo vi /etc/hosts). Make the file look like this:
    127.0.0.1         localhost
    192.168.1.100     top-master
    192.168.1.101     bottom-slave
  13. As superuser, edit the /etc/network/interfaces file (sudo vi /etc/network/interfaces). Make it look like this for the top-master node:
    auto eth0
    iface eth0 inet static
    address 192.168.1.100
    netmask 255.255.255.0
    gateway 192.168.1.100
     
    Make it look like this for the bottom-slave node: auto eth0
    iface eth0 inet static
    address 192.168.1.101
    netmask 255.255.255.0
    gateway 192.168.1.100

  14. Reboot the top-master system. When it comes back up, reboot the bottom-slave system. You may not have Internet access now, depending on how your network is configured. If your nodes are plugged into a switch that's connected to a router like one for your home, you may still, depending on how your router handles the computers. You don't need access at this point for the node you have finished anyway.
  15. At this point, all you need to do is configure NFS, communication between your nodes without passwords, and test out MPI!
  16. On both nodes, run these two commands:
    sudo mkdir /sharedFiles
    sudo chown odroid:odroid /sharedFiles.
  17. On the top-master node, run these two commands:
    echo "/sharedFiles *(rw,sync)" | sudo tee -a /etc/exports
    sudo service nfs-kernel-server restart.
  18. On the bottom-slave node, as sudo edit /etc/fstab and add this line:
    top-master:/sharedFiles    /sharedFiles    nfs
  19. On both nodes, run these commands:
    ssh­-keygen ­-t rsa       (for this command, I accept the default file location and I use an empty password)
    cd .ssh
    cat id_rsa.pub >> authorized_keys
  20. On the top-master node, run this command:
    ssh-copy-id bottom-slave
  21. On the bottom-slave node, run this command:
    ssh-copy-id top-master
  22. Reboot the master node and once it comes up, reboot the slave node.
  23. From one of the nodes, do the rest of these instructions. Copy the mpitest.c file into /sharedFiles.
  24. cd /sharedFiles and then create a file called machines in that folder which contains these two lines
    top-master:4
    bottom-slave:4
    .
  25. Compile the mpitest.c file with the command mpicc mpitest.c -o hellompi
  26. Run the compiled program with the command mpirun -n 8 -f ./machines ./hellompi
  27. You should get 8 lines of output with processes 0 to 7, which should be split between the two different nodes. Note that the lines won't be in order!
  28. When you want to write MPI programs and run them, just make your source code file in the /sharedFiles folder.
  29. Want to add more nodes? Do everything you did for the bottom-slave node for the new node (but give it a different IP address and name), and make sure you propagate the ssh keys from the other systems to this one and from this one to the other ones, like you did in steps 20 and 21.