|
Table of Contents
Big Data SoftwareThis page contains information about the software used during the course. Torch7Torch7 is an interactive development environment for machine learning and computer vision. It is an extension of the Lua language with a multidimensional numerical array library. Lua is a very simple, compact and efficient interpreter/compiler with a straightforward syntax. It is used widely as a scripting language in the computer game industry. Torch extends Lua with an extensive numerical library and various facilities for machine learning and computer vision. Torch has computational back-ends for multicore/multi-CPU machines (using Intel/AVX and OpenMP), NVidia GPUs (using CUDA), and ARM CPUs (using the Neon instruction set). Many research projects at the CILVR Lab are built with Torch. The main developers and maintainers of Torch are Ronan Collobert (IDIAP, Switzerland), Clément Farabet (NYU/CILVR), and Koray Kavukcuoglu (DeepMind Technologies). Quick Installation of Torch on Ubuntu and Mac OSXType the following command in your shell: curl -s https://raw.github.com/clementfarabet/torchinstall/master/install | bash This will download, compile, and install Torch (or update it if you have installed it otherwise). This will also download and install all the pre-requisite packages (including homebrew on Mac OS). This script will also install all the commonly-used optional Torch packages such as nnx, image, parallel and a few others. Note that this script will only work on Ubuntu and Mac OS. Manual installation on Linux and Mac OSXGo to http://www.torch.ch/manual/install/index and follow instructions. Then go to http://www.torch.ch/manual/packages to see a list of optional packages, and install them with: torch-pkg install <package-name> On Windows...Torch7 is not completely supported on Windows yet. The simplest thing is to download this Ubuntu virtual machine, which includes a pre-compiled, pre-installed version of Torch. This virtual machine requires VirtualBox. Torch/Lua On-line resourcesTorch Tutorial for Machine LearningUnder construction… Vowpal Wabbit / All ReduceVowpal Wabbit library for fast machine learning, by John Langford. VW is the essence of speed in machine learning, able to learn from terafeature datasets with ease. Via parallel learning, it can exceed the throughput of any single machine network interface when doing linear learning, a first amongst learning algorithms. We primarily use the wiki off github. A few useful starting points are: HadoopThe Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. |