• People
  • Courses

Big Data Software

This page contains information about the software used during the course.

Torch7

Torch7 is an interactive development environment for machine learning and computer vision. It is an extension of the Lua language with a multidimensional numerical array library.

Lua is a very simple, compact and efficient interpreter/compiler with a straightforward syntax. It is used widely as a scripting language in the computer game industry. Torch extends Lua with an extensive numerical library and various facilities for machine learning and computer vision.

Torch has computational back-ends for multicore/multi-CPU machines (using Intel/AVX and OpenMP), NVidia GPUs (using CUDA), and ARM CPUs (using the Neon instruction set).

Many research projects at the CILVR Lab are built with Torch.

The main developers and maintainers of Torch are Ronan Collobert (IDIAP, Switzerland), Clément Farabet (NYU/CILVR), and Koray Kavukcuoglu (DeepMind Technologies).

Quick Installation of Torch on Ubuntu and Mac OSX

Type the following command in your shell:

sudo curl -s https://raw.github.com/clementfarabet/torchinstall/master/install-all | bash

and type your sudo password when prompted.

This line will download, compile, and install Torch (or update it if you have installed it otherwise). This will also download and install all the pre-requisite packages (including homebrew on Mac OS).

This script will also install all the commonly-used optional Torch packages.

Note that this script will only work on Ubuntu and Mac OS.

If you don't have sudo privileges, you can type the above without the leading “sudo”, but you need to manually install the required packaged beforehand.

For more details go to https://github.com/clementfarabet/torchinstall.

Manual installation on Linux and Mac OSX

Go to http://www.torch.ch/manual/install/index and follow instructions.

Then go to http://www.torch.ch/manual/packages to see a list of optional packages, and install them with:

torch-pkg install <package-name>
On Windows...

Torch7 is not completely supported on Windows yet. The simplest thing is to download this Ubuntu virtual machine, which includes a pre-compiled, pre-installed version of Torch. This virtual machine requires VirtualBox.

Torch/Lua On-line resources

Torch Tutorial for Machine Learning

Under construction…

2013/02/18 13:54 · Yann LeCun

Vowpal Wabbit / All Reduce

Vowpal Wabbit library for fast machine learning, by John Langford.

VW is the essence of speed in machine learning, able to learn from terafeature datasets with ease. Via parallel learning, it can exceed the throughput of any single machine network interface when doing linear learning, a first amongst learning algorithms.

We primarily use the wiki off github. A few useful starting points are:

Hadoop

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Hadoop Website

 
 
/srv/www/cilvr/htdocs/data/pages/courses/bigdata/software/start.txt · Last modified: 2013/02/18 13:55 by yann
Recent changes RSS feed Creative Commons License Valid XHTML 1.0 Valid CSS Driven by DokuWiki
Drupal Garland Theme for Dokuwiki