• People
  • Courses

OverFeat: Object Recognizer, Feature Extractor

Overview

OverFeat is a Convolutional Network-based image features extractor and classifier.

The underlying system is described in the following paper:

Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann LeCun: “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks”, International Conference on Learning Representations (ICLR 2014), April 2014. (OpenReview.net), (arXiv:1312.6229), (BibTeX).

It is primarily a feature extractor in that data augmentation for better classification is not provided. Different data augmentation schemes can be implemented to improve classification results, see the OverFeat paper for more details.

OverFeat was trained on the ImageNet dataset and participated in the ImageNet 2013 competition.

This package allows researchers to use OverFeat to recognize images and extract features.

A library with C++ source code is provided for running the OverFeat convolutional network, together with wrappers in various scripting languages (Python, Lua, Matlab coming soon).

OverFeat was trained with the Torch7 package ( http://www.torch.ch ). This package provides tools to run the network in a standalone fashion. The training code is not part of this package.

For questions or reporting problems, please visit the overfeat group: https://groups.google.com/forum/#!forum/overfeat

Code

Code Package: The code is available at Download OverFeat-v04-2.tgz .

Network Files: OverFeat requires weight files (over 1GB download). There is a script download_weights.py in the archive to download them automatically. They can also be downloaded manually here .

Git Repository: A git repository is also available, to make it easy to keep up with revisions and updates. The git version may sometimes contain minor corrections not yet in the tarball distibution. The git repository config files are also included in the archive, so you can update the code by simply typing

git pull

from the overfeat directory.

See file CHANGELOG in the archive for the list of changes.

* 05/12/2014 : version 4.2 released. Fix bug in feature extractor on GPU.
* 03/25/2014 : version 4.1 released. Fix compilation bug for python API, and missing file for cuda C++ API.
* 02/28/2014 : version 4.0 released. No more dependency to torch. Experimental GPU code.
* 02/12/2014 : version 3.6 released. Fixed a bug in ppm image parsing.
* 01/23/2014 : version 3.5 released. Fixed a minor bug with LD_LIBRARY_PATH, and added a git repository.
* 01/17/2014 : version 3.4 released. Added 32 bits binaries for linux and fixed output layer with option -f.
* 01/16/2014 : version 3.3 released. Improved overfeat script, weight files in a separate archive.
* 01/15/2014 : version 3.2 released. Fixed bug in python wrapper.
* 01/14/2014 : version 3.1 released. Fixed minor bug in overfeat script.
* 01/06/2014 : version 3.0 released. Python wrapper, Mac OS binaries, bug fixes.

Installation

Basic setup

Download the archive from the link above.

Extract the files:

tar -xvzf overfeat-vXX.tgz
cd overfeat
python download_weights.py

A git repository is included in the archive. To keep up to date, type (git is required) :

git pull

Precompiled binaries are available for Linux (Ubuntu 64 bits and 32 bits) in overfeat/bin. The dependencies for these are python and imagemagick, which come with most Linux distributions.

Important note: OverFeat compiled from source on your computer will run faster than the pre-compiled binaries.

A simple test of the pre-compiled binaries can be done with

./bin/YOUROS/overfeat -n 3 samples/bee.jpg

where YOUROS is either linux_64, linux_32 or macos

BLAS

Overfeat can run without BLAS, however it would be very slow. On Mac OS, Accelerate is installed so no further installation is required. On linux, however We strongly advice you to install openblas. On Ubuntu/Debian you should compile it (it might take a while, but it is worth it) :

sudo apt-get install build-essential gcc g++ gfortran git libgfortran3
git clone https://github.com/xianyi/OpenBLAS.git
cd OpenBLAS
make NO_AFFINITY=1 USE_OPENMP=1
sudo make install

For some reason, on 32 bits Ubuntu, libgfortran doesn't create the correct symlink. If you have issues linking with libgfortran, locate where libgfortran is installed (for instance /usr/lib/i386-linux-gnu) and create the correct symlink :

cd <folder_containing_libgfortran.so.3>
sudo ln -sf libgfortran.so.3 libgfortran.so

The precompiled binaries use BLAS. Openblas from the package manager should work, but will be slower. If you don't want to (or can't, for some reason) use BLAS, you must recompile overfeat.

Building from source

The webcam demo requires OpenCV. It can be installed on Ubuntu/Debian using apt-get :

sudo apt-get install python imagemagick libopencv-core2.3 libopencv-highgui2.3

On Mac OS, you must install the corresponding libraries either by hand or with your favorite package manager (MacPorts supports all the required libraries).

If you need to build the system from source, do the following:

# Install compiler
sudo apt-get install git g++ python imagemagick cmake
cd src
# Install torch :
sh install.sh
# Build overfeat
make all

To build webcam sample :

# install dependencies
sudo apt-get install pkg-config libopencv-dev libopencv-highgui-dev
make cam

High level interface

The feature extractor requires a weight file, containing the weights of the network. We provide a weight file located in data/default/net_weight . The software we provide should be able to locate it automatically. In case it doesn't, the option -d can be used to manually provide a path.

Overfeat can use two sizes of network. By default, it uses the smaller one. For more accuracy, the option -l can be used to use a larger, but slower, network.

Classification

In order to get the top N (by default, N=5) classes from a number of images :

./overfeat [-n <N>] [-d path_to_weights] [-l] path_to_image1 [path_to_image2 [path_to_image3 [... ] ] ]

To use overfeat online (feeding an image stream), feed its stdin stream with a sequence of ppm images (ended by end of file ('\0') character). In this case, please use option -p. For instance :

convert image1.jpg image2.jpg -resize 231x231 ppm:- | ./overfeat [-n <N>] -p

Please note that to get the classes from an image, the image size should be 231×231. The image will be cropped if one dimension is larger than 231, and the network won't be able to work if both dimension are larger. For feature extraction, it can be any size greater or equal to 231×231 .

Feature extraction

In order to extract the features, use -f option. For instance :

./overfeat -f image1.png image2.jpg

It is compatible with option -p.

The option -L (overrides -f) can be used to return the output of any layer. For instance

./overfeat -L 12 image1.png

returns the output of layer 12. The option -f corresponds to layer 19 for the small network, and 22 for the large one.

It writes the features on stdout as a sequence. Each feature starts with three integers separated by spaces, the first is the number of features (n), the second is the number of rows (h) and the last is the number of columns (w). It is followed by a end of line ('\n') character. Then follows n*h*w floating point number (written in ascii) separated by spaces. The feature is the first dimension (so that to obtain the next feature, you must add w*h to your index), followed by the row (to obtain the next row, add w to your index).

Webcam Demo

We provide a live classifier based on video from a video input device such as a webcam. It reads images from the webcam, and displays the most likely classes along with the probabilities. It can be run with

./bin/linux_64/webcam [-d <path_to_weights>] [-l] [-w <webcam_idx>]

Batch

We also provide an easy way to process a whole folder :

./bin/linux_64/overfeat_batch [-d <path_to_weights>] [-l] -i <input_dir> -o <output_dir>

It process each image in the input folder and produces a corresponding file in the output directory, containing the features,in the same format as before.

GPU pre-compiled binaries

We are providing precompiled binaries to run overfeat on GPU. Because the code is not released yet, we do not provide the source for now. The GPU release is experimental and for now only runs on linux 64bits. It requires a Nvidia GPU with CUDA architecture >= 2.0 (that covers all recent GPUs from Nvidia).

You will need openblas to run the GPU binaries.

The binaries are located in ``` bin/linux_64/cuda ``` And work the same way as the CPU versions. You can include the static library the same way as the CPU version.

Important note : In the present release, cuda toolkit 5.0 is required. It will be updated to 5.5 (the latest version) in the next update.

Examples

Classify image samples/bee.jpg, getting the 3 most likely classes :

./bin/linux_64/overfeat -n 3 samples/bee.jpg

Extract features from samples/pliers.jpg

./bin/linux_64/overfeat -f samples/pliers.jpg

Extract the features from all files in samples :

./bin/linux_64/overfeat_batch -i samples -o samples_features

Advanced

The true program is actually overfeatcmd, where overfeat is only a python script calling overfeatcmd. overfeatcmd is not designed to be used by itself, but can be if necessary. It taked two arguments :

.bin/linux_64/overfeatcmd <path_to_weights> <N> <I> <L>

If <N> is positive, it is, as before, the number of top classes to display. If <N> is non-positive, the features are going to be the output. The option <L> specifies from which layer the features are obtained (by default, <L>=16, corresponding to the last layer before the classifier). <I> corresponds to the size of the network : 0 for small, 1 for large.

APIs

C++

The library is written in C++. It consists of one static library named liboverfeat.a . The corresponding header is overfeat.hpp . It uses the low level torch tensor library (TH). Sample code can be found in overfeatcmd.cpp and webcam.cpp.

The library provides several functions in the namespace overfeat (see the API definition in "overfeat.hpp"):

  • init : This function must be called once before using the feature extractor. It reads the weights and must be passed a path to the weight files. It must also be passed the size of the network (net_idx), which should be 0, or 1, respectively for small or large networks. Note that the weight file must correspond to the size of the network.
  • free : This function releases the ressources and should be called when the feature extractor is no longer used.
  • fprop : This is the main function. It takes a THTensor* and runs the network on it. The output correspond to the output of the classifier. If the input is 3*H*W, the output is going to be nClasses * h * w, where nClasses is 1000 in the default network, h = ((H-11)/4+1)/8-6 and w = ((W-11)/4+1)/8-6 for the small network, and h=((H-7)/2+1)/18-5 and w=((W-7)/2+1)/18-5 for the large network . Each pixel of the output corresponds to a 231×231 window on the input for the small network, and 221×221 for the large one. Each class gets a score, but they are not probabilities (they are not normalized).
  • get_n_layers : Returns the total number of layers of the network.
  • get_output : Once fprop has been computed, this function returns the output of any layer. For instance, in the default network, layer 16 corresponds to the final features before the classifier.
  • soft_max : This function converts the output to probabilities. It only works if h = w = 1 (only one output pixel).
  • get_class_name : This function returns the string corresponding to the i-th class.
  • get_top_classes : Given a nClasses element vector corresponding to score or probabilities, this function returns the n top classes, along with their score/probabilities.

When compiling code using liboverfeat.a, the code must also be linked against libTH.so, the tensor library. The file libTH.so will have been produced when compiling torch.

Torch7:

We have bindings for torch, in the directory API/torch. The file API/torch/README contains more details.

Python:

The python bindings are in the directory API/python. See API/python/README for more details.

Networks

the network architectures currently distributed with OverFeat are as follows:

  • 'fast' network (table 1 in http://arxiv.org/abs/1312.6229):
    • input 3x231x231
    • stage 1: convo: 11×11 stride 4×4; ReLU; maxpool: 2×2 stride 2×2; output (layer 3): 96x24x24
    • stage 2: convo: 5×5 stride 1×1; ReLU; maxpool: 2×2 stride 2×2; output (layer 6): 256x12x12
    • stage 3: convo: 3×3 stride 1×1 0-padded; ReLU; output (layer 9) 512x12x12
    • stage 4: convo: 3×3 stride 1×1 0-padded; ReLU; output (layer 12) 1024x12x12
    • stage 5: convo: 3×3 stride 1×1 0-padded; ReLU; maxpool: 2×2 stride 2×2; output (layer 16) 1024x6x6
    • stage 6: convo: 6×6 stride 1×1; ReLU; output (layer 18) 3072x1x1
    • stage 7: full; ReLU; output (layer 20) 4096x1x1
    • stage 8: full; output (layer 21) 1000x1x1
    • output stage: softmax; output (layer 22) 1000x1x1
  • 'accurate' network (table 2 in http://arxiv.org/abs/1312.6229):
    • input 3x221x221
    • stage 1: convo: 7×7 stride 2×2; ReLU; maxpool: 3×3 stride 3×3; output (layer 3): 96x36x36
    • stage 2: convo: 7×7 stride 1×1; ReLU; maxpool: 2×2 stride 2×2; output (layer 6): 256x15x15
    • stage 3: convo: 3×3 stride 1×1 0-padded; ReLU; output (layer 9) 512x15x15
    • stage 4: convo: 3×3 stride 1×1 0-padded; ReLU; output (layer 12) 512x15x15
    • stage 5: convo: 3×3 stride 1×1 0-padded; ReLU; output (layer 15) 1024x15x15
    • stage 6: convo: 3×3 stride 1×1 0-padded; ReLU; maxpool: 3×3 stride 3×3; output (layer 19) 1024x5x5
    • stage 7: convo: 5×5 stride 1×1; ReLU; output (layer 21) 4096x1x1
    • stage 8: full; ReLU; output (layer 23) 4096x1x1
    • stage 9: full; output (layer 24) 1000x1x1
    • output stage: softmax; output (layer 25) 1000x1x1
 
 
/srv/www/cilvr/htdocs/data/pages/software/overfeat/start.txt · Last modified: 2014/05/12 22:33 by mathieu
Recent changes RSS feed Creative Commons License Valid XHTML 1.0 Valid CSS Driven by DokuWiki
Drupal Garland Theme for Dokuwiki