Speed and Memory Use

From VipsWiki
Jump to: navigation, search

We've written programs using number of different image processing system to load a TIFF image, crop 100 pixels off every edge, shrink by 10% with bilinear interpolation, sharpen with a 3x3 convolution and save again. It's a trivial test but it does give some idea of the speed and memory behaviour of these libraries (and it's also quite fun to compare the code).

See also our main Benchmarks page for a more complex benchmark and timings on a variety of machines.

Results

i5-3210M CPU @ 2.50GHz (Dell Vostro laptop)

Software Run time (secs real) Memory (peak RSS MB) Times slower
VIPS C/C++ 7.40 0.37 26 1.0
VIPS Python 7.40 0.40 33 1.1
ruby-vips 7.40 0.44 34 1.2
VIPS command-line 7.40 0.67 28 1.8
VIPS nip2 7.40 1.08 62 2.9
OpenCV 2.4.8 1.14 197 3.1
GraphicsMagick 1.3.18 1.31 247 3.54
NetPBM 10.0 1.38 73 3.9
RMagick 2.13.2 (ImageMagick 6.7.7) 1.56 667 4.2
ImageMagick 6.7.7.10 1.59 470 4.2
ExactImage 0.8.9 1.54 119 4.2
FreeImage 3.15.1 (incomplete) 2.11 187 5.7
PIL 1.1.7 2.48 211 6.7
ImageScience 1.2.4 (based on FreeImage 3.15.1, incomplete) 4.14 260 11.2
GEGL 0.2 16.35 403 44
Octave 3.0.1 64 (est.) 8500 (est.) 200

Notes

The benchmarks plus a simple driver program are in a github repository. See the README for details.

All timings are for a 5,000 by 5,000 pixel 8-bit RGB image in uncompressed tiled TIFF format, 128 by 128 pixel tiles. Each test was run with something like:

time ./vips.sh wtc_tiled_small.tif wtc2.tif

On a quiet system with the quickest real time of three runs recorded. I didn't try to clear the disc cache so the disc speed should not be a factor. The peak memory column was found by sampling RES with "ps" using this script. I used the systems as packaged for Ubuntu, unless otherwise indicated. I last ran these tests on 23 July 2014 and used the current stable version of every package.

I don't know much about OpenCV and I'm sure it could be made to run this benchmark more quickly. OpenCV is also not (very) threaded, so on a single-core machine it would be much closer.

Some systems, like ImageScience and nip2, have relatively long start-up times and this hurts their position in the table.

The VIPS command-line generates a huge amount of disc traffic which makes it unsuitable in certain applications. This is not really considered in this table.

ExactImage will not read tiled tiff, so the benchmark uses a strip tiff for this test.

GEGL is not really designed for batch-style processing, it targets interactive applications, like paint programs. In this test I used a jpg image for GEGL, since the version in Ubuntu does not support tiff load and save.

Octave aims to be a very high-level prototyping language and is not primarily targeting speed. I timed a 2,000 by 2,000 pixel monochrome JPEG and extrapolated from that.

FreeImage does not have a sharpening or convolution operation so I skipped that part of the benchmark.

ImageScience is based on FreeImage and therefore does not support sharpening, so I've skipped that part of the test. The resize() method is always bicubic which is a little unfair as the other benchmarks here use bilinear.

Why is VIPS quick

We have a How it works page which goes into some detail, but briefly:

Threaded image input-output (IO) system
Most image processing libraries have threaded operations. Each operation has code, generally using a framework like OpenMP, to run the operation over all the available processors. VIPS instead puts the threading into the image IO system, so all operations are automatically threaded. Additionally, this type of "horizontal" threading makes much better use of processor caches and dramatically reduces the amount of locking you need.
Fast operations
The VIPS primitives are implemented carefully, and some use techniques like run-time code generation. The convolution operator, for example, will examine the matrix and the image and at run-time write a short SSE3 program to implement exactly that convolution on exactly that image.
Demand-driven
VIPS is fully demand-driven, so it only needs to keep a few pixel buffers in memory, it doesn't need to load the whole image. This reduces memory use.

Graphically

Svt2.png

This graph was made by running "ps" very quickly and piping the output to a simple script that calculated total RES for all processes associated with a task.

Memtrace.png

This is a fancier one generated by vipsprofile showing the memory behaviour of vips on this task. The bottom graph shows total memory, the upper traces show threads calculating useful results (green), threads blocked on synchronisation (red) and memory allocations (white ticks). There's a blog post with some more detail on how this was made.

Implementations

VIPS Python

#!/usr/bin/python

import sys
from vipsCC import *

im = VImage.VImage (sys.argv[1])
im = im.extract_area (100, 100, im.Xsize () - 200, im.Ysize () - 200)
im = im.affinei_all ("bilinear", 0.9, 0, 0, 0.9, 0, 0)
mask = VMask.VIMask (3, 3, 8, 0, 
		  [-1, -1, -1, 
		   -1,  16, -1, 
		   -1, -1, -1])
im = im.conv (mask)
im.write (sys.argv[2])

ruby-vips

#!/usr/bin/ruby

require 'rubygems'
require 'vips'
include VIPS

im = Image.new(ARGV[0])

im = im.extract_area(100, 100, im.x_size - 200, im.y_size - 200)
im = im.affinei(:bilinear, 0.9, 0, 0, 0.9, 0, 0)
mask = [
    [-1, -1,  -1],
    [-1,  16, -1],
    [-1, -1,  -1]
]
m = Mask.new mask, 8, 0 
im = im.conv(m)

im.write(ARGV[1])

VIPS nip2

#!/home/john/vips/bin/nip2 -s

main
  = error "usage: infile -o outfile", argc != 2
  = (sharpen @ shrink @ crop) (Image_file argv?1)
{
  crop x = extract_area 100 100 (x.width - 200) (x.height - 200) x;
  shrink = resize Interpolate_bilinear 0.9 0.9;
  sharpen = conv (Matrix_con 8 0 [[-1, -1, -1], [-1, 16, -1], [-1, -1, -1]]);
}

VIPS command-line

#!/bin/bash

width=$(vipsheader -f Xsize $1)
height=$(vipsheader -f Ysize $1)

width=$((width - 200))
height=$((height - 200))

set -x

vips im_extract_area $1 t1.v 100 100 $width $height
vips im_affinei_all t1.v t2.v bilinear 0.9 0 0 0.9 0 0
cat > mask.con <<EOF
3 3 8 0
-1 -1 -1
-1 16 -1
-1 -1 -1
EOF
vips im_conv t2.v $2 mask.con
rm t1.v t2.v mask.con

VIPS C++

#include <vips/vips>

int main (int argc, char **argv)
{
        vips::VImage in (argv[1], "rs");
        vips::VIMask mask (3, 3, 8, 0,
                -1, -1, -1, -1, 16,-1, -1, -1, -1);

        in.
                extract_area (100, 100, in.Xsize () - 200, in.Ysize () - 200).
                affine (0.9, 0, 0, 0.9, 0, 0,
                        0, 0, 
                        (in.Xsize () - 200) * 0.9, 
                        (in.Ysize () - 200) * 0.9).
                conv (mask).
                write (argv[2]);

        return 0;
}

VIPS C

#include <vips/vips.h>

int 
main( int argc, char **argv )
{
	GOptionContext *context;
	VipsImage *global;
	VipsImage **t;

	GError *error = NULL;

	if( vips_init( argv[0] ) )
		return( -1 );

	context = g_option_context_new( " - VIPS benchmark program" );
	g_option_context_add_group( context, vips_get_option_group() );
	if( !g_option_context_parse( context, &argc, &argv, &error ) ) {
		if( error ) {
			fprintf( stderr, "%s\n", error->message );
			g_error_free( error );
		}

		vips_error_exit( NULL );
	}
	g_option_context_free( context );

	global = vips_image_new();
	t = (VipsImage **) vips_object_local_array( VIPS_OBJECT( global ), 5 );

	if( !(t[0] = vips_image_new_from_file( argv[1], 
		NULL )) )
		vips_error_exit( "unable to read %s", argv[1] );

	t[1] = vips_image_new_matrixv( 3, 3, 
		-1.0, -1.0, -1.0, 
		-1.0, 16.0, -1.0, 
		-1.0, -1.0, -1.0 );
	vips_image_set_double( t[1], "scale", 8 ); 

	if( vips_extract_area( t[0], &t[2], 
                100, 100, t[0]->Xsize - 200, t[0]->Ysize - 200, NULL ) ||
		vips_affine( t[2], &t[3], 0.9, 0, 0, 0.9, NULL ) ||
		vips_conv( t[3], &t[4], t[1], NULL ) ||
		vips_image_write_to_file( t[4], argv[2], NULL ) )
		return( -1 );

	g_object_unref( global );

	return( 0 );
}

PIL

#!/usr/bin/python 

import Image, sys
import ImageFilter 

im = Image.open (sys.argv[1])
im = im.crop ((100, 100, im.size[0] - 100, im.size[1] - 100))
im = im.resize ((int (im.size[0] * 0.9), int (im.size[1] * 0.9)),
        Image.BILINEAR) 
filter = ImageFilter.Kernel ((3, 3),
              (-1, -1, -1,
               -1, 16, -1,
               -1, -1, -1))
im = im.filter (filter)
im.save (sys.argv[2])

Octave

#!/usr/bin/octave -qf

pkg load image

im = imread(argv(){1});
im = im(101:end-100, 101:end-100);        % Crop
im = imresize(im, 0.9, 'linear');         % Shrink    
myFilter = [-1 -1 -1
        -1 16 -1
        -1 -1 -1]; 
im = conv2(double(im), myFilter);         % Sharpen
im = max(0, im ./ (max(max(im)) / 255));  % Renormalize
imwrite(argv(){2}, uint8(im));           % Write back again

ImageMagick

#!/bin/bash

# we crop on load, it's a bit quicker and saves some memory
# we can't crop 100 pixels with the crop-on-load syntax, so we have to
# find the width and height ourselves
width=$(vipsheader -f Xsize $1)
height=$(vipsheader -f Ysize $1)

width=$((width - 200))
height=$((height - 200))

set -x

convert "$1[${width}x${height}+100+100]" \
        -filter triangle -resize 90x90% \
        -convolve "-1, -1, -1, -1, 16, -1, -1, -1, -1" \
        $2

GraphicsMagick

#!/bin/bash

set -x

# GraphicsMagick does not have crop-on-load so we use -shave instead
gm convert $1 \
        -shave 100x100 \
        -filter triangle -resize 90x90% \
        -convolve "-1, -1, -1, -1, 16, -1, -1, -1, -1" \
        $2

ExactImage

#!/bin/bash

width=$(vipsheader -f Xsize $1)
height=$(vipsheader -f Ysize $1)

width=$((width - 200))
height=$((height - 200))

# set -x

econvert -i $1 \
	--crop "100,100,$width,$height" \
	--bilinear-scale 0.9 \
	--convolve "-1, -1, -1, -1, 16, -1, -1, -1, -1" \
	-o $2

FreeImage

/* Compile with:

   gcc freeimage.c -lfreeimage

 */

#include <FreeImage.h>

int
main (int argc, char **argv)
{       
  FIBITMAP *t1;
  FIBITMAP *t2;
  int width;
  int height;

  FreeImage_Initialise (FALSE);

  t1 = FreeImage_Load (FIF_TIFF, argv[1], TIFF_DEFAULT);

  width = FreeImage_GetWidth (t1); 
  height = FreeImage_GetHeight (t1); 

  t2 = FreeImage_Copy (t1, 100, 100, width - 100, height - 100); 
  FreeImage_Unload (t1); 

  t1 = FreeImage_Rescale (t2, (width - 200) * 0.9, (height - 200) * 0.9,
                          FILTER_BILINEAR);
  FreeImage_Unload (t2); 

  /* FreeImage does not have a sharpen operation, so we skip that.
   */

  FreeImage_Save (FIF_TIFF, t1, argv[2], TIFF_DEFAULT);
  FreeImage_Unload (t1); 

  FreeImage_DeInitialise ();

  return 0;
}      

NetPBM

#!/bin/bash

cat > mask <<EOF
P2
3 3
32
14 14 14 
14 48 14
14 14 14
EOF

tifftopnm $1 | \
  pnmcut -left 100 -right -100 -top 100 -bottom -100 | \
  pnmscale 0.9 | \
  pnmconvol mask | \
  pnmtotiff -truecolor -color > $2

ImageScience

#!/usr/bin/ruby

require 'rubygems'
require 'image_science'

ImageScience.with_image(ARGV[0]) do |img|
    img.with_crop(100, 100, img.width() - 100, img.height() - 100) do |crop|
        crop.resize(crop.width() * 0.9, crop.height() * 0.9) do |small|
            small.save(ARGV[1])
        end
    end
end

RMagick

#!/usr/bin/ruby

require 'rubygems'
require 'RMagick'
include Magick

im = ImageList.new(ARGV[0])

im = im.shave(100, 100)
im = im.resize(im.columns * 0.9, im.rows * 0.9, filter = TriangleFilter)
kernel = [-1, -1, -1, -1, 16, -1, -1, -1, -1]
im = im.convolve(3, kernel)
                   
im.write(ARGV[1])

OpenCV

/*
   g++ -g -Wall opencv.cc `pkg-config opencv --cflags --libs`
 */     

#include <cv.h> 
#include <highgui.h>
                  
using namespace cv;
                   
int  
main (int argc, char **argv)
{
  Ptr < IplImage > t1;

  if (!(t1 = cvLoadImage (argv[1])))
    return 1;   
  Mat img (t1);

  Mat crop (img, Rect (100, 100, img.cols - 200, img.rows - 200));

  Mat shrunk;
  resize (crop, shrunk, Size (0, 0), 0.9, 0.9);

  float m[3][3] = { {-1, -1, -1}, {-1, 16, -1}, {-1, -1, -1} };
  Mat kernel = Mat (3, 3, CV_32F, m) / 8.0; 

  Mat sharp;
  filter2D (shrunk, sharp, -1, kernel, Point (-1, -1), 0, BORDER_REPLICATE);

  CvMat cvimg = sharp;
  cvSaveImage (argv[2], &cvimg);
        
  return 0;
}     

GEGL

/* compile with
 
   gcc -g -Wall gegl.c `pkg-config gegl-0.2 --cflags --libs`

 */

#include <stdio.h>
#include <stdlib.h>

#include <gegl.h>

static void 
null_log_handler (const gchar *log_domain, 
		  GLogLevelFlags log_level, 
		  const gchar *message, 
		  gpointer user_data)
{
}

int
main (int argc, char **argv)
{
  GeglNode *gegl, *load, *crop, *scale, *sharp, *save;

  gegl_init (&argc, &argv);

  if (argc != 3) 
    {           
      fprintf (stderr, "usage: %s file-in file-out\n", argv[0]);
      exit (1);
    }

  g_log_set_handler ("GEGL-load.c", 
    G_LOG_LEVEL_WARNING | G_LOG_FLAG_FATAL | G_LOG_FLAG_RECURSION, 
    null_log_handler, NULL);
  g_log_set_handler ("GEGL-gegl-tile-handler-cache.c", 
    G_LOG_LEVEL_WARNING | G_LOG_FLAG_FATAL | G_LOG_FLAG_RECURSION, 
    null_log_handler, NULL);

  gegl = gegl_node_new ();
        
  load = gegl_node_new_child (gegl,
                              "operation", "gegl:load",
                              "path", argv[1], 
                              NULL);

  crop = gegl_node_new_child (gegl, 
                              "operation", "gegl:crop",
                              "x", 100.0,
                              "y", 100.0,
                              "width", 4800.0, 
                              "height", 4800.0, 
                              NULL);
                
  scale = gegl_node_new_child (gegl,
                               "operation", "gegl:scale",
                               "x", 0.9,
                               "y", 0.9,
                               "filter", "linear", 
                               "hard-edges", FALSE, 
                               NULL);
                
  sharp = gegl_node_new_child (gegl,
                               "operation", "gegl:unsharp-mask",
                               "std-dev", 1.0, // diameter 7 mask in gegl
                               NULL);

  save = gegl_node_new_child (gegl,
                              "operation", "gegl:save",
                              //"operation", "gegl:png-save",
                              //"bitdepth", 8,
                              "path", argv[2], 
                              NULL);

  gegl_node_link_many (load, crop, scale, sharp, save, NULL);
 
  //gegl_node_dump( gegl, 0 );

  gegl_node_process (save);

  //gegl_node_dump( gegl, 0 );
                
  g_object_unref (gegl);

  gegl_exit ();

  return (0);
}