Basic Digital Imaging

From VipsWiki
Jump to: navigation, search


This page has been written as part of the NIP2 Beginners Guide.


The topic of digital imaging is large and complex: this document is intended to provide a brief introduction to some of the common concepts. It will begin by discussing how and why images are captured before going on to define several common pieces of terminology used when working with digital images.

Capturing or Digitising Images

Before you can begin any form of digital image processing you obviously need some digital images. These can be taken directly with a digital camera, captured from a video signal or scanned from an existing photographic print, negative or transparency. There are too many different pieces of equipment available in this field to try to describe them here in any detail. This document will cover some of the more generally relevant concepts. For more specific details relating to your own equipment please refer to the documentation supplied by the manufactureres.

What do you want digital images for?

With an unlimited budget, new fast computers and highly trained dedicated staff it is possible to capture fantastic high quality images, in a variety of different regions of the electromagnetic spectrum[1] and at regular intervals in the life span of the object or scene being imaged. The images could all be neatly catalogued and organised in well designed searchable databases, to be found instantly, on demand and printed at whatever size required. However, this is rarely the case, if ever.

When it comes to digitising images there are several important questions to consider.

  1. What are the images for and what quality of images are required?
  2. What can my equipment or resources achieve?
  3. How are the captured images going to be stored?

The answers to these question will determine the resolution, bit-depth, file type and size of the images that are required and can be captured and stored. But in order to understand these answers a basic understanding of what resolution, bit-depth, file type and image size are will also be required.

What is a digital image.

Error creating thumbnail: Unable to save thumbnail to destination
A example of a simple 3 × 3 pixel, black and white image. The numbers shown in each square represents the digital value of each shade.
A black and white, (or mono), digital image is built up from small defined patches called pixels, arranged together like a large set of mosaic tiles. Each pixel is actually stored as a number which is then converted into a specific shade as the image is viewed on a computer. The number of pixels in an image is used to define the resolution of an image and the range of numbers used to define the avaliable shades is described as the bit-depth. Colour images are achieved by combining together specific sets of mono images in what are called bands. Thus a mono image is know as a one band image, whereas a standard RGB[2] colour image, which is composed of three different bands, is known as a 3-band image.

What is resolution ?

The term resolution is generally used to describe how much detail there is in an image, where increasing the resolution increases the detail and so on. However, from this point on, the use of the term resolution becomes less clear, especially when it comes to printing. When viewed on different computers a single image can appear to have changed size and if sent to printer it can come out in a variety of different sizes. The size of an image can be defined in terms of pixels or mega pixels, centimetres or inches, as so many dots, points or pixels per inch or centimetre, or even how many kilobytes or megabytes or space it takes up. Hopefully we can clear this up.

Image resolution

A digital image has a specific size, this is determined by the number of pixels in the image, expressed as width × height, (e.g. 1000 × 1000 pixels), or sometimes in terms of the total number of pixels, which in this cases would be 1000000 pixels or 1 Megapixel. For example a 6 Megapixel image has a total of 6000000 pixels or perhaps 3000 × 2000 pixels. The perceived size of the image will depend on how large each of the individual pixels are displayed,

Error creating thumbnail: Unable to save thumbnail to destination
This diagram shows the visual effect of changing resolution, from 256x256 down to 8x8. If the display size of the pixels is fixed the image will appear to shrink, but if the image is kept at the same display size the pixels will become more and more visable.

Screen or Monitor resolution

The image displayed on a computer monitor, like any digital image, is composed of a large number of small dots or pixels. The theoretical maximum number of pixels a monitor can display will depend on the smallest point of information the monitor can resolve. As an example, think of a screen as a 32 × 32 array of lights; the smallest detail it could display would be a single light. Therefore, if each of these lights were controlled individually we would have a screen resolution of 32 × 32 screen pixels and could display a full 32 × 32 pixel digital image. Alternatively if the lights were controlled in small groups, for example four, then this would reduce the screen resolution from 32 × 32 to only 16 × 16 screen pixels, see example. Now why, if we can display 32 pixels would we want to be limited to 16 ?

Error creating thumbnail: Unable to save thumbnail to destination
Example images displayed on a theoretical 32 × 32 pixel screen. [A]: Shows a 32 × 32 pixel digital image, with screen resolution set to 32 × 32 pixels. [B]: Shows a 16 × 16 pixel digital image, with screen resolution set to 16 × 16 pixels. [C]: Shows a 16 × 16 pixel digital image, with screen resolution set to 32 × 32 pixels. [D]: A detail of a 64 × 64 pixel image with the zoom is set to 100% or 1:1, so that one image pixel is displayed by one screen pixel. [E]: A 64 × 64 pixel image with the zoom is set to 50% or 1:2, so that four image pixels are displayed by one screen pixel. [F]: A detail of a 64 × 64 pixel image with the zoom is set to 200% or 2:1, so that one image pixel is displayed by four screen pixels.

If we have a 16 × 16 pixel digital image and display it on our screen, set to a resolution of 16 × 16 screen pixels, the image will fill the display. If the screen resolution is changed back to 32 × 32 screen pixels the same 16 × 16 pixel digital image will only fill the middle quarter of the screen. Practically, this means that at higher screen resolutions all of the icons and text on a computer screen will appear smaller, so more digital information can be displayed at the same time, bigger pictures, more text etc. However, there is no point in displaying more information if it is too small to read. Therefore, the screen resolution used on most monitors is a compromise, balancing how much can be displayed with what is comfortable to view[3].

At the time of writing typical screen resolution range from 1024 × 768 to 1600 × 1200, higher or lower screen resolutions are possible depending on the hardware being used[4].

Error creating thumbnail: Unable to save thumbnail to destination
Example images displayed on a theoretical 64 × 64 pixel screen showing how detail can be lost when resizing an image. [A]: Shows a 64 × 64 pixel digital image. [B]: Shows image [A] resized to 32 × 32 pixels. [C]: Shows image [B] resized back to 64 × 64 pixels.

Display Size: Zoom controls

When a screen resolution has been set, this will control the size of many aspects of information displayed on a computer; windows, menus, icons, text, etc. However when using most image viewing software you can alter the display size of an image by using zoom controls. It should be noted that zoom controls do not actually change the number of pixels in an image, but alter the size at which each of these pixels are displayed. Zoom can be described in different ways but generally it is given either as a percentage, or as a ratio describing the relationship between screen pixels and image pixels. If one image pixel is displayed by one screen pixel the zoom is said to be 100% or 1:1. If you zoom out from an image then one screen pixel will represent more than one image pixel, the percentage would decrease, 50%, 25%, etc or the ratio fall, 1:2, 1:4, etc. Zooming in has the reverse effect, one image pixel will be represented by an increasing number of screen pixels, the percentage would increase, 200%, 400%, etc or the ratio rise, 2:1, 4:1, etc.

If the image being viewed has a higher resolution than the screen on which it is being displayed then all of the detail in the image can not be displayed at the same time. For example, if a 64 × 64 pixel image is displayed on our 32 × 32 pixel screen only 32 × 32 pixels can be displayed at any one time. With a zoom of 100% a 32 × 32 pixel detail of our image will be displayed. If we zoom out to 50% the computer will display a simplified 32 × 32 pixel version of the entire image and if we zoom in to 200% then the computer will display a 16 × 16 pixel detail of the image, but each image pixel will be represented by four screen pixels, see figure.

Resizing Images

In addition to the zoom controls most image processing software will also allow you to resize your image, to actually change the total number of image pixels. This process is very similar to zooming except this time instead of comparing screen and image pixels the comparison is between old image pixels and new image pixels. In real image processing the resizing of images can be quite mathematically complex but if a basic method is used to shrink a 64 × 64 pixel image to 32 × 32 pixels then each group of four pixels in the original image will be averaged to produce one pixel in the new image, thus leading to a loss of detail. If the 32 × 32 image is then increase in size back to 64 × 64, each pixel will be duplicated to produce four pixels in the new image; because of this duplication there is no real increase in detail, the image just gets bigger. The effect of decreasing the size of an image can be seen in figure.

Error creating thumbnail: Unable to save thumbnail to destination
Examples of images with different bit-depths.

Print Resolution

When it comes to printing images we need to start thinking about physical dots or points on a piece of paper, rather than display pixels on a screen. Print resolution is used to describe the number of dots a printer is producing. It is normally expressed as so many dots or points per printed unit distance, i.e. Dots Per Inch (DPI), or Points Per Inch (PPI)[5]. Some systems may well be using pixels per cm now, but DPI is still very common so we will continue using it in this document.

As an example, if we have a printer that is set to 100 DPI, and we print a 100x100 pixel digital image, the printed image will cover one square inch of paper. If we change the printer resolution to 50 DPI and print the same image the physical dots will be twice as big so the printed image will take up four square inches. And as you would expect if we set the printer resolution to 200 DPI the printed image will cover only a quarter of a square inch.

Unfortunately, real printing is slightly more complex than this. When an image is sent to a printer the software involved will often ask the user how big they want the print to be, either in physical units, (centimetres or inches, etc) or relative to the size of the paper. This means that the printer software is offering to resize your image prior to printing. As noted before, resizing an image will not create any new information, just change the size. There may be some clever image processing going on, but generally the printer will just be changing how many printed pixels are used to represent each image pixel, the physical equivalent of the image processing zoom controls. For most applications these printed images look fine, but if problems occur it would be good to examine how much resizing the printer is doing.

To ensure more controllable results, it is better to check that your image is the correct size before it is sent to the printer. For example, if you want to print an image of 6 × 8 inches on a printer set to 300 DPI you will need an image 1800 × 2400 pixel image, which is about 4.3 Megapixels[6].

How many DPI should I use ?

When looking at modern printers, even in the under £100 bracket, they will indicate that they are capable of printing at resolutions into the thousands of DPI, for example one £60 printer advertised a maximum setting of 4800 DPI. If we consider the previous example, printing a 6 × 8 inch picture at 300 DPI required a 1800 × 2400 pixel image. Printing the same picture at 4800 DPI would need a 28800 × 38400 pixel image, which is more than 1000 Megapixels. This would be a very big image, well beyond the capcity of all current commercial cameras.[7]

Generally an image resolution of 300 DPI will be sufficient to produce good colour prints on most printers and up to 900 DPI for high contrast graphics or text; the human eye can't really see more detail than this. Depending on your printer, one thing that higher DPI settings can change is how well your printer blends its dots together or deals with flat areas of pale colours, so you may still get better prints at higher DPI settings. A good method for checking your printer is to to print the same image several time with different settings. Decide on a fixed print size, 4 × 4 inches, a full or half page, etc, and then print the images several times at that fixed size but with different DPI values and see if you can see any differences.

What is bit-depth?

The shade of each pixel, in a black and white digital image, is defined by a number, in most cases the lower the number the darker the pixel and the higher the number the brighter the pixel[8], see figure [#fg:pixels 3.1]fg:pixels. In each digital image there is a fixed range of available numbers, or shades of grey, which is referred to as the bit-depth of an image and is dependent on how many "bits"[9] of information are available to define the number.

One digital "bit" of information can either be on or off, 1 or 0, and therefore a pixel in a 1-bit image can either be white or black, equivalent to 21(2) levels of information. As the numbers of available digital "bits" in an image increases the total number of available shades increases. For example, a 4-bit image can have 24(16) shades and an 8-bit image could have 28(256), see figure. Generally, images with higher bit-depths can record more subtable changes in shade or colour.

As with resolution, there can be a difference between the bit-depth of an image and the bit-depth of your screen or print system. The vast majority of printers and monitors all output 8-bit images but this is not a serious problem as the vast majority of digital images produced by consumers are also 8-bit.

Dynamic Range and Histograms

A standard 1-band, 8-bit image can contain up to 28 or 256 levels of information this means that pixel values can range from 0, (black), up to 255, (white). However, images do not always make use of all of the available levels. One of the easiest ways to see which levels have been used in a particular image, is to examine the histogram of an image. A histogram is a graph of the number of pixels against pixel value, (or level of grey). The range of pixel values used in an image and shown in the histogram, is termed the dynamic range of the image. Some example histograms, with their associated images are given in figure. Photographs, negatives, slides, transparencies, etc also have a dynamic range; this refers to the range of shades recorded in the particular photgraphic media.

As well as displaying the dynamic range a histogram can also show if a captured image is under- or over-exposed. If an image has a relatively high number of pixels with a value of 0 (black), i.e. most of the pixels are displayed on the left-hand side of the histogram, then the image is under-exposed. And if the opposite is true, where an image has a relatively high number of pixels with a value of 255 (white) then the image is over-exposed. If an 8-bit image has a high number of pixels with values of 0 and 255 it would indicate that the object being imaged has a dynamic range of greater than 28(256).

Error creating thumbnail: Unable to save thumbnail to destination
Three versions on a mono, 8-bit, image and the corresponding histograms. As the dynamic range decreases, going from A to C, the image become flatter due to the loss of tonal range.

Correctly exposing an object with strong dark and light areas can be very difficult and the results are often a compromise to ensure that particular areas of interest are correctly exposed. For many applications this is fine and perfectly acceptable images can be produced. However, for some applications, such as conservation technical imaging, it would be preferable to be able to capture more of the available information. This can be achieved by capturing several different 8-bit exposures or using a higher bit-depth capture system.

Working with lower bit-depth images

To simplify image processing systems, images with bit-depths less than eight are normally treated as 8-bit, with a limited or very specific dynamic range.[10]

Working with higher bit-depth images

Again to simplify image processing systems, images with a bit-depth greater than 8 but less than or equal to 16 are all treated as 16-bit images. A standard, 1-band, 16-bit image can have up to 216 or 65536 different levels of information, allowing for a much higher definition of tonal change than is possible with 8-bit images. As noted above, most monitors and printers are 8-bit, so how can one work with 16-bit images? The answer is that an 8-bit version of a 16-bit image needs to be produced so that it can be viewed. With printing this will be a fixed conversion, simplifying the 16-bit image to 8-bits of information. However, on a computer screen this conversion can be dynamic, allowing many different 8-bit versions of a single 16-bit image to be viewed. Three different 8-bit views of the same 16-bit image can be seen in figure. So although the full dynamic range of a 16-bit image cannot be viewed at the same time, all of the information can be recorded and sections of it displayed as required.

Error creating thumbnail: Unable to save thumbnail to destination
Three differrent 8-bit images produced from a single 16-bit image. [A]:The image has been scaled to display the mid tones. [B]: The image has been scaled to show the detail in the highlights. [C]: The image has been scaled to show the detail in the darker areas.

Different kinds of numbers

In this document we have already introduced the concept of bit-depth, defining the maximum number of pixel values, or shades, available in a particular image. The next stage is to define what kind of numbers the pixel values can be, positive, negative, decimals etc. Three of the more common types of numbers used in conjunction with digital images (unsigned, signed and float), will be described here.

Unsigned numbers are positive integers[11], (0, 3, 20, 34, etc) and their range for a particular image is defined by the bit-depth of an image, for example an 8-bit, unsigned, integer image can contain all values from 0 to 255.

Signed numbers are positive or negative integers, (-23, -5, 0, 13, 45, etc) and their range for a particular image is defined by the bit-depth of an image,for example an 8-bit, signed, integer image can only contain all the values from -128 to 127, including 0.

Finally, float numbers can be positive or negative, decimals,(-2.3, -0.035, 0, 1, 4.5, etc), and again the number of levels of information is defined by the bit-depth of an image. However the actual range of numbers available will depend on how many decimal points are used. It should be noted that float images generally have higher bit-depths, 32, 64, etc. to allow for the large number of possible decimal places.

At the time of writing the values used in most of the images produced by consumer digital products will be 8-bit, unsigned integers.

So how does colour work ?

Colour and how it is used and defined can very quickly become complicated. If you would like to read more information related to colour, beyond the very basic comments given below, it is recommended that you consult some of the very good publications covering this subject.[12]

Colour images are produced by combining specific sets of one-band images. These bands can be combined in different ways depending on the colour format of an image. One of the common colour formats, RGB (Red, Green, Blue)[13], will be used as an example to explain this.

RGB images are composed of three, separate one-band images. These three bands represent the different amounts of red, green and blue present in a particular image. If any of them are viewed individually, as one-band images, they will appear black and white, like all one-band images, but when combined together a computer can interpret them as colour, a simple example can be seen here.

Error creating thumbnail: Unable to save thumbnail to destination
Two example RGB images; in both cases the colour images represents the full RGB images and the following three mono images represent the three 1-band images which are combined to produce the colour.

There are several different colour formats used in digital imaging which have been developed for different types of work, some for printing, others for colour science, etc. They are all composed of sets of 1-band images, which are just interpreted by the computer in different ways. At this time, most of the consumer equipment on the market will produce RGB type images and the colour format will not limit the common image processing techniques. However, if you are using equipment, or processing techniques that require other colour formats it is advised that you at least familiarise yourself with their basic description to minimise the possibility of complications later on.[14]

Image file types, jpeg, tif, etc

So now we have digital images with; resolution, bit-depth, one or more bands and various possible colour formats, how do we store them on a computer? This is defined by the file type of an image, which refers to how all of the numbers, representing all of the digital image information, are parcelled up. Some file types have been designed to compress the digital data to save space, others have been designed to record everything in order to preserve detail and there are many others in between.

More complex pieces of image processing software tend to have their own file formats to make the most of what the software is designed to do, but they all generally also output standard formats so the images can be used elsewhere.

The more common standard image file formats include bitmap, tiff, jpeg and gif. For specific details of these or other image formats please consult the literature or explore the web.


The file size of images will depend on how the numerical image data has been organised. Some formats allow the information to be compressed, taking up less space by using complex algorithms, such as jpegs, where as other formats just record all the uncompressed information, such as tiffs. Compression can be done without losing any of the original information, this is called loss-less compression. Other forms of compresion can lead to a degradation of the original information or lossy compression. The default compression, set for particular file formats, in most image processing software, has been set to limit any perceptable loss of information, but if the compression used is increased the loss of information can become noticeable.

Camera Output

Digital cameras often come with a choice of options to set the size of captured images. This can be achieved by actually changing the resolution of the captured files or by changed their file format. Generally the highest quality file format offered by cameras will be uncompressed tiff images, followed by a range of jpegs with increasing compression. Higher end digital cameras often come with their own file format to ensure that the maximum amount of information is captured on the camera. More standard versions of the images can then be produced with their bundled software.

What is is calibration ?

sec:calibration Calibration is correcting consistant errors. If you know that your equipment is always going to do something slightly wrong, you can correct or calibrate your images to compensate for this. For example, if you have a camera that is slightly more sensitive to yellow light, then all of the images captured with the camera will be slightly yellow. If this is a consistent problem you can calculate how much extra yellow is present in all of your pictures. The calibration process would then be either to remove the extra yellow component, or to add other colours to compensate, in this case blue. Image calibrations are achieved by capturing images of known controlled targets and then using the captured information to correct other images. Some of the more common forms of calibration are correcting for problems in homogeneity of lighting, accuracy of colour and spacial distortions.

Homogeneity Calibration

This process is used to correct for inhomogeneities or unevenness in lighting. It is particularly important when smaller details are mosaiced together to produce larger composite images, as uneven lighting in this situation can lead to a patchwork appearance in the finished composite. The homogeneity of any arrangement of lights can be measured by simply placing a flat, even, grey card in front of the object or area to be imaged. If the lights are homogenous then the digital image of the grey card will be an even, flat, grey. However, this situation is rare. Normally a captured grey card image will display a range of tones defining the varying strength of the light[15] across the image area. The inhomogeneity of a lighting arrangement, recorded with a grey card, will generally stay the same until the relative positions of the camera, lights and target area are changed. Therefore, the calculations required to transform the captured grey image into a flat, even grey can also be used to calibrate any further captured images. If a large number of images are captured, at one time, it may be necessary to regularly capture new grey card images to compensate for any gradual change in the lighting over time.

Colour Calibration

In imaging, the captured colour of an object is produced by a combination of the colour of the light being reflected from, or emitted by, an object and the colour sensitivity of the detector. If you change the lighting or the detector the colours in the final digital image can also change. Colour calibration is the process of correcting the captured colours to account for different lights and detectors. This is done by imaging a target which contains a set of known coloured patches and again the calculations required to transform the captured colours into the known values can be used to calibrate any further images captured under the same conditions.

Spatial Distortion

In a captured image of a painting, each pixel in the image, represents a real physical area on the painting. If the image contains no spacial distortion then the relative positions of any two pixels in the image will be the same as the relative position of the coresponding areas in the painting. If the image is distorted, the relative positions of the image pixels begin to change with respect to the coresponding real areas. One of the most common example of spacial distortion which can occur in photography, especially with wide angled lenses, is where the very edges of your image begin to appear curved as if the image is being projected onto a ball rather than a flat screen. The degree of distortion can be measured by comparing the captured image of a computer generated grid with the perfect original.


  1. For example: visible light, infra-red, ultra-violet, X-rays, etc.
  2. RGB, which stands for Red-Green-Blue, indicates a particular type of colour image and will be explained later.
  3. For details on how to change your screen resolution please consult the help files or manuals supplied with your operating system, (i.e. Windows, Mac OS or Linux).
  4. For an image to appear clear, when displayed using the full size of a display, it will need to have an image resolution equal to or higher than the screen resolution. This is important to remember when preparing images for power point presentations. At the time of writing typical digital projectors have resolution of 1024 × 768 though higher resolution equipment is becoming available.
  5. Here, Dots and Points, both refer to exactly the same thing. Printer resolution can also be referred to as "Pixels Per Printed Inch", (PPPI).
  6. A 1800 × 2400 pixel digital image is 1800 pixels wide and 2400 pixels high, a total of 4320000 (1800 × 2400) pixels, or about 4.3 Megapixels
  7. The high DPI values are really an indication of how precise a printer is. A value of 4800 DPI actually indicates that a given printer is able to place a dot of ink in 4800 different positions in every inch, even though there is a high degree of overlap between one positioned dot and the next.
  8. In some types of colour digital image, such as CMYK (a mix of four colours; cyan, magenta, yellow and black), which have been designed for printing rather than viewing on a screen, the relationship between shade and number is actually reversed. Here a higher number is used to indicate that a larger amount of ink will be required to produce a given shade.
  9. All information on computers are stored in the form of binary data, 0s and 1s. A single "bit" is an individual binary object and can be either 0 or 1.
  10. For example a 1-bit image could become an 8-bit image containing pixel values of only 0 and 1. Or a 4-bit image could become an 8-bit image with pixel values limited between 0 and 15.
  11. An integer is any whole number, i.e. no fractions or decimal points.
  12. For example: Billmeyer and Saltzman's Principles of Color Technology, 3rd Edition (March 31, 2000), by Roy S. Berns, ISBN: 047119459X, or The Reproduction of Colour: Sixth Edition (March 2004), by R. W. G. Hunt, ISBN: 0863433685 fn:color_lit
  13. Other colour formats can include; Mono, Lab, LCh, XYZ, CMYK etc. For more information describing these or other colour formats please consult the literature, see footnote [#fn:color_lit 12]fn:color_lit or search the web.
  14. See footnote [#fn:color_lit 12]fn:color_lit
  15. This can be due to the inhomogeneity of a single light, the angle of incidence of a light, or the combination of a group of lights.