The IplImage Structure

Before reading this page it is highly advisable to quickly review the IplImage structure, a version of which is kept on this page. (Use 'Find' or [Ctrl-F] and type in 'IplImage'.)

OpenCV has a tendency to hide information, especially when you only tend to use the HighGUI functions or the one-liner methods that take 15 parameters. As such, you tend to miss some of the important stuff that can be useful to know when working at the lower levels. This page is an attempt to address the problem by showing you some of the 'inner workings' of the IplImage structure.

It is worth noting that images are not stored using the 'traditional' RGB colour space, they're actually stored in BGR (the other way round). Why this is I'm not entirely sure, but you don't tend to notice it as all the methods are written to use BGR as well.

Create An Image

To draw a red square we'll need to start off by creating an image.

IplImage *img = cvCreateImage(cvSize(100, 100), IPL_DEPTH_8U, 3);

This creates an image of width/height 100/100, using 8-bit unsigned integers to represent the colour values, and with 3 colour channels. However, 8-bit unsigned values are not the only type available; values can also be held as 32-bit floating point numbers (IPL_DEPTH_32F) and a variety of other ways. In each case the depth is represented as IPL_DEPTH_<bits>{U|S|F} where U, S and F stand for unsigned, signed and floating point. i.e.

Also notice that it's a pointer to an image - all images should be created in this way when using OpenCV as most (if not all) of its methods take image pointers as parameters in order to modify images directly.

Image Data

Images are not stored by pixel. Instead they are stored as arrays of colour levels which means less processor overhead as you're not constantly dereferencing pointers. These arrays of colour are stored in BGR order (as mentioned above).

e.g. IplImage's imageData field looks like this...

imageData[0] imageData[1] imageData[2] imageData[3] imageData[4] imageData[5]
colour values go in these elements opposed to this:

imageData[0] imageData[1]
-> red -> green -> blue -> red -> green -> blue
colour values go in these elements

Greyscale image structures differ very slightly - instead of having three channels they have just the one (for brightness) that can be accessed in the same way.
      i.e. cvCreateImage(cvSize(100, 100), IPL_DEPTH_8U, 1)
In this case the first pixel would be imageData[0], the second would be imageData[1] and so on.

Finally, images in OpenCV are padded. Most image formats available today such as JPEG, PNG, TIFF and the like are padded out so that the number of columns in an image is divisible by 4 - with the exception of BMPs. This means that if you ever get round to converting between image structures using BMPs, you can get some rather interesting skewing effects if you try to simply copy the data arrays over. (This was discovered whilst trying to convert between Leeds' libRTImage library and OpenCV. If you're interested, this is what I came up with.)

Direct Pixel Access

So to get our red square going we'll just have to edit every third channel. Direct access of the pixels is possible using the imageData attribute and the number of bytes in the image (or img->imageSize) can be used as a quick way of bounding the for loop.

img->imageData[i] = value;

so we get:

int i;
for (i = 2; i < img->imageSize; i+=3)
   img->imageData[i] = 255;

(For a finished file which displays the square in a window, click here.)

It is worth noting that while most images and methods in OpenCV use or return 8-bit unsigned data (e.g. cvLoadImage always returns an IPL_DEPTH_8U image), this is not how OpenCV is written. imageData isn't int or float, it's actually a char pointer to data within IplImage.

It would seem that this is done for versatility, but presents a little confusion on use with your first 32-bit float image. 32F images can only hold values between 0 and 1, but to a rather high degree of accuracy (as you would expect), so we have to adjust values accordingly. We also have to change the way for loops are defined - imageSize is measured in bytes and as there are now four bytes per colour value (floats are four bytes each), the for loop returns a segmentation fault a quarter of the way through its life if we use the same code as before. Instead we can use the image's width and height attributes, multiplying by 3 so that all channels are filled. Finally, the values themselves need to be converted to float pointers so that the data is stored in the correct format. The following code should clarify things.

int i;
for (i = 0; i < img->width*img->height*3; i+=3)
   ((float*)img->imageData)[i] = 64/256.0;
   ((float*)img->imageData)[i+1] = 196/256.0;
   ((float*)img->imageData)[i+2] = 256/256.0;


The orange square program shows the complete version of the above. Click here to see it.

Image Representations

There are several colour space conversions available within OpenCV, through use of the cvCvtColor function:

cvCvtColor(source, destination, space_code);

Here, space_code typically takes the form of the source colour space and the desired colour space, but note that in each case the source and destination images must have the correct number of channels. The possible codes are:


CIE XYZ: YCrCb JPEG (a.k.a. YCC) HSV: HLS: CIE Lab: CIE Luv: Bayer (a pattern widely used in CCD and CMOS cameras): However, whilst IplImage's attributes include one for colorModel, OpenCV completely ignores this when displaying or processing an image. Instead it chooses to assume BGR order, and this can lead to strange image output if displayed using cvShowImage.