The basics of background substraction

Mar 25, 2009 by     22 Comments    Posted under: Linux/Unix, OpenCV, Tutorials, Works

This tutorial explain the basics of background substraction. First of all we need define what is a background and what is a foreground.

We consider a background the pixels of image without motion. And a foreground the pixels with motion. Then the simplest background model assume each background pixel his brightness varies independently with normal distribution. Then we can calculate our statistical model of background by accumulating several dozens of frames and his squares, this is:

$\displaystyle{S(x,y)=\sum_{f=1}^N p(x,y)}$
$\displaystyle{Sq(x,y)=\sum_{f=1}^N p(x,y)^2}$

Where:

$\displaystyle{f=frame; N= total frames}$

Then the average is:

$\displaystyle{M(x,y)=\frac{S(x,y)}{N}}$

BG substraction mean pixel

And need the standar derivation for our statistical model:

$\displaystyle{\sigma(x,y)=\sqrt{ \frac{Sq(x,y)}{N} - M(x,y)^2 }}$

Then, when we have our statistical background model, we can give a image and divide it in background and foreground, the pixels that are in foreground are the P(x,y) that met this condition:

$\displaystyle{ | M(x,y) - P(x,y) | \ge \lambda \sigma(x,y) }$

For optimize a litle the operations we can use this for evaluate the condition:

$\displaystyle{ {| M(x,y) - P(x,y) |}^2 \ge {(\lambda \sigma(x,y))}^2 }$

$\displaystyle{ {| M(x,y) - P(x,y) |}^2 \ge \lambda^2 \sigma(x,y)^2 }$

Where $\displaystyle{ sigma(x,y)^2 }$ is

$\displaystyle{\sigma(x,y)^2 = \frac{Sq(x,y)}{N} - M(x,y)^2 }$

And $\displaystyle{ \lambda }$ is a constant or variable and we can set it to 3, it is the well-known “three sigmas” rule.

BG substraction Standar derivation compare

Ok, then we now have the theory, then, we go to implement it.

To develop it opencv give us some functions, cvAcc and cvSquareAcc

 cvAcc(frame,acc,NULL); cvSquareAcc(frame, sqacc, NULL); N++; cvConvertScale(acc, M, (double)(1.0/N), 0); cvConvertScale(sqacc, sqaccM, (double)(1.0/N),0); cvMul( M, M, M2, 1 ); cvSub( sqaccM, M2, sig2, NULL);//The sig is sig2   //For detect FG Condition cvConvertScale(sig2, lambda_sig2, (double)9, 0); cvSub(M, frame, leftCondition, NULL); cvMul(leftCond, leftCond, leftCond2, NULL);   //Compare leftCondition &gt; lambda_sig2 //to detect foreground. 

Results:

Note:
This code is an example and it’s no completed.
In next tutorial a more advanced background substranction.

• thanks for sharing some theory behind this subject. looking forward to the more advanced background substraction article!

• Hey would like to see an example of this working

• hi,
which literature you are using for theory?

• Why? This theory appear in OpenCV old documentation pdf.

• Hello damiles,

I´ve to say your blog is very useful for those of us who are starting in the computer vision. I´m using openCV as well for a project in which I´ve to recognize people in a 3D environment (I know it´s not easy). I´d like to ask you one question, the methods used to get a frame from the camera calling cvQueryFrame everytime in a loop, isn´t it too much complex computationally speaking? Or openCV manage it in order to let the Operative System take the control?

• Thanks a lot.
could you explain more about how the code work?

I’m waiting for your next post on this problem.I intend to use this to make the classifier mor robust( in my hand posture recognition program)

• Hi,

I’m PhD student and i want have a substraction background for my processing before the end of the month. You’re methdod are very interesting…. is it possible to get the code to do tests on my project?

• hey,

got pretty much all of it but got a simple question for the last part when you do “//Compare leftCondition > lambda_sig2
//to detect foreground.”
Do you do a pixel or an image comparison? am not too sure how to do a pixel comparison so it will be great if you can explain it.

• Is by pixel cli624.

• Complete state of the art in the field in the following publications:

S. Elhabian, K. El-Sayed, S. Ahmed, “Moving Object Detection in Spatial Domain using Background Removal Techniques – State-of-Art”, Recent Patents on Computer Science, Volume 1, Number 1, pages 32-54, January 2008.

T. Bouwmans, F. El Baf, B. Vachon, “Background Modeling using Mixture of Gaussians for Foreground Detection – A Survey”, Recent Patents on Computer Science, Volume 1, No 3, pages 219-237, November 2008.

T. Bouwmans, F. El Baf, B. Vachon, “Statistical Background Modeling for Foreground Detection: A Survey”, Handbook of Pattern Recognition and Computer Vision, World Scientific Publishing, Volume 4, Part 2, Chapter 3, pages 181-199, January 2010.

• what about cvRunningAvg() function.im using that in my application for background subtraction but getting unhandled exception.i will be grateful if u help.

• hi Sam, please send me the piece of code and the error you get.

• Can you please show us an example of background subtraction? How can we get a mask? I am doing an ancient coins recognition system. In my case, I have images of coins. How can I segment the coin from background?

I suggest you use circle-hough transform to detect circle things in the image.

• Dear Damiles

thank you for such basic tutorials for beginner at computer vision like me!

I found this tutorial really useful, can you share us the code for implement your tutorial?

Regards
-Tai

• hi Tai, this tutorial i did a lot of time, and i don’t found the code, but i go to look for another time.

Regards Damiles

• Hi Damiles,
Your subject is very interesting. I used your code in my project about person fall detection using Kinect camera.
But i like to know how or which data did you used to generate your BG substraction mean pixel graph ?

I don’t understand because your mean pixel value must be the same for all frames. The mean image is normally calculate once for all frames. No ?

Thanks

Regards !

• Now that cv2 is out and the interface has changed. – I wonder if you would recast your ideas in the new form ?

cap = cv2.VideoCapture(0)
BGsample = 30 # capture 30 frames for BG sample
dim = (cap.get(3), cap.get(4))
print dim
acc = np.zeros(dim, np.float32) # 32 bit accumulator
sqacc = np.zeros(dim, np.float32) # 32 bit accumulator
if success:
for i in range(BGsample):
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.accumulate(frame, acc)
cv2.accumulateSquare(frame, sqacc)
# BG samples gathered – now process
M = acc*1/float(BGsample)
sqaccM = sqacc*1/float(BGsample)
M2 = M*M
sig2 = sqacc-M2

# begin looking for FG
frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

• This is working quite well but I don’t have the maths right for the FG detection working:

 import cv2 import numpy as np

 

if __name__ == '__main__': cap = cv2.VideoCapture(1) cv2.namedWindow("input") cv2.namedWindow("sig2") cv2.namedWindow("detect") BGsample = 20 success, img = cap.read() width = cap.get(3) height = cap.get(4) key = -1 if success: acc = np.zeros((height, width), np.float32) # 32 bit accumulator sqacc = np.zeros((height, width), np.float32) # 32 bit accumulator for i in range(20): a = cap.read() # dummy to warm up sensor for i in range(BGsample): success, img = cap.read() frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) cv2.accumulate(frame, acc) cv2.accumulateSquare(frame, sqacc) # M = acc/BGsample print "div",M[0][0] sqaccM = sqacc/float(BGsample) M2 = M*M sig2 = sqaccM-M2 # # do FG detection while(key < 0): success, img = cap.read() frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Ideally we create a mask for future use that is B/W for FG objects # (using erode or dilate to remove noise) grey = cv2.morphologyEx((M+sig2-frame)/60, cv2.MORPH_DILATE, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3)), iterations=3) cv2.imshow("input", frame) cv2.imshow("sig2", sig2/60) cv2.imshow("detect", grey/20) key = cv2.waitKey(1) cv2.destroyAllWindows() 

• Sorry to respond to late,

The mean, normal and derivation is take with N previous frames.

• I think this is a working solution.
But I would prefer output was available as a mask – for further processing
and that I could use erode or dilate to remove pixel noise.
 import cv2 import numpy as np

 

if __name__ == '__main__': cap = cv2.VideoCapture(1) cv2.namedWindow("input") #cv2.namedWindow("sig2") cv2.namedWindow("detect") BGsample = 20 # number of frames to gather BG samples from at start of capture success, img = cap.read() width = cap.get(3) height = cap.get(4) if success: acc = np.zeros((height, width), np.float32) # 32 bit accumulator sqacc = np.zeros((height, width), np.float32) # 32 bit accumulator for i in range(20): a = cap.read() # dummy to warm up sensor # gather BG samples for i in range(BGsample): success, img = cap.read() frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) cv2.accumulate(frame, acc) cv2.accumulateSquare(frame, sqacc) # M = acc/float(BGsample) (mu, sigma) = cv2.meanStdDev(M) print type(mu), type(sigma), mu.shape, sigma.shape sqaccM = sqacc/float(BGsample) M2 = M*M sig2 = sqaccM-M2 #sig2 = 3* sig2 # have BG samples now # coerce mean etc into 8bit image space detectmin = cv2.convertScaleAbs(M-sig2) detectmax = cv2.convertScaleAbs(M+sig2) # start FG detection key = -1 while(key < 0): success, img = cap.read() frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Ideally we create a mask for future use that is B/W for FG objects # (using erode or dilate to remove noise) level = cv2.inRange(frame, detectmin, detectmax) cv2.imshow("input", frame) #cv2.imshow("sig2", M/200) cv2.imshow("detect", level) key = cv2.waitKey(1) cv2.destroyAllWindows() 

• needs formatting – can’t determine proper tag to use to preserve spacing.
also you could delete the older posts (and this one)