We consider a background the pixels of image without motion. And a foreground the pixels with motion. Then the simplest background model assume each background pixel his brightness varies independently with normal distribution. Then we can calculate our statistical model of background by accumulating several dozens of frames and his squares, this is:

$latex \displaystyle{S(x,y)=\sum_{f=1}^N p(x,y)}$

$latex \displaystyle{Sq(x,y)=\sum_{f=1}^N p(x,y)^2}$

Where:

$latex \displaystyle{f=frame; N= total frames}$

Then the average is:

$latex \displaystyle{M(x,y)=\frac{S(x,y)}{N}}$

[caption id="attachment_160" align="aligncenter" width="300" caption="BG substraction mean pixel"][/caption]

And need the standar derivation for our statistical model:

$latex \displaystyle{\sigma(x,y)=\sqrt{ \frac{Sq(x,y)}{N} - M(x,y)^2 }}$

Then, when we have our statistical background model, we can give a image and divide it in background and foreground, the pixels that are in foreground are the P(x,y) that met this condition:

$latex \displaystyle{ | M(x,y) - P(x,y) | \ge \lambda \sigma(x,y) }$

For optimize a litle the operations we can use this for evaluate the condition:

$latex \displaystyle{ {| M(x,y) - P(x,y) |}^2 \ge {(\lambda \sigma(x,y))}^2 }$

$latex \displaystyle{ {| M(x,y) - P(x,y) |}^2 \ge \lambda^2 \sigma(x,y)^2 }$

Where $latex \displaystyle{ sigma(x,y)^2 }$ is

$latex \displaystyle{\sigma(x,y)^2 = \frac{Sq(x,y)}{N} - M(x,y)^2 }$

And $latex \displaystyle{ \lambda }$ is a constant or variable and we can set it to 3, it is the well-known "three sigmas" rule.

[caption id="attachment_161" align="aligncenter" width="300" caption="BG substraction Standar derivation compare"][/caption]

Ok, then we now have the theory, then, we go to implement it.

To develop it opencv give us some functions, cvAcc and cvSquareAcc

`cvAcc(frame,acc,NULL);`

cvSquareAcc(frame, sqacc, NULL);

N++;

cvConvertScale(acc, M, (double)(1.0/N), 0);

cvConvertScale(sqacc, sqaccM, (double)(1.0/N),0);

cvMul( M, M, M2, 1 );

cvSub( sqaccM, M2, sig2, NULL);//The sig is sig2

//For detect FG Condition

cvConvertScale(sig2, lambda_sig2, (double)9, 0);

cvSub(M, frame, leftCondition, NULL);

cvMul(leftCond, leftCond, leftCond2, NULL);

//Compare leftCondition > lambda_sig2

//to detect foreground.

Results:

Note:

This code is an example and it's no completed.

In next tutorial a more advanced background substranction.

## About David MillĂˇn EscrivĂˇ

David completed his studies in Universidad Politecnica de Valencia in IT with a Master's degree in artificial intelligence, computer graphics, and pattern recognition, focusing on pattern recognition and Computer Vision. David has more than 15 years of experience in IT, with more than ten years of experience in Computer Vision, computer graphics, and pattern recognition, working on different projects and start-ups, applying his knowledge of Computer Vision, optical character recognition, and augmented reality. Co-Author of two OpenCV books and reviewer of few more.

thanks for sharing some theory behind this subject. looking forward to the more advanced background substraction article! :)

ReplyDeleteHey would like to see an example of this working

ReplyDeletehi,

ReplyDeletewhich literature you are using for theory?

Why? This theory appear in OpenCV old documentation pdf.

ReplyDeleteHello damiles,

ReplyDeleteI´ve to say your blog is very useful for those of us who are starting in the computer vision. I´m using openCV as well for a project in which I´ve to recognize people in a 3D environment (I know it´s not easy). I´d like to ask you one question, the methods used to get a frame from the camera calling cvQueryFrame everytime in a loop, isn´t it too much complex computationally speaking? Or openCV manage it in order to let the Operative System take the control?

Thanks a lot.

ReplyDeletecould you explain more about how the code work?

I'm waiting for your next post on this problem.I intend to use this to make the classifier mor robust( in my hand posture recognition program)

Hi,

ReplyDeleteI'm PhD student and i want have a substraction background for my processing before the end of the month. You're methdod are very interesting.... is it possible to get the code to do tests on my project?

Thank you in advance for your help.

hey,

ReplyDeletegot pretty much all of it but got a simple question for the last part when you do "//Compare leftCondition > lambda_sig2

//to detect foreground."

Do you do a pixel or an image comparison? am not too sure how to do a pixel comparison so it will be great if you can explain it.

Many thanks in advance!

Is by pixel cli624.

ReplyDeleteComplete state of the art in the field in the following publications:

ReplyDeleteS. Elhabian, K. El-Sayed, S. Ahmed, “Moving Object Detection in Spatial Domain using Background Removal Techniques - State-of-Art”, Recent Patents on Computer Science, Volume 1, Number 1, pages 32-54, January 2008.

T. Bouwmans, F. El Baf, B. Vachon, “Background Modeling using Mixture of Gaussians for Foreground Detection - A Survey”, Recent Patents on Computer Science, Volume 1, No 3, pages 219-237, November 2008.

T. Bouwmans, F. El Baf, B. Vachon, “Statistical Background Modeling for Foreground Detection: A Survey”, Handbook of Pattern Recognition and Computer Vision, World Scientific Publishing, Volume 4, Part 2, Chapter 3, pages 181-199, January 2010.

what about cvRunningAvg() function.im using that in my application for background subtraction but getting unhandled exception.i will be grateful if u help.

ReplyDeletehi Sam, please send me the piece of code and the error you get.

ReplyDeleteCan you please show us an example of background subtraction? How can we get a mask? I am doing an ancient coins recognition system. In my case, I have images of coins. How can I segment the coin from background?

ReplyDeleteHey Damiles,

ReplyDeletecould u please tell me how do I get individual pixel values of an IplImage and after that again converting pixel values array into IplImage. How can do it? please reply ASAP :)

@Nadeeshani

ReplyDeleteI suggest you use circle-hough transform to detect circle things in the image.

Dear Damiles

ReplyDeletethank you for such basic tutorials for beginner at computer vision like me!

I found this tutorial really useful, can you share us the code for implement your tutorial?

Regards

-Tai

hi Tai, this tutorial i did a lot of time, and i don't found the code, but i go to look for another time.

ReplyDeleteRegards Damiles

Hi Damiles,

ReplyDeleteYour subject is very interesting. I used your code in my project about person fall detection using Kinect camera.

But i like to know how or which data did you used to generate your BG substraction mean pixel graph ?

I don't understand because your mean pixel value must be the same for all frames. The mean image is normally calculate once for all frames. No ?

Thanks

Regards !

Now that cv2 is out and the interface has changed. - I wonder if you would recast your ideas in the new form ?

ReplyDeletecap = cv2.VideoCapture(0)

BGsample = 30 # capture 30 frames for BG sample

success, img = cap.read()

dim = (cap.get(3), cap.get(4))

print dim

acc = np.zeros(dim, np.float32) # 32 bit accumulator

sqacc = np.zeros(dim, np.float32) # 32 bit accumulator

if success:

for i in range(BGsample):

success, img = cap.read()

frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.accumulate(frame, acc)

cv2.accumulateSquare(frame, sqacc)

# BG samples gathered - now process

M = acc*1/float(BGsample)

sqaccM = sqacc*1/float(BGsample)

M2 = M*M

sig2 = sqacc-M2

# begin looking for FG

success, img = cap.read()

frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

...

This is working quite well but I don't have the maths right for the FG detection working:

ReplyDeleteimport cv2

import numpy as np

if __name__ == '__main__':

cap = cv2.VideoCapture(1)

cv2.namedWindow("input")

cv2.namedWindow("sig2")

cv2.namedWindow("detect")

BGsample = 20

success, img = cap.read()

width = cap.get(3)

height = cap.get(4)

key = -1

if success:

acc = np.zeros((height, width), np.float32) # 32 bit accumulator

sqacc = np.zeros((height, width), np.float32) # 32 bit accumulator

for i in range(20): a = cap.read() # dummy to warm up sensor

for i in range(BGsample):

success, img = cap.read()

frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.accumulate(frame, acc)

cv2.accumulateSquare(frame, sqacc)

#

M = acc/BGsample

print "div",M[0][0]

sqaccM = sqacc/float(BGsample)

M2 = M*M

sig2 = sqaccM-M2

#

# do FG detection

while(key < 0):

success, img = cap.read()

frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

#Ideally we create a mask for future use that is B/W for FG objects

# (using erode or dilate to remove noise)

grey = cv2.morphologyEx((M+sig2-frame)/60, cv2.MORPH_DILATE,

cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3)), iterations=3)

cv2.imshow("input", frame)

cv2.imshow("sig2", sig2/60)

cv2.imshow("detect", grey/20)

key = cv2.waitKey(1)

cv2.destroyAllWindows()

Sorry to respond to late,

ReplyDeleteThe mean, normal and derivation is take with N previous frames.

I think this is a working solution.

ReplyDeleteBut I would prefer output was available as a mask - for further processing

and that I could use erode or dilate to remove pixel noise.

import cv2

import numpy as np

if __name__ == '__main__':

cap = cv2.VideoCapture(1)

cv2.namedWindow("input")

#cv2.namedWindow("sig2")

cv2.namedWindow("detect")

BGsample = 20 # number of frames to gather BG samples from at start of capture

success, img = cap.read()

width = cap.get(3)

height = cap.get(4)

if success:

acc = np.zeros((height, width), np.float32) # 32 bit accumulator

sqacc = np.zeros((height, width), np.float32) # 32 bit accumulator

for i in range(20): a = cap.read() # dummy to warm up sensor

# gather BG samples

for i in range(BGsample):

success, img = cap.read()

frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.accumulate(frame, acc)

cv2.accumulateSquare(frame, sqacc)

#

M = acc/float(BGsample)

(mu, sigma) = cv2.meanStdDev(M)

print type(mu), type(sigma), mu.shape, sigma.shape

sqaccM = sqacc/float(BGsample)

M2 = M*M

sig2 = sqaccM-M2

#sig2 = 3* sig2

# have BG samples now

# coerce mean etc into 8bit image space

detectmin = cv2.convertScaleAbs(M-sig2)

detectmax = cv2.convertScaleAbs(M+sig2)

# start FG detection

key = -1

while(key < 0):

success, img = cap.read()

frame = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

#Ideally we create a mask for future use that is B/W for FG objects

# (using erode or dilate to remove noise)

level = cv2.inRange(frame, detectmin, detectmax)

cv2.imshow("input", frame)

#cv2.imshow("sig2", M/200)

cv2.imshow("detect", level)

key = cv2.waitKey(1)

cv2.destroyAllWindows()

needs formatting - can't determine proper tag to use to preserve spacing.

ReplyDeleteplease fix

also you could delete the older posts (and this one)

I am a PhD student and i want to experiment with background subtraction.I would like to see an example of this working Regards

ReplyDeletewhat does it mean to compute the mean of all frames...why we needed it???t....n how could i compute all the pixels from video sequence

ReplyDeleteactually m trying to implement the paper efficient background subtraction and shadow removal for monochromatic sequence....

plz help me

also i need some theory on background subtraction as earlier as possible....

ReplyDeletehow could d implenetation is possible in Matlab

ReplyDeleteInformative article, totally what I was looking for.

ReplyDeletemy site; carte psn gratuit

should you not be using more color frequencies? Into unseen reds especially?

ReplyDelete