Possible race condition in ConnectionTCPBase::cleanup() (gloox 0.95 stable)
From: "Ilian Jeri Pinzon" <ilianpinzon@xxxxxxxxx>
Date: Mon, 22 Oct 2007 13:50:43 +0200 (CEST)
Hi,

I'm encountering a possible race condition when disconnecting using gloox
v0.95 stable in Windows XP with VS 2005. ConnectionTCPBase::cleanup()
crashes when m_sendMutex/m_recvMutex is locked after ConnectionTCPBase has
already been destructed.

First of all, I'm using ClientBase::connect() synchronously but it's
actually called from a separate thread so that I can still have asynchronous
behavior. I'm not sure if I'm doing the right thing for disconnecting but
I'm calling it from my class' destructor. Here's the pseudocode:

a. client->disconnect()
b. block here until we receive onDisconnect()
c. delete client

When step (a) is called, 2 things happen simultaneously:

(1) Connection::recv() stops and calls ClientBase::notifyDisconnect(),
(2) ClientBase::disconnect also calls ClientBase::notifyDisconnect()

In my machine, the client is already being destructed (step c) even if (1)
has not completed yet because (2) has already signaled onDisconnect(). This
causes the crash in ConnectionTCPBase::cleanup() when (1) is being finished.

My temporary fix is to remove notifyDisconnect() in (2) and rely in (1)'s
notifyDisconnect() only. You might have a better solution.

- Ilian