PDA

View Full Version : Fix QueryPerformaceCounter going Backwards


gnif
2009-02-09, 01:46
I am posting this here as I do not know the best way to code this into XBMC and someone else that knows more about it should really do this..

Please see the thread:
http://xbmc.org/forum/showthread.php?p=281375&posted=1#post281375

I have detailed a solution to fix the backwards time issue that is caused on multi-core systems from querying the clock of random cores from differing threads.

It will need translating where possible to SDL and an initialisation function setup to start the thread.

gnif
2009-02-09, 03:15
Here is a complete sample program


/************************************************** *************************
* Copyright (C) 2009 by Geoffrey McRae *
* geoff@rabidhost.com *
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. *
* *
* This program is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU General Public License for more details. *
* *
* You should have received a copy of the GNU General Public License *
* along with this program; if not, write to the *
* Free Software Foundation, Inc., *
* 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
************************************************** *************************/

/*
* This program is a working example of getting the high resolution time
* using a thread locked to a single core to avoid the time going backwards
* on SMP systems
*/

#define _GNU_SOURCE
#include <unistd.h>
#include <pthread.h>
#include <time.h>
#include <string.h>
#include <stdio.h>

pthread_cond_t thread_time_cond1 = PTHREAD_COND_INITIALIZER;
pthread_mutex_t thread_time_mutex1 = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t thread_time_cond2 = PTHREAD_COND_INITIALIZER;
pthread_mutex_t thread_time_mutex2 = PTHREAD_MUTEX_INITIALIZER;

struct timespec *thread_time_data;
clockid_t thread_time_clk_id;
int thread_time_ret;

void* thread_time_func(void *data) {
/* Set our affinity to one core so the time never goes backwards */
unsigned long mask = 1;
pthread_setaffinity_np(pthread_self(), sizeof(mask), (cpu_set_t*)&mask);

/* Lock the primary mutex for our condition */
pthread_mutex_lock(&thread_time_mutex1);
while(1) {
/* Wait until someone wants the current time */
pthread_cond_wait(&thread_time_cond1, &thread_time_mutex1);
thread_time_ret = clock_gettime(thread_time_clk_id, thread_time_data);

/* Lock the 2nd mutex, signal the 2nd cond, and unlock */
pthread_mutex_lock(&thread_time_mutex2);
pthread_cond_signal(&thread_time_cond2);
pthread_mutex_unlock(&thread_time_mutex2);
}

pthread_exit(NULL);
return NULL;
}

int clock_gettime_core(clockid_t clk_id, struct timespec *tp) {
int ret;

/* Lock the mutex so we dont change things in the thread mid stride */
pthread_mutex_lock(&thread_time_mutex1);
thread_time_clk_id = clk_id;
thread_time_data = tp;

/* Signal the thread to run */
pthread_mutex_lock(&thread_time_mutex2);
pthread_cond_signal(&thread_time_cond1);
pthread_mutex_unlock(&thread_time_mutex1);

/* Wait for the thread to tell us our data is ready */
pthread_cond_wait(&thread_time_cond2, &thread_time_mutex2);
ret = thread_time_ret;

/* Finally unlock the mutex so this function can be called again */
pthread_mutex_unlock(&thread_time_mutex2);
return ret;
}

pthread_mutex_t thread_test_mutex = PTHREAD_MUTEX_INITIALIZER;
unsigned long long old;
unsigned long long cur;

void* test_thread(void *data) {
struct timespec ts;
while(1) {
pthread_mutex_lock(&thread_test_mutex);
clock_gettime_core(CLOCK_REALTIME, &ts);
cur = ((unsigned long long)ts.tv_sec * 1000000000L) + ts.tv_nsec;
if (cur < old) printf("Time went backwards. diff = %qu, cur = %qu\n", ((unsigned long long)old - cur), cur);

/* Print out the time every second to show that we have not deadlocked */
if ((cur / 1000000000L) != (old / 1000000000L))
printf("Sec: %lld\n", cur);
old = cur;

pthread_mutex_unlock(&thread_test_mutex);
}

pthread_exit(NULL);
return NULL;
}

int main(int argc, char **argv) {
pthread_attr_t thread_attr;
pthread_t thread_time;

/* Create the time thread */
pthread_attr_init(&thread_attr);
pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_DETACHED);
pthread_create(&thread_time, &thread_attr, thread_time_func, NULL);

/* Create the test threads */
old = 0;
pthread_t thread_test;
pthread_create(&thread_test, NULL, test_thread, NULL);
pthread_create(&thread_test, NULL, test_thread, NULL);
pthread_join(thread_test, NULL);

return 0;
}


Compile with:

gcc clock.c -o clock -lrt -lpthread


In theory you could remove the pthread_setaffinity_np call as the same thread will be always used to query the time, however if the CPU scheduler decides to move the thread to another core, the time may jump backwards.

gnif
2009-02-09, 06:35
I have built a shared library I am going to LD_PRELOAD into XBMC that will override the clock_gettime syscall with my version and test the kind of performance I get.

Hopefully the logs will no longer report the time going backwards, and things will still run well ;)

I will post the results ASAP.

gnif
2009-02-09, 06:46
It works!

I no longer have misspelt "backwords" errors in my log... since I am remote to the PC however I can not report as to the performance till I get home, stay tuned.

gnif
2009-02-10, 07:18
Performance testing has proved that this is not a viable solution.

However, further investigation of how clock_gettime works and the various different clocks that can be used, AFAIK there is no reason to use CLOCK_PROCESS_CPU_ID as the source, since through experimentation it seems that CLOCK_REALTIME has better precision, and does not suffer from the backwards problem.

I am updating XBMC to use CLOCK_REALTIME to see how this affects things.

gnif
2009-02-10, 07:46
Ok, from what I understand

CLOCK_PROCESS_CPUTIME_ID is the number of elapsed cycles the current core has done, and can go backwards when the processes is moved to another core.

CLOCK_THREAD_CPUTIME_ID is the number of elapsed cycles that the current thread has done on the current core, and can also go backwards when the thread is moved to another core.

CLOCK_REALTIME is the system wide clock and will not ever go backwards between cores, BUT, it can go backwards or jump forwards if the system time is changed.

CLOCK_MONOTONIC is like the CLOCK_REALTIME except it is un-affected by time changes and is not the current time, but the time since an unspecified time (ie, system boot).

So, the best way to fix this problem without performance degradation is to change the QueryPerformanceCounter function to always query CLOCK_MONOTONIC. The other CPUTIME clocks are not part of the POSIX standard anyway and may not be portable between platforms.

althekiller
2009-02-11, 00:31
From what I can tell XBMC always uses CLOCK_MONOTONIC already. The bUseHighRes parameter is used to choose CLOCK_PROCESS_CPUTIME_ID, but grep doesn't turn up any instances where it is true.

gnif
2009-02-11, 01:14
Yes, I found this last night, so it is not the root cause of the issue. However, since the "HighPrecision" mode is no higher precision then MONOTONIC, and it introduces problems on multi core systems, the option should not do anything for Linux.

I also found the reason the logs were reporting the time as going backwards.

The function GetAbsoluteClock does not acqure a mutex lock before it tries to update m_systemFrequency, or m_systemOffset if they are not set. This causes a race condition where one thread will be mid way updating those values, while another is also trying to do so.

I am about to submit a patch to trac that resolves this issue, and also fixes the spelling error of "Backwords"

Trac: http://xbmc.org/trac/ticket/5883

tslayer
2009-02-11, 01:53
But if you change the spelling of Backwords, we won't know it was elupus who wrote the code. It is his signature spelling :)

gnif
2009-02-11, 02:52
But at least those who search for this problem will find it :).

Now I am trying to track down an audio sync problem with my SB Live Platinum with SPDIF output. I thought it would be related to the clock issue, but obviously not :(.

Alteast my logs are no longer filling with "time went backwords" errors anymore :)

althekiller
2009-02-11, 03:30
Committed in r17646. Sorry, I just had to leave the typo ;)

gnif
2009-02-11, 05:15
Yay, my first accepted patch :)

I noticed the few modifications you made (changed clock var to a constant), I will be more meticulous next time. :)

Maxim
2009-02-11, 05:45
But if you change the spelling of Backwords, we won't know it was elupus who wrote the code. It is his signature spelling :)

Ok ok. Errors are best when not seen anyway, right? :)