Turbo Hand Grabber works 4 times faster for my database but sometimes it stuck dead on certain points.
So i did some research and here it is what i found so far:
You are using NpgsqlDataReader gatherReader which "fetch" 1000 hands (public List<DataHand> GetNextUnprocessedHands(int bufferSize)) and return it one by one with DataHand GetNextHand() whenever it asked by NoteParser. So why are you wasting so much memory for these handhistories?I think noone can handle 1000 threads parser. And i think it has nothing do to with saving parsed results to database every 1000s hand (which would be great to see as parameter, especially when it produces large output and sometimes i get OutOfMemory exceptions and tbh HM2 sucks at memory management).
And every time i hit "stop" button or close HM2 gatherReader does not stop processing hands and i have to kill it in task manager and in running queries.
Code:
public void StopProcessing(object o)
{
try
{
lock (this.gatherLockObject)
{
if (this.gatherReader != null)
{
try
{
this.gatherReader.Close();
}
catch
{
}
}
try
{
GatherHelper.Connection.ClearPool();
GatherHelper.Connection.Dispose();
}
catch
{
}
lock (this.noteLock)
{
}
}
}
catch
{
}
}
doesnt look complete, mb some logging? =)
Queries.
Code:
SELECT h.handhistory_id AS pokerhand_id, h.handhistory,
(case when g.pokergametype_id = 0 then 'nl' when g.pokergametype_id = 1 then 'nl' when g.pokergametype_id = 2 then 'fl' else 'nl' end) as GameType, 'cash' as tabletype,
g.istourney, (case when g.pokergametype_id IN (3,4,5,9,10,11,13,14,15,16,17,18,19,25,26,29) then true else false end) as IsOmaha,
g.pokergametype_id
FROM handhistories h
INNER JOIN gametypes g on g.gametype_id = h.gametype_id
WHERE handhistory_id > 10721663
ORDER BY handhistory_id
I think you should get rid of JOINs whenever possible, its very heavy query on large databases. (could be tricky with filters on). gametypes table is kinda small and it its better to get it once and cache it. And implement all that "pokergametype_id" logic in handgetter (GetNextUnprocessedHands?), noneed to delegate buisiness logic to database - its wrong.
Updating data.
Its very slow to insert data into notecaddy_data/scatter player by player. Its much better to do INSERT ... VALUES(player_id1, data1),(player_id2, data2) ON DUPLICATE KEY UPDATE; since it has player_id as primary key. Or make a Procedure in Postgres, or both.
Oh yes, please do "verbose logging" checkbox in settings, not just "enable logging". It would help alot to find bottlenecks and debug errors.