问题
I'm using Postgresql on Windows in a C# application. The problem I'm having is really weird and can be described as follows:
- I restart my Windows
- I run the program
- One specific query fails:
SELECT COUNT(*) AS c FROM files WHERE total_bytes IS NOT NULL
- I run the program again and everything works normally
Weird notes:
- I tried making another query before that one (even using the same table) and it worked:
SELECT COUNT(*) AS c FROM files
- I wasn't able to reproduce the error restarting Postgresql. It only happens on a Windows reboot. And it happens only once.
The exception traceback:
Npgsql.NpgsqlException: Exception while reading from stream
at Npgsql.ReadBuffer.Ensure(Int32 count, Boolean dontBreakOnTimeouts)
at Npgsql.NpgsqlConnector.DoReadMessage(DataRowLoadingMode dataRowLoadingMode, Boolean isPrependedMessage)
at Npgsql.NpgsqlConnector.ReadMessageWithPrepended(DataRowLoadingMode dataRowLoadingMode)
at Npgsql.NpgsqlConnector.ReadMessage(DataRowLoadingMode dataRowLoadingMode)
at Npgsql.NpgsqlConnector.ReadExpecting[T]()
at Npgsql.NpgsqlDataReader.NextResultInternal()
at Npgsql.NpgsqlDataReader.NextResult()
at Npgsql.NpgsqlCommand.Execute(CommandBehavior behavior)
at Npgsql.NpgsqlCommand.ExecuteDbDataReaderInternal(CommandBehavior behavior)
at Npgsql.NpgsqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
at System.Data.Common.DbCommand.ExecuteReader()
at Npgsql.NpgsqlCommand.ExecuteReader()
at DriveShare.Database.Postgresql.ExecuteQuery(NpgsqlCommand command) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 216
at DriveShare.Database.Postgresql.Query(String sql, Object[] args) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 72
at DriveShare.Database.Postgresql.QueryOne(String sql, Object[] args) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 83
at DriveShare.Database.Postgresql.QueryValue(String key, String sql, Object[] args) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 97
at DriveShare.Database.Postgresql.QueryValue(String key, String sql) in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\Postgresql.cs:line 92
at DriveShare.Database.FileIndexDataSet.CountIndexedFiles() in c:\projetos\driveshareclient\DriveShare\DriveShare\Database\FileIndexDataSet.cs:line 89
at DriveShare.Engine.DriveShareEngine.Start() in c:\projetos\driveshareclient\DriveShare\DriveShare\Engine\DriveShareEngine.cs:line 156
at DriveShareWebService.Program.Main(String[] args) in c:\projetos\driveshareclient\DriveShare\DriveShareWebService\Program.cs:line 19
Since I have to keep the program working, I wrote a workaround to make the app retry that query before proceeding. I'm not proud of that:
public void WaitForConnection()
{
int limitSeconds = 3 * 60;
var start = DateTime.Now;
while (true)
{
try
{
Log.WaitingForDatabaseConnection();
Query("SELECT COUNT(*) AS c FROM files WHERE total_bytes IS NOT NULL");
Log.DatabaseConnectionAquired();
break;
}
catch (Exception e)
{
var wastedTime = DateTime.Now - start;
if (wastedTime.TotalSeconds > limitSeconds)
throw;
else
Log.Exception(e);
}
Thread.Sleep(1000);
}
}
I'm using Npgsql (in a thin abstraction class) to connect to Postgresql. Postgresql logs show some entries about winsock errors that I do not yet understand:
2016-08-16 10:14:34 BRT LOG: database system was shut down at 2016-08-16 10:12:07 BRT
2016-08-16 10:14:34 BRT FATAL: the database system is starting up
2016-08-16 10:14:34 BRT LOG: MultiXact member wraparound protections are now enabled
2016-08-16 10:14:34 BRT LOG: sistema de banco de dados está pronto para aceitar conexões
2016-08-16 10:14:34 BRT LOG: autovacuum launcher started
2016-08-16 10:17:16 BRT LOG: could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:17:27 BRT LOG: could not send data to client: unrecognized winsock error 10054
2016-08-16 10:17:27 BRT STATEMENT: SELECT path FROM files
2016-08-16 10:17:27 BRT FATAL: connection to client lost
2016-08-16 10:17:27 BRT STATEMENT: SELECT path FROM files
2016-08-16 10:17:27 BRT LOG: could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:17:27 BRT LOG: unexpected EOF on client connection with an open transaction
2016-08-16 10:17:33 BRT LOG: unexpected EOF on client connection with an open transaction
2016-08-16 10:25:14 BRT LOG: could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:25:15 BRT LOG: could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:25:15 BRT LOG: unexpected EOF on client connection with an open transaction
2016-08-16 10:26:30 BRT LOG: could not send data to client: unrecognized winsock error 10054
2016-08-16 10:26:30 BRT FATAL: connection to client lost
2016-08-16 10:26:50 BRT LOG: could not send data to client: unrecognized winsock error 10054
2016-08-16 10:26:50 BRT FATAL: connection to client lost
2016-08-16 10:26:50 BRT LOG: could not receive data from client: unrecognized winsock error 10053
2016-08-16 10:26:50 BRT LOG: unexpected EOF on client connection with an open transaction
2016-08-16 10:27:06 BRT LOG: could not send data to client: unrecognized winsock error 10054
2016-08-16 10:27:06 BRT FATAL: connection to client lost
2016-08-16 10:27:06 BRT LOG: could not send data to client: unrecognized winsock error 10054
2016-08-16 10:27:06 BRT FATAL: connection to client lost
2016-08-16 10:27:30 BRT LOG: pedido de desligamento rápido foi recebido
2016-08-16 10:27:30 BRT LOG: interrompendo quaisquer transações ativas
2016-08-16 10:27:30 BRT LOG: autovacuum launcher shutting down
2016-08-16 10:27:30 BRT ERROR: canceling statement due to user request
2016-08-16 10:27:30 BRT LOG: autovacuum launcher shutting down
2016-08-16 10:27:30 BRT LOG: shutting down
2016-08-16 10:27:30 BRT LOG: database system is shut down
I don't expect someone to know what exactly what my problem is. I was just hoping someone might have had some similar issues that could shed some light in it.
回答1:
With some help I found the solution in the Npgsql docs, here.
Npgsql by default comes with some timeout parameters for connection and commands. After a Windows reboot, first access to the table was very slow, triggering the command timeout.
With additional parameters on the connection string I was able to change those settings higher and solve my problem:
connectionString += ";Timeout=180;Command Timeout=180";
Bonus tip: a Postgresql function pg_sleep(seconds)
helped me reproduce the problem without actual reboots. Very helpful:
SELECT pg_sleep(60);
来源:https://stackoverflow.com/questions/38977119/postgresql-fails-specific-query-one-time-after-windows-reboot