问题
I'm testing Google BatchRequest (C#) of InsertAllRequest. once a batch reaches more than 60 requests (~30,000 bigquery table rows / > 10,880,366 Http ContentLength in total), Exceptions happens as below. It's the same when I turned off my firewall. Solutions I found online such as turn http keep-alive off doesn't work in this case because I don't have controls on how the API uses the HttpClient.
System.Net.Http.HttpRequestException: Error while copying content to a stream. ---> System.IO.IOException: Unable to read data from the transport connection: An established connection was aborted by the software in your host machine. ---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine
at System.Net.Sockets.Socket.BeginReceive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)
at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state)
--- End of inner exception stack trace ---
at System.Net.GZipWrapperStream.EndRead(IAsyncResult asyncResult)
at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.EndRead(IAsyncResult asyncResult)
at System.Net.Http.StreamToStreamCopy.StartRead()
--- End of inner exception stack trace ---
at Microsoft.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at Microsoft.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccess(Task task)
at Microsoft.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
at Microsoft.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
at Google.Apis.Requests.BatchRequest.<ExecuteAsync>d__3.MoveNext() in c:\ApiaryDotnet\default\Src\GoogleApis\Apis\Requests\BatchRequest.cs:line 175
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at BigData.BigQuery.API.BigQueryHelper.<InsertBatchAsync>d__1f.MoveNext() in c:\Users\fionazhao\Documents\BigDataCode\Framework\BigData\BigQuery\API\BigQueryHelper.cs:line 197
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
Here's the Google's API code that throwed this exception:
public async Task ExecuteAsync(CancellationToken cancellationToken)
{
if (Count < 1)
return;
ConfigurableHttpClient httpClient = service.HttpClient;
var requests = from r in allRequests
select r.ClientRequest;
HttpContent outerContent = await CreateOuterRequestContent(requests).ConfigureAwait(false);
var result = await httpClient.PostAsync(new Uri(batchUrl), outerContent, cancellationToken)
.ConfigureAwait(false);
result.EnsureSuccessStatusCode();
// Get the boundary separator.
const string boundaryKey = "boundary=";
var fullContent = await result.Content.ReadAsStringAsync().ConfigureAwait(false);
var contentType = result.Content.Headers.GetValues("Content-Type").First();
var boundary = contentType.Substring(contentType.IndexOf(boundaryKey) + boundaryKey.Length);
int requestIndex = 0;
// While there is still content to read, parse the current HTTP response.
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
var startIndex = fullContent.IndexOf("--" + boundary);
if (startIndex == -1)
{
break;
}
fullContent = fullContent.Substring(startIndex + boundary.Length + 2);
var endIndex = fullContent.IndexOf("--" + boundary);
if (endIndex == -1)
{
break;
}
HttpResponseMessage responseMessage = ParseAsHttpResponse(fullContent.Substring(0, endIndex));
if (responseMessage.IsSuccessStatusCode)
{
// Parse the current content object.
var responseContent = await responseMessage.Content.ReadAsStringAsync().ConfigureAwait(false);
var content = service.Serializer.Deserialize(responseContent,
allRequests[requestIndex].ResponseType);
allRequests[requestIndex].OnResponse(content, null, requestIndex, responseMessage);
}
else
{
// Parse the error from the current response.
var error = await service.DeserializeError(responseMessage).ConfigureAwait(false);
allRequests[requestIndex].OnResponse(null, error, requestIndex, responseMessage);
}
requestIndex++;
fullContent = fullContent.Substring(endIndex);
}
}
回答1:
That's just above 10MB which is the request size limit for streaming inserts. Adjust your code for these limits:
Maximum row size: 1 MB
HTTP request size limit: 10 MB
Maximum rows per second: 100,000 rows per second, per table. Exceeding this amount will cause quota_exceeded errors.
Maximum rows per request: 500
Maximum bytes per second: 100 MB per second, per table. Exceeding this amount will cause quota_exceeded errors.
https://cloud.google.com/bigquery/streaming-data-into-bigquery
来源:https://stackoverflow.com/questions/32914863/google-api-batchrequest-an-established-connection-was-aborted-by-the-software-i