How to wait for Azure Search to finish indexing document? For integration testing purpose

ぐ巨炮叔叔 提交于 2019-11-30 23:48:36
Heather Nakama

If your service has multiple search units, there is no way to determine when a document has been fully indexed. This is a deliberate decision to favor increased indexing/query performance over strong consistency guarantees.

If you're running tests against a single unit search service, the approach (keep checking for document existence with a query rather than a lookup) should work.

Note that on a free tier search service this will not work as it's hosted on multiple shared resources and does not count as a single unit. You'll see the same brief inconsistency that you would with a dedicated multi-unit service

Otherwise, one possible improvement would be to use retries along with a smaller sleep time.

The other answer by @HeatherNakama was very helpful. I want to add to it, but first a paraphrased summary:

There is no way to reliably know a document is ready to be searched on all replicas, so the only way a spin-lock waiting until a document is found could work is to use a single-replica search service. (Note: the free tier search service is not single-replica, and you can't do anything about that.)

With that in mind, I've created a sample repository with Azure Search integration tests that roughly works like this:

private readonly ISearchIndexClient _searchIndexClient;

private void WaitForIndexing(string id)
{
    // For the free tier, or a service with multiple replicas, resort to this:
    // Thread.Sleep(2000);

    var wait = 25;

    while (wait <= 2000)
    {
        Thread.Sleep(wait);
        var result = fixture.SearchService.FilterForId(id);
        if (result.Result.Results.Count == 1) return;
        if (result.Result.Results.Count > 1) throw new Exception("Unexpected results");
        wait *= 2;
    }

    throw new Exception("Found nothing after waiting a while");
}

public async Task<DocumentSearchResult<PersonDto>> FilterForId(string id)
{
    if (string.IsNullOrWhiteSpace(id) || !Guid.TryParse(id, out var _))
    {
        throw new ArgumentException("Can only filter for guid-like strings", nameof(id));
    }

    var parameters = new SearchParameters
    {
        Top = 2, // We expect only one, but return max 2 so we can double check for errors
        Skip = 0,
        Facets = new string[] { },
        HighlightFields = new string[] { },
        Filter = $"id eq '{id}'",
        OrderBy = new[] { "search.score() desc", "registeredAtUtc desc" },
    };

    var result = await _searchIndexClient.Documents.SearchAsync<PersonDto>("*", parameters);

    if (result.Results.Count > 1)
    {
        throw new Exception($"Search filtering for id '{id}' unexpectedly returned more than 1 result. Are you sure you searched for an ID, and that it is unique?");
    }

    return result;
}

This might be used like this:

[SerializePropertyNamesAsCamelCase]
public class PersonDto
{
    [Key] [IsFilterable] [IsSearchable]
    public string Id { get; set; } = Guid.NewGuid().ToString();

    [IsSortable] [IsSearchable]
    public string Email { get; set; }

    [IsSortable]
    public DateTimeOffset? RegisteredAtUtc { get; set; }
}
[Theory]
[InlineData(0)]
[InlineData(1)]
[InlineData(2)]
[InlineData(3)]
[InlineData(5)]
[InlineData(10)]
public async Task Can_index_and_then_find_person_many_times_in_a_row(int count)
{
    await fixture.SearchService.RecreateIndex();

    for (int i = 0; i < count; i++)
    {
        var guid = Guid.NewGuid().ToString().Replace("-", "");
        var dto = new PersonDto { Email = $"{guid}@example.org" };
        await fixture.SearchService.IndexAsync(dto);

        WaitForIndexing(dto);

        var searchResult = await fixture.SearchService.Search(dto.Id);

        Assert.Single(searchResult.Results, p => p.Document.Id == dto.Id);
    }
}

I have tested and confirmed that this reliably stays green on a Basic tier search service with 1 replica, and intermittently becomes red on the free tier.

Use a FluentWaitDriver or similar component to wait in tests, if waiting is needed only for tests. I wouldn't pollute the app with thread delays. Azure indexer will have a few acceptable milliseconds-seconds delay, provided the nature of your search instance.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!