Text Indexer Failure after upgrading to 3.4.0

The text indexer won’t start after upgrading:

System.ArgumentException: can only update existing binary-docvalues fields!
   at Lucene.Net.Index.IndexWriter.UpdateBinaryDocValue(Term term, String field, BytesRef value)
   at Squidex.Domain.Apps.Entities.Contents.Text.IndexState.Index(Guid id, Byte draft, Term term, Byte forDraft, Byte forPublished) in /src/src/Squidex.Domain.Apps.Entities/Contents/Text/IndexState.cs:line 55
   at Squidex.Domain.Apps.Entities.Contents.Text.TextIndexContent.UpdateFor(Byte draft, Byte forDraft, Byte forPublished) in /src/src/Squidex.Domain.Apps.Entities/Contents/Text/TextIndexContent.cs:line 152
   at Squidex.Domain.Apps.Entities.Contents.Text.TextIndexContent.Copy(Boolean fromDraft) in /src/src/Squidex.Domain.Apps.Entities/Contents/Text/TextIndexContent.cs:line 99
   at Squidex.Domain.Apps.Entities.Contents.Text.TextIndexerGrain.CopyAsync(Guid id, Boolean fromDraft) in /src/src/Squidex.Domain.Apps.Entities/Contents/Text/TextIndexerGrain.cs:line 103
   at Squidex.Domain.Apps.Entities.Contents.Text.OrleansCodeGenTextIndexerGrainMethodInvoker.Invoke(IAddressable grain, InvokeMethodRequest request) in /src/src/Squidex.Domain.Apps.Entities/obj/Release/netstandard2.0/Squidex.Domain.Apps.Entities.orleans.g.cs:line 1314
   at Orleans.Runtime.GrainMethodInvoker.Invoke()
   at Squidex.Infrastructure.Orleans.StateFilter.Invoke(IIncomingGrainCallContext context) in /src/src/Squidex.Infrastructure/Orleans/StateFilter.cs:line 21
   at Orleans.Runtime.GrainMethodInvoker.Invoke()
   at Squidex.Infrastructure.Orleans.LoggingFilter.Invoke(IIncomingGrainCallContext context) in /src/src/Squidex.Infrastructure/Orleans/LoggingFilter.cs:line 30
   at Orleans.Runtime.GrainMethodInvoker.Invoke()
   at Squidex.Infrastructure.Orleans.LocalCacheFilter.Invoke(IIncomingGrainCallContext context) in /src/src/Squidex.Infrastructure/Orleans/LocalCacheFilter.cs:line 32
   at Orleans.Runtime.GrainMethodInvoker.Invoke()
   at Orleans.Runtime.GrainMethodInvoker.Invoke()
   at OrleansDashboard.Metrics.GrainProfilerFilter.Invoke(IIncomingGrainCallContext context)
   at Orleans.Runtime.GrainMethodInvoker.Invoke()
   at Orleans.Runtime.InsideRuntimeClient.Invoke(IAddressable target, IInvokable invokable, Message message)
   at Orleans.OrleansTaskExtentions.<ToTypedTask>g__ConvertAsync|4_0[T](Task`1 asyncTask)
   at Squidex.Domain.Apps.Entities.Contents.Text.GrainTextIndexer.On(Envelope`1 event) in /src/src/Squidex.Domain.Apps.Entities/Contents/Text/GrainTextIndexer.cs:line 82
   at Squidex.Infrastructure.EventSourcing.Grains.EventConsumerGrain.DispatchConsumerAsync(Envelope`1 event) in /src/src/Squidex.Infrastructure/EventSourcing/Grains/EventConsumerGrain.cs:line 245
   at Squidex.Infrastructure.EventSourcing.Grains.EventConsumerGrain.<>c__DisplayClass12_0.<<OnEventAsync>b__0>d.MoveNext() in /src/src/Squidex.Infrastructure/EventSourcing/Grains/EventConsumerGrain.cs:line 90
--- End of stack trace from previous location where exception was thrown ---
   at Squidex.Infrastructure.EventSourcing.Grains.EventConsumerGrain.DoAndUpdateStateAsync(Func`1 action, String caller) in /src/src/Squidex.Infrastructure/EventSourcing/Grains/EventConsumerGrain.cs:line 178

Hi Jaben,

I have seen it before, but I cannot reproduce it locally yet. But I am relatively sure that it has nothing to do with 3.4.

My only theory is a corrupted index in some cases. I will see if I can create a workaround for that. I guess the only real workaround would be to log a error and skip the exception, because essentially it means that the document has not been indexed. We could also reindex that, but we need a guarantee that it does not reindex a lot of documents.

You could try to restart the indexer.

Hey Sebastian,

Thanks for getting back to me so quickly on this one.

I did try the various buttons: restarting the index, etc. I think it’s some corrupt data – but hard to tell what to do about it. Now that I’m running production on 3.4.0 – I definitely don’t want to roll back. How can I tell what is corrupt?

Thanks!

Jaben

It is only a theory so far. But the error says that there is no document to update. The index is flushed to the file periodically, so in theory it can happen that a document is not written to the file. And I think this happened here. But i am not sure yet.

Do you have a (small) backup where i can reproduce it?

I think the best bet is to delete all Index_{ID}.zip entries in your asset folder and then rerun the indexer.

As I see it the exception happens only if the field is unknown (see https://github.com/apache/lucenenet/blob/ecf62129099d04a95edce3fca1243c7b4d2fac6b/src/Lucene.Net/Index/IndexWriter.cs#L1931) and I have no idea how this can be the case. I see only the following options:

  1. Events are not in order, because you have a MongoDB replica set and the clocks between the servers. are not synced.

  2. … Thats it

1 Like

Thanks for getting back to me on this, Sebastian.

Where is the Asset folder? I’m using mongo grid fs. Would this be in GridFS or the mongo asset collection?

Yes, in GridFS then.

But before you do that it would be great to get a backup.

1 Like

k, will do – but it will recreate the indexes?

It should, my idea was to delete the indexes and then restart the event consumer.

1 Like

Well, I feel pretty silly – I just hit reset on the broken text indexer and it seems to have fixed itself.

Maybe a good message to add with the error? “Try and reset the indexer as a first step”

Ugg, sorry to bug. :laughing:

I thought you have already tried that. But good point. I will make the error more clear.

1 Like