After upgrading from v4.1.3 to 6.5.0, some data is lost or in a strange half-published state

I have…

  • [x] Checked the logs - Nothing beyond lots of INFO log noise, though this seems like it could be a data problem from migrations and not a bug that causes a crash

I’m submitting a…

  • [x] Regression (a behavior that stopped working in a new release)
  • [ ] Bug report
  • [ ] Performance issue
  • [ ] Documentation issue or request

Current behavior

I upgraded from 4.1.3 to 6.5.0, and I changed my graphql queries to the newer format (a few things were renamed since 4.1.3). For a lot of data, everything seems fine. But I am noticing some odd inconsistencies where some data is missing or somehow the Published version is lost or regressed.

Here is a picture of the Admin tool which shows a value for the Message Template field on the Published version.

However, when I query via GraphQL, this shows as null:

$ curl $GRAPHQL_API_URL -H 'X-Languages: en-US' -H 'Content-Type: application/json' -H "Authorization: Bearer $READER_TOKEN" --data-raw $'{ "query": "{ queryInventoryContents { flatData { messageTemplate } } }" }' | jq
{
  "data": {
    "queryInventoryContents": [
      {
        "flatData": {
          "messageTemplate": null
        }
      }
    ]
  }
}

But, if I add the “X-Unpublished: true” header, that field’s data shows up.

$ curl $GRAPHQL_API_URL -H 'X-Languages: en-US' -H 'Content-Type: application/json' -H "Authorization: Bearer $READER_TOKEN" --data-raw $'{ "query": "{ queryInventoryContents { flatData { messageTemplate } } }" }' -H 'X-Unpublished: true' | jq
{
  "data": {
    "queryInventoryContents": [
      {
        "flatData": {
          "messageTemplate": "Browse our inventory of {ITEM_COUNT} items!"
        }
      }
    ]
  }
}

Prior to the upgrade, this entity was published.

It feels like it has something to do with data that may have been corrupted during the upgrade from 4.1.3 to 6.5.0, and/or something to do with published/draft content that is somehow not coming through correctly.

Expected behavior

I would have expected data that was Published at v4.1.3 to remain published after the upgrade.

Minimal reproduction of the problem

My testing is on a non-production database, and if it would help debugging, I could provide a MongoDB dump from before and after the upgrade privately.

Environment

  • [x] Self hosted with docker
  • [ ] Self hosted with IIS
  • [ ] Self hosted with other version
  • [ ] Cloud version

Version: Bug appeared after upgrading to 6.5.0 from 4.1.3

Browser:

Not specific to any browser since this is an API call issue

It looks like your content has a draft version or so, which is not shown properly in the UI. There was a bug for this a long time ago and I am not sure if this is what you see.

The only chance to go through this is with a MongoDB dump. You can send it to me via PM.

Do you have the ID of such a content item?

The Inventory schema is a single-element content that is suffering from this bug. Its ID is 18f2a9cf-ffb8-47bd-a808-ce9f56e87218

1 Like

I don’t remember the details of this problem, but i know that it can be solved by just publishing this item and it works in my test. It seems that only one item is affected anyway, so I am not sure if it is worth to investigate further.

Are you saying that publishing the item after the upgrade will fix it, or before?

This data you’re looking at is from a development instance. I haven’t yet run the upgrade against production, so I’m trying to avoid any downtime or blank pages when we go live with the upgrade.

Is there something in the MongoDB records I can look for that would help me find the problem, before and/or after the upgrade? For example, if publishing an entity is the way to fix it, then perhaps I need to run a query before the upgrade of all published content, then run the upgrade, and re-run that query, then manually go in and hit the Publish button on anything that disappeared. If that’s a solution, do you have a recommended MongoDB query?

Thanks again for your help on this!

You can make a query I made to States_Contents_Published3:

{ "do" : { $eq : {} } }

Thanks, that brings up the inventory record and when I manually published, it worked on the site. However, there are still other entities suffering from this same problem:

  1. GlobalMetaData is another single-entity collection that isn’t coming through correctly, but it doesn’t show up in the list of empty do query. In fact, this is a weird one because some data is coming through: When I look at what’s returned from the query, I see values in two list field (Linked Data and Meta Tags) but all the other string fields are empty event though in the admin they have a value in the admin tool
  2. The Page entity with slug "about-us/why-weller" is also in this weird state, where some of the field values are coming through in the published entity but the “Sections” list is emtpy

I’m trying to familiarize myself with the structure of the data right now to see whether there’s any other query I can come up with that would highlight which items have problems.

I suppose, one way about it would be to get rid of all drafts prior to the upgrade, then do the upgrade, find anything that is a draft, and publish it. Does that sound reasonable? If so, do you have a query for “get me all drafts” in both 4.1.3 data and 6.5?

I would put Squidex in readonly mode. Then you make a copy of the database and make a second deployment with the update. Then you have all the time to migrate the data. when you are done you change the binding in your load balancer or DNS and when something goes wrong you can roll back.

Ok, I can live with that. Are stored assets safe to leave untouched during this time? I just want to make sure that the upgrade process doesn’t do anything like move around or rename assets.

Yes, assets are never ever overwritten.

Just a follow-up: I have performed the recommended upgrade approach of setting the MongoDB instance to readonly, then cloning and performing the upgrade offline, then switching back-ends once complete, and it was successful. Along the way I manually published about two dozen entities that had gone unpublished in the upgrade, and I have verified that, once manually published, all data was intact.

Thanks again for your assistance.

1 Like

Thanks for your update. Great that it worked.