[SOLVED] 500 error when getting /apps

stefanolsenn · October 26, 2020, 8:48am

I have…

[ x ] Checked the logs and have uploaded a log file and provided a link because I found something suspicious there. Please do not post the log file in the topic because very often something important is missing.

I’m submitting a…

[ ] Regression (a behavior that stopped working in a new release)
[x ] Bug report
[ ] Performance issue
[ ] Documentation issue or request

Current behavior

This morning when I opended squidex, I got a 500 error when tried to access the /apps endpoint
There has been 0 activity over the weekend, and was working fine when I left work friday.

Expected behavior

Able to get the /apps endpoint - I have the logs this time from the app-service - I’ll PM them if requested

Minimal reproduction of the problem

I think it’s quite hard to reproduce - I’ve done nothing extra ordinary lately to the system…

Environment

[ x ] Self hosted with docker
[ ] Self hosted with IIS
[ ] Self hosted with other version
[ ] Cloud version

Version: 5.0.0

Browser:

[ x ] Chrome (desktop)
[ ] Chrome (Android)
[ ] Chrome (iOS)
[ ] Firefox
[ ] Safari (desktop)
[ ] Safari (iOS)
[ ] IE
[ ] Edge

Others:

I updated the container image to 5.2.0 and restarted the app-service, and was able to fetch the /apps endpoint now, and get to the squidex dashboard/overview

stefanolsenn · October 26, 2020, 8:53am

I tend to get a lot of these time-out exceptions:

[…] Response did not arrive on time in 00:00:30 for message: NewPlacement Request S127.0.0.1:11111:341397831*cli/ […]

where the endpoint is:

“requestPath”: “/apps/app-name-here/ui/settings/me”,

Sebastian · October 26, 2020, 9:11am

What happens when you restart your container? Btw: I always ask for the logs

stefanolsenn · October 26, 2020, 9:16am

https://pastebin.com/KKCR4P6F - This is what I could gather in the app-service

After I restarted the container/app-service (and in this case updated the docker image to 5.2.0) - it’s working again

Sebastian · October 26, 2020, 9:36am

Okay, there is not that much I can do about it, right now…

stefanolsenn · November 4, 2020, 8:56am

I’m still getting this error from time to time. I haven’t added any new apps since the last error I got. This morning we got the same error again where we couldn’t query the /apps endpoint

We can still query the content though. Do you have the slightest idea on how to improve this? It’s hosted on Azure if you need to know

If I restart the webapp, it then again works

Sebastian · November 4, 2020, 9:29am

I have never seen it, it could be someting like a sleep mode which causes the app not to wake up properly.

stefanolsenn · November 4, 2020, 9:33am

Thanks for getting back.
We already have this option enabled on our app:

Any other suggestions?

Sebastian · November 4, 2020, 9:44am

I think this setting could make it even worse. In IIS this prevents the web server from shutting down your application if there are no requests.

But there is also a sleep mode or so on the machine level. But it should not happen. You have only one node, right?

stefanolsenn · November 4, 2020, 9:47am

Oh. Should I try to disable it?

Yes, only one node. Hosted as a container webapp on Azure, and a mongodb container instance hooked up to it.

Current version is 5.2.0.

Sebastian · November 4, 2020, 9:52am

I am not 100% sure what happens. I have posted it in Github: https://github.com/dotnet/orleans/issues/6820

You can try a monitoring service and just query your service every 5 minutes or so to keep it alive. Just an idea.

stefanolsenn · November 4, 2020, 10:03am

That’s great! Just ping me on github if I need to answer or provide anything!

Funny that you mention it, I’ve already set up such a service.

Let’s see how this goes for now. Thanks a lot for the service!

stefanolsenn · November 25, 2020, 9:38am

Hi again!

Since last, I’ve set up a uptime robot that pings both the mongodb container instance, and the Squidex web-app every minute.

This morning I noticed that a ping on the mongodb port was timed out. So the container instance was unresponsive for maximum of 1 min - Properly less, since the next ping was OK.

Then I checked the Squidex app - It throws the same 500 error as I have previously reported in this thread.
So I guess it has something to do with the db connection. I don’t know which call that causes Squidex to go into a failed state - but if the request fails, I have to restart the Squidex app so I can use the squidex app again. The good thing is that all the content is still query-able.

I checked the logs for the mongodb container instance, and I couldn’t see anything that stood out.

So, a feature request for me is; Could you make some kind of retry policy for these requests that fails, so it does not goes into a failed state if a request fails? I do not have enough insight in Squidex to say, if it is Orleans or Squidex that should be pathed for this - Either way, it will improve the system to have this feature I think

In the meantime I think I will migrate the mongodb instance to another cloud provider. I do not have the time to investigate further why the mongodb becomes unresponsive at random times.

:: Update - I didn’t restart the web app right away. I tried to clear localstorage, and was able to access the /app endpoint after the login attempt. Strange…

Sebastian · November 25, 2020, 10:30am

Sure, I could do that, but I would also think about a liveness probe in your case. This would solve the problem I guess.

stefanolsenn · November 25, 2020, 10:52am

So a liveness prope as part of the mongodb container instance?

Sebastian · November 25, 2020, 1:32pm

No, it should observe squidex and then it will just restart it.

stefanolsenn · December 28, 2020, 7:44am

Just an update if anybody else experience this strange error.

I migrated the mongoDB instance to a vm on Google Cloud (e2-medium (2 vCPUs, 4 GB memory)), and this seemed to have fixed the problems we’re having.

The previous deployment was a 2gig container instance on Azure. MongoDB is pretty memory hungry, so I just decided to scale up to 4gig.

system · December 30, 2020, 7:54am

This topic was automatically closed after 2 days. New replies are no longer allowed.