502 Bad Gateway on the Public Bug Tracker

Here you can provide suggestions on how to improve the product, website, etc.
Post Reply
Air Force One
Posts: 105
Joined: 6. Oct 2017, 16:54
Primary OS: MS Windows other
VBox Version: PUEL
Guest OSses: Windows
Location: Germany

502 Bad Gateway on the Public Bug Tracker

Post by Air Force One »

In the last couple of months, I have sometimes had the above-mentioned error on the public bug tracker. Sometimes it disappears in a couple of minutes, but sometimes it takes a few hours to resolve this problem. In some cases, a whole site (not just public bug tracker) is unreachable.

I understand that this is open source and partly free for some users. But could somebody analyze the reasons for this and fix it? It would be very nice if this would simply work. ;-)
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: 502 Bad Gateway on the Public Bug Tracker

Post by scottgus1 »

The whole www.virtualbox.org website goes 502 at times. The admins are aware of it, but they haven't been able to make it stop happening. Probably something outside their control. I'll ping them about it.
Air Force One
Posts: 105
Joined: 6. Oct 2017, 16:54
Primary OS: MS Windows other
VBox Version: PUEL
Guest OSses: Windows
Location: Germany

Re: 502 Bad Gateway on the Public Bug Tracker

Post by Air Force One »

I'm not an administrator, and I don't try to tell other people how to do their jobs. But I think that 502 is caused by some troubles in communication with the server. So there should be something in the logs? And it looks like I'm not the only one experiencing this, so this isn't region-dependent.

It isn't a 500 error, so it's not an error with the server itself or the application running on the server.
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: 502 Bad Gateway on the Public Bug Tracker

Post by scottgus1 »

Curiously, I'm not getting 502 or any other errors on the bugtracker just now. I'm in northeast USA. Maybe a regional problem? Are you getting it right now? Are you OK saying what approximate part of the planet you're on?
Air Force One
Posts: 105
Joined: 6. Oct 2017, 16:54
Primary OS: MS Windows other
VBox Version: PUEL
Guest OSses: Windows
Location: Germany

Re: 502 Bad Gateway on the Public Bug Tracker

Post by Air Force One »

I've updated my signature, so you can see, that I live in the same part of the globe as fth0 does. The InnoTek HQ isn't far from here. ;-) You are thinking that this is geo-dependent? Like content network having troubles in Europe?

Another issue is the download limits. If I try to download more than 5 files simultaneously, I have to wait and some of the downloads are interrupted then. So normally, they release all the test and development builds simultaneously. So there are three versions with three files for every version for Windows. I can start the first four without any trouble, but then I can click, but I have to wait. And then some of the files are interrupted, and I have to repeat the download. Usually these are the ISO files.

The download speed is also a topic here. The official releases are really quick. But the test and development version files are landing at the limit of 350 KB/s on my side. So one can see, they are saving resources everywhere. And I probably could understand this. ;-)
mpack
Site Moderator
Posts: 39156
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: 502 Bad Gateway on the Public Bug Tracker

Post by mpack »

If it was server dependent then I'd think we should all get the same error. We have seen geo-dependent routing errors before.

I'm in the UK, no error from here. In fact BugTracker access seems quite snappy.

If you have the means to route the request through a VPN in a different location then that might be an interesting test.
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: 502 Bad Gateway on the Public Bug Tracker

Post by scottgus1 »

At 2:18 PM Eastern Time USA post daylight savings, 11 minutes after Mpack's post above, the Bugtracker is up and alive in Northeast USA.
fth0
Volunteer
Posts: 5616
Joined: 14. Feb 2019, 03:06
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Linux, Windows 10, ...
Location: Germany

Re: 502 Bad Gateway on the Public Bug Tracker

Post by fth0 »

Before the move of www.virtualbox.org to new servers in early June 2023, multi-hour outages happened regularly, perhaps about once a month. After the move, I've noticed multi-hour outages only once or twice, and they happened usually world-wide (according to isitdownrightnow.com, for example), so the move was a big improvement in my POV.

FWIW, I've seen the "502 Bad Gateway" occasionally during the last multi-hour outage a few weeks ago (~2023-10-31).
klaus
Oracle Corporation
Posts: 1073
Joined: 10. May 2007, 14:57

Re: 502 Bad Gateway on the Public Bug Tracker

Post by klaus »

First of all: we're very sorry that the stability of www.virtualbox.org has not improved (in the last weeks it has been noticeably worse than the average before) since the move to the new setup in July. We hoped that we'd "lose" the bug affecting www.virtualbox.org specifically.

The "502 Bad Gateway" error is just a new symptom of the old issue: somehow the Oracle HTTP Server worker thread pool gets completely depleted. In the old setup this resulted in an unresponsive website, with the new setup the load balancer shows this error message when all application servers are unresponsive.

It appears that the issue is a "cooperation" of the Trac application and OHS, but the issue is hard to track down because it takes between hours and weeks for it to manifest.

We're trying to find a way to avoid this problem. Probably will be making some experiments later this week.
klaus
Oracle Corporation
Posts: 1073
Joined: 10. May 2007, 14:57

Re: 502 Bad Gateway on the Public Bug Tracker

Post by klaus »

Took longer than I thought (the devil is in the details, as usual) to switch our setup over from using mod_wsgi to gunicorn... I tested as much as I could today and it worked smoothly, but I don't want to ruin someone's weekend.

Therefore I came up with a compromise: I've put this in action on one of the backend servers at approx 15:30 UTC (a little before I sent this message). This will achieve two things: if I made a mistake then it will not break things completely, and additionally even a single server could keep the site alive if the others are going into the 'all worker threads hang' state.
scottgus1
Site Moderator
Posts: 20965
Joined: 30. Dec 2009, 20:14
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Windows, Linux

Re: 502 Bad Gateway on the Public Bug Tracker

Post by scottgus1 »

Thanks very much, Klaus, for taking a whack at this!
klaus
Oracle Corporation
Posts: 1073
Joined: 10. May 2007, 14:57

Re: 502 Bad Gateway on the Public Bug Tracker

Post by klaus »

Before I totally forget: since yesterday all servers are running the new config.

If you saw a 502 today... that was for a totally different reason and also caused the forums to be down. The DB server had trouble and went into offline mode. True root cause is unclear, the applications worked as usual before and after without human intervention.
klaus
Oracle Corporation
Posts: 1073
Joined: 10. May 2007, 14:57

Re: 502 Bad Gateway on the Public Bug Tracker

Post by klaus »

The last 2 weeks were pretty rough for virtualbox.org. 6 outages affecting both www.virtualbox.org and forums.virtualbox.org are not what we consider normal. The high level reason is that the DB server is going down due to running out of disk space. Not that it is short on space (we're using little more than 10% of the capacity), but suddenly it starts writing to temporary tables as fast as it can. With the result that things are slower than usual for about an hour and then it is down.

We're working both on investigating what triggers this "out of the blue" and on speeding up the reaction time to the outages which is much too long.
Post Reply