I would also put into question if you _really_ need to check for updates every 5 minutes. Once per startup is already enough, and if you're concerned about users who leave it on for days, it could easily be daily or even less often.
Lammy 6 hours ago [-]
A 5 minute update check interval is usage-reporting in disguise. Way fewer people would turn off a setting labeled “check for updates” than one labeled “report usage statistics”.
bilekas 6 hours ago [-]
Don’t give them ideas!!
YetAnotherNick 6 hours ago [-]
Do they say that they don't do any usage reporting?
TowerTall 4 hours ago [-]
from their FAQ on the buttom of the fronpage:
Screen Studio can collect basic usage data to help us improve the app, but you can opt out of it during the first launch. You can also opt out at any time in the app settings.
Spivak 6 hours ago [-]
Eh, this one is probably ignorance over malice. It's super common to see people who need to make an arbitrary interval choice go with 300 out of habit.
If the update interval had been 1 day+, they probably wouldn't have noticed after one month when they had a 5 minute update check interval.
karhuton 6 hours ago [-]
To be as user friendly as possible, always ask if user wants automatic background updates or not. If you can’t update without user noticing it, please implement manual updates as two mechanisms:
1) Emergency update for remote exploit fixes only
2) Regular updates
The emergency update can show a popup, but only once. It should explain the security risk. But allow user to decline, as you should never interrupt work in progress. After decline leave an always visible small warning banner in the app until approved.
The regular update should never popup, only show a very mild update reminder that is NOT always visible, instead behind a menu that is frequently used. Do not show notification badges, they frustrate people with inbox type 0 condition.
This is the most user friendly way of suggesting manual updates.
You have to understand, if user has 30 pieces of software, they have to update every day of the month. That is not a good overall user experience.
zveyaeyv3sfye 3 hours ago [-]
> You have to understand, if user has 30 pieces of software, they have to update every day of the month. That is not a good overall user experience.
That's not an user issue tho, it's a "packaging and distribution of updates" issue which coincidentally has been solved for other OS:es using a package manager.
tom1337 4 hours ago [-]
I'd also question if the updater needs to download the update before the user saying they want it. Why not check against a simple endpoint if a newer version is available and if so, prompt the user that an update could be downloaded and then download it. This would also allow the user to delay the update if they are on metered connections.
m3adow 6 hours ago [-]
First thing I thought as well. Every 5 minutes for a screen recording software is an absurd frequency. I doubt they release multiple new versions per day.
ghurtado 6 hours ago [-]
IIRC, Every 5 minutes used to be the standard interval between email checks, back in the days of dialup and desktop email clients.
How the times have changed ..
lucb1e 6 hours ago [-]
It's near-instant now not usually because of more incessant polling, but because it simply keeps the connection open (can last many hours without sending a single byte, depending also on the platform) and writes data onto it as needed (IMAP IDLE). This has gotten more efficient if anything
pjmlp 5 hours ago [-]
And because how expensive they were in Portugal, I never done it, it was always on manual.
wodenokoto 5 hours ago [-]
Depends on the application. I have my browser running for months at a time.
atoav 6 hours ago [-]
Yeah but that should be a variable anyways. Maybe even a variable provided by the server. But in this case it should be on demand. with the old version cached and only downloading the new one when there is a new version once a day.
atoav 6 hours ago [-]
Yeah but that should be a variable anyways. Maybe even a variable provided by the server.
jve 6 hours ago [-]
> Screen Studio is a screen recorder for macOS. It is desktop app. It means we need some auto-updater to allow users to install the latest app version easily.
No, it doesn't mean that.
Auto updater introduced series of bad outcomes.
- Downloading update without consent, causing traffic for client.
- Not only that, the download keeps repeating itself every 5 minutes? You did at least detect whether user is on metered connection, right... ?
- A bug where update popup interrupts flow
- A popup is a bad thing on itself you do to your users. I think it is OK when closing the app and let the rest be done in background.
- Some people actually pay attention to outgoing connections apps make and even a simple update check every 5 minutes is excessive. Why even do it while app is running? Do on startup and ask on close. Again some complexity: Assume you're not on network, do it in background and don't bother retrying much.
- Additional complexity for app that caused all of the above. And it came with a price tag to developer.
Wouldn't app store be perfect way to handle updates in this case to offload the complexity there?
HelloNurse 5 hours ago [-]
App store updates are perfect: no unnecessary complications, no unnecessary work (assuming Screen Studio is published and properly updated in the app store), and the worst case scenario is notifications about a new Screen Studio version ending up in a Screen Studio recording in progress.
Thinking of it, the discussed do-it-yourself update checking is so stupid that malice and/or other serious bugs should be assumed.
socalgal2 5 hours ago [-]
Does the app store handle staged rollouts?
That was a thing I thought was missing from this writeup. Ideally you only roll up the update to a small percent of users. You then check to see if anything broke (no idea how long to wait, 1 day?). Then you increase the percent a little more (say, 1% to 5%) and wait a day again and check. Finally you update everyone (who has updates on)
dahcryn 3 hours ago [-]
yes obviously something as mature as the App store supports phased rollout. I believe it is even the default setting once you reach certain audience sizes. Updates are always spread over 7 days slowly increasing the numbers
djxfade 4 hours ago [-]
Yes it does support this
Nition 3 hours ago [-]
While we're listing complaints... 250MB for a screen recorder update?
donatj 6 hours ago [-]
I am always kind of a stickler about code reviews. I once had a manager tell me that I should leave more to QA with an offhand comment along the lines of "what is the worst that could happen" to which I replied without missing a beat "We all lose our jobs. We are always one bad line of code away from losing our jobs"
The number of times I have caught junior or even experienced devs writing potential PII leaks is absolutely wild. It's just crazy easy in most systems to open yourself up to potential legal issues.
canucker2016 21 minutes ago [-]
...And if there's no one around to review the code?
The website makes it seem like it's a one person shop.
jarym 6 hours ago [-]
Just amazed that ‘better testing’ isn’t one of the takeaways in the summary.
Just amazed. Yea ‘write code carefully’ as if suggesting that’ll fix it is a rookie mistake.
So so frustrating when developers treat user machines like their test bed!
fifilura 5 hours ago [-]
Contrarian approach: $8000 is not a lot in this context. What did the CEO think of this? Most of the time it is just a very small speed bump in the overall finances of the company.
Avoidable, unfortunate, but the cost of slowing down development progress e.g. 10% is much higher.
But agree that senior gatekeepers should know by heart some places where review needs to be extra careful. Like security pitfalls, exponential fallback of error handling, and yeah, probably this.
Klaster_1 5 hours ago [-]
How do you adjust your testing approach to catch cases like this? In my experience, timing related issues are hard to catch and can linger for years unnoticed.
doix 5 hours ago [-]
I would mock/hook/monkey patch/whatever the functions to get the current time/elapsed time, simulate a period of time (a day/week/month/year/whatever), make sure the function to download the file is called the correct amount of times. That would probably catch this bug.
Although, after such a fuck up, I would be tempted to make a pre-release check that tests the compiled binary, not any unit test or whatever. Use LD_PRELOAD to hook the system timing functions(a quick google shows that libfaketime[0] exists, but I've never used it), launch the real program and speed up time to make sure it doesn't try to download more than once.
> We decided to take responsibility and offer to cover all the costs related to this situation.
Good on them. Most companies would cap their responsibility at a refund of their own service's fees, which is understandable as you can't really predict costs incurred by those using your service, but this is going above and beyond and it's great to see.
weird-eye-issue 6 hours ago [-]
"Luckily, it was not needed"
sevg 6 hours ago [-]
Why in the world does it need to check for updates every 5 minutes?
The author seemed to enjoy calculating the massive bandwidth numbers, but didn’t stop to question whether 5 minutes was a totally ridiculous.
The scale is astounding. I was briefly interested in the person that caused the error then immediately realized it was irrelevant because if a mechanism doesn't exist to catch an issue like that, then any company is living on borrowed time.
voidUpdate 5 hours ago [-]
whyyy does wikipedia not redirect mobile links to the desktop website when you have a desktop UA?
wodenokoto 3 hours ago [-]
Because people on desktops asking for the mobile site should be able to view the mobile site.
The url specifically asks Wikipedia to serve the mobile site.
voidUpdate 3 hours ago [-]
Well when I follow a desktop link on my phone, it redirects me to the mobile version, despite the URL specifically asking to serve the desktop site, it just doesn't work the other way around. Plus I never asked to see the mobile site, I followed a link someone else posted
xigoi 4 hours ago [-]
Why do they have a separate mobile website at all instead of writing proper CSS to make one website work on all devices?
They had no cost usage alerts. So they even did not know that the thing was happening, just realized with the first bill.
I think that is the essence of what is wrong with the cloud costs. Defaulting to possibility for everyone to scale rapidly while in reality 99% have quite predictable costs month over month.
saretup 6 hours ago [-]
Not to mention the cost users paid to download 250 MB every 5 minutes.
pests 6 hours ago [-]
It was mentioned, at the bottom. One customer even had their ISP cancel their service.
ghurtado 6 hours ago [-]
It seems a bit self centered to make their lost $8000 the focus of the article.
The title should have been: "how a single line of code cost our users probably more than $8000"
moi2388 2 hours ago [-]
Why on earth are you checking for updates every 5 minutes to begin with?!
Seriously this alone makes me question everything about this app.
pornel 6 hours ago [-]
It would also be nice if the update archive wasn't 250MB. Sparkle framework supports delta updates, which can cut down the traffic considerably.
mathverse 6 hours ago [-]
This is an electron app.
dahcryn 3 hours ago [-]
which is their design choice, not an obligation.
Electron really messed up a few things in this world
mimimi31 6 hours ago [-]
>Add special signals you can change on your server, which the app will understand, such as a forced update that will install without asking the user.
I understand the reasoning, but that makes it feel a bit too close to a C&C server for my liking. If the update server ever gets compromised, I imagine this could increase the damage done drastically.
bryanrasmussen 6 hours ago [-]
I read that as "A single line of code costs $8000" which, from the comments seems like a few others had the same thought. Reading the article it is not costs and the original title is "One line of code that did cost $8,000", so as some others have pointed out it is a bug that cost $8000.
aranw 4 hours ago [-]
I have Screen Studio and I don't leave it open but all I wish for now is that you disable the auto updater. Provide an option for it to be disabled and allow for manual update checking. Checking for an update every 5 minutes is total overkill and downloading the update automatically is just bad. What if I was on mobile internet and had limited bandwidth and usage. The last thing I want is an app downloading it's own update without my consent and knowledge.
hardwaresofton 6 hours ago [-]
Bugs are great chances to learn.
What might be fun is figuring out all the ways this bug could have been avoided.
Another way to avoid this problem would have been using a form of “content addressable storage”. For those who are new, this is just a fancy way of saying make sure to store/distribute the hash (ex. Sha256) of what you’re distributing and store it on disk in a way that content can be effectively deduplicated by name.
It’s probably not so easy as to make it a rule, but most of the time, an update download should probably do this
ghurtado 6 hours ago [-]
> out all the ways this bug could have been avoided.
The most obvious one is setting up billing alerts.
Past a certain level of complexity, you're often better off focusing on mitigation that trying to avoid every instance of a certain kind of error.
HelloNurse 31 minutes ago [-]
Note that billing alerts protect against unexpected network traffic, not directly against bugs and bad design in the software. Update checking remains a terrible idea.
jumploops 6 hours ago [-]
Oh boy, I know of at least one case where a single line of code cost ~$500k…
Curious where the high-water mark is across all HNers (:
explodes 4 hours ago [-]
Others have reported higher already, but for data:
Our team had a bug that cost us about $120k over a week.
Another bug running on a large system had an unmeasurable cost. (Could $K, could be $M)
agos 4 hours ago [-]
I would be surprised if half of the user on this site did _not_ create or personally see a bug where a line cost way more than $8000
short_sells_poo 5 hours ago [-]
$1.2mln, gone in about 30 minutes.
indrex 3 hours ago [-]
Plenty of (valid) criticism in the comments, but I appreciate the developer for publishing it.
insin 5 hours ago [-]
Knowing where to put the line: $7999 (is sadly not the story)
handfuloflight 6 hours ago [-]
Would have cost $0 on Cloudflare's R2.
poleguy 6 hours ago [-]
Ever consider not using cloud for everything? Hosting this on traditional hosting would have limited the problem and the cost.
M95D 6 hours ago [-]
And in that case, the problem would not be discovered until 1) someone opened a bug report, which rarely happens, because any competent user would just disable auto-updates, and 2) that bug report would be investigated, which also rarely happens.
cess11 4 hours ago [-]
It's not like you are forbidden to monitor your services just because you didn't put them in big clown.
I assume most of that 2PB network traffic was not egress, right? Otherwise how did it "only" cost you $8k on Google Cloud?
Even at a cost of 0.02$ per GB, which is usually a few times lower than the actual prices I could find there, that would still result in an invoice of about $40k...
byyll 5 hours ago [-]
I'll let my employer know to update my salary or reduce my workload.
navigate8310 6 hours ago [-]
So did you pay or Google showed you mercy by chewing their potential earnings?
cyprx 6 hours ago [-]
meanwhile the CTOs plan to apply AI into their production codebases :)
bilekas 6 hours ago [-]
> While refactoring it, I forgot to add the code to stop the 5-minute interval after the new version file was available and downloaded.
I’m sorry but it’s exactly cases like these that should be covered by some kind of test, especially When diving into a refactor. Admittedly it’s nice to hear people share their mistakes and horror stories, I would get some stick for this at work.
jer0me 6 hours ago [-]
(2023)
suraci 6 hours ago [-]
A bug cost $8000
jiggawatts 6 hours ago [-]
These articles are great, but I have to one-up the blog: I recently helped a small dev team clean up a one-line mistake that cost them $95,000... which they didn't notice for three months.
The relevance is that instead of checking for a change every 5 minutes, the delay wasn't working at all, so the check ran as fast as possible in a tight loop. This was between a server and a blob storage account, so there was no network bottleneck to slow things down either.
It turns out that if you read a few megabytes 1,000 times per second all day, every day, those fractions of a cent per request are going to add up!
silverfrost 5 hours ago [-]
In other news a screen recorder app is a 250MB (presumably compressed) download...
zveyaeyv3sfye 3 hours ago [-]
FWIW, OBS is ~150 MB, not an electron app and actually open source.
Screen Studio can collect basic usage data to help us improve the app, but you can opt out of it during the first launch. You can also opt out at any time in the app settings.
If the update interval had been 1 day+, they probably wouldn't have noticed after one month when they had a 5 minute update check interval.
1) Emergency update for remote exploit fixes only
2) Regular updates
The emergency update can show a popup, but only once. It should explain the security risk. But allow user to decline, as you should never interrupt work in progress. After decline leave an always visible small warning banner in the app until approved.
The regular update should never popup, only show a very mild update reminder that is NOT always visible, instead behind a menu that is frequently used. Do not show notification badges, they frustrate people with inbox type 0 condition.
This is the most user friendly way of suggesting manual updates.
You have to understand, if user has 30 pieces of software, they have to update every day of the month. That is not a good overall user experience.
That's not an user issue tho, it's a "packaging and distribution of updates" issue which coincidentally has been solved for other OS:es using a package manager.
How the times have changed ..
No, it doesn't mean that.
Auto updater introduced series of bad outcomes.
- Downloading update without consent, causing traffic for client.
- Not only that, the download keeps repeating itself every 5 minutes? You did at least detect whether user is on metered connection, right... ?
- A bug where update popup interrupts flow
- A popup is a bad thing on itself you do to your users. I think it is OK when closing the app and let the rest be done in background.
- Some people actually pay attention to outgoing connections apps make and even a simple update check every 5 minutes is excessive. Why even do it while app is running? Do on startup and ask on close. Again some complexity: Assume you're not on network, do it in background and don't bother retrying much.
- Additional complexity for app that caused all of the above. And it came with a price tag to developer.
Wouldn't app store be perfect way to handle updates in this case to offload the complexity there?
Thinking of it, the discussed do-it-yourself update checking is so stupid that malice and/or other serious bugs should be assumed.
That was a thing I thought was missing from this writeup. Ideally you only roll up the update to a small percent of users. You then check to see if anything broke (no idea how long to wait, 1 day?). Then you increase the percent a little more (say, 1% to 5%) and wait a day again and check. Finally you update everyone (who has updates on)
The number of times I have caught junior or even experienced devs writing potential PII leaks is absolutely wild. It's just crazy easy in most systems to open yourself up to potential legal issues.
The website makes it seem like it's a one person shop.
Just amazed. Yea ‘write code carefully’ as if suggesting that’ll fix it is a rookie mistake.
So so frustrating when developers treat user machines like their test bed!
Avoidable, unfortunate, but the cost of slowing down development progress e.g. 10% is much higher.
But agree that senior gatekeepers should know by heart some places where review needs to be extra careful. Like security pitfalls, exponential fallback of error handling, and yeah, probably this.
Although, after such a fuck up, I would be tempted to make a pre-release check that tests the compiled binary, not any unit test or whatever. Use LD_PRELOAD to hook the system timing functions(a quick google shows that libfaketime[0] exists, but I've never used it), launch the real program and speed up time to make sure it doesn't try to download more than once.
[0] https://github.com/wolfcw/libfaketime
Good on them. Most companies would cap their responsibility at a refund of their own service's fees, which is understandable as you can't really predict costs incurred by those using your service, but this is going above and beyond and it's great to see.
The author seemed to enjoy calculating the massive bandwidth numbers, but didn’t stop to question whether 5 minutes was a totally ridiculous.
https://en.m.wikipedia.org/wiki/Knight_Capital_Group#2012_st...
440m usd
The url specifically asks Wikipedia to serve the mobile site.
Previous discussion: https://news.ycombinator.com/item?id=35858778
I think that is the essence of what is wrong with the cloud costs. Defaulting to possibility for everyone to scale rapidly while in reality 99% have quite predictable costs month over month.
The title should have been: "how a single line of code cost our users probably more than $8000"
Seriously this alone makes me question everything about this app.
Electron really messed up a few things in this world
I understand the reasoning, but that makes it feel a bit too close to a C&C server for my liking. If the update server ever gets compromised, I imagine this could increase the damage done drastically.
What might be fun is figuring out all the ways this bug could have been avoided.
Another way to avoid this problem would have been using a form of “content addressable storage”. For those who are new, this is just a fancy way of saying make sure to store/distribute the hash (ex. Sha256) of what you’re distributing and store it on disk in a way that content can be effectively deduplicated by name.
It’s probably not so easy as to make it a rule, but most of the time, an update download should probably do this
The most obvious one is setting up billing alerts.
Past a certain level of complexity, you're often better off focusing on mitigation that trying to avoid every instance of a certain kind of error.
Curious where the high-water mark is across all HNers (:
Our team had a bug that cost us about $120k over a week.
Another bug running on a large system had an unmeasurable cost. (Could $K, could be $M)
Set up daily emails.
Set up cost anomaly alerts.
I’m sorry but it’s exactly cases like these that should be covered by some kind of test, especially When diving into a refactor. Admittedly it’s nice to hear people share their mistakes and horror stories, I would get some stick for this at work.
The relevance is that instead of checking for a change every 5 minutes, the delay wasn't working at all, so the check ran as fast as possible in a tight loop. This was between a server and a blob storage account, so there was no network bottleneck to slow things down either.
It turns out that if you read a few megabytes 1,000 times per second all day, every day, those fractions of a cent per request are going to add up!
https://obsproject.com/