Today's Massive AWS Outage That Took Down Your Favorite Sites Is Still Going On

Today's Massive AWS Outage That Took Down Your Favorite Sites Is Still Going On (www.cnet.com)
from return2ozma@lemmy.world to technology@lemmy.world on 20 Oct 20:50
https://lemmy.world/post/37627525

#technology

threaded - newest

Eldritch@piefed.world on 20 Oct 21:21 next collapse

It's almost like we shouldn't rely on just a few central sites. And that everything should be democratized and federated.

sp3ctr4l@lemmy.dbzer0.com on 21 Oct 01:54 collapse

I2P

geti2p.net/en/

Basically, what if the entire internet was torrents, everyone was seeding / routing to everyone else, oh and also its more private/secure than Tor or VPNs?

Downside is it is quite slow… but if it caught on more widely, that could alleviate somewhat.

errer@lemmy.world on 21 Oct 02:19 collapse

I think I’d really need to know how “somewhat” alleviated it is to have any interest, given the status quo is like, a second.

sp3ctr4l@lemmy.dbzer0.com on 21 Oct 03:19 collapse

Uh, no clue, that math on that would be very difficult to calculate, exceedingly complicated.

Have you ever been able to accurately predict the actual speed at which you download a torrent thats the size of a whole days worth of your regular internet usage?

Its basically a dynamic mesh network, you could run the math on a 100 different scenarios, get a 100 different results, and also have no clue which scenario is more or less realistic.

The way I2P works is by step one, encrypting your traffic, step two, bundling that into a bigger packet made out of those network-near you’s traffic, that then has its own encryption around all that, and then that gets sent somewhere else.

So, upside is, even if your packets are intercercepted… its basically impossible to figure out which subpart of the bigger packet is whose.

You only have the keys to your part of that bigger packet.

Downside of all this is that all that packet bundling takes time, and is dynamically reconfigured, so… yeah, doing a ‘from principles’ estimate is… I dunno, find a chaos mathematician specialist for a more precise answer?

Possibly also worth mentioning: You can use I2P as basically something like Tor/a VPN, to access the non I2P net, the normal internet, you do this by using what is called an outproxy.

Theoretically an outproxy could be just giving all its packets right over to the NSA, but again, you’ve got that kind of encrypted packet sausage going on, and the network is much more complex and distributed than the Tor network’s smaller number of more centralized nodes.

Xanthobilly@lemmy.world on 21 Oct 06:07 collapse

What do you recommend to read up on it more? I’ve read the wiki and this.. I’m wondering if there’s more to understand about browsing or connecting with content or like minded people.

zip@lemmy.blahaj.zone on 21 Oct 07:04 next collapse

I’m wondering the same thing! If you get an answer, would you mind letting me know, if it’s not too much trouble? I’ve read a lot about it, but it still feels like I’m missing/not understanding most of it. That may just me a ‘me & my crappy brain’ issue, though.

sp3ctr4l@lemmy.dbzer0.com on 21 Oct 09:42 collapse

I am not sure about further ‘reading’, but the youtuber Mental Outlaw has a number of videos that do a pretty decent job of introducing and explaining I2P as a concept, as well as some videos that walk you through at least one way to do the actual setup process.

Though, some of those may be slightly out of date by now, those vids are I think a few years old at this point.

There’s also the difference between I2P proper, which I believe is still done wholly in Java, and I2PD, which is basically the same I2P, but rewritten in … either C or C++.

And, depending on how you’re going to want to use I2P on your system… as in uh, just a shunt or mode for a specific program, vs trying to reroute your whole OS’s traffic through it… that gets messy and complicated fast, depending on your setup, and exactly what you want to do.

Also, depending on your ISP / router situation, you may of may not have to futz with opening a port for I2P on your router firewall.

expatriado@lemmy.world on 20 Oct 21:37 next collapse

your favorite sites

looks at list

nop

mycodesucks@lemmy.world on 20 Oct 21:56 next collapse

mushroommunk@lemmy.today on 20 Oct 22:04 next collapse

I get what you’re saying, and agree, but there were many more, Ancestry.com and findagrave.com and many more were also down (while I’m in the middle of an ancestry fact finding trip). It really was massive.

bamboo@lemmy.blahaj.zone on 21 Oct 03:16 collapse

Ancestry.com and findagrave.com are kinda the funniest examples that could be picked from the sites being affected today. Obviously there’s the parallels of AWS being dead today, but I also can’t imagine there would be a lot of updates to those sites that not being active on there for some amount of time would miss out on some timely update. I totally hate being in the grove when something out of my control impedes my workflow, don’t get me wrong, and can totally see how the outages would be annoying.

artyom@piefed.social on 21 Oct 01:56 collapse

Only site that got me was Riverside ☹️

Today@lemmy.world on 20 Oct 21:46 next collapse

Worst part for me was when the rewards app at the tea shop wouldn’t work.

sentient_loom@sh.itjust.works on 20 Oct 23:04 next collapse

What kind of tea did you get?

Today@lemmy.world on 20 Oct 23:39 collapse

Iced combo of lemonade, plain, and blueberry. Crazy refreshing.

sentient_loom@sh.itjust.works on 21 Oct 01:30 collapse

I think that’s called juice lol

sounds good though

victorz@lemmy.world on 20 Oct 23:10 collapse

I hated it when I was trying to break to avoid hitting that pedestrian at the cross walk, and the brake pedal input does a roundtrip to AWS before activating the wheel brakes. For user statistics, for my safety. Not at all for AI training, we swear.

Oh well, had no choice but to drive-by that old hag.

MadMadBunny@lemmy.ca on 20 Oct 21:48 next collapse

Okay, who messed with the Network Switches?

lnxtx@sopuli.xyz on 20 Oct 21:50 collapse

It’s always DNS™

frustrated_phagocytosis@fedia.io on 20 Oct 22:12 next collapse

I literally noticed zero difference. But it sounds bad. Have they tried shoving more AI in there to fix the problem?

dissentiate@lemmy.dbzer0.com on 20 Oct 23:04 next collapse

shalafi@lemmy.world on 21 Oct 01:01 collapse

My DNS is a Lightsail instance out west, no issue.

yoshisaur@lemmy.blahaj.zone on 20 Oct 23:06 next collapse

Can’t even do any of my work for college until the outage is over

MyNameIsAtticus@lemmy.world on 20 Oct 23:10 collapse

Same. I’m sitting in my college’s library right now trying to work and the outage threw a wrench in all of my plans. I’m thankful I downloaded the files to my hard drive though so I can do most of the work on Pen and Paper

yoshisaur@lemmy.blahaj.zone on 20 Oct 23:39 collapse

Yeah. I’m really glad that I already finished all my work that was due today. Otherwise I’d be screwed. I was trying to get a head start on some stuff due Wednesday, but I guess I can’t until aws is back up

SparkyBauer44@lemmy.world on 20 Oct 23:38 next collapse

“Sir. Sir? Sir. (sigh) have you tried turning the internet off and on again?”

HootinNHollerin@lemmy.dbzer0.com on 21 Oct 00:14 next collapse

fubarx@lemmy.world on 21 Oct 01:57 next collapse

AWS salespeople, meeting customers today.

PoopingCough@lemmy.world on 21 Oct 02:36 collapse

Today should have been easy for them tbh. “See? That’s why you should pay us more money to have active infra on our other regions to failover to!”

other_cat@piefed.zip on 21 Oct 02:33 next collapse

The laundry room monitor in my building went down lol… (The monitor being the thing that lets you check a website to see if any of the machines are free or when they’re done, etc.)

MSids@lemmy.world on 21 Oct 04:16 next collapse

I don’t even want to hear an argument for moving back on prem with how badly Broadcom/VMware ripped our eyes out this year. 350% increase over 2 years ago, and I still have to buy the hardware, secure it in a room, power it, buy redundant Internet and networking equipment, get a backup facility, buy and test a generator/UPS, and condition the damn air. Oh then every few years we have to switch out all the hardware when it stops getting vendor support.

At least everyone was all in the same boat today, and we all know what was broken.

Brkdncr@lemmy.world on 21 Oct 05:25 collapse

Moving to Nutanix soon. Love their product. Proxmox looks good on paper too, just not mature enough in the enterprise to bet my paycheck on it.

Cloud infrastructure is expensive.

zergtoshi@lemmy.world on 21 Oct 06:26 collapse

Lucky me needs Proxmox only for self-hosting and loves it :)

weariedfae@sh.itjust.works on 21 Oct 04:40 next collapse

I had surgery today and couldn’t pick up my meds at the pharmacy because my insurance uses AWS somewhere in the billing process. We had to pay out of pocket and pray we get reimbursed because they’re expensive. This took 6 phone calls to find out and overall, sucked. I didn’t think AWS going down could affect my damn insurance.

zip@lemmy.blahaj.zone on 21 Oct 07:07 collapse

Jeez, that’s messed up. I’m so sorry! I hope you’re able to get reimbursed without much more trouble, and I hope your recovery goes well!

renegadespork@lemmy.jelliefrontier.net on 21 Oct 04:56 next collapse

I can hear the smug grins on homelabber’s/self-hoster’s faces from here.

Mangoholic@lemmy.ml on 21 Oct 06:30 next collapse

Not like their systems never have downtime.

SkaveRat@discuss.tchncs.de on 21 Oct 08:00 collapse

They’d type up a really angry reply, but they need to fix a their router config real quick

towerful@programming.dev on 21 Oct 09:04 collapse

Oh look, fediverse is still working.
You can share in the smug grin

Flames5123@sh.itjust.works on 21 Oct 05:03 collapse

As someone that works at another Amazon AWS dependent org, it also took out us. It was awful. Nothing I could do on my end. Why the fuck didn’t it get rolled back immediately? Why did it go to a second region? Fucking idiots on the big teams side.

I got paged 140 times between 12 and 4 am PDT. Then there was another one where I had to hand it off at 7am because I needed fucking sleep. And they handled it until 1pm. I love my team, but it’s so awful that this even was able to happen. All our our fuck ups take 5-30 mins to roll back or manually intervene. This took them 2+ hours, and it was painful. Then it HAPPENED AGAIN! Like what the fuck.

return2ozma@lemmy.world on 21 Oct 06:31 next collapse

Oof.

douglasg14b@lemmy.world on 21 Oct 08:23 collapse

This is a good reason to start investing in multi region architecture at some point.

Not trying to be smug here or anything, but we updated a single config value, made a PR, and committed the change and we were switched over to a different region in a few minutes. Smooth sailing after that.

(This is still dependent to some degree on AWS in order to actually execute the failover, something we’re mulling over how to solve)

Now, our work demands we invest in such things, we’re even investing in multi-cloud (an actual nightmare). Not everyone can do this, and some systems are just not built to be able to, but if it’s within reach it’s probably worth it.

Flames5123@sh.itjust.works on 21 Oct 08:38 collapse

Last night from 12-4am, it was almost every region impacted so it didn’t help that much.

But we do have failovers for customers that they need to activate to just start working in another region.

But our canaries and infrastructure alarms cannot do that since they are for alerts in the region.