With internet archive head quarters being in the US what would happen if the administration went after them? Would people from other countries be able to keep the project going?
Not just data but in important scientific research too. The republicans want to drive the US back to the stone age, because that’s when they were last relevant.
I wish I had the necessary petabytes of storage to at least store an offline copy. I wonder how many disks that would be and how redundant disks you’d need.
Here’s this from 2021. They say they have about 200PB of raw storage across some 20k spinning drives at the time of writing (with more being added constantly, about 25%/yr), and capacities are mixed from 4TB to 16TB, across 750 servers housed on about 75 racks. I have 6x16TB WD red pros that ran me about $355/ea new with tax, and my bill was a smidge over $2100. Assuming you used all 16TB, you’d need about 12,500 16TB disks, which would run you about $4,437,500 without a bulk discount. How much of that is redundancy I’m not sure, but that’s just HDDs, not the hardware to actually run everything between storage enclosures, OS, disks, memory, clustering, etc. They say they say a single copy with 16TB drives would be about 15 racks., but how that breaks down I’m not sure.
If you have the funds, please support Internet Archives.
Just did yesterday after putting it off for some months
With internet archive head quarters being in the US what would happen if the administration went after them? Would people from other countries be able to keep the project going?
We need to distribute all that data, for the sake of history in the future. Data hoarding and torrenting is a service to humanity.
Not just data but in important scientific research too. The republicans want to drive the US back to the stone age, because that’s when they were last relevant.
😇 in my mind, I included those datasets in data
But sure! I agree! We need free and open science, worldwide
People need to understand the art of scientific thinking
That’s unfortunately a very valid point. Iirc the big problem IA has is the sheer amount of disk space required to store everything.
I wish I had the necessary petabytes of storage to at least store an offline copy. I wonder how many disks that would be and how redundant disks you’d need.
Here’s this from 2021. They say they have about 200PB of raw storage across some 20k spinning drives at the time of writing (with more being added constantly, about 25%/yr), and capacities are mixed from 4TB to 16TB, across 750 servers housed on about 75 racks. I have 6x16TB WD red pros that ran me about $355/ea new with tax, and my bill was a smidge over $2100. Assuming you used all 16TB, you’d need about 12,500 16TB disks, which would run you about $4,437,500 without a bulk discount. How much of that is redundancy I’m not sure, but that’s just HDDs, not the hardware to actually run everything between storage enclosures, OS, disks, memory, clustering, etc. They say they say a single copy with 16TB drives would be about 15 racks., but how that breaks down I’m not sure.
I once made this calculation for a database of 700Tb, even that blew my mind 🤣
Depends on what the cops would do.