5331 private links
The price associated with Wasabi’s pay-as-you-go pricing model is summarized in the table below. This price does not include applicable taxes or additional optional services such as Wasabi’s Premium Support plan, Wasabi Direct Connect, and Wasabi Ball Transfer Appliance.
$5.99 TB/mo (minimum)
($.0059 GB/mo)
Ingress & Egress Data: Free (as long as total ingress/egress data ~storage volume)
rsync.net is not built on a cloud platform - we own (and have built) all of our own platform. Our storage arrays, as we call them, are 2U “head units” with x16 2.5" drive slots and one or more SAS HBAs.
ZFS makes good use of fast cache storage so, after using up two drive slots for our boot mirror (which is always a mix of two totally different SSDs) we have room for up to 14 more SSDs for read (L2ARC) and write (SLOG) cache. ZFS is also RAM hungry so we choose a motherboard that can support up to 2TB of RAM.
Attached to these head units, externally, are one or more JBOD chassis filled with hard drives. JBOD stands for “just a bunch of disks” and they are typically 4U chassis containing 45 or 60 drives that attach back to the head unit with SAS cables.
rsync.net has no firewalls and no routers. In each location we connect to our IP provider with a dumb, unmanaged switch.
This might seem odd, but consider: if an rsync.net storage array is a FreeBSD system running only OpenSSH, what would the firewall be ? It would be another FreeBSD system with only port 22 open. That would introduce more failure modes, fragility and complexity without gaining any security.
- Restic is multi-threaded, borg is not. This translates to restic being extremely fast in comparison to borg, but borg having less impact on average on CPU usage while running. This limitation in borg is actually a direct consequence of the next point.
- Borg does actual deduplication, while restic only does classic incremental backups. With restic, you store a copy of every file, but the files are reference counted so that each version of a file only gets stored once. Borg, however, operates on blocks, not files, and deduplicates within individual backups. So if you have a dozen copies of the same data in your backup, restic stores each copy, but borg only stores the first and makes all the others references to that. The main benefit of this is that borg produces much smaller backups when you have lots of duplicate data and actually does more space efficient incremental backups (because it only stores what actually changed, not the whole changed file).
- Borg supports compression, while Restic seemingly does not (and doesn't handle sparse files very well either). This too has a huge impact on space efficiency, and may explain why restic is lightning fast on my systems when compared to borg. //
Austin I have to correct you:
Restic does indeed do deduplication on blocklevel. It uses a rolling hash algorithm called rabin as a chunker.
In short, a rolling hash algorithm reacts to patterns within the file and cuts it. If two files have the same patterns there's a high chance to have the cuts at the same positions, giving it the ability to deduplicate files which's data is not aligned to any specific block size.
Encrypted, Deduplicated, and Compressed Data Backups Using Your Own Cloud Storage
KopiaUI
Kopia comes with a user-friendly desktop app for Windows, macOS and Linux which allows you to create snapshots, define policies and restore files quickly.
LTO-9 tapes hit the shelves with a supposed 45TB compressed capacity //
LTO-9 cartridges offer 50% more capacity than LTO-8 tapes, with 18TB native capacity per cartridge, supposedly rising to a whopping 45TB compressed (at a 2.5:1 ratio). Fujifilm says these gains were achieved using barium ferrite (BaFe) particles, which are carefully distributed across the surface of the tape, creating a smooth magnetic layer. //
The next generation tape is also faster than its predecessor, reaching transfer rates of up to 1,000MB/sec compressed (and 440MB/sec native), as compared with 750MB/sec (360MB/sec native) on offer with LTO-8.
New LTO-9 drives are fully backward compatible with LTO-8 cartridges, which should make data migration relatively simple for storage administrators. //
With LTO-9 tapes finally hitting the shelves, questions will also be asked about the long-term future of the hard disk drive, the largest of which have a capacity of 18TB. The archival market is dominated by high-capacity tape and the falling price of SSDs has applied pressure from the opposite direction, squeezing hard drives further into niche markets.
The constraints of physics suggests hard drive capacity cannot keep up with the evolution of tape. According to the LTO Program roadmap, LTO-10 tapes are set to offer an incredible 90TB capacity per cartridge, and tapes as large as 580TB have even been created in lab settings.
LTO-9 cartridges offer 50% more capacity than LTO-8 tapes, with 18TB native capacity per cartridge, supposedly rising to a whopping 45TB compressed (at a 2.5:1 ratio). Fujifilm says these gains were achieved using barium ferrite (BaFe) particles, which are carefully distributed across the surface of the tape, creating a smooth magnetic layer. //
A common system task is backing up files – that is, copying files with the ability to go back in time and restore them. For example, if someone erases or overwrites a file but needs the original version, then a backup allows you to go back to a previous version of the file and restore it. In a similar case, if someone is editing code and discovers they need to go back to a version of the program from four days earlier, a backup allows you to do so. The important thing to remember is that backups are all about copies of the data at a certain point in time.
In contrast to backing up is “replication.” A replica is simply a copy of the data when the replication took place. Replication by itself does not allow you to go back in time to retrieve an earlier version of a file. However, if you have a number of replicas of your data created over time, you can sort of go back and retrieve an earlier version of a file, but you need to know when the replica was made, then you can copy the file from that replica.
This is how many modern file system backup programs work. On day 1 you make an rsync copy of your entire file system:
backup@backup_server> DAY1=`date +%Y%m%d%H%M%S`
backup@backup_server> rsync -av -e ssh earl@192.168.1.20:/home/earl/ /var/backups/$DAY1/
On day 2 you make a hard link copy of the backup, then a fresh rsync:
backup@backup_server> DAY2=`date +%Y%m%d%H%M%S`
backup@backup_server> cp -al /var/backups/$DAY1 /var/backups/$DAY2
backup@backup_server> rsync -av -e ssh --delete earl@192.168.1.20:/home/earl/ /var/backups/$DAY2/
“cp -al” makes a hard link copy of the entire /home/earl/ directory structure from the previous day, then rsync runs against the copy of the tree. If a file remains unchanged then rsync does nothing — the file remains a hard link. However, if the file’s contents changed, then rsync will create a new copy of the file in the target directory. If a file was deleted from /home/earl then rsync deletes the hard link from that day’s copy.
In this way, the $DAY1 directory has a snapshot of the /home/earl tree as it existed on day 1, and the $DAY2 directory has a snapshot of the /home/earl tree as it existed on day 2, but only the files that changed take up additional disk space. If you need to find a file as it existed at some point in time you can look at that day’s tree. If you need to restore yesterday’s backup you can rsync the tree from yesterday, but you don’t have to store a copy of all of the data from each day, you only use additional disk space for files that changed or were added.
I use this technique to keep 90 daily backups of a 500GB file system on a 1TB drive.
One caveat: The hard links do use up inodes. If you’re using a file system such as ext3, which has a set number of inodes, you should allocate extra inodes on the backup volume when you create it. If you’re using a file system that can dynamically add inodes, such as ext4, zfs or btrfs, then you don’t need to worry about this.
Hardware failure and a careless user feeling adventurous with powerful utilities such as dd and fdisk can lead to data loss in Linux. Not only that, sometimes spring cleaning a partition or directory can also lead to accidentally deleting some useful files. Should that happen, there’s no reason to despair. With the PhotoRec utility, you can easily recover a variety of files, be it documents, images, music, archives and so on.
Developed by CGSecurity and released under the GPL, PhotoRec is distributed as a companion utility of Testdisk, which can be used to recover and restore partitions. You can use either of these tools to recover files, but each has a job that it’s best suited for. Testdisk is best suited for recovering lost partitions. //
Although initially designed to only recover image files (hence the name), PhotoRec can be used to recover just about any manner of file.
Even better, PhotoRec works by ignoring the underlying filesystem on the specified partition, disk or USB drive. Instead, it focuses on the unique signatures left by the different file types to identify them. This is why PhotoRec can work with FAT, NTFS, ext3, ext4 and other partition types. //
The greatest drawback of PhotoRec – if any tool that can seemingly pull deleted files out of the digital ether can have a drawback – is that it doesn’t retain the original filenames. This means that recovered files all sport a gibberish alpha-numeric name. If this is a deal-breaker for you, consider using Testdisk first to recover your lost files.
To Install Testdisk open a terminal window and first update the software repositories before installing testdisk.
YAHB - Yet Another Hardlink-based Backup-tool
YAHB is a deduplicating file copy tool, intended for backup use. Deduplication works on the file-level with NTFS hardlinks.
- Backups by Google One is replacing your existing Android system backup method.
- This update is rolling out to many Android phones and will happen in the background.
- Backed-up items now include MMS, photos, videos, wallpapers, several system settings, and more.
- You do not need to be a Google One subscriber to use this service.
When it comes to having a backup plan, Navy SEALs go by the rule that “Two is one and one is none.” They’re not often one-upped, but in the world of computer backup, even two is none. The gold standard until recently has been the 3-2-1 rule—three copies of your data on two different media with one copy stored off-site. //
Cloud backups are sometimes tied to a company’s active directory, and they’re often not virtually isolated from a company’s production network. //
As emerging technology has changed the way backup strategies are implemented, the core principles of a 3-2-1 backup strategy still hold up:
You should have multiple copies of your data.
Copies should be geographically distanced.
One or more copies should be readily accessible for quick recoveries in the event of a physical disaster or accidental deletion.
But, they need to account for an additional layer of protection: One or more copies should be physically or virtually isolated in the event of a digital disaster like ransomware that targets all of their data, including backups. //
What Is 3-2-1-1-0?
A 3-2-1-1-0 strategy stipulates that you:
- Maintain at least three copies of business data.
- Store data on at least two different types of storage media.
- Keep one copy of the backups in an off-site location.
- Keep one copy of the media offline or air gapped.
- Ensure all recoverability solutions have zero errors. //
What Is 4-3-2?
If your data is being managed by a disaster recovery expert like Continuity Centers, for example, your backups may be subscribing to the 4-3-2 rule:
- Four copies of your data.
- Data in three locations (on-prem with you, on-prem with an MSP like Continuity Centers, and stored with a cloud provider).
- Two locations for your data are off-site.
dolsh
David Murphy
1/06/21 10:08pm
all of my personal data sits within my main C:\Users folder
Rookie mistake. If you’re seriously going to re-install Windows regularly, or as I’d prefer it, if you’re going to use Windows 10 regularly, do not use your boot drive as User storage. Most purchased PC’s support this now (though not all laptops). I have another disagreement with the article, but I’ll get to that at the end.
Your boot drive should be windows and applications only. A separate physical disk is your User folder, Documents, Media, Games, etc.
This makes backup really easy... //
With the above, it only takes about a half hour to reinstall, configure, and kick off restoring applications and games. I used to do this quite often. Now, it’s really only when I have a significant enough hardware upgrade to warrant it.
So my main disagreement is that you need to reinstall Windows 10 at all. There was a time when applications embedded themselves in system startup, and even a technical Windows user would find their Windows installation slowing down over time. This just doesn’t happen with Windows 10. My current installation dates back to the Windows 7 to Windows 10 upgrade. The IT departments of my last two companies have experienced the same thing across thousands of desktops. The tools built into Windows 10 allow for managing application installs and determining what’s running much better than several years ago. I’ve found that when applications slow Windows down, I can remove them and preformance returns. That wasn’t always possible. There was a time when you needed to know what sysinternals and hijack-this were to keep it all running well.
It seems like a stupid question, if you’re not an IT professional – and maybe even if you are – how much storage does it take to store 1TB of data? Unfortunately, it’s not a stupid question in the vein of “what weighs more, a pound of feathers or a pound of bricks”, and the answer isn’t “one terabyte” either. I’m going to try to break down all the various things that make the answer harder – and unhappier – in easy steps. Not everybody will need all of these things, so I’ll try to lay it out in a reasonably likely order from “affects everybody” to “only affects mission-critical business data with real RTO and RPO defined”. //
TL;DR: If you have 280GiB of existing data, you need 1TB of local capacity. //
8:1 rule of thumb
Based on the same calculations and with a healthy dose of rounding, we come up with another really handy, useful, memorable rule of thumb: when buying, you need eight times as much raw storage in production as the amount of data you have now.
So if you’ve got 1TiB of data, buy servers with 8TB of disks – whether it’s two 4TB disks in a single mirror, or four 2TB disks in two mirrors, or whatever, your rule of thumb is 8:1. Per system, so if you maintain hotspare and DR systems, you’ll need to do that twice more – but it’s still 8:1 in raw storage per machine.
Dell PowerEdge R740xd2 rack server
Actually 1 per rack across 20 racks. The Backblaze Vault architecture uses one drive from each of 20 servers to form a tome, which is the storage unit.
The xd2 model which holds 26 3.5 inch drives. That's what we use.
A 3-2-1 strategy means having at least three total copies of your data, two of which are local but on different mediums (read: devices), and at least one copy off-site. //
There is no such thing as a perfect backup system, but the 3-2-1 approach is a great start for the majority of people and businesses. Even the United States government recommends this approach. In a 2012 paper for US-CERT (United States Computer Emergency Readiness Team), Carnegie Mellon recommended the 3-2-1 method in their publication titled: Data Backup Options.
Given the present non-existence of any perfect agreement on where applications should store their cached information, I propose a very simple convention that will at least allow such information to be identified effectively. Regardless of where the application decides to (or is configured to) place its cache directory, it should place within this directory a file named:
CACHEDIR.TAG
This file must be an ordinary file, not for example a symbolic link. Additionally, the first 43 octets of this file must consist of the following ASCII header string:
Signature: 8a477f597d28d172789f06886806bc55
Case is important in the header signature, there can be no whitespace or other characters in the file before the 'S', and there is exactly one space character (ASCII code 32 decimal) after the colon. The header string does not have to be followed by an LF or CR/LF in the file in order for the file to be recognized as a valid cache directory tag. The hex value in the signature happens to be the MD5 hash of the string ".IsCacheDirectory". This signature header is required to avoid the chance of an unrelated file named CACHEDIR.TAG being mistakenly interpreted as a cache directory tag by data management utilities, and (for example) causing valuable data not to be backed up.
The content of the remainder of the tag file is currently unspecified, except that it should be a plain text file in UTF-8 encoding, and any line beginning with a hash character ('#') should be treated as a comment and ignored by any software that reads the file.
We will henceforth refer to a file named as specified above, and having the required signature at the beginning of its content, as a cache directory tag.
For the benefit of anyone who happens to find and look at a cache directory tag directly, it is recommended that applications include in the file a comment referring back to this specification. For example:
Signature: 8a477f597d28d172789f06886806bc55
# This file is a cache directory tag created by (application name).
# For information about cache directory tags, see:
# http://www.brynosaurus.com/cachedir/
Ransomware is becoming the number one threat to data, which makes it essential to ensure that bad actors don’t encrypt your backup data along with your primary data when they execute ransomware attacks. If they succeed at that, you will have no choice but to pay the ransom, and that will encourage them to try it again.
The key to not having to pay ransom is having the backups to restore systems that ransomware has encrypted. And the key to protecting those backups from ransomware is to put as many barriers as you can between production systems and backup systems. Whatever you do, make sure that the only copy of your backups is not simply sitting in a directory on a Windows server in the same data center you are trying to protect.
Sanoid is a policy-driven snapshot management tool for ZFS filesystems. When combined with the Linux KVM hypervisor, you can use it to make your systems functionally immortal.
sanoid rollback demo
(Real time demo: rolling back a full-scale cryptomalware infection in seconds!)
More prosaically, you can use Sanoid to create, automatically thin, and monitor snapshots and pool health from a single eminently human-readable TOML config file at /etc/sanoid/sanoid.conf. (Sanoid also requires a "defaults" file located at /etc/sanoid/sanoid.defaults.conf, which is not user-editable.) //
Sanoid also includes a replication tool, syncoid, which facilitates the asynchronous incremental replication of ZFS filesystems.
The backup solution space is crowded. There are a multitude of applications, commands, and methodologies that are available. Choosing the right one is a daunting task that can definitely leave your head spinning. After having tried a few of the options out there, I have settled with Restic and here’s why.
First and foremost, Restic is completely open source. This is important especially when you are choosing a solution that will handle your very own data, which often times is personal and sensitive in nature. That’s the whole reason that it is being backed up in the first place…because you deem it as important. So, when it comes to backups, it is a requirement that the solution be non-proprietary, open source, and auditable. There should not be any concern of nefarious data harvesting or data siphoning to an overly interested third-party. Your data should remain your data.
Deduplication
Encryption
Restic is compatible with Linux, BSD, Mac, and Windows. In addition to being able to create backups to local, SFTP, and REST servers, it is also capable of backing up to major cloud storage providers such as Backblaze B2, Wasabi, Openstack Swift, Amazon S3, Google Cloud, Microsoft Azure, etc..
Simplicity and Sophistication
Restic provides you with a set of commands that can be used as building blocks for a variety of backup strategies, from simple to complex.
Snapshots
By creating a public/private SSH keypair, and uploading the public key to your rsync.net filesystem, you can allow your backup process to authenticate without your password.
Generating the SSH Keypair
First, log into your unix system as the user that your backups will run under. So, if your backups will run as the root user (which is very common) you need to log in as root.
Now run the following command:
ssh-keygen -t rsa -b 4096
Accept the defaults - do not change the filenames or file locations It is very important that the resultant private and public keys reside in your home directories .ssh directory, or ~/.ssh (which is the default)
DO NOT enter a passphrase - just hit enter twice, leaving an empty passphrase.
Uploading Your Public Key
Upload your newly created public key using this command:
scp ~/.ssh/id_rsa.pub 123@tv-s009.rsync.net:.ssh/authorized_keys
DO NOT change the permissions on the uploaded file, before or after the upload
DO NOT change the permissions on your home directory, or your .ssh directory
NOTE: 123@tv-s009 is most certainly NOT your login ID or hostname - please change them.
Testing Your Passwordless Login
Test that your key works by ssh'ing to your rsync.net filesystem (from your local system, as the user who created/uploaded the key):
ssh 123@tv-s009.rsync.net ls
You should not be asked for a password
Multiple Keys (optional)
It is possible to upload multiple public keys to your rsync.net account, allowing one or more users on one or more computer systems to log in without a password. However, you cannot just follow the above instructions over and over again, because each time you follow them, you will overwrite the previous key.
Instead, do this:
-
For the first user on the first computer system, follow the instructions above exactly.
-
For each subsequent user (possibly on different computer systems), replace the 'scp' step in the above instructions with:
cat ~/.ssh/id_rsa.pub | ssh 123@tv-s009.rsync.net 'dd of=.ssh/authorized_keys oflag=append conv=notrunc'
-
Repeat this process for each user until you have a fully populated authorized_keys file in your rsync.net account.