Two Months With CrashPlan


It has been about two months since I started using CrashPlan to backup all of my data to the cloud. A few weeks ago I wrote the article – CrashPlan and Why The Cloud Makes Sense – to highlight my initial experiences as well as provide an overview of what CrashPlan has to offer. My initial backup set was 1.8TB and it took about 7 weeks to finish. During that time throughput was all over the map. Most of the time it was 1.5Mbps or less. For several weeks it rarely got above 700-800Kbps. That is until a couple of weeks ago when I changed the Data De-duplication configuration from Automatic to Minimal…

Dedupe Config Change To The Rescue

A couple of weeks ago only 800GB or so had been backed up out of a total of 1.8TB. Throughput was not usually any higher than 700-800Kbps and the estimated time remaining was measured in months. I did some research and came across this forum post by JC-Austin (located on page 2 – January 21, 2011 at 15:43)…or read JC-Austin(TechNazgul)’s blog post CrashPlan Online Backup: Maximizing Upload Speeds. I went ahead and changed the Data De-duplication configuration (in Settings | Backup | Advanced settings) from Automatic to Minimal and the difference was incredible! Immediately the throughput shot up to 20Mbps (I have 25Mbps upstream with Verizon FiOS)! Over the next couple of weeks throughput rarely dropped below 10Mbps and the estimated time remaining went from months to days. This past weekend the initial 1.8TB backup was completed. If you aren’t getting the throughput you think you should be getting with CrashPlan I definitely recommend changing the dedupe setting from Automatic to Minimal (restart the CrashPlan service to play it safe after you make the change) and see if your results improve. It doesn’t hurt to give it a shot. There’s some speculation that the total size of your backup set may impact the dedupe performance. The larger the backup set the slower throughput seems to get. There’s some logic to this as more data would need to be scanned for dedupe to do it’s job. But as I said, it’s speculation at this point.

Phase Two

After the initial 1.8TB was completed this past weekend I then proceeded to add a few more folders to the backup to bring the total to 2.4TB. These folders are located on network shares and I followed these directions to get that working with CrashPlan. As of writing this article all but 76GB of the total 2.4TB has been backed up to CrashPlan Central. It only took about 4-5 days to complete the additional 600GB (minus the 76GB still remaining) and that includes many interruptions in connectivity to CrashPlan Central which I will talk about in a moment.

But What About Restoring Files?

I will admit right now that I have not done exhaustive testing of file restorations with CrashPlan. What I have done is a single file restore test when logged into CrashPlan via web browser as well as via the CrashPlan software. In both cases it’s a piece of cake. When you restore via web browser you get a compressed ZIP file with the files you chose to restore. When you restore via the CrashPlan software you restore the original file(s) (no ZIP) and you can either restore to the original location or to another location you specify. Both restoration methods are fairly intuitive. In both cases you can choose to restore the most recent version of the file(s) or choose a back-dated version. When restoring via the CrashPlan software you can specify to overwrite or rename if files of the same name already exist in the destination.

Other Thoughts and Concerns

The past few days have been hit or miss with regard to connectivity to CrashPlan Central. I’m aware that Mozy recently eliminated unlimited backups and that it is possible there has been a surge of customers joining CrashPlan that might be affecting connectivity (purely speculation on my part). Over the past few days there have been numerous multi-hour long periods where there is no connectivity to CrashPlan Central and therefore no backups (or restorations) during that time period. The remaining 76GB should have been done a day or two ago had it not been for these constant interruptions. At this point it’s just a nuisance as long as the trend does not continue beyond a few days. Prior to the past few days starting from the point where I changed the dedupe setting to Minimal my experience with CrashPlan has been extremely positive.

Tweet about this on TwitterShare on FacebookShare on Google+Share on RedditPin on PinterestShare on TumblrDigg thisShare on StumbleUponShare on LinkedInEmail this to someone


#backup#Carbonite#cloud#CrashPlan#de-duplication#dedupe#file#file sync#Mozy#online#storage

  • Jim

    Are there any downsides to changing the data de-duplication configuration to achieve an increase in upload speed? Thanks for helping.

    • d.k.sutton

      Not that I’m aware of. I should note that since the Mozy mass migration of users recently my performance has been back to before the dedupe change. Might not correlate but that’s about the time it changed. I don’t see any reason not to give the setting a try to see if it makes a difference but for me it is not anymore.

  • Dan Sydnes

    According to CrashPlan, here are the settings:

    Full compression is used when backing up over an Internet connection. Minimal is used when backing up directly to disk or over LAN.

    It is 100% effective, but is CPU-intensive. It is a little slower, but saves bandwidth and disk at destination.

    About 90% effective, it uses several methods to identify duplicate data. It is less CPU-intensive and will speed up initial backup speed significantly, typically 400% on a single processor system.
    De-duplication is balance between upload bandwidth, CPU horsepower, and disk speed.

    My guess is that “Full” (or “Automatic” when streaming to CrashPlan’s cloud) works on a block-level (4KB or 8KB chunks), while “Minimal” works on a file-level.

    Hashing calculations are processor-intensive. So a 5MB music file would require a single calculation for the “Minimal” setting, but a whopping 640 calculations for the “Full” setting (assuming 8KB blocks).

    De-duplication also builds a dictionary of “seen” hashes. Disk speed is critical because as the file or data block is read & its hash calculated, the dictionary has to be scanned for a potential match. This causes the disk to jump from data to dictionary and back again. Compared to file-level, block-level hashing geometrically increases the number of look-ups.

    So why do block-level hashing at all? Two use cases come to mind:

    1) Your data includes lots of similar–but not identical–files. For example, multiple revisions of an Excel spreadsheet or email messages with graphic signatures. Block-level de-duplication significantly reduces the data size for transmission & backup storage.

    2) Workstation horsepower outweighs upload bandwidth. For example, standard DSL or multiple clients sharing the same connection.

  • I’m on the demo trial at the moment and the last few days have been pretty bad. Crashplan have admitted they are having problems and said they are working on it. They suggested I could start again with another server but that would mean trashing the last twelve days of uploads and starting from scratch. I’m just curious to know how you are getting on now and if this is normal?

    • I got through the pain of uploading the initial backup. That was over a year ago now. Since then I’ve rarely had a reason to interface with Crashplan. I have it running on a VM with mapped shares to my ReadyNAS storage and CrashPlan does it’s thing with no fuss. I get the daily email from CrashPlan to confirm things are still working, and that is usually where my interaction with CrashPlan ends. 🙂 Since I’m only adding 500MB here 3GB there the performance hasn’t been much of a concern. I figured even if it takes a few hours it will eventually get uploaded. I’ll reply back later tonight after I take a look at the logs. I’ll reply back with insights (if any) on how it’s been performing lately.

      • Thanks for the feedback. You have a much faster upload connection than I do and I’m waiting for a fibre option that should give 20mbps upload speeds to go live in this area but that might be a few months. 

        In the meanwhile I have a 1mbps upload speed so I’ve been forced to cut down the amount of data I select for the initial backup or it will literally take months. The option to send in a hard drive isn’t available because I’m in the UK. 

        I don’t expect it to work perfectly 100% of the time because that would be unrealistic but there have been problems since day one and and it’s feeling remarkably similar to the experience I had with Mozy the last time I tried online backups a few years ago, so  I wondered if I was just unlucky or if this is widespread. 

        For the last few days the destination drive has been disconnected almost constantly and Crashplan tell me they have now upgraded the server, however they need to work on the I/O, which will take some time. I’m no server specialist but I thought that upgrading the server hardware would have already fixed the I/O problem. 

        The part I find confusing is the data I am receiving inside the main application. Last night it said the upload was about 12% complete and now it’s less than 2% done. A few hours ago it said I had uploaded 54 Gigs in total and now it’s just 45 Gigs. 

        Hopefully it will all be fixed and I want this to work but I just changed my settings to match your suggestions and I’m simply waiting for normal service to resume. 

        • There was a brief period of time towards the end of my initial upload where had a problem connecting to their service. I think it spent the better part of a day or two disconnected. I haven’t seen that since, however. Although, that could happen now and I might not even notice.

          One thing that can be outside their control is the route you take to get to their services. Since your data is traveling a greater distance, that “might” mean more hops and more potential for a router along the way to be the culprit. I know a year ago when I was regularly hitting the Crashplan forums I saw people posting about routing issues with particular ISPs from particular locations.

          Huh, that is odd. I don’t think I’ve ever seen the app flip out like that.

          Good luck. Hope you get it resolved.

    • I just checked the CrashPlan history. Last week a 700MB file uploaded in 8 minutes. So that is almost 12Mbps throughput. My FiOS upstream speed is 25Mbps. So obviously it didn’t use all of it but with CrashPlan I’ve considered anything north of 10Mbps to be really good. Towards the end of my initial upload I sometimes only saw 1Mbps or less. If you can get past that initial upload (maybe take advantage of their drive seeding option) the speed becomes way less of a concern….as long as it works.

  • Fred

    Thanks a million! My upload speed went from ~2.5Mbps to > 50Mbps, sometimes as high as 250Mbps (I’m on a 300Mbps optical)!

  • livestrong2109

    I’ve currently got crashplan backing up several Pc’s to a little Home Office server with a crazy large mirrored storage pool. My friends and family love backing up to the thing as it costs them nothing, and I live the compression and simplicity I get in setting everything up, and allowing them to manage their own keys. All in all its a amazing addition to my server setup.