How to consolidate files and remove duplicates

Glenn Fleishman
30 March, 2015
View more articles fromthe author
AAA
Help

Apple, cleanup, macworld australiaThis week we tackle some of your storage questions, including consolidating, de-duplication and management.

Too many drives

Reader Jon writes that he has a pile of partly full external USB drives formatted for use with Windows that he almost always mounts on a Mac, and expects there is a lot of duplication among the different drives he’s using.

I need some guidance on the best approach to consolidating all of the data down to one drive, best disk file system recommendations, parsing through the data for duplicates and deleting it, and then trying somehow, magically, organise, filter or quickly search all of the content.

Let’s take these in order, and the recommendations will work for any assemblage of drives, Mac HFS+ or Windows formatted.

Consolidating: The easiest way to consolidate would be the tedious but necessary task of copying everything on to a single drive. As astounding as it sounds, you can purchase a 5TB (yes, that’s five whole terabytes) external USB 3.0-connected drive for a reasonable price right this moment if you need that much storage. We live in astonishing times.

To keep drives’ content distinct, I recommend dragging the entire mounted drive volume into the new disk. This will create a folder with the entire contents. You should name those descriptively, if they aren’t already, so you know to which drive they correspond. (Later, erase your old USB drives, and donate them to a charity if you no longer need them!)

File system: Given that this reader is dealing with Windows files and wrote that he already has Paragon Software Group’s NTFS for Mac, NTFS is likely the best format to standardise on. NTFS handles OS X files just fine, and you’ll still be able to mount the drive on a Windows system. A converse option is to standardise on OS X’s HFS+ and use Paragon’s HFS for Windows.

Removing duplicates: You have a variety of software that finds and removes duplicates to choose from, all of which analyses the file – it doesn’t rely on the file name or other information. Only byte-identical files are matched as duplicates. After finding copies, you can choose how or if to remove overlaps. Check out either Duplicate Detective (US$3) or Gemini (US$10). They have slightly different features.

whatsize duplicates

 

WhatSize (US$30), includes de-duplication as part of its toolkit, and allows ‘hard linking’, a Unix method of making multiple file-system links to one set of data, reducing storage usage without effectively deleting the point that a file existed at in the original folder structure.

Organise, filter and search: Organising is an idiosyncratic task, as you have to have a goal as to how you’re sorting files. I’ve largely given up file sorting into folders, relying instead on Spotlight.

Deleting downloads

Sally Greer asks:

Can I delete my downloads? They are getting massive, and since I was PC user before, I don’t know the rules of the Downloads folder.

This may seem like a simple question to many readers, but I find folks who are new computer users or new to Mac get tripped up on the things that old Mac hands already know.

Every OS X user’s home folder contains a Downloads folder, in which Safari and other browsers preferentially drop files you download. I recommend pruning this folder regularly, as it can wind up becoming massive over time as subsequent versions of a piece of software download into.

If you have a large enough drive, you might not care about deletion. But it does impose a burden on your backups. Even though an inert downloaded file only needs to be stored once with any software or service that creates a base set of files and then incremental differences later – including Time Machine – it still occupies space. For online backups, it’s more you’re uploading and storing remotely.

omnidisksweeper downloads

 

I’d argue that if you don’t need a file after installing software, delete it, and empty the trash. If you do, copy it elsewhere so that you know it’s available. Most software you purchase can be downloaded fresh, or a full copy is downloaded with each update. I recently found several different downloaded versions of Adobe Lightroom, each a minor update but also each comprising hundreds of megabytes!

To check if you’ve left giant downloads scattered about that you no longer need, the folks at the Omni Group offer a free utility called OmniDiskSweeper. It’s not a de-duplicator, like the software above. Rather, it can examine a drive and list files from the largest to smallest, making it easy to see giant lumps you no longer need.

Leave a Comment

Please keep your comments friendly on the topic.

Contact us