Mailing List Archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlug] Find symlinks, or what should be symlinks...

On 2014年04月22日 17:11, Darren Cook wrote:
In setting up my new notebook, I copied over the big Projects directory
tree. My first try (using rsync, via a NAS disk) had problems
(everything had the 'x' permission set, and symlinks didn't get copied).

Perhaps both issues are due to a non-Linux filesystem...?

So, I set up sshd, and "scp -rp"-ed the directory tree. It took a while
longer than expected,

The primary reason is likely that `rsync` compressed data for transmission while `scp` did not.

These are now both 2MB files, instead of one being a link to the other.

How would you fix this?

I could delete and start again, using rsync, with it set to keep
symlinks within the same disk).

...Would rsync, to the existing tree, replace above with a
symlink, automatically? If so, not starting again and instead just
running rsync, might be perfect?

`rsync` is probably the easiest solution. As Kalin pointed out, `rsync` may be able to fix things without having to transfer the files again, and the `--dry-run` option can help you tune the command before actually running it.

If the initial `rsync` issues were indeed due to a non-Linux filesystem, however, you will likely need to need to find a different way. (Why do you need to use `rsync` "via a NAS disk?") Perhaps you can `rsync` directly from the source to the target machine, or perhaps you can switch to a different intermediate (Linux) filesystem.

If you have to start over, and you are unable to use `rsync`, then an easy solution would be to create a compressed archive of the source data and `scp` it to the target machine.

Or I could run some clever bash script (??) to find all symlinks on the
old machine. Then I have a list of what I need to fix manually.

Note that you do not need a clever script to do this, as `find` suffices:

    $ find /path/to/source -type l -printf '%p\t%l\n'

Or I started wondering if there is a tool to hunt for duplicate files
and sub-directories in a directory tree? That might give me an optimum
list of what should be symlinked, and at least I'd then know the size of
the problem.

I would not go down this route unless necessary. If I did, I would likely use Python, but it is of course possible to approach using standard Linux utilities:

    $ find /path/to/dest -type f -exec md5sum {} \; \
      | sort | uniq -dDw 32 > dups.txt

Files with the same content will have the same checksum. Note that there will likely be files with the same content (such as empty files!) that should not become symlinks.

Good luck,


Home | Main Index | Thread Index

Home Page Mailing List Linux and Japan TLUG Members Links