others-how to exclude files or directories when using tar command to create tarballs in linux system ?

1. Purpose

In this post, I would demo how to exclude files or directories when using tar command to create tarballs in linux system.

For example, I have a directory named A as follows:

A
├── DEPENDENCIES
├── a.log
├── b.log
├── c.log
├── maven
│   ├── d.log
│   └── org.apache.httpcomponents
│       ├── e.log
│       └── httpclient
│           ├── f.log
│           ├── pom.properties
│           └── pom.xml
└── workdir
    ├── m1.cache
    ├── m2.cache
    └── m3
        └── m3.cache

5 directories, 12 files

Now I want to create a tarbar of this directory recursively ,but I don’t want to include the *.log files and the workdir, in other words, I want to exclude these files or directories when using tar:

  • Exclude *.log files in the root directory and all subdirectories recursively
  • Exclude the workdir directory and its contents totally

So the result of the tarball should look like this:

A
├── DEPENDENCIES
└── maven
    └── org.apache.httpcomponents
        └── httpclient
            ├── pom.properties
            └── pom.xml

3 directories, 3 files

How to do this job?

2. Environment

  • Linux or MacOS

3. Solution

3.1 Exclude the log files in all directories recursively

We should exclude the logs as follows, Let’s execute this tar command at A’s parent directory as follows:

tar --exclude="*.log" -zcvf test.tar.gz A

Explanation:

  • The ‘–exclude’ option must be used at the head of the command to exclude recursively
  • The –exclude=’xxx’ style should be used to avoid format error
  • The –exclude=”.log” defines a pattern named ‘.log’, so all files with the suffix .log would be ignored when creating tarbar
  • The -zcvf means create a compressed tarball verbosely
  • The last ‘A’ means the directory to be used to create tarball

After execution, We get this result:

a A
a A/.DS_Store
a A/workdir
a A/maven
a A/DEPENDENCIES
a A/maven/org.apache.httpcomponents
a A/maven/org.apache.httpcomponents/httpclient
a A/maven/org.apache.httpcomponents/httpclient/pom.xml
a A/maven/org.apache.httpcomponents/httpclient/pom.properties
a A/workdir/m1.cache
a A/workdir/.DS_Store
a A/workdir/m2.cache
a A/workdir/m3
a A/workdir/m3/m3.cache

As you can see, all log files are excluded in the result. But we still see the workdir, which should be excluded too.

3.2 Exclude the workdir from the result

Now we want to exclude another directory in the result, how to do this job?

If you want to exclude more than one pattern in the tar command, you should add more ‘–exclude’ options to your command, just as follows:

tar --exclude="*.log" --exclude="workdir" -zcvf test.tar.gz A

Here you can see, we add another ‘–exclude’ option to the last command. Here is the result:

a A
a A/.DS_Store
a A/maven
a A/DEPENDENCIES
a A/maven/org.apache.httpcomponents
a A/maven/org.apache.httpcomponents/httpclient
a A/maven/org.apache.httpcomponents/httpclient/pom.xml
a A/maven/org.apache.httpcomponents/httpclient/pom.properties

It works!

4. How it works?

To avoid operating on files whose names match a particular pattern, use the ‘–exclude’ option.

  • ‘–exclude=pattern’ Causes tar to ignore files that match the pattern.

    The ‘–exclude=pattern’ option prevents any file or member whose name matches the shell wildcard (pattern) from being operated on. For example, to create an archive with all the contents of the directory ‘src’ except for files whose names end in ‘.o’, use the command ‘tar -cf src.tar –exclude=’*.o’ src’.

You can get more information about the exclusions of tar from this document.

5. Summary

In this post, I demonstrated how to exclude some pattern of files or directories from tar command. You can do this job by appending more than one –exclude options at the top the tar command. For more info, you can refer to this document, thanks for your reading.