commit 2c954112417c5101887d3a789cfd97f44d875390
parent 4590e214c6140e6d71896c8133120ebc2af287a1
Author: sin <sin@2f30.org>
Date: Wed, 21 Mar 2018 15:20:07 +0000
Add README
Diffstat:
2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
@@ -3,7 +3,7 @@ PREFIX = /usr/local
SRC = dedup.c
OBJ = dedup.o
BIN = dedup
-DISTFILES = $(SRC) LICENSE Makefile arg.h tree.h
+DISTFILES = $(SRC) LICENSE Makefile README arg.h tree.h
CFLAGS = -g -Wall
CPPFLAGS = -I/usr/local/include
diff --git a/README b/README
@@ -0,0 +1,32 @@
+dedup is a simple data deduplication program. It is designed to be
+used in a pipeline with tar/gpg etc.
+
+dedup only handles a single file at a time, so using tar is advised.
+For example, to dedup a tar file you can invoked dedup as follows:
+
+ tar cf - ~/bak | dedup
+
+This will create a .{cache,index,store} in the current directory. The
+store file contains all the unique blocks. The index file contains
+all the revisions of files that have been deduplicated. Each revision
+is identified by its SHA256 hash. The cache file is only used to
+speed up block comparison.
+
+To list all known revisions run:
+
+ dedup -l
+
+You will get a list of hashes. Each hash corresponds to a single file
+(in this case, a tar archive).
+
+To extract a file from the deduplicated store run:
+
+ dedup -e <hash> > bak.tar
+
+You can mix dedup with other programs like gpg(1). For example to
+perform a remote backup you can use the following command:
+
+ tar cf - ~/bak | gpg -c | ssh user@host dedup
+
+Cheers,
+sin