public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* workflow idiom to compare zip/tgz with folder subtree
@ 2015-09-22 11:50 Paul
  2015-09-22 13:20 ` Andrey Repin
  0 siblings, 1 reply; 14+ messages in thread
From: Paul @ 2015-09-22 11:50 UTC (permalink / raw)
  To: cygwin

I currently take snapshots of selected portions of a folder subtree using
zip files.  Sometimes, I use command-line zip, but other times I'll use the
Windows Compressed Zip folder.  I find myself frequently unzipping the
snapshots into temp folders just so that I can use the unix diff utility
(via cygwin) to see what the differences are with the live folder tree.  Is
there a way to take regular snapshots that I can compare with the live
folder subtree without having to unpack it?  I'm not married to zip. 
However, tar doesn't quite do it.  My vague recollection is that it will
check that the files in a tgz snapshot matches what is in the live folder
subtree, but it won't report what is in the live folder subtree which isn't
in the tgz snapshot.

I've also posted this to:
http://answers.microsoft.com/en-us/windows/forum/windows_7-files/workflow-idiom-to-compare-ziptgz-with-folder/631c39ae-609d-402f-88ba-2681a836f11e?tm=1442922342392


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-22 11:50 workflow idiom to compare zip/tgz with folder subtree Paul
@ 2015-09-22 13:20 ` Andrey Repin
  2015-09-23  1:07   ` Paul
  0 siblings, 1 reply; 14+ messages in thread
From: Andrey Repin @ 2015-09-22 13:20 UTC (permalink / raw)
  To: Paul, cygwin

Greetings, Paul!

> I currently take snapshots of selected portions of a folder subtree using
> zip files.  Sometimes, I use command-line zip, but other times I'll use the
> Windows Compressed Zip folder.  I find myself frequently unzipping the
> snapshots into temp folders just so that I can use the unix diff utility
> (via cygwin) to see what the differences are with the live folder tree.  Is
> there a way to take regular snapshots that I can compare with the live
> folder subtree without having to unpack it?  I'm not married to zip. 
> However, tar doesn't quite do it.  My vague recollection is that it will
> check that the files in a tgz snapshot matches what is in the live folder
> subtree, but it won't report what is in the live folder subtree which isn't
> in the tgz snapshot.

Git, Subversion... basically any sane VCS out there.


-- 
With best regards,
Andrey Repin
Tuesday, September 22, 2015 16:10:42

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-22 13:20 ` Andrey Repin
@ 2015-09-23  1:07   ` Paul
  2015-09-23  1:13     ` Eliot Moss
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Paul @ 2015-09-23  1:07 UTC (permalink / raw)
  To: cygwin

Andrey Repin <anrdaemon <at> yandex.ru> writes:
> Git, Subversion... basically any sane VCS out there.

Ah, yes....I've managed to avoid version control all these years
because I wanted the convenience of bash file management and changing
things on a whim as I see fit.  And for lack of time to learn yet
another system.  I see the lesson here.  The right tool for the right
job, and there's no getting around the learning curve and the
sacrifice in flexibility.  OK.  I have it on my radar.  I will have to
find time to try it out.  Thank you!


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-23  1:07   ` Paul
@ 2015-09-23  1:13     ` Eliot Moss
  2015-09-23  9:20       ` Andrey Repin
  2015-09-23  9:20     ` Andrey Repin
  2015-09-24 14:32     ` Warren Young
  2 siblings, 1 reply; 14+ messages in thread
From: Eliot Moss @ 2015-09-23  1:13 UTC (permalink / raw)
  To: cygwin

On 9/22/2015 9:06 PM, Paul wrote:
> Andrey Repin <anrdaemon <at> yandex.ru> writes:
>> Git, Subversion... basically any sane VCS out there.
>
> Ah, yes....I've managed to avoid version control all these years
> because I wanted the convenience of bash file management and changing
> things on a whim as I see fit.  And for lack of time to learn yet
> another system.  I see the lesson here.  The right tool for the right
> job, and there's no getting around the learning curve and the
> sacrifice in flexibility.  OK.  I have it on my radar.  I will have to
> find time to try it out.  Thank you!

There are also various backup tools based on rsync and compression.
One of these is called duplicity, and it supports encryption as well.
But I suspect there are a number of these and that you can find one
that matches your task ...

Eliot Moss

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-23  1:13     ` Eliot Moss
@ 2015-09-23  9:20       ` Andrey Repin
  0 siblings, 0 replies; 14+ messages in thread
From: Andrey Repin @ 2015-09-23  9:20 UTC (permalink / raw)
  To: Eliot Moss, cygwin

Greetings, Eliot Moss!

>>> Git, Subversion... basically any sane VCS out there.
>>
>> Ah, yes....I've managed to avoid version control all these years
>> because I wanted the convenience of bash file management and changing
>> things on a whim as I see fit.  And for lack of time to learn yet
>> another system.  I see the lesson here.  The right tool for the right
>> job, and there's no getting around the learning curve and the
>> sacrifice in flexibility.  OK.  I have it on my radar.  I will have to
>> find time to try it out.  Thank you!

> There are also various backup tools based on rsync and compression.
> One of these is called duplicity, and it supports encryption as well.
> But I suspect there are a number of these and that you can find one
> that matches your task ...

It seems he need comparison over reservation.
I don't know of any backup tools that offer differential view against backup
content. Not that I know many backup tools, though...


-- 
With best regards,
Andrey Repin
Wednesday, September 23, 2015 12:13:01

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-23  1:07   ` Paul
  2015-09-23  1:13     ` Eliot Moss
@ 2015-09-23  9:20     ` Andrey Repin
  2015-09-24 14:32     ` Warren Young
  2 siblings, 0 replies; 14+ messages in thread
From: Andrey Repin @ 2015-09-23  9:20 UTC (permalink / raw)
  To: Paul, cygwin

Greetings, Paul!

> Andrey Repin <anrdaemon <at> yandex.ru> writes:
>> Git, Subversion... basically any sane VCS out there.

> Ah, yes....I've managed to avoid version control

/facepalm.
'nuff said.

> all these years
> because I wanted the convenience of bash file management and changing
> things on a whim as I see fit.  And for lack of time to learn yet
> another system.  I see the lesson here.  The right tool for the right
> job, and there's no getting around the learning curve and the
> sacrifice in flexibility.  OK.  I have it on my radar.  I will have to
> find time to try it out.  Thank you!

VCS could be as flexible as you make it. From local file-based repositories to
distributed networked storage with master/slave failovers.
Or as rigid, as you need it. I.e. commit hooks.

If all you need is a local VCS, check Subversion. http://svnbook.org/
The "SVN book" will get you started in no time. And cover the essential basics
that will remain valuable even if you later need to switch to some other VCS.


-- 
With best regards,
Andrey Repin
Wednesday, September 23, 2015 11:58:33

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-23  1:07   ` Paul
  2015-09-23  1:13     ` Eliot Moss
  2015-09-23  9:20     ` Andrey Repin
@ 2015-09-24 14:32     ` Warren Young
  2015-09-25  0:51       ` Paul
  2 siblings, 1 reply; 14+ messages in thread
From: Warren Young @ 2015-09-24 14:32 UTC (permalink / raw)
  To: The Cygwin Mailing List

On Sep 22, 2015, at 7:06 PM, Paul wrote:
> 
> Andrey Repin writes:
>> Git, Subversion... basically any sane VCS out there.
> 
> I've managed to avoid version control all these years
> because I wanted the convenience of bash file management and changing
> things on a whim as I see fit.

The only file management task that VCSes force you to do through the VCS is file moves/renames and deletions, and that’s only because a VCS can’t work out how to manage the files you told it to manage if there isn’t a file where you told it to expect one.  All changes to the file *content* can — and normally *are* — done outside the VCS.

Normally you check your changes into the VCS shortly after you make them and are happy with the changes, but it’s quite possible to put off check-ins for weeks or months.  I don’t do that at work on source code repositories, but I have one repo at home that backs up changes to things like ~/bin which sometimes lags way behind “current” like that.

That’s where the VCS’s diff command comes in handy.  It answers the question, “What did I change in this file 4 weeks ago?”

If you were using zip as the archiving format because you want a single file you can move between systems, I recommend that you look at Fossil, rather than the more popular VCSes:

   http://fossil-scm.org/

Fossil’s repository is a single well-strucured, compressed file, which makes it easy to back up, move to other machines, etc.  (It’s a SQLite database file, if that means anything to you.)

If you actually need ZIP files (or tarballs) for some reason, there is a Fossil command to get a particular point in history as an archive.

One of those points in history is called “tip”, meaning the state of the whole repository as of the most recent checkin, which means it’s a single command to get a zip file of all files at the tip of the Fossil repository.

Subversion is a bit simpler to use than Fossil, but its default storage format is a big pile o’ files, which means you pretty much need to do repository management through the svnadmin tool.

Git is even worse than svn in that the pile o’ files is in the same tree as the working file set, instead of a separate tree.

Git is also more complicated than Fossil:

   http://fossil-scm.org/index.html/doc/trunk/www/fossil-v-git.wiki

> And for lack of time to learn yet another system.

You should be able to get started with any sane VCS in maybe half an hour.  Learning all the ins-and-outs will take time, but there’s power in mastering the details.

In terms of complexity, Subversion < Fossil < Git.

The only reason for someone with simple needs to go with Git is that you need the interoperability it provides, since it’s becoming the lingua franca of the developer world.  There’s something to be said for going with the standard, even if it’s a PITA in some ways.  

But I’m not telling a Windows user something they don’t already know with that, am I? :)
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-24 14:32     ` Warren Young
@ 2015-09-25  0:51       ` Paul
  2015-09-25  1:50         ` Andrey Repin
  2015-09-25 16:37         ` Warren Young
  0 siblings, 2 replies; 14+ messages in thread
From: Paul @ 2015-09-25  0:51 UTC (permalink / raw)
  To: cygwin

Eliot Moss wrote:
> There are also various backup tools based on rsync and compression.
> One of these is called duplicity, and it supports encryption as
> well.  But I suspect there are a number of these and that you can
> find one that matches your task ...

Andrey Repin wrote:
> It seems he need comparison over reservation.  I don't know of any
> backup tools that offer differential view against backup content.
> Not that I know many backup tools, though...

Warren Young suggested: fossil

Thank you all.  I've perused and pondered.  There is a key constraint
that I neglected to mention.  I am shuttling incremental work back and
forth between two locations using disc.  At one of the sites, the only
possible tools are M$ Office and a snapshot of Cygwin.  The full copy
of the working hierarchy exists at the two sites (almost identical).
The more restrictive site is the authoritative home of the historical
snapshots, though I may have mini-snapshots at the alternative site.
The comparison of the working file hierarchy with snapshots lets me
vet what needs to be shuttle back and forth; the majority of the
differences will not be relevant as the hierarcy exists at both sites.
I use the same archival scheme for local snapshots and for shuttling
work between sites, though the content is not the same (I won't take
an entire local snapshot with me on disc most of the time).

Most of the files are not software, though parallels can be drawn:
Long SQL scripts, Matlab scripts, images, data files, VBA, Matlab
files, text files, LaTeX files, image files, and M$ Office files
(Access, Excel, Word, Powerpoint, PST).  This is not a development
environment, it is an analysis environment (with code hackery to that
end).  However, the evolution of files and version control
requirements probably overlap (I can only guess as I've never worked
in a regulated code development environment, relying instead on my own
adhoc snapshots & incrementals).  One differences from the days when I
wrote "real" (compiled) code is that I'm not just archiving source
code; some of the files are images, databases, etc., and take up a lot
of space.  I end up creating incrementals a lot more, or simply
leaving the big files out of the snapshot routine (relying on very old
snapshots).  My analysis strategy is strongly influenced by this; I
try to avoid computational approaches that rely on intermediately
generated data that need to be archvied.  As much as possible,
everything should be quickly generatable from raw client input data
files.  Been able to get away with that so far, with a great deal of
effort.

I rely alot on bash hackery, even though I'm no graybeard.  "find",
"diff -qr", and "xargs" are indispensible, and using vim window
splitting, it is very efficient to browse the diff output and warp to
discrepant text files, and even delve into zip files to open its
content, and then use vimdiff to cruise the discrepancies.  The
synergy between vim & bash are (to me) like magic, scripting up copies
and such and piping them to bash.  For the most part, however, you
need to unpack the snapshot (or rebuild it from incrementals).  Andrey
is right, the main thing causing me to put the question out there is
the desire to avoid this.

I noticed that fossil & cvs are part of cygwin.  I will have to bite
the bullet & try a few baby steps at some point.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-25  0:51       ` Paul
@ 2015-09-25  1:50         ` Andrey Repin
  2015-09-25 11:33           ` Paul
  2015-09-25 15:19           ` Warren Young
  2015-09-25 16:37         ` Warren Young
  1 sibling, 2 replies; 14+ messages in thread
From: Andrey Repin @ 2015-09-25  1:50 UTC (permalink / raw)
  To: Paul, cygwin

Greetings, Paul!

> I noticed that fossil & cvs are part of cygwin.  I will have to bite
> the bullet & try a few baby steps at some point.

If anything, I would NOT recommend CVS to anyone making their first steps into
VCS world.
Subversion is way more consistent, better thought out and have about the same
usability characteristics where they are comparable. (And don't forget the
marvelous svnbook.org ...)
The unly reason I was using CVS up until a month ago for some of my projects
is because I was lazy and did not convert them to Subversion ten years ago.


-- 
With best regards,
Andrey Repin
Friday, September 25, 2015 04:36:25

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-25  1:50         ` Andrey Repin
@ 2015-09-25 11:33           ` Paul
  2015-09-25 12:05             ` Andrey Repin
  2015-09-25 15:19           ` Warren Young
  1 sibling, 1 reply; 14+ messages in thread
From: Paul @ 2015-09-25 11:33 UTC (permalink / raw)
  To: cygwin

Andrey Repin <anrdaemon <at> yandex.ru> writes:
> If anything, I would NOT recommend CVS to anyone making their first
> steps into VCS world.  Subversion is way more consistent, better
> thought out and have about the same usability characteristics where
> they are comparable. (And don't forget the marvelous svnbook.org
> ...) The unly reason I was using CVS up until a month ago for some
> of my projects is because I was lazy and did not convert them to
> Subversion ten years ago.

Ah, OK.  Well thanks for the heads up.  Time is so hard to come by
these days that knowing about any non-starters is very helpful.  Thank
you.

BTW, is your sig obsolete? What's this about terrible english?


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-25 11:33           ` Paul
@ 2015-09-25 12:05             ` Andrey Repin
  0 siblings, 0 replies; 14+ messages in thread
From: Andrey Repin @ 2015-09-25 12:05 UTC (permalink / raw)
  To: Paul, cygwin

Greetings, Paul!

> BTW, is your sig obsolete? What's this about terrible english?

One of the ancient teachers (I forgot which one) once walked by the beach with
his students, talking their lessons and what life come by.
And one of his students exclaimed:
- You know so much, is there something left in this world that you don't know?
The teacher turned to him and and said:
- Let me offer you an example.
He then drew some circles in the beach's sand.
- Let's pretend, that the sand on this beach is everything in the world that
could be understood. Then this circle symbolizes everything that I know. And
this one symbolizes what you know. Where they intersect, we both have shared
knowledge. But you sure know something that I do not, so do I know things you
have not experienced or heard of, yet.
- Yes, that's understandable, - the student replied.
- Now, look at the borders of these circles. If so much of the unknown borders
your knowledge, then how do I feel looking around mine?


-- 
With best regards,
Andrey Repin
Friday, September 25, 2015 14:54:56

Sorry for my terrible english...


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-25  1:50         ` Andrey Repin
  2015-09-25 11:33           ` Paul
@ 2015-09-25 15:19           ` Warren Young
  1 sibling, 0 replies; 14+ messages in thread
From: Warren Young @ 2015-09-25 15:19 UTC (permalink / raw)
  To: cygwin

On Sep 24, 2015, at 7:39 PM, Andrey Repin <anrdaemon@yandex.ru> wrote:
> 
>> I noticed that fossil & cvs are part of cygwin.  I will have to bite
>> the bullet & try a few baby steps at some point.
> 
> I would NOT recommend CVS to anyone making their first steps into
> VCS world.

No new repos should be created in CVS, for any reason.

The only reason the tool is still being maintained is to serve old repos that have not converted for one reason or another.

> Subversion is way more consistent, better thought out and have about the same
> usability characteristics where they are comparable.

Yes.

There is no case where CVS has any material advantage over Subversion, with the exception of installation and build simplicity, and that’s irrelevant in 2015 when every OS (or OS-like, in the case of Cygwin) distro has easy ways to get pre-packaged Subversion.

And if build and distribution simplicity matters, Fossil beats even CVS:

    $ cygcheck -l fossil | wc -l
    4
    $ cygcheck -l cvs | wc -l
    38

While writing this message, I tried looking up the CVS home page, whose name I forgot since leaving it for Subversion a dozen years ago.  It wasn’t even on the first page of Google results, even though Google knows full well I’m a software developer, based on past search history.

And lest you think it was a problem of insufficient Google juice to the old CVS home page, the Wikipedia page for CVS-the-VCS (as opposed to CVS-the-pharmacy) wasn’t on the first page of results, either.

That should tell you something.

Then when you finally arrive at the page, the link to the documentation is a broken link into the Wayback Machine, because all the sites that used to host the docs have disappeared due to lack of interest.

(And yes, I’m aware that Fossil isn’t on the first page of Google results, either, being pushed off by Fossil the fashion wear company.  But then, Fossil today doesn’t have the popularity that CVS once had, so there’s no reason to expect it to be there yet, with such a generic name.)

> The unly reason I was using CVS up until a month ago for some of my projects
> is because I was lazy and did not convert them to Subversion ten years ago.

I don’t know about “lazy”.  Some conversions are just plain difficult.  Cygwin is one, as are are FreeBSD and several others I could dig up who converted many years past the peak popularity of Subversion.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-25  0:51       ` Paul
  2015-09-25  1:50         ` Andrey Repin
@ 2015-09-25 16:37         ` Warren Young
  2015-10-05  0:39           ` Paul
  1 sibling, 1 reply; 14+ messages in thread
From: Warren Young @ 2015-09-25 16:37 UTC (permalink / raw)
  To: The Cygwin Mailing List

On Sep 24, 2015, at 6:50 PM, Paul wrote:
> 
> I am shuttling incremental work back and
> forth between two locations using disc.

In that case, you want a distributed version control system (DVCS), not a centralized one.  That rules out Subversion.  (And CVS.)  Fossil and Git are DVCSes, so they’ll work a case like this.  Mercurial and Bazaar (a.k.a bzr) are also DVCSes, and both are also in the Cygwin package repo.

I don’t know how to use any of the other three available DVCSes for a task like yours, but it’s certainly easy enough with Fossil.

The command flow looks like this, assuming the removable disk is called R:, and using “f” as a short alias for “fossil”:

   f new /cygdrive/r/shared-project.fossil
   cd ~/shared-project
   f open /cygdrive/r/shared-project.fossil
   f add *
   f ci -m 'initial checkin’ 

Now everything in ~/shared-project is copied into the Fossil repo on the R: volume.  When you get to the remote site:

   cd ~/shared-project
   f open /cygdrive/r/shared-project.fossil

Now you have a copy of all the files from the R: drive.  If you open a Fossil repo within an existing tree that previously wasn’t under Fossil management, it will ask whether you want to overwrite the preexisting files or leave them alone.  If you leave them alone, a subsequent “fossil diff” will show how your preexisting files differ from the ones in the Fossil repo on R:.

After you make changes to files at either site, say “fossil ci” and it will open a text editor for you to describe your changes.   (Or use the -m option, as above.)

Then back at the other site:

   cd ~/shared-project
   f up

Now all your remote changes are synchronized.

If all that looks complicated, realize that there are only a few day-to-day commands: f ci, f up, f diff.

> the majority of the
> differences will not be relevant as the hierarcy exists at both sites.

If you’re saying that there are files that need to be semi-synchronized between the sites, so that only *some* changes to individual files need to be copied, then Fossil is probably going to fight you.

If you’re saying instead that some files in a given tree are sync’d and some aren’t, that’s easy.  That’s actually the normal way to use Fossil, since with software development projects, you typically only store original source files, and never store anything that can be re-generated from those sources.

(Some projects bend that rule a bit, storing both configure.ac and configure, for example.)

> Most of the files are not software, though parallels can be drawn:
> Long SQL scripts, Matlab scripts, images, data files, VBA, Matlab
> files, text files, LaTeX files, image files, and M$ Office files
> (Access, Excel, Word, Powerpoint, PST).

Most of those things are sensible to store in Fossil.

The main thing you want to avoid storing are large binary files whose content largely changes frequently.

Uncompressed image files (e.g. TIFF without compression) are fine, because probably only *parts* of the image change from one update to the next, so Fossil will store only the differences, then compress that difference, so that you effectively get TIFF-with-compression, and more efficiently than storing a series of separately-compressed TIFFs besides.

Compressed image files (e.g. PNG) can be okay, as long as they change rarely.  The problem with compressed images is that the compression algorithm can change every byte in the file just because a single pixel changed, so the whole image has to be stored in the Fossil repo again.

That said, your existing ZIP archival scheme may be re-copying unchanged images already, in which case Fossil will actually be more efficient, since versions where an image is unchanged refer back to the previously-stored copy of that image.

Besides TIFF, another image file format you might consider is PSD, which can be either compressed or uncompressed.  (Photoshop > Preferences > File Handling > Disable Compression of PSD and PSB Files.)  Plus, PSD layers ensure that only a changed layer needs to be stored separately, rather than the whole thing if you *do* use PSD compression.

MS Office docs are a similar problem to compressed PSD, since they’re just specially-structured ZIP files.  Unchanged assets within, say, a PPTX file shouldn’t be re-copied into the Fossil repo on checkin, but more data than would be stored if you could get an uncompressed PPTX file will still have to be stored.

By comparison, the LaTeX documents are wonderful for Fossil, since they’re uncompressed text, so you’ll get massive compression from them.  Not just the normal 2:1 you typically get for text, but potentially many times that because of the delta compression.

> This is not a development
> environment, it is an analysis environment (with code hackery to that
> end).  However, the evolution of files and version control
> requirements probably overlap

Yes, version control systems are good for more than just software source code.

> One differences from the days when I
> wrote "real" (compiled) code

SQL, VBA, and MatLab are real code.  Don’t let anyone tell you different.

> As much as possible,
> everything should be quickly generatable from raw client input data
> files.

That strategy matches exactly with what you want for a VCS: store the source data, not the data generated from it, unless it just takes too much time and effort to re-generate it.

> using vim window
> splitting, it is very efficient to browse the diff output

While you can still do that with Fossil, it’s probably better to switch to either “fossil gdiff” coupled with a graphical diff utility of your choice, or to use “fossil ui”, which will let you view diffs of checked-in versions in a browser, either inline or side-by-side, your choice.

On Windows, fossil gdiff defaults to WinDiff, which you may already have installed, since MS distributes it with some other software:

  https://en.wikipedia.org/wiki/WinDiff

A lot of people prefer Beyond Compare or Meld, both of which can be configured to act as the graphical diff handler for Fossil.

It should be possible to use Vim’s vimdiff feature this way, too.

> try a few baby steps at some point.

One of the smarter things you can do with Fossil is to use several repositories, one for each focused project, instead of trying to store everything in a single “world” repo.

So, put one project under Fossil management today.  Sync it back and forth, work out the kinks.  Put another project under Fossil in a separate repository next week.  Add more repos as you become comfortable with the process.

Fossil makes managing multiple active repos easy with its “all” command, which lets you do common things to all of the repositories.  “fossil all sync” is a common incantation, for example, meaning “Update all the local Fossil checkouts with the changes from the master repos.”
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: workflow idiom to compare zip/tgz with folder subtree
  2015-09-25 16:37         ` Warren Young
@ 2015-10-05  0:39           ` Paul
  0 siblings, 0 replies; 14+ messages in thread
From: Paul @ 2015-10-05  0:39 UTC (permalink / raw)
  To: cygwin

Warren Young <wyml <at> etr-usa.com> writes:
Lots of good background for newbies to version control apps.

Warren, thanks for the comprehensive map of version control
apps for newbies.  To be honest, I'm not sure when I will have
a chance to get spun up on one.  But I know where there is a
good intro now.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-10-05  0:39 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 11:50 workflow idiom to compare zip/tgz with folder subtree Paul
2015-09-22 13:20 ` Andrey Repin
2015-09-23  1:07   ` Paul
2015-09-23  1:13     ` Eliot Moss
2015-09-23  9:20       ` Andrey Repin
2015-09-23  9:20     ` Andrey Repin
2015-09-24 14:32     ` Warren Young
2015-09-25  0:51       ` Paul
2015-09-25  1:50         ` Andrey Repin
2015-09-25 11:33           ` Paul
2015-09-25 12:05             ` Andrey Repin
2015-09-25 15:19           ` Warren Young
2015-09-25 16:37         ` Warren Young
2015-10-05  0:39           ` Paul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).