Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 342372 - support gitattributes
Summary: support gitattributes
Status: RESOLVED FIXED
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement with 117 votes (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: helpwanted
: 421364 (view as bug list)
Depends on: 521455 486560 486563 486628 499615 537410
Blocks: 357039 452968 420799 421364 470333
  Show dependency tree
 
Reported: 2011-04-10 12:43 EDT by Christian Halstrick CLA
Modified: 2019-03-20 18:58 EDT (History)
95 users (show)

See Also:


Attachments
result of last run of the performance analysis script at https://gist.github.com/b2f67f9ee921bb5e31bc (11.23 KB, text/plain)
2014-11-03 08:33 EST, Christian Halstrick CLA
no flags Details
Updated result of last run of the performance analysis script (points to the last patch set of each review) (7.23 KB, text/plain)
2014-11-04 05:00 EST, Arthur Daussy CLA
no flags Details
binary patched plugins (stable-4.1) for testing the functionality (4.55 MB, application/x-stuffit)
2015-10-30 09:26 EDT, Ivan Motsch CLA
no flags Details
example .gitattributes for testing (623 bytes, text/plain)
2015-10-30 09:31 EDT, Garret Wilson CLA
no flags Details
performance-check after eol-fix patch (6.64 KB, text/plain)
2015-11-03 08:19 EST, Ivan Motsch CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Halstrick CLA 2011-04-10 12:43:14 EDT
JGit should support gitattributes[1]. At least it should be possible to specify the "diff" property. This would allow to mark certain files as "binary" and to 
prevent content merges on these files. When one knows that for certain file types (e.g. files with specific file endings) the content merge produces wrong content and for those files it is better to rely on manual conflict resolution marking those files as binary would solve the issue. Such a problem is described in [2].

[1] http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
[2] http://dev.eclipse.org/mhonarc/lists/egit-dev/msg02058.html
Comment 1 Robin Stocker CLA 2011-05-04 11:43:36 EDT
Is this bug the cause for jgit inserting conflict markers into jars on merge? Or put another way, does jgit currently support binary file detection at all?
Comment 2 Alvaro Sanchez-Leon CLA 2011-05-18 12:08:40 EDT
the "merge" property is equally important since merges are at the repo level. it's fairly common to expect different strategies within the same merge. 

 Specifying the merge strategy per file type will be needed in R4E as review meta data files will need to use a different strategy than review target files.
Comment 3 Carsten Pfeiffer CLA 2011-12-01 03:48:48 EST
A draft is already available at

http://egit.eclipse.org/r/#change,1614 and
http://egit.eclipse.org/r/#change,1615
Comment 4 Robert Niemi CLA 2011-12-01 10:24:41 EST
The proposed draft is more than a year old. Is it a dead end, or did the developer give up / couldn't find time to continue the work?
Comment 5 Carsten Pfeiffer CLA 2011-12-01 10:29:48 EST
(In reply to comment #4)
I suppose the latter. But maybe he can answer that himself?
Comment 6 Alan McGovern CLA 2012-02-28 20:57:03 EST
Is there any chance support for this could be added? The lack of support for .gitattributes is causing the crlf settings for my repo to be ignored which in turn is causing a lot of grief.
Comment 7 Miles Parker CLA 2012-07-05 13:07:18 EDT
Hi all, just a note that we (Tasktop) anticipate providing this as part of a new effort. (Thanks to Ericsson for their support.) Sorry, can't give more of a timeline for that, except "this year", but wanted to give people a heads up.
Comment 8 Lothar Werzinger CLA 2012-10-29 12:31:20 EDT
As gitattributes are the recommended way to ensure the right line endings in the repository across platforms / users I would be really interested in getting support for gitattributes in Eclipse. @miles: is there an ETA?

https://help.github.com/articles/dealing-with-line-endings

Git allows you to set the line ending properties for a repo directly using the text attribute in the .gitattributes file. This file is committed into the repo and overrides the core.autocrlf setting, allowing you to ensure consistant behaviour for all users regardless of their git settings.
Comment 9 Miles Parker CLA 2012-10-29 12:45:43 EDT
As the end of the year is rapidly approaching, I can safely say "less than two months" now. :) It's on the plan.

(This is probably obvious, but note that you can maintain the .gitattributes file from w/in Eclipse.)

cheers,

Miles
Comment 10 Adib Saikali CLA 2013-01-09 23:38:04 EST
> (This is probably obvious, but note that you can maintain the .gitattributes
> file from w/in Eclipse.)

Miles can you expand on what you mean by maintaining .gitattributes from w/in eclipse?
Comment 11 Gunnar Wagenknecht CLA 2013-01-29 09:24:55 EST
(In reply to comment #10)
> Miles can you expand on what you mean by maintaining .gitattributes from
> w/in eclipse?

The first part is about adding support for reading and writing .gitattributes from JGit. No Eclipse UI is planned at the moment.
Comment 12 Gunnar Wagenknecht CLA 2013-01-29 16:04:28 EST
FWIW, I started work on looking at the reviews and updating it to the latest JGit code.

https://git.eclipse.org/r/#/c/1614/
https://git.eclipse.org/r/#/c/1615/

There are still a few open ends. At this stage, it's still a draft.
Comment 13 Gunnar Wagenknecht CLA 2013-03-10 12:34:38 EDT
I've rewrote the support greatly inspired how git ignores are implemented in JGit. The latest patch set is attached to:

  https://git.eclipse.org/r/1614

Once progress is made we can look into support actual attributes within JGit. I'd appreciate if someone compiles a list of essential attributes that are worth to look at from a JGit perspective. Note, I'm really looking for the once that can be supported out-of-the-box (line endings?).
Comment 14 Tobias Oberlies CLA 2013-06-14 05:40:06 EDT
Although the comments make me think that gitattributes are not supported, I've observed at least some effects in JGit 3.0.0.201305281830-rc when changing the .gitattribute file in a project:
- Setting "*.bat eol=crlf text" triggers the conversion to LF in the repository (as implied by the text attribute)
- Setting "*.bat eol=crlf" has no effect, so setting the eol attribute does not imply the text attribute (as documented and implemented in native git).

Could someone from the JGit project clarify the status of the current .gitattributes support?
Comment 15 Gunnar Wagenknecht CLA 2013-06-14 06:06:16 EDT
Tobias, the review is still pending. It adds support for JGit for *parsing* the .gitattributes file. I did not find any other line of code in JGit that reads/parses .gitattributes files.
Comment 16 Christian Halstrick CLA 2013-06-20 10:58:03 EDT
I can confirm that gitattributes are not yet supported by E/Jgit.
Comment 17 Michael Vorburger CLA 2013-11-08 14:20:47 EST
> I'd appreciate if someone compiles a list of essential attributes that are worth to look at from a JGit perspective. Note, I'm really looking for the once that can be supported out-of-the-box (line endings?).

IMHO definitely the line ending stuff - I've just created new Bug 421364 specifically about that.
Comment 18 Peter Dahm CLA 2014-01-10 10:32:15 EST
can you give any hint if and when this bug will be fixed ?
Comment 19 Lars Vogel CLA 2014-03-25 08:19:22 EDT
*** Bug 421364 has been marked as a duplicate of this bug. ***
Comment 20 Aaron Digulla CLA 2014-04-09 12:22:54 EDT
Standard test case:

- I've set the line endings for the project to "Unix". This works for text files and Java sources.
- I've committed launch configs inside of the project to share them with the whole team. Eclipse will change the line endings of these to "platform", no matter what the project setting is.

That means every time a Unix or Windows developer opens this project, the line endings of the launch config changes.
Comment 21 Robin Rosenberg CLA 2014-04-09 15:16:15 EDT
(In reply to Aaron Digulla from comment #20)
> Standard test case:
> 
> - I've set the line endings for the project to "Unix". This works for text
> files and Java sources.
> - I've committed launch configs inside of the project to share them with the
> whole team. Eclipse will change the line endings of these to "platform", no
> matter what the project setting is.
> 
> That means every time a Unix or Windows developer opens this project, the
> line endings of the launch config changes.

There is a workspaces setting which is used for this, I believe.
Comment 22 Arthur Daussy CLA 2014-10-23 05:48:16 EDT
Hi,
 As mentionned in the mailist I have rebased and reworked review 1614: 
   https://git.eclipse.org/r/#/c/1614/
 I also added a new review regarding the resolution of git attributes with review:
  https://git.eclipse.org/r/#/c/35377/

I will now start to work on the first git attribute implementation.
Comment 23 Christian Halstrick CLA 2014-11-03 08:31:46 EST
This is the continuation of a performance analysis first presented in gerrit codereview (https://git.eclipse.org/r/#/c/35665). It fit's better here because it touches multiples proposals in our review system:

I am interested in the performance effects of gitattributes support and this ident support. Since jgit is also used on server side on repos without attributes I want to make sure we don't destroy performance here.
I took a clean, non-bare linux repo as test repo. I compared the performance of "git status" of native git and mulitple jgit versions (plain vanilla, basic attributes support, ident support). I also counted with the strace utility on a linux box how many filesystem calls we do.

The bottom line: In a repo without gitattributes these changes don't influence performance much. That's great! Good job.
On a linux repo with gitattributes and a "*.c ident" rule the performance of git status raises by factor 4. We are opening 20000 more files. And we are reporting files as dirty which native git thinks which are clean. 

Here is a table summarizing the performance. The value "runtime(s)" is the wall clock time needed to execute the command measured with "time". The value "#lstat"/"#open" is the total number of lstat/open system calls (get metadata for a file/open a file) we do during execution of the "jgit stat
us" measured with "strace".

"jgit status" on clean, non-bare linux repo
===========================================
native git (runtime(s)/#lstat/#open): 0.1s / 48012 / 749
jgit w/o gitattributes (runtime(s)/#lstat/#open): 2.4s / 51203 / 511
jgit with basic gitattributes support (runtime(s)/#lstat/#open): 2.7s / 51203 / 511
jgit with gitattributes & ident support (runtime(s)/#lstat/#open): 2.5s / 51203  / 511

"jgit status" on clean, non-bare linux repo with a root .gitattributes with "*.c ident"
=======================================================================================
native git (runtime(s)/#lstat/#open): 0.1s / 48013 / 750
jgit w/o gitattributes (runtime(s)/#lstat/#open): 2.4s / 51203 / 511
jgit with basic gitattributes support (runtime(s)/#lstat/#open): 3.1s / 51204 / 514
jgit with gitattributes & ident support (runtime(s)/#lstat/#open): 8.5s / 73823  / 23134

My test script runs on linux and is downloadable from https://gist.github.com/b2f67f9ee921bb5e31bc. I attached a trace of the last run.
Comment 24 Christian Halstrick CLA 2014-11-03 08:33:12 EST
Created attachment 248322 [details]
result of last run of the performance analysis script at https://gist.github.com/b2f67f9ee921bb5e31bc
Comment 25 Arthur Daussy CLA 2014-11-04 05:00:38 EST
Created attachment 248345 [details]
Updated result of last run of the performance analysis script (points to the last patch set of each review)

Here is the result of the script when using all the last patch set of 3 related reviews.
Comment 26 Tobias Oberlies CLA 2014-11-21 11:02:27 EST
(In reply to comment #23)
> On a linux repo with gitattributes and a "*.c ident" rule the performance of git
> status raises by factor 4. We are opening 20000 more files. And we are reporting
> files as dirty which native git thinks which are clean.
Could the "ident" support be traced in a separate bug? I'm pretty sure that the minority of the 41 people wanting .gitattribute support actually need that feature.
Comment 27 Arthur Daussy CLA 2014-11-24 03:49:55 EST
Hi,

 I have created a different bug for the implementation of the "Ident" attribute:
 https://bugs.eclipse.org/bugs/show_bug.cgi?id=452968

However, the "ident" attribute development is mainly used to validate the architecture choices made for the implementation of the git attributes.

Regards,

Arthur
Comment 28 Chris Aniszczyk CLA 2015-01-05 11:18:07 EST
progress was recently made on the Gerrit review, filed a CQ:
https://dev.eclipse.org/ipzilla/show_bug.cgi?id=9078

Should go into 3.7
Comment 29 Chris Aniszczyk CLA 2015-01-08 11:53:44 EST
merged into master: c185484dcfb52aaae818bc111824f1a31ec0f806
Comment 30 Arthur Daussy CLA 2015-01-08 12:54:46 EST
Isn't it a bit too early to close this bug?

https://git.eclipse.org/r/#/c/1614/ was the first step to compute the git attributes. This review was really about computing the attributes in separated Tree. If I understand correctly the specification, git attributes should be computed from both the Working tree and the index [1]. This is why I would suggest to wait for the review "https://git.eclipse.org/r/#/c/35377/" to be integrated before closing this bug. This review is about moving the computation of git attributes to the TreeWalk. From this object we can access both the WorkingTreeIterator and the DirCacheIterator. This is why I thought it was a good candidate for implementing the logic described by [1].

Is that something that sounds reasonable to you?

[1] "When the .gitattributes file is missing from the work tree, the path in the index is used as a fall-back. During checkout process, .gitattributes in the index is used and then the file in the working tree is used as a fall-back." taken from https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
Comment 31 Chris Aniszczyk CLA 2015-01-08 13:38:25 EST
Fair enough
Comment 32 Ats CLA 2015-03-05 10:25:45 EST
I didn't quite understand if the enhancement for this plugin that is requested by this bug is now supposed to be implemented, or not. First impression was that enhancement was merged into master, but bug was reopened just because of code review - i guess i got the wrong impression:

I tried out the latest EGit release with latest Eclipse JEE release, but it seemed that .gitattributes is still not considered and line endings in repository are changed to CRLF if workspace contains file with CRLF line endings.

----------

Details: 

1) Installation: I installed EGit version 3.7.0.201502260915-r through Eclipse marketplace on top of freshly downloaded Eclipse Luna SR2 4.4.2 https://www.eclipse.org/downloads/packages/eclipse-ide-java-ee-developers/lunasr2

2) .gitattributes file:
* text eol=crlf
*.java text

3) Previous commit for java file was made through SourceTree so that line endings in workspace are CRLF and in repository LF, as expected.

4) When i added one line and committed through Eclipse, then all line endings in workspace remained the same (CRLF) as expected, but line endings in repository were changed from LF to CRLF <- THIS IS NOT EXPECTED!
Comment 33 Arthur Daussy CLA 2015-03-10 03:07:59 EDT
Hi,

 No the computation of git attributes is not integrated yet. 2 patchsets are waiting for a review (https://git.eclipse.org/r/#/c/35377/ and https://git.eclipse.org/r/#/c/35665/6). Moreover, those reviews only introduce the backend to support gitattributes. The implementation of each git attributes (end of line, merge etc..) will need to be implemented separately in subsequent reviews.

Regards,

Arthur Daussy
Comment 34 Matthias Kuchem CLA 2015-04-30 07:28:41 EDT
Can we get a new Target Milestone here? 3.7 is already released and this isn't fixed.
Comment 35 Gunnar Wagenknecht CLA 2015-04-30 07:51:57 EDT
Removing myself from assignee as I'm no longer driving this. Anyone who is interested, please feel free to take over.
Comment 36 Andre Bossert CLA 2015-05-25 14:04:47 EDT
We have also issues with automatic merge in JGit of some special files like IBM Rhapsody (rpy, sbs) that needs own merge tools called.
With .gitaatributes:
*.rpy diff=binary
*.sbs diff=binary

we are aible to make Git "skip" changing such files during the merge / rebase etc -> for command line Git (windows msysgit) and other tools like TortoiseGit or SourceTree we have the support and it works. Now during switching to Eclipse Egit / Jgit we have the issues in our merge workflows.

Can we help to test / verify the patches if available?

P.S.: for the whole workflow also external diff / merge tools should be supported, see related:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=356832
Comment 37 Nathan Deckard CLA 2015-06-16 11:37:47 EDT
we use gitattributes in combination with smudge/clean filters to be applied to specific file types. need support for both
Comment 38 Matthias Sohn CLA 2015-06-17 03:33:12 EDT
we started working on JGit support for clean/smudge filters as a first step to support Github's LFS extension which allows to store large files on a separate LFS server while keeping track of versions in git, see https://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02908.html and bug 470333
Comment 39 Srgjan Srepfler CLA 2015-07-10 04:21:08 EDT
This is the main reason why everyone on windows hate git after an line endings fiasco.
Comment 40 Eclipse Genie CLA 2015-08-16 17:30:13 EDT
WARNING: this patchset contains 1472 new lines of code and may require a Contribution Questionnaire (CQ) if the author is not a committer on the project. Please see:https://wiki.eclipse.org/Project_Management_Infrastructure/Creating_A_Contribution_Questionnaire
Comment 41 Garret Wilson CLA 2015-08-26 09:41:55 EDT
I'm getting ready to migrate a huge (since 2003) Subversion repository over to Git. And of course I was planning on having our team use my IDE of choice, Eclipse.

But Eclipse+EGit cannot even handle the standard way for specifying line endings, which is .gitattributes? Seriously?

And this was promised by the end of the year---in 2012!?? (See Comment 9.)

And then someone opened Bug 421364 (see  Comment 17) to split out the line ending functionality, and the next year someone marked it as a duplicate?

So in short, is Eclipse not serious about Git support? This is fundamental configuration for Git. The only other way to prevent major "grief" is to dictate that all developers manually set their line endings to Unix, even on Windows? Am I then expected to fly around the country and verify their Eclipse settings, or just wait until someone forgets and we have to do a history filter?

Or do we just need to go with an IDE that supports Git?
Comment 42 Matthias Sohn CLA 2015-08-26 11:12:24 EDT
(In reply to Garret Wilson from comment #41)
> I'm getting ready to migrate a huge (since 2003) Subversion repository over
> to Git. And of course I was planning on having our team use my IDE of
> choice, Eclipse.
> 
> But Eclipse+EGit cannot even handle the standard way for specifying line
> endings, which is .gitattributes? Seriously?
> 
> And this was promised by the end of the year---in 2012!?? (See Comment 9.)

this was promised by a contributor who walked away

> And then someone opened Bug 421364 (see  Comment 17) to split out the line
> ending functionality, and the next year someone marked it as a duplicate?
> 
> So in short, is Eclipse not serious about Git support? This is fundamental
> configuration for Git. The only other way to prevent major "grief" is to
> dictate that all developers manually set their line endings to Unix, even on
> Windows? Am I then expected to fly around the country and verify their
> Eclipse settings, or just wait until someone forgets and we have to do a
> history filter?

Obviously none of those who contribute to JGit found this important enough
to spend the scarce time they can dedicate to JGit on finishing support for
gitattributes.

If this is important for you feel free to contribute the missing pieces.
Comment 43 Garret Wilson CLA 2015-08-26 11:17:44 EDT
Well then let me at least know what the situation is, so I can be sure of a workaround.

Which EOL settings does JGit support?

* core.eol?
* core.autocrlf (global config)?
* core.safecrlf (global config)?
* core.autocrlf (repository config)?
* core.safecrlf (repository config)?
Comment 44 Garret Wilson CLA 2015-08-27 11:39:15 EDT
We may have some new developers that I could assign to this in the upcoming months. This feature is important to us, as it is essential part of Git an relates to every single check-in of Java files.

So that I can better understand the situation, could someone at least answer my questions in Comment 43 regarding current eol/autocrlf support?
Comment 45 Paul Verest CLA 2015-09-04 14:06:30 EDT
@all

There is Bug 476615 - Editor for .git* files (.gitignore .gitmodules .gitattributes)
it would be better if Editor comes with functionality support
Comment 46 Matthias Sohn CLA 2015-09-06 18:35:13 EDT
(In reply to Paul Verest from comment #45)
> @all
> 
> There is Bug 476615 - Editor for .git* files (.gitignore .gitmodules
> .gitattributes)
> it would be better if Editor comes with functionality support

editor should be discussed in Bug 476615, this bug here is tracking work on the gitattributes support in JGit which has no UI itself
Comment 47 Garret Wilson CLA 2015-09-06 18:38:26 EDT
I've already mentioned I may be able to find resources to work on this. To get me oriented, can no one answer my questions in Comment 43?

Which EOL settings does JGit support?

* core.eol?
* core.autocrlf (global config)?
* core.safecrlf (global config)?
* core.autocrlf (repository config)?
* core.safecrlf (repository config)?

Is it that no one really knows how EGit handles EOL anywhere, or that no one really cares if anyone fixes this or not?
Comment 48 Matthias Sohn CLA 2015-09-06 19:25:07 EDT
(In reply to Garret Wilson from comment #47)
> I've already mentioned I may be able to find resources to work on this. To
> get me oriented, can no one answer my questions in Comment 43?
> 
> Which EOL settings does JGit support?
> 
> * core.eol?
> * core.autocrlf (global config)?
> * core.safecrlf (global config)?
> * core.autocrlf (repository config)?
> * core.safecrlf (repository config)?
> 
> Is it that no one really knows how EGit handles EOL anywhere, or that no one
> really cares if anyone fixes this or not?

AFAIK JGit supports core.autocrlf but neither safecrlf nor eol.

The only JGit commit mentioning safecrlf I found is this one

jgit (master)]$ git log | grep -i -A 10 -B 15 safecrlf

commit 76dd9d1d46007fc49639d264631658114f4fbd24
Author: Robin Rosenberg <robin.rosenberg@dewire.com>
Date:   Mon Oct 31 23:30:11 2011 +0100

    Support more of AutoCRLF

    This patch introduces CRLF handling to the DirCacheCheckout and
    WorkingTreeIterator supporting the AutoCRLF for add, checkout
    reset and status and hopefully some other places that depende
    on the underlying logic of the affected API's.

    The patch includes test cases for the Status command provided by
    Tomasz Zarna for bug 353867.

    The core.eol and core.safecrlf options are not yet supported.

    Bug: 301775
    Bug: 353867
    Change-Id: I2280a2dc0698829475de6a662a6c6e80b1df7663

I don't know how perfect autocrlf support is since I don't use Windows so I lack first-hand experience

Commit 7df17e57d4e736336de6b95810daf076e9b7dded looks interesting for implementing support for core.eol
Comment 49 Eclipse Genie CLA 2015-10-28 08:52:36 EDT
WARNING: this patchset contains 1501 new lines of code and requires a Contribution Questionnaire (CQ), as author arthur.daussy@obeo.fr is not a committer on jgit/jgit. Please see:https://wiki.eclipse.org/Project_Management_Infrastructure/Creating_A_Contribution_Questionnaire
Comment 50 Arthur Daussy CLA 2015-10-29 04:01:09 EDT
Great I see that this review is moving on!
I'm a committer on another Eclipse project should still fill up a CQ? If I do, could you explain me how to do it? I have read the wiki page but I was unable to find a link to create a CQ (If I understand well only a JGit commit can initiate on?)

Regards,

Arthur
Comment 51 Matthias Sohn CLA 2015-10-29 05:10:42 EDT
The CQ 9120 for this change was already approved in January :-)

The recently introduced automation by Genie isn't smart enough to understand that the CQ linked in the commit message was already approved hence it added a comment that a CQ is required.

Since we now work on support for clean/smudge filters to enable LFS support we found that we need a few more tweaks and a few more tests before we can submit the work you started.

Find the complete change series here
https://git.eclipse.org/r/#/c/50372

We think we can submit this change very soon as we now reached a state with good test coverage and all tests now succeed.
Comment 52 Arthur Daussy CLA 2015-10-29 05:42:57 EDT
I thought so but I wanted to be sure to avoid preventing any progress on this review.

Thanks

Arthur
Comment 53 Ivan Motsch CLA 2015-10-30 09:23:02 EDT
The jgit/egit code I found seems to me well-structured, has good code style and is good understandable. Good work.
Now I created a patch for jgit/egit version stable-4.1.

- The patch introduces basic attribute support and detailed End-of-line-conversion.
- I separated the Ignore and Attributes concept from the Treewalkers and dircheckout code and added it as a facility to the Repository. I think that this is more the way other implementations are doing it due to the fact that attribute and ignore decisions are only valid when including the whole tree of files in the complete git repo at once.
- I created many tests covering all I did and all classes and concepts i introduced and implemented.

Basic Features:
- attributes support including global attributes file, config/info/attributes file, working-dir/working-subdir .gitattributes files
- macros ('binary' and custom macros [attr]foo bar)
- automatic refresh using jgit FileSnapshot concept

End-of-line conversion support
- detect native eol
- config properties 'core.eol', 'core.autocrlf'
- .gitattributes such as 'eol=lf', 'eol=crlf', 'text', '-text', 'binary', 'text=auto',

All Junit tests are green, except the following two:
- testTrailingSpaces(org.eclipse.jgit.ignore.IgnoreNodeTest)
comment: this test always fails on windows due to the fact that windows(7) cannot create directory names with trailing spaces
- testParseHistory(org.eclipse.jgit.patch.EGitPatchHistoryTest)
comment: this uses the *nix external git command that i dont have currently linked

Attached are the patched binary Plug-Ins in case someone wants to test this patch without checking out all git files and do the building...
Please give feedback. The patched version is based on stable-4.1 (eclipse 4.5 = mars)

I will also submit a Gerrit Change shortly that can be reviewed.

We are using that code currently in approx. 15 larger projects and it solves our issues with line conversion and mixed java, c# and signed vsto file environments.
Comment 54 Ivan Motsch CLA 2015-10-30 09:26:19 EDT
Created attachment 257648 [details]
binary patched plugins (stable-4.1) for testing the functionality

see Comment 53
Comment 55 Garret Wilson CLA 2015-10-30 09:30:12 EDT
I appreciate that someone is getting a chance to address this.

I'd like to attach a sample .gitattributes file for you to test with the patch. Why? Because it seems that every time (in whatever product) someone says "such and such is fixed", I try it out in *my* situation and it breaks, and they say, "oh, it doesn't support such-and-such feature". ;)
Comment 56 Garret Wilson CLA 2015-10-30 09:31:40 EDT
Created attachment 257649 [details]
example .gitattributes for testing

This is the .gitattributes file we intend to use in a really big Subversion -> Git conversion very soon. I'd like to know if it works correctly with the patch. Thanks.
Comment 57 Eclipse Genie CLA 2015-10-30 09:49:03 EDT
New Gerrit change created: https://git.eclipse.org/r/59345

WARNING: this patchset contains 3038 new lines of code and requires a Contribution Questionnaire (CQ), as author ivan.motsch@bsiag.com is not a committer on jgit/jgit. Please see:https://wiki.eclipse.org/Project_Management_Infrastructure/Creating_A_Contribution_Questionnaire
Comment 58 Eclipse Genie CLA 2015-10-30 09:51:16 EDT
New Gerrit change created: https://git.eclipse.org/r/59346
Comment 59 Ivan Motsch CLA 2015-10-30 09:52:18 EDT
(In reply to Garret Wilson from comment #56)
> Created attachment 257649 [details]
> example .gitattributes for testing
> 
> This is the .gitattributes file we intend to use in a really big Subversion
> -> Git conversion very soon. I'd like to know if it works correctly with the
> patch. Thanks.



thank you very much. Yes, I will check it just right now.
Comment 60 Garret Wilson CLA 2015-10-30 09:55:10 EDT
Note that while my example .gitattributes file just sets binary/text for each file (remember that "text" is an alias; see .gitattributes online documentation), it also sets specific EOL handling for *.bat and *.sh files.
Comment 61 Garret Wilson CLA 2015-10-30 09:57:10 EDT
Sorry, I misspoke. Instead of "text", it is "binary" that is an alias (that is, a macro). See http://git-scm.com/docs/gitattributes .
Comment 62 Ivan Motsch CLA 2015-10-30 10:00:41 EDT
(In reply to Garret Wilson from comment #60)
> Note that while my example .gitattributes file just sets binary/text for
> each file (remember that "text" is an alias; see .gitattributes online
> documentation), it also sets specific EOL handling for *.bat and *.sh files.

I can confirm to you that the patch will cover all your settings in the .gitattributes file. Except the diff=... attributes.

My patch (so far) only solves the end-of-line issues. The diff is not yet part of it.

For the various end-of-line attributes in your .gitattributes file, i have in fact full test coverage which tests exactly these cases. eol=lf, text=auto, binary etc.
Comment 63 Ivan Motsch CLA 2015-10-30 10:02:27 EDT
If you can, please review the file (gerrit) org.eclipse.jgit.util.io.StreamConversionFactory.java
which does the decision part of the stream conversions.
Comment 64 Garret Wilson CLA 2015-10-30 10:23:41 EDT
> I can confirm to you that the patch will cover all your settings in the
> .gitattributes file. Except the diff=... attributes.

Ivan, that is super-awesome!!!! Thanks for checking this for me.

The diff= stuff I put in for completeness. I'm not even sure how they work. I'm most interested in the EOL stuff for the moment, so this is great!!

I hope this goes smoothly, so that by the time we have everything converted over to Git for our developers, Eclipse will be working with .gitattributes as well.
Comment 65 Ivan Motsch CLA 2015-10-30 10:28:14 EDT
Andrey Loskutov commented the Gerrit : The patch doubles "jgit status" execution time on a huge (~8 GB) repository, from ~6 to ~12 seconds.

This correct. In fact I do a FileSnapshot isModified check on every access to the file. There is huge performance improvement potential.

However unless there is a basic "go" I will not invest much time (and money) in performance improvements.
There can be done much but there should be "enough" interest in this :-)
Comment 66 Andrey Loskutov CLA 2015-10-30 10:31:40 EDT
(In reply to Ivan Motsch from comment #65)
> Andrey Loskutov commented the Gerrit : The patch doubles "jgit status"
> execution time on a huge (~8 GB) repository, from ~6 to ~12 seconds.
> 
> This correct. In fact I do a FileSnapshot isModified check on every access
> to the file. There is huge performance improvement potential.

Ivan, I haven't reviewed your code, but I've just tested it on our repository (~8 GB) and it doubles the "jgit status" execution time (from ~6 to ~12 seconds). This is not acceptable IMHO. FYI: jgit master is already 6 times slower as git CLI which does the same work in 1 second.
 
> However unless there is a basic "go" I will not invest much time (and money)
> in performance improvements.
> There can be done much but there should be "enough" interest in this :-)

I'm not sure if you have seen another patch here - https://git.eclipse.org/r/35377/ - which *also* adds support for git attributes and does not add any significant performance regression.
Comment 67 Ivan Motsch CLA 2015-10-30 10:39:20 EDT
yes I saw that other patch, but that covers not at least the minimum to support for correct end-of-line handling and correct override / precedence of the various attribute files.

I checked out the other patch but found it too difficult to add the beforementioned features to it due to the lack of generalization in the iterator locations.

These two things interoperate but should imho not be combined in code.
As mentioned I think there can be much done in performance improvement, but until this happens I /we need a solution in the first place that handles crlf and eol and attributes regarding end-of-line all at once.
This is one. It works and I hope it will help others in having a first step into a viable solution.

I will be happy to help improve this even more. jgit/egit is a great tool and I like using it.
Comment 68 Andrey Loskutov CLA 2015-10-30 10:44:47 EDT
(In reply to Ivan Motsch from comment #67)
> yes I saw that other patch, but that covers not at least the minimum to
> support for correct end-of-line handling and correct override / precedence
> of the various attribute files.

May be you could comment about issues/design problems you've found directly on the patch?
Comment 69 Christian Halstrick CLA 2015-11-02 04:34:47 EST
Since a few days I am currently working on adding "git lfs" functionality to jgit. For that reason I also need to have working gitattributes. The attributes handling was too basic for my needs so I based my work on Arthurs open change https://git.eclipse.org/r/#/c/35377/. The idea was that for a good attributes implementation we need to search for them on workingtree, index and maybe even in other trees. That place in jgit which knows about all the different places to look for attributes seems to be the treewalk. Since performance is a very big topic in JGit we thought that a integration of the attributes handling in the treewalk would allow us to implement a well performing implementation. In a Checkout or Add operation we would like to look at the expensive working tree only once. Scanning the working tree of the linux repo once to search for dirty files and conflicts and then scan it again for finding gitattributes should be avoided.

But I do agree that current implementation https://git.eclipse.org/r/#/c/35377 is quite complicated. Especially I don't like (anymore :-() that attribute handling is also spread into all the iterators. I influenced parts of this decision, so shame on me.

I'll start reviewing your change https://git.eclipse.org/r/#/c/59345/. Maybe you can write comments in https://git.eclipse.org/r/#/c/35377.
Comment 70 Ivan Motsch CLA 2015-11-02 04:57:36 EST
Thank you for the comment.
I added a draft comment to https://git.eclipse.org/r/#/c/35377/17/org.eclipse.jgit/src/org/eclipse/jgit/treewalk/WorkingTreeIterator.java

I am not sure if I did it the right way. Can you see that comment?
Comment 71 Andrey Loskutov CLA 2015-11-02 05:12:58 EST
(In reply to Ivan Motsch from comment #70)
> Thank you for the comment.
> I added a draft comment to
> https://git.eclipse.org/r/#/c/35377/17/org.eclipse.jgit/src/org/eclipse/jgit/
> treewalk/WorkingTreeIterator.java
> 
> I am not sure if I did it the right way. Can you see that comment?

Nope. You have to "Reply" in Gerrit to publish your draft comments.
Comment 72 Ivan Motsch CLA 2015-11-02 11:34:09 EST
I created a performance optimized version of the patch.
This includes most of the proposals I did in the Comment 70 in gerrit.
That reduced the git status command on my local 30000+ file git repo from 4.5sec to 3.1sec.
gerrit https://git.eclipse.org/r/59345 
Could you verify this with your reference linux test?
Comment 73 Ivan Motsch CLA 2015-11-03 08:17:19 EST
here is the performance test of patchset3 (see attachment performance-check-eol-fix.txt).

original jgit without eol-support: 5.676s
pacth set 3 with eol-support: 6.764s

I used the Java6+ Files.walkFileTree without traversing files, but only the directories and probing for .gitattribute/.gitignore files. Tese are the 10036 'stat error' hits. Including ad-hoc subtree ignore when .gitignore were found.

Now with this patch a single Files.walkFileTree is done in addition to the existing WorkingTreeIterator.

Maybe a combination of this and the working tree iterator would boost the performance. Files.walkFileTree is massively faster than doing a File.listFiles style traversal.
Comment 74 Ivan Motsch CLA 2015-11-03 08:19:04 EST
Created attachment 257703 [details]
performance-check after eol-fix patch
Comment 75 Eclipse Genie CLA 2015-11-17 19:40:03 EST
Gerrit change https://git.eclipse.org/r/35377 was merged to [master].
Commit: http://git.eclipse.org/c/jgit/jgit.git/commit/?id=12280c02dbb8e4ac10893fbbd415be757afab4c1
Comment 76 Garret Wilson CLA 2015-11-19 15:23:29 EST
Hi, all. It looks like this is being worked on, and for that I am grateful.

I just wanted to check and see what the latest status is. Has it been placed in some public release stream yet, and is there a way to get a version of the Eclipse plugin that supports .gitattributes now? If now, what's the timeline?

Thank you immensely.
Comment 77 Garret Wilson CLA 2015-11-19 15:26:40 EST
(sigh) Nothing simple ever seems to work.

Today I saw that Eclipse offered me Eclipse Git Team Provider 4.1.1.201511131810-r and Java implementation of Git 4.1.1.201511131810-r I got excited thinking it now included .gitattributes support. So I tried to upgrade and got:

An error occurred while collecting items to be installed
session context was:(profile=SDKProfile, phase=org.eclipse.equinox.internal.p2.engine.phases.Collect, operand=, action=).
No repository found containing: osgi.bundle,org.sonatype.m2e.egit,0.14.0.201509090157
No repository found containing: org.eclipse.update.feature,org.sonatype.m2e.egit.feature,0.14.0.201509090157

:(
Comment 78 Matthias Sohn CLA 2015-11-19 17:57:04 EST
(In reply to Garret Wilson from comment #76)
> Hi, all. It looks like this is being worked on, and for that I am grateful.
> 
> I just wanted to check and see what the latest status is. Has it been placed
> in some public release stream yet, and is there a way to get a version of
> the Eclipse plugin that supports .gitattributes now? If now, what's the
> timeline?

We now have partial support for attributes in the nightly build which is available here
http://download.eclipse.org/egit/updates-nightly

Ivan Motsch is working on more changes adding some of the missing pieces, these changes are still in review:
https://git.eclipse.org/r/#/q/project:jgit/jgit+is:open+owner:%22Ivan+Motsch%22
Comment 79 Matthias Sohn CLA 2015-11-19 17:58:05 EST
Next release is 4.2 planned for mid of december
Comment 80 Matthias Sohn CLA 2015-11-19 17:59:52 EST
(In reply to Garret Wilson from comment #77)
> (sigh) Nothing simple ever seems to work.
> 
> Today I saw that Eclipse offered me Eclipse Git Team Provider
> 4.1.1.201511131810-r and Java implementation of Git 4.1.1.201511131810-r I
> got excited thinking it now included .gitattributes support. So I tried to
> upgrade and got:
> 
> An error occurred while collecting items to be installed
> session context was:(profile=SDKProfile,
> phase=org.eclipse.equinox.internal.p2.engine.phases.Collect, operand=,
> action=).
> No repository found containing:
> osgi.bundle,org.sonatype.m2e.egit,0.14.0.201509090157
> No repository found containing:
> org.eclipse.update.feature,org.sonatype.m2e.egit.feature,0.14.0.201509090157
> 
> :(

looks like you have installed some m2e egit integration which restricts allowed egit versions, either you uninstall this m2e egit integration or you need to also add the update site for a newer version of it which is compatible with egit 4.1.1
Comment 81 Garret Wilson CLA 2015-11-19 18:54:31 EST
> looks like you have installed some m2e egit integration which restricts allowed
> egit versions ...

How can I tell what that would be?
Comment 82 Matthias Sohn CLA 2015-11-20 03:31:31 EST
(In reply to Garret Wilson from comment #81)
> > looks like you have installed some m2e egit integration which restricts allowed
> > egit versions ...
> 
> How can I tell what that would be?

- click "Help > Installation Details"
- select tab "Features" and check if you have a feature named "org.sonatype.m2e.egit.feature"

You probably need a newer version of that one, the latest version accepts all EGit/JGit versions between 3.0 and 5.0 (checked this by looking into the OSGi manifest it contains). You can install this feature in the following way (if m2e is already installed):

- click "Preferences > Maven > Discovery"
- click "Open catalog"
- install the m2e-egit

I don't know why m2e uses its own custom way to install extensions ...
Comment 83 Garret Wilson CLA 2015-11-28 14:22:04 EST
> check if you have a feature named "org.sonatype.m2e.egit.feature"

Yes; I had:
Maven SCM Handler for EGit	0.14.0.201504071521	org.sonatype.m2e.egit.feature.feature.group	Sonatype, Inc.
    Eclipse Git Team Provider	4.1.0.201509280440-r	org.eclipse.egit.feature.group	Eclipse EGit

> install the m2e-egit

I went through "Preferences > Maven > Discovery" as indicated, but when "m2e Team providers" was listed I was given only the option to cancel.

I checked for updates just now and I was given the following options:

Eclipse Git Team Provider	4.1.1.201511131810-r
Java implementation of Git	4.1.1.201511131810-r
Maven SCM Handler for EGit	0.14.0.201509090157

I installed "Eclipse Git Team Provider" and "Java implementation of Git" separately and restarted Eclipse. Then I tried to install the new version of "Maven SCM Handler for EGit". It failed again with:

An error occurred while collecting items to be installed
session context was:(profile=SDKProfile, phase=org.eclipse.equinox.internal.p2.engine.phases.Collect, operand=, action=).
No repository found containing: osgi.bundle,org.sonatype.m2e.egit,0.14.0.201509090157
No repository found containing: org.eclipse.update.feature,org.sonatype.m2e.egit.feature,0.14.0.201509090157

Come on, this is ridiculous. I need the "Maven SCM Handler for EGit" because I want to check out a Maven project directly from Git. (Surely this must be the most common configuration in existence! A project using Maven and Git---imagine that!!)

So the plugin that lets me check out a Maven project from Git is now incompatible with the Java implementation of Git on Eclipse??

Should I file a bug? Where?
Comment 84 Garret Wilson CLA 2015-11-28 14:28:32 EST
P.S. I understand that the Maven SCM issue is not the fault of JGit; sorry for taking this thread off topic. If someone could point me to where I file the Maven SCM bug, I would appreciate it. Thanks.
Comment 85 Matthias Sohn CLA 2015-12-07 08:13:39 EST
(In reply to Garret Wilson from comment #84)
> P.S. I understand that the Maven SCM issue is not the fault of JGit; sorry
> for taking this thread off topic. If someone could point me to where I file
> the Maven SCM bug, I would appreciate it. Thanks.

I think you can file the maven scm integration issue here
https://bugs.eclipse.org/bugs/enter_bug.cgi?product=m2e
Comment 86 Garret Wilson CLA 2015-12-23 15:50:13 EST
>> I just wanted to check and see what the latest status is.
> Next release is 4.2 planned for mid of december

Any news if this release went out, and if JGit now indeed supports .gitattributes?
Comment 87 Matthias Sohn CLA 2015-12-23 16:53:32 EST
(In reply to Garret Wilson from comment #86)
> >> I just wanted to check and see what the latest status is.
> > Next release is 4.2 planned for mid of december
> 
> Any news if this release went out, and if JGit now indeed supports
> .gitattributes?

current planned date for release review for 4.2 is Jan 6
https://projects.eclipse.org/projects/technology.jgit/releases/4.2/plan

.gitattributes support for clean and smudge filters will be available in 4.2, further enhancements are in review:
https://git.eclipse.org/r/#/c/60617
https://git.eclipse.org/r/#/c/60635
Comment 88 Garret Wilson CLA 2015-12-23 17:07:39 EST
> 
.gitattributes support for clean and smudge filters will be available in 4.2

I'm afraid I don't understand; I don't know what clean and smudge filters are.

The important question for me (and the most fundamental use of .gitattributes---and indeed for Git in general): If I commit and push a file from inside Eclipse using EGit, will JGit 4.2 process my EOL settings in accordance with my .gitattributes settings?
Comment 89 Matthias Sohn CLA 2015-12-23 17:35:09 EST
(In reply to Garret Wilson from comment #88)
> > 
> .gitattributes support for clean and smudge filters will be available in 4.2
> 
> I'm afraid I don't understand; I don't know what clean and smudge filters
> are.

that's explained here
https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes#Keyword-Expansion

> The important question for me (and the most fundamental use of
> .gitattributes---and indeed for Git in general): If I commit and push a file
> from inside Eclipse using EGit, will JGit 4.2 process my EOL settings in
> accordance with my .gitattributes settings?

no, this is not yet ready, AFAIK Ivan is working on that
Comment 90 Garret Wilson CLA 2015-12-23 17:41:23 EST
> no, this is not yet ready, AFAIK Ivan is working on that

Ack. Sigh.

This means that if I have something like the following in my .gitattributes, EGit is guaranteed to screw it up one way or another---whether I'm on Windows, Linux, or Macintosh:

*.bat eol=crlf
*.sh eol=lf

Am I correct that EGit will corrupt the content in my Git repository with those .gitattributes settings (regardless of my autocrlf setting)? Isn't adding CRLF to a Linux shell script or using LF for a Windows batch file corrupting the content?

We're coming up on... five years! No, really---this bug was filed almost five years ago!

Please, please---can someone give this more priority?
Comment 91 Christian Halstrick CLA 2015-12-28 03:50:35 EST
We are adding support for .gitattributes currently. But as you can read in [1] an .gitattributes file can contain a lot of different stuff (filter definitions which modify content during checkin/checkout, merge drivers, "ident" attribute for expansion of $ID$ during checkout, macros). We will not support everything, but the features you seem to be intested in ("eol=" or the "text" macro) are under development now. You may check progress here [2]. Other features of .gitattributes like filters are already supported now. 

To make sure we are on the right track and the feature we are implementing will help you I would like to understand your last comments. Maybe you found another bug we have not yet seen. You say that EGit screws up your files. I always thought that as long as EGit has not learned about .gitattributes and eol handling and as long as core.autocrlf=false then EGit will simply not modify any content during checkout/checkin. Files are checked in/out byte-by-byte as they are stored in the  repo or stored in the filesystem. If I checkin a .bat file with crlf it will be checked out with crlf everywhere. Is it this behavior that we checkout the files byte-by-byte in the way you checked them in which screws up your files? Or is EGit modifying the content in another way which screws up files? 


[1] http://git-scm.com/docs/gitattributes
[2] https://git.eclipse.org/r/#/c/60635
Comment 92 Oti Humbel CLA 2015-12-31 05:08:18 EST
(In reply to Christian Halstrick from comment #91)
> You say that EGit screws up your files. I
> always thought that as long as EGit has not learned about .gitattributes and
> eol handling and as long as core.autocrlf=false then EGit will simply not
> modify any content during checkout/checkin. Files are checked in/out
> byte-by-byte as they are stored in the  repo or stored in the filesystem. If
> I checkin a .bat file with crlf it will be checked out with crlf everywhere.
> Is it this behavior that we checkout the files byte-by-byte in the way you
> checked them in which screws up your files? Or is EGit modifying the content
> in another way which screws up files? 

We have many developers, some of them working on Windows, some of them working on Linux. Some of them use git from the command line, some of them use EGit. Our plan was to normalize line endings per repository by using one .gittatributes file (* text=auto). This works great for the command line users: On Linux they have Linux style line endings, on Windows they have Windows style line endings. But as soon as the EGit users came into play, we heard complaints from users 'destroying' each other's line endings. On one hand, this might be because we did not control the core.autocrlf settings of each user. On the other hand some platform specific editors do their own magic with line endings, producing unexpected diffs.
Comment 93 Garret Wilson CLA 2015-12-31 10:44:14 EST
From Comment 91:

> You say that EGit screws up your files. I always thought that as long as EGit
> has not learned about .gitattributes and eol handling and as long as
> core.autocrlf=false then EGit will simply not modify any content during
> checkout/checkin.

That's where you are assuming things. Why assume we are using core.autocrlf=false? You're also assuming that no other tools are being used in the workflow.

It is notoriously difficult to ensure core.autocrlf=false across all developer machines. This is one of the problems .gitattributes was meant to solve---it allows us to control EOL handling consistently without assuming on brittle environment settings.

Moreover even if we could ensure that core.autocrlf=false, there's no guarantee that other non-Git tools will obey this setting. In fact it's almost guaranteed that they will *not*. Which means the other tools may switch the line endings. If another text editor on a Windows machine, for example, creates a .txt file on Windows, it will use crlf---and congratulations, by using core.autocrlf=false you've just *guaranteed* that incorrect settings get into the repository.

Let's go back to my example on a system that has core.autocrlf=true:

*.bat eol=crlf
*.sh eol=lf

On a Windows machine EGit (if I understand the current situation correctly) check in both a .bat and a .sh file using CRLF---which is incorrect for the .sh file. On a Linux machine EGit will check in both files using LF---which is incorrect for the .bat file.

But I'm just rehashing explanations given surely hundreds of times in various articles and conversations. This is a well-known problem---and one that .gitattributes was meant to solve. It provides a common configuration that all tools can use to provide consistent expected behavior. But it only works if all the tools in the chain play the game, and right now EGit is the tool in the chain fouling things up.
Comment 94 Christian Halstrick CLA 2016-01-11 03:08:46 EST
(In reply to Garret Wilson from comment #93)

Garret, I totally agree that having full support for line-ending handling in .gitattributes would be great and that EGit should support it. I gave links showing where we work on it. I also know these external tools/eclipse plugins which don't stick to the line-endings they find in file but always write their own static line-ending-type or the current platform line-ending-default. That will corrupt .sh files if you work on windows. And correct line ending handling with .gitattributes seems to be for me the ultimate solution.

> From Comment 91:
> 
> > You say that EGit screws up your files. I always thought that as long as EGit
> > has not learned about .gitattributes and eol handling and as long as
> > core.autocrlf=false then EGit will simply not modify any content during
> > checkout/checkin.
> 
> That's where you are assuming things. Why assume we are using
> core.autocrlf=false? You're also assuming that no other tools are being used
> in the workflow.

No, you get me wrong. I am not assuming that you have core.autocrlf set to false. I just wanted to doublecheck that it is not EGit which corrupts files when autocrlf is false. Just to make sure we don't have another bug in EGit/JGit which corrupts files because they have been multiple comments that EGit corrupts the files. But if I understand your comments correct then you agree that when autocrlf is false EGit will not modify (corrupt?) the content. 
And yes, I know that it is impossible to control autocrlf settings of each developer in a big project. And yes, tools which corrupt line endings will always exist and therefore we need good line-ending handling in EGit with .gitattributes. I agree to that.
Comment 95 Sebastien Bonami CLA 2016-01-15 18:10:33 EST
What's the best workaround to this right now? Thanks.
Comment 96 Anselm D. CLA 2016-01-20 10:00:14 EST
if clean and smudge filters are available in 4.2 for jgit, does this mean they are available for egit out of the box?
Comment 97 Matthias Sohn CLA 2016-01-20 10:29:27 EST
(In reply to Anselm D. from comment #96)
> if clean and smudge filters are available in 4.2 for jgit, does this mean
> they are available for egit out of the box?

yes, we implemented this to allow using the git-lfs extension with EGit.

The git filter needs to be configured in the usual way and EGit needs to be able
to find it on the PATH seen by Eclipse.
Comment 98 Anselm D. CLA 2016-01-22 03:54:29 EST
I have difficulties to run the clean and smudge filters.

I started with jgit plugin nightly build for eclipse mars at 19. January and it does not work.

However, actually I am using a jgit and egit clone from 19. january and start a debugging session from eclipse:
as test i add a ./jgit/.gitattributes file and additional the configuration in git config.

For test i did a replace from head for a random file and stopping at my break point:

DirCacheCheckout.checkoutEntry(Repository, DirCacheEntry, ObjectReader, boolean, String) line: 1239	

which is this in the source code
	public static void checkoutEntry(Repository repo, DirCacheEntry entry,
			ObjectReader or, boolean deleteRecursive,
			String smudgeFilterCommand) throws IOException {

The String smudgeFilterCommand is null. 

Ok, this is because it is called by 
public static void checkoutEntry(Repository repo, DirCacheEntry entry,
			ObjectReader or, boolean deleteRecursive) throws IOException {
   checkoutEntry(repo, entry, or, deleteRecursive, null);
}

This is the stack trace:
Thread [Worker-2] (Suspended (breakpoint at line 1239 in DirCacheCheckout))	
	DirCacheCheckout.checkoutEntry(Repository, DirCacheEntry, ObjectReader, boolean, String) line: 1239	
	DirCacheCheckout.checkoutEntry(Repository, DirCacheEntry, ObjectReader, boolean) line: 1199	
	CheckoutCommand.checkoutPath(DirCacheEntry, ObjectReader) line: 473	
	CheckoutCommand.access$3(CheckoutCommand, DirCacheEntry, ObjectReader) line: 471	
	CheckoutCommand$2.apply(DirCacheEntry) line: 464	
	DirCacheEditor.applyEdits() line: 175	
	DirCacheEditor.finish() line: 128	
	DirCacheEditor(BaseDirCacheEditor).commit() line: 273	
	DirCacheEditor.commit() line: 123	
	CheckoutCommand.checkoutPathsFromCommit(TreeWalk, DirCache, RevCommit) line: 468	
	CheckoutCommand.checkoutPaths() line: 406	
	CheckoutCommand.call() line: 204	
	DiscardChangesOperation.discardChanges(Repository, Collection<String>) line: 201	
	DiscardChangesOperation.discardChanges(IProgressMonitor) line: 162	
	DiscardChangesOperation.access$0(DiscardChangesOperation, IProgressMonitor) line: 150	
	DiscardChangesOperation$1.run(IProgressMonitor) line: 143	
	Workspace.run(IWorkspaceRunnable, ISchedulingRule, int, IProgressMonitor) line: 2241	
	DiscardChangesOperation.execute(IProgressMonitor) line: 146	
	DiscardChangesActionHandler$1.runInWorkspace(IProgressMonitor) line: 57	
	DiscardChangesActionHandler$1(InternalWorkspaceJob).run(IProgressMonitor) line: 39	
	Worker.run() line: 55	


Can anyone give me a hint? Wrong version, wrong configuration?
Comment 99 Matthias Sohn CLA 2016-01-22 04:30:12 EST
ok, let me describe the steps:

- Install git-lfs as described on its home page https://git-lfs.github.com/ 

- Configure the lfs filter in ~/.gitconfig

[filter "lfs"]
	required = true
	smudge = git-lfs smudge %f
	clean = git-lfs clean %f

- Open a terminal and run git-lfs to prove it's installed and available on the path

$ git-lfs
git-lfs/1.1.0 (GitHub; darwin amd64; go 1.5.1)
git lfs <command> [<args>]

Git LFS is a system for managing and versioning large files in
...

- Start Eclipse from the same terminal to ensure that Eclipse sees the same PATH as your terminal
- Upgrade JGit/EGit to 4.2.0 from http://download.eclipse.org/egit/updates (I just released it :-)
- Create an empty repository
- in order to track e.g. pdf files in lfs from the terminal cd to the repository's root path and run
$ git lfs track *.pdf
- this creates .gitattribute
- Commit this change using EGit
- copy some pdf file into your project
- add it to the index using EGit
- the blob for this file should now be stored under .git/lfs, e.g.:
$ find .git/lfs/
.git/lfs/
.git/lfs//objects
.git/lfs//objects/1e
.git/lfs//objects/1e/d5
.git/lfs//objects/1e/d5/1ed5d8c45c2e3cef22467d305ae1116047ef9d953aabfa2b0015f60218bd2723

does this work for you ?
Comment 100 Anselm D. CLA 2016-01-22 04:58:56 EST
(In reply to Matthias Sohn from comment #99)

> 
> - Install git-lfs as described on its home page https://git-lfs.github.com/ 
...
> does this work for you ?

I will try it, thank you. So git-lfs is required to use the smudge  and clean filters or is it an example for smudge and clean?
Comment 101 Matthias Sohn CLA 2016-01-22 05:27:52 EST
(In reply to Anselm D. from comment #100)
> (In reply to Matthias Sohn from comment #99)
> 
> > 
> > - Install git-lfs as described on its home page https://git-lfs.github.com/ 
> ...
> > does this work for you ?
> 
> I will try it, thank you. So git-lfs is required to use the smudge  and
> clean filters or is it an example for smudge and clean?

JGit does support invoking smudge and clean filters now but doesn't have any filter implementation currently. I.e. it now supports this generic git extension mechanism [1]
but it doesn't provide an implementation for such an extension.

git-lfs is one such extension implementation extending git to allow storing large objects in a different storage than the git repository itself. It does so by extending "git add" (triggered by the configured clean filter) and "git checkout" (triggered by the configured smudge filter).

You could also implement other clean/smudge filters which e.g. encrypt/decrypt the file content when it's stored/retrieved in git. 

[1] described in https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes#Keyword-Expansion
Comment 102 Christian Halstrick CLA 2016-01-22 08:06:43 EST
Here is an example demonstrating how jgit supports e.g. clean filters. This stupid filter replaces all 'a' by 'x' during add.

[~/tmp]$ curl 'https://repo.eclipse.org/content/groups/releases//org/eclipse/jgit/org.eclipse.jgit.pgm/4.2.0.201601211800-r/org.eclipse.jgit.pgm-4.2.0.201601211800-r.sh' -o jgit.sh -s
[~/tmp]$ mkdir test
[~/tmp]$ cd test
[~/tmp/test]$ ../jgit.sh init
Initialized empty Git repository in ...
[~/tmp/test (master)]$ git config filter.a2b.clean 'sed s/a/x/'
[~/tmp/test (master)]$ echo "*.txt filter=a2b" >.gitattributes
[~/tmp/test (master)]$ ../jgit.sh add .gitattributes
[~/tmp/test (master)]$ echo "abcdefg" >bob.txt
[~/tmp/test (master)]$  ../jgit.sh add bob.txt
[~/tmp/test (master)]$ rm bob.txt
[~/tmp/test (master)]$ git checkout -- bob.txt
[~/tmp/test (master)]$ cat bob.txt
xbcdefg
Comment 103 Anselm D. CLA 2016-01-22 09:31:53 EST
(In reply to Christian Halstrick from comment #102)
> Here is an example demonstrating how jgit supports e.g. clean filters. This
> stupid filter replaces all 'a' by 'x' during add.
> 
> [~/tmp]$ curl
> 'https://repo.eclipse.org/content/groups/releases//org/eclipse/jgit/org.
> eclipse.jgit.pgm/4.2.0.201601211800-r/org.eclipse.jgit.pgm-4.2.0.
> 201601211800-r.sh' -o jgit.sh -s
> [~/tmp]$ mkdir test
> [~/tmp]$ cd test
> [~/tmp/test]$ ../jgit.sh init
> Initialized empty Git repository in ...
> [~/tmp/test (master)]$ git config filter.a2b.clean 'sed s/a/x/'
> [~/tmp/test (master)]$ echo "*.txt filter=a2b" >.gitattributes
> [~/tmp/test (master)]$ ../jgit.sh add .gitattributes
> [~/tmp/test (master)]$ echo "abcdefg" >bob.txt
> [~/tmp/test (master)]$  ../jgit.sh add bob.txt
> [~/tmp/test (master)]$ rm bob.txt
> [~/tmp/test (master)]$ git checkout -- bob.txt
> [~/tmp/test (master)]$ cat bob.txt
> xbcdefg

Thank you, your example of the git-lfs works regarding the test you gave me. But I am at the first step of my example, i would like to call a shell script at windows for clean and one for smudge. I do not have a real implementation for the scripts now.

But whatever i do, the clean is called, but not the smudge. Even in your example i did a system monitoring and i did not find a call git-lfs smudge %f but a git-lfs clean %f.

So first question: if i make with eclipse egit a "replace revision" "with head" to the pdf file, i thought it should call the smudge filter?
Comment 104 Anselm D. CLA 2016-01-22 09:43:14 EST
Sorry i did it wrong #103 was
(In reply to Matthias Sohn from comment #101)
> (In reply to Anselm D. from comment #100)
> > (In reply to Matthias Sohn from comment #99)
Comment 105 Anselm D. CLA 2016-01-26 03:00:50 EST
(In reply to Anselm D. from comment #103)
> (In reply to Christian Halstrick from comment #102)
> > [~/tmp/test (master)]$ git config filter.a2b.clean 'sed s/a/x/'
> ...
> > [~/tmp/test (master)]$ git checkout -- bob.txt
> > [~/tmp/test (master)]$ cat bob.txt


Hi Christian,

i have some difficulties to understand it.

As i understand it (see Git - Git Attributes
https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes)

"Figure 8-2. The “smudge” filter is run on checkout."
"Figure 8-3. The “clean” filter is run when files are staged."

In your example the clean filter is used with a checkout and not the smudge filter. I am confused. Can you please give me a hint.
Comment 106 Anselm D. CLA 2016-01-26 03:38:10 EST
ok, i understand now, the -add calls the clean filter.
Should the checkout call the smudge filter?
Comment 107 Christian Halstrick CLA 2016-01-26 04:54:25 EST
Right. When you transfer new content from the filesystem to the git repo (as it is done during "git add") we call the clean filter. When content is retrieved from the git repo and written to the filesystem then we call smudge. "git checkout" will call the smudge filter on the files which have to be updated.
Comment 108 Anselm D. CLA 2016-01-26 05:59:43 EST
(In reply to Christian Halstrick from comment #107)
> Right. When you transfer new content from the filesystem to the git repo (as
> it is done during "git add") we call the clean filter. When content is
> retrieved from the git repo and written to the filesystem then we call
> smudge. "git checkout" will call the smudge filter on the files which have
> to be updated.

OK, this test with jgit works. 
i add
 git config filter.a2b.smudge 'sed s/x/y/'

and after the checkout the result is:
$ cat bob.txt
ybcdefg

Nevertheless i am fighting with egit 'sed s/a/x/ does not give the correct result. I put cygwin in the system path, so sed should be callable. 

How does the filter works? Reading from STDIN and writing to STDOUT?
Comment 109 Matthias Sohn CLA 2016-01-26 06:57:00 EST
(In reply to Anselm D. from comment #108)
> (In reply to Christian Halstrick from comment #107)
> > Right. When you transfer new content from the filesystem to the git repo (as
> > it is done during "git add") we call the clean filter. When content is
> > retrieved from the git repo and written to the filesystem then we call
> > smudge. "git checkout" will call the smudge filter on the files which have
> > to be updated.
> 
> OK, this test with jgit works. 
> i add
>  git config filter.a2b.smudge 'sed s/x/y/'
> 
> and after the checkout the result is:
> $ cat bob.txt
> ybcdefg
> 
> Nevertheless i am fighting with egit 'sed s/a/x/ does not give the correct
> result. I put cygwin in the system path, so sed should be callable. 
> 
> How does the filter works? Reading from STDIN and writing to STDOUT?

yes, it reads from stdin and writes to stdout, the relative path of the file to run through the filter is passed as an argument
Comment 110 Anselm D. CLA 2016-01-26 07:19:55 EST
(In reply to Matthias Sohn from comment #109)
> yes, it reads from stdin and writes to stdout, the relative path of the file
> to run through the filter is passed as an argument

I does not work like i expect it, to make it simple i define a filter, which goes to stdout:
echo hello 
The result is an unchanged file.


echo hello > %f
This one give 
hello
as a result, which i can understand, but is the way it should be defined?
Comment 111 Garret Wilson CLA 2016-01-26 07:35:07 EST
Almost five years ago this issue was opened so that JGit could support the EOL processing in .gitattributes so as not to corrupt our files and foul up our Git workflow.

At some point along the way someone said they fixed it. And then someone said that it was thoroughly tested. I even provided a sample and they said it passed with flying colors.

Now everyone is ignoring the EOL issue and pretending it never happened. They are talking about something that frankly I've never used and no doubt it is useful for some people but it wasn't what this bug was opened for.

The EOL issue has been dropped, and I don't know why, especially when it was supposedly fixed. I fear it will never be brought back.

Wouldn't it be better to open a separate issue for this other tangent, and finish fixing the issue for which this bug was opened, namely correct processing of EOLs based upon the configuration in .gitattributes?
Comment 112 Christian Halstrick CLA 2016-01-26 08:20:11 EST
This bug is named "support gitattributes" and I created this bug (5 years ago!) to track the handling of gitattributes support. Correct EOL handling is one aspect which can be solved by gitattributes. But clean/smudge filters are also inside gitattributes and therefore Anselm's questions regarding this feature are discussed here.

But you are right: we should not discuss both topics in one bug. I'll create a new bug regarding support for clean and smudge filters so we can seperate this discussions.
Comment 113 Christian Halstrick CLA 2016-01-26 08:33:46 EST
Created a new bug 486560 for discussing clean/smudge filters in gitattributes. Anselm, further questions about this topic should go there.
Comment 114 Garret Wilson CLA 2016-01-26 08:39:34 EST
OK. So here we are. Almost five years later. Back right where we started.

JGit does not support EOL handling based upon .gitattributes, which is the standard and recommended approach.

Because of this problem, certain configurations are *guaranteed* to corrupt data, as I have pointed out in this bug.

I have offered to devote resources to fixing this, and asked where I should start looking. No one gave me any pointers. Then someone said that it was fixed anyway. I provided a test case. They said it passed. Then I was told that it would be in the next release.

Then it became clear that really it wasn't really going to be in the next release, and nobody now even seems to know what happened to the fix that was supposedly already implemented and passsing the tests. It's like we rolled back five years and haven't moved an inch.

So where are we on this central Git feature? Do we even know?
Comment 115 Christian Halstrick CLA 2016-01-26 09:27:40 EST
Current work is at https://git.eclipse.org/r/#/c/60635 and predecessors. This waits for reviews also from me. Shame on me. Vacation and other job topics didn't left room for that review. But there is light at the end of the tunnel.

Ivan, I think you have implementations for text attribute and binary macro based on your open changes. If you could propose them Garret would have a chance to test it in his environment. Since Garret seems to have complicated setup and wants to help that would speed up things. Any chances?
Comment 116 Ivan Motsch CLA 2016-01-26 09:36:40 EST
Live Tests:
Well, the one thing that can still be used for testing only would be the binary patch I added on 30. Oct 2015. This is what I use myself for combined c#, vbscript and java development. These binary patched plugins can just be added to the eclipse IDE in the dropins folder. Just note, that these are only for proof of concept purposes.

Development:
Since Christian asked me to split up this feature into smaller parts I created
Comment 117 Garret Wilson CLA 2016-01-26 09:38:31 EST
> Garret would have a chance to test it in his environment

I've already provided tests and I was told they passed.

> Garret seems to have complicated setup

I don't have a complicated setup! I have a simple, typical setup! That's why it's such a shame that five years on JGit can't "git with program" and support even simple, typical .gitattributes configurations!!
Comment 118 Ivan Motsch CLA 2016-01-26 09:39:00 EST
...I created multiple gerrit changes. These wait for review and approval since Jan 07, 2016. Maybe Garrett can also take a look at those.
Comment 119 Ivan Motsch CLA 2016-01-26 09:41:54 EST
(In reply to Garret Wilson from comment #117)
> > Garret would have a chance to test it in his environment
> 
> I've already provided tests and I was told they passed.
> 
> > Garret seems to have complicated setup
> 
> I don't have a complicated setup! I have a simple, typical setup! That's why
> it's such a shame that five years on JGit can't "git with program" and
> support even simple, typical .gitattributes configurations!!

Hi Garret, just to let you know, I am not a jgit committer, but a scout committer. Since this missing feature affects many of us all I decided to help with code contribution. Please help Christian and Co. to review this code so it can be merged asap.

The gerrits are
https://git.eclipse.org/r/#/c/60617/
https://git.eclipse.org/r/#/c/60635/
Comment 120 Garret Wilson CLA 2016-01-26 09:44:11 EST
I'll be happy to look into this, but I don't know how helpful it will be, because I've never seen JGIt code before or know anything about its architecture. When I was trying to commit resources to this problem last year I was told it was fixed so I devoted resources elsewhere. But I'll look at the links you sent.
Comment 121 Ivan Motsch CLA 2016-01-26 09:50:23 EST
(In reply to Christian Halstrick from comment #115)
> Current work is at https://git.eclipse.org/r/#/c/60635 and predecessors.
> This waits for reviews also from me. Shame on me. Vacation and other job
> topics didn't left room for that review. But there is light at the end of
> the tunnel.

I think we are on a good path to a solution, lets get ahead, its no more long to go:-)
Comment 122 Christian Halstrick CLA 2016-01-26 10:06:01 EST
As usual when developing a new feature it's always very helpful when a user with a real-life setup tests the feature. I think that Ivan is testing this already in productive(?) environments. Garret, I think it would be helpful if you could also test the patched plugins (as they are attached to this bug) with your project. Sharing your .gitattributes is ok, but testing it with your project content would be better. Maybe you could also check that checkout/status/add doesn't have performance surprises in your setup.
Comment 123 Ivan Motsch CLA 2016-01-26 10:12:41 EST
The binary patched jars on this bug will behave slower than the "real" jgit as the linux tests showed. Also in my setup I feel that. Some seconds slower.
So I am also hoping (and working) for the better solution we are creating in that moment withe the gerrit parts.
Comment 124 Anselm D. CLA 2016-02-12 06:58:23 EST
(In reply to Matthias Sohn from comment #99)
> ok, let me describe the steps:
> 
> - Install git-lfs as described on its home page https://git-lfs.github.com/ 
> 
> - Configure the lfs filter in ~/.gitconfig
> 
> [filter "lfs"]
> 	required = true
> 	smudge = git-lfs smudge %f
> 	clean = git-lfs clean %f
> 
> - Open a terminal and run git-lfs to prove it's installed and available on
> the path
> 
> $ git-lfs
> git-lfs/1.1.0 (GitHub; darwin amd64; go 1.5.1)
> git lfs <command> [<args>]
> 
> Git LFS is a system for managing and versioning large files in
> ...
> 
> - Start Eclipse from the same terminal to ensure that Eclipse sees the same
> PATH as your terminal
> - Upgrade JGit/EGit to 4.2.0 from http://download.eclipse.org/egit/updates
> (I just released it :-)
> - Create an empty repository
> - in order to track e.g. pdf files in lfs from the terminal cd to the
> repository's root path and run
> $ git lfs track *.pdf
> - this creates .gitattribute
> - Commit this change using EGit
> - copy some pdf file into your project
> - add it to the index using EGit
> - the blob for this file should now be stored under .git/lfs, e.g.:
> $ find .git/lfs/
> .git/lfs/
> .git/lfs//objects
> .git/lfs//objects/1e
> .git/lfs//objects/1e/d5
> .git/lfs//objects/1e/d5/
> 1ed5d8c45c2e3cef22467d305ae1116047ef9d953aabfa2b0015f60218bd2723
> 
> does this work for you ?

This example only calls a clean filter, not a smudge filter, is my interpretation correct?
Comment 125 Garret Wilson CLA 2016-02-12 08:18:32 EST
> This example only calls a clean filter, not a smudge filter, is my interpretation correct?

OK, now I'm confused again. I thought that in Comment 113 clean/smudge filter discussions had been moved to Bug 486560; that this discussion here should be solely for supporting EOL handling in .gitattributes; and that everyone was instructed to move clean/smudge filter handling to Bug 486560. (One more huge shortcoming of Bugzilla is that you can't update the description.)
Comment 126 Matthias Sohn CLA 2016-02-12 08:28:04 EST
(In reply to Garret Wilson from comment #125)
> (One more huge shortcoming of Bugzilla is that you can't update the
> description.)

committers can edit the description, what's your proposal how to change it
Comment 127 Garret Wilson CLA 2016-02-12 08:34:08 EST
> committers can edit the description, what's your proposal how to change it

Frankly I always thought that the description was clear that the support of .gitattributes supported was related to "mark[ing] certain files as 'binary'" and "files with specific file endings". But in Comment 112 Christian seemed to think the description covered "clean/smudge filters".

So whatever change you want to make to the description that makes it clear to others that Bug 342372 is related solely to binary/text EOL support in .gitattribute would be fine. Whatever it takes to keep this bug on focus so that maybe someday it will be finished.
Comment 128 Garret Wilson CLA 2016-02-20 00:08:38 EST
> Please help Christian and Co. to review this code so it can be merged asap.

I looked at the two gerrits indicated. As I mentioned I am completely unfamiliar with the codebase, and I wasn't sure of the context of much of the code. I nonetheless found StreamTypeUtil.java which appears to have something to do with detecting line endings. I added several comments that mostly probably just portrayed my ignorance of the current code structure. I'm still unsure where lies the code that determines a specific file's EOL specification in .gitattributes based upon the file extension.

I did however find what appears to be a typo that could result in a significant bug in StreamTypeUtil.java:

    String eol = attrs.getValue("eol"); //$NON-NLS-1$
    if (eol != null && "crlf".equals(eol)) //$NON-NLS-1$
      return StreamType.TEXT_LF; //BUG? shouldn't this be TEXT_CRLF?
    if (eol != null && "lf".equals(eol)) //$NON-NLS-1$
      return StreamType.TEXT_LF;

Otherwise I'm completely lost in the code. If you deploy it in a plugin beta, though, I promise I'll update my Eclipse and test it with a few files.

(I don't understand why this bug is poking along. It's still marked as NEW, for crying out loud!)

Let me know what else I can do.
Comment 129 Ivan Motsch CLA 2016-02-22 06:21:24 EST
Thanks for the hint, i verified that.

The existing code is correct. When checking IN into git using the eol=crlf attribute all CRLF files are converted to LF upon check-in. 
When checked OUT they are converted back to LF.

When using the 'binary' option then the file is checked-in 'as is'.

The reason is that the "eol=CRLF" option only specifes the line ending format on checked-out files. Inside git they are all normalized/canonicalized to LF.
Comment 130 Garret Wilson CLA 2016-02-27 13:52:13 EST
> Thanks for the hint, i verified that.

So... when is this going to be ready for the public?

We'll soon hit the five year mark this bug has been open. And even if it were finished today, it would still be falling behind with the introduction of the new Git Large File Storage (LFS) effort jointly supported by GitHub and BitBucket, which leverages .gitattributes:

https://git-lfs.github.com/
https://confluence.atlassian.com/bitbucketserver/git-large-file-storage-794364846.html

With all this trouble (with no end in sight) coaxing EGit to use .gitattributes for simple things like EOL configuration, it's hard for me to imagine that EGit would ever get around to implementing LFS, and especially not with .gitattributes support. It's a pity all around.
Comment 131 Matthias Sohn CLA 2016-02-27 18:43:00 EST
(In reply to Garret Wilson from comment #130)
> > Thanks for the hint, i verified that.
> 
> So... when is this going to be ready for the public?
> 
> We'll soon hit the five year mark this bug has been open. 

and what did you do to fix it other than complaining that it's not done yet ?

> And even if it
> were finished today, it would still be falling behind with the introduction
> of the new Git Large File Storage (LFS) effort jointly supported by GitHub
> and BitBucket, which leverages .gitattributes:
> 
> https://git-lfs.github.com/
> https://confluence.atlassian.com/bitbucketserver/git-large-file-storage-
> 794364846.html
> 
> With all this trouble (with no end in sight) coaxing EGit to use
> .gitattributes for simple things like EOL configuration, it's hard for me to
> imagine that EGit would ever get around to implementing LFS, and especially
> not with .gitattributes support. It's a pity all around.

The attributes support needed for LFS integration is available since 4.2

I understood EOL attribute support is important for you so why don't you start helping Ivan at least by testing his changes in review ?

I think you should consider to stop complaining and use a more polite tone when talking to those doing the work.
Comment 132 Garret Wilson CLA 2016-02-27 19:08:27 EST
> and what did you do to fix it other than complaining that it's not done yet ?

I could talk about how I offered to devote developers to this (Comment 44), provided a real-life test case for use in verifying the code (Comment 56), provided guidance on implementation (Comment 61), provided clarification on how the lack of .gitattributes support can corrupt files (Comment 90, Comment 93), and performed code reviews as requested by Ivan (Comment 128). But further reply to your comment will only further an argumentative thread that detracts from the main issue: EGit still does not support .gitattributes almost five years after this was filed.

> I understood EOL attribute support is important for you so why don't you start helping Ivan at least by testing his changes in review ?

I stand ready to help Ivan. Ivan, please provide me instructions on how I can acquire and install the latest changes. Please also let me know what is blocking this from going forward, and how I can help.
Comment 133 Matthias Sohn CLA 2016-02-27 19:50:46 EST
In order to prepare testing changes in review:

- Clone the jgit and egit repositories [1] and fetch the change from Gerrit you want to test [2].

- You can run EGit by starting another Eclipse workbench in the debugger from Eclipse [3].

- Or build jgit and egit using Maven [4] and install EGit from
org.eclipse.egit.repository/target/repository

[1] https://wiki.eclipse.org/EGit/Contributor_Guide#Repositories
[2] https://wiki.eclipse.org/EGit/User_Guide#Fetching_a_change_from_a_Gerrit_Code_Review_Server
[3] https://wiki.eclipse.org/EGit/Contributor_Guide#Development_IDE_Configuration
[4] https://wiki.eclipse.org/EGit/Contributor_Guide#Maven_Build_Sequence
Comment 134 Garret Wilson CLA 2016-02-28 10:41:38 EST
Thanks for that very useful info,  Matthias.

I have created an entire test suite for this bug, located at the following repository:

https://bitbucket.org/garretwilson/jgit-bug-342372-test-resources

It allows you to test all combinations of text/binary files; those defined in .gitattributes and those that are not; and those that should have platform-specific line endings and those should have a specific line ending on all platforms.

I have one of my developers working on this now to do the initial tests on an LF-based platform.

Feel free to fork the project and do your own tests. Complete installation and test instructions are provided on the front page of the project and in the readme.md file.
Comment 135 Matthias Sohn CLA 2016-02-28 16:49:20 EST
(In reply to Garret Wilson from comment #134)
> Thanks for that very useful info,  Matthias.
> 
> I have created an entire test suite for this bug, located at the following
> repository:
> 
> https://bitbucket.org/garretwilson/jgit-bug-342372-test-resources
> 
> It allows you to test all combinations of text/binary files; those defined
> in .gitattributes and those that are not; and those that should have
> platform-specific line endings and those should have a specific line ending
> on all platforms.

could you convert this into a unit test and contribute it to jgit ?

> I have one of my developers working on this now to do the initial tests on
> an LF-based platform.

thanks, this helps

step 4. isn't necessary since the local build will have a higher timestamp
than the version installed so that p2 will decide to upgrade EGit to the local build

> Feel free to fork the project and do your own tests. Complete installation
> and test instructions are provided on the front page of the project and in
> the readme.md file.

I'll try that on Mac
Comment 136 Matthias Sohn CLA 2016-02-28 17:34:08 EST
I executed your test on Mac:

- when using jgit and egit current master test fails as expected, test.bat has LF line endings instead of CRLF
- test succeeds with https://git.eclipse.org/r/#/c/60635/ rebased onto master
- test succeeds with changes https://git.eclipse.org/r/#/c/60635/ and https://git.eclipse.org/r/#/c/67085/ rebased onto master
Comment 137 Garret Wilson CLA 2016-02-28 17:43:24 EST
I just had an additional thought: to really make sure everything is working, I suppose the tester would (in addition to the existing instructions) need to:

1. Clone the repository yet again and verify the line endings.
2. *Convert* all the text file line endings to their opposite (i.e. LF->CRLF; CRLF->LF).
3. Commit the changes and push.
4. Clone the repository into yet another directory.
5. Verify that the line endings are correct.

This last set of instructions is needed to make sure that EGit will *correct* incorrect line endings when doing a commit.
Comment 138 Garret Wilson CLA 2016-03-01 00:11:57 EST
I just spent a hour-long screen-sharing session with my developer Salvadore Jefferson. He uses a Mac (LF line endings) and had went through all the instructions from Matthias for compiling and installing EGit with the indicated patches.

As you know I created a Bitbucket repository with all four combinations of text files: declared as using the platform line ending, declared as using LF, declaring as using CRLF, and lastly, not declared in .gitattributes at all (to be auto-detected as text and to use the platform line ending).

1. We cloned the repository using EGit; all line endings were as expected.
2. We modified all text files, pushed, and cloned to another directory; all line endings were as expected.
3. We modified all text files, but then we switched the line endings to their opposites to see if EGit would correct them (as the official Git implementation does). We pushed and cloned to another directory; all line endings were as expected.
4. In the step above I wasn't sure if EGit was correcting the line endings on the way up (so that they would be normalized in the remote repository) or on the way down based upon .gitattributes as happens in step 1 (in which case we would have corrupted files in the remote repository, even if they get corrected on the way down). So I went straight to Bitbucket and downloaded a .zip file of the literal contents of the remote repository. All line endings were as expected; all files were in normalized LF form, except for the .bat file which indicated CRLF in .gitattributes.

(The last step was important, to make sure that the line endings get corrected and stored correctly in normalized form on the remote repository.)

So on a LF-EOL machine such as Linux or Mac, everything functions exactly as expected.

I have not yet tested on a CRLF-EOL machine. I will try my best to do those tests personally, although it may have to wait a few days because of my multitude of projects
Comment 139 Christian Halstrick CLA 2016-03-01 03:42:47 EST
That are great news. I also spent a lot of time in the last days reviewing the code and Ivan was fast in reacting on the comments. We are especially improving the unit tests. But manual tests with EGit done by end-users and not the developers are very, very helpful.

Some aspects are not yet fully looked at (e.g. big (>65k) files, attribute changes when switching branches, dirty worktrees/index (gitattributes in worktree differ from index and/or HEAD)). But these are smaller topics which can be covered fast.
Comment 140 Garret Wilson CLA 2016-03-02 11:01:51 EST
So now I'm going through Matthias' instructions so that I can test this personally on my Windows machine. I have to say that the online Gerrit instructions are exceedingly confusing, especially for those of us who have never heard of Gerrit until now. The best I can figure out is that apparently each change has a revision number, and that we have to go somewhere and find this revision number? (This was not clear at all from the links e.g. https://git.eclipse.org/r/#/c/60617/ given to me.) And which revision number do we use---the latest? I'm sure this would be completely clear to someone who has been using Gerrit...

Anyway about the only way I was able to go forward was to follow the tip of going to the web site and copying the download/checkout "git fetch git://git.eclipse.org/gitroot/jgit/jgit refs/changes/17/60617/11 && git checkout FETCH_HEAD" string, which populates the "Fetch from Gerrit" dialog.

But then what do I do? Apparently this will check out a FETCH_HEAD... but do I then merge this into master? Mattias gave me two change URLs, so I guess I would need to do the same to both of them. (My developer ran into the same issue and he said he was only able to go forward by merging.)

Any background information would be useful. All this Gerrit stuff is new to our team---and the explanations assume a lot of knowledge about Gerrit and its workflow(s). Thank you very much.
Comment 141 Garret Wilson CLA 2016-03-03 15:31:29 EST
> Apparently this will check out a FETCH_HEAD... but do I then merge this into master?

Could someone please clarify the workflow for integrating these Gerrit changes?
Comment 142 Matthias Sohn CLA 2016-03-03 16:49:20 EST
(In reply to Garret Wilson from comment #140)
> So now I'm going through Matthias' instructions so that I can test this
> personally on my Windows machine. I have to say that the online Gerrit
> instructions are exceedingly confusing, especially for those of us who have
> never heard of Gerrit until now. The best I can figure out is that
> apparently each change has a revision number, and that we have to go
> somewhere and find this revision number? (This was not clear at all from the
> links e.g. https://git.eclipse.org/r/#/c/60617/ given to me.) And which
> revision number do we use---the latest? I'm sure this would be completely
> clear to someone who has been using Gerrit...

this is described in [1]. E.g. if you want to fetch
https://git.eclipse.org/r/#/c/60635/
into a new local branch, you just copy the number in the URL to the clipboard (if you are using EGit nightly build you can also just copy the complete URL or any other URL pointing into the change details, e.g. https://git.eclipse.org/r/#/c/60635/20/org.eclipse.jgit.test/tst/org/eclipse/jgit/api/EolStreamTypeUtilTest.java
then select the repository this change belongs to e.g. in repositories view
and click "Fetch from Gerrit...". If you are using latest nightly build EGit will automatically start content assist which will show all available patchsets for the chosen change. Select the one you want (most often the latest one with the highest number) and click ok. This will fetch the selected change/patchset and create a new local branch for it.

Do the same for the second change

> Anyway about the only way I was able to go forward was to follow the tip of
> going to the web site and copying the download/checkout "git fetch
> git://git.eclipse.org/gitroot/jgit/jgit refs/changes/17/60617/11 && git
> checkout FETCH_HEAD" string, which populates the "Fetch from Gerrit" dialog.
> 
> But then what do I do? Apparently this will check out a FETCH_HEAD... but do
> I then merge this into master? Mattias gave me two change URLs, so I guess I
> would need to do the same to both of them. (My developer ran into the same
> issue and he said he was only able to go forward by merging.)
> 
> Any background information would be useful. All this Gerrit stuff is new to
> our team---and the explanations assume a lot of knowledge about Gerrit and
> its workflow(s). Thank you very much.

I think currently these two changes are based on different base versions since they have been started independently. If you want to test the combination of both changes you can just rebase one of them on top of the other one. For that checkout one of them, then select the other one in history view and click "Rebase".

Technically you could as well merge the two branches but this creates a merge commit so that the history looks more complex hence I usually prefer rebase over merge for such experiments. If anything goes wrong you can just delete the local branches created for each of these changes and start over.


[1] https://wiki.eclipse.org/EGit/User_Guide#Fetching_a_change_from_a_Gerrit_Code_Review_Server
Comment 143 Garret Wilson CLA 2016-05-14 14:22:52 EDT
I thought that I had seen that this issue was resolved and being released---I guess I was mistaken.

Sorry to bother everyone, but could someone give me an update on this issue? I had extensive .gitattribute EOL tests performed on Macintosh, as you can see above. Are you waiting on me to perform the same tests on Windows---am I the one holding this up?
Comment 144 Christian Halstrick CLA 2016-05-15 16:04:35 EDT
I see in https://bugs.eclipse.org/bugs/show_bug.cgi?id=493360 that we still have problems with core.autocrlf at least on windows. E/JGit think in certain situations that that working-tree files are dirty although they are clean. When autocrlf handling is on we need to do autocrfl converstion even if we only do a git status to see wether files are dirty or not (no checkout, no add, just a git status). And there seem to be situations which are not covered by the tests where E/JGit fail to do this conversion correctly.

Thanks for the help. But currently you can't do much because the problem is reproduceable.
Comment 145 Matthias Sohn CLA 2016-05-15 16:33:33 EDT
(In reply to Garret Wilson from comment #143)
> I thought that I had seen that this issue was resolved and being
> released---I guess I was mistaken.
> 
> Sorry to bother everyone, but could someone give me an update on this issue?
> I had extensive .gitattribute EOL tests performed on Macintosh, as you can
> see above. Are you waiting on me to perform the same tests on Windows---am I
> the one holding this up?

git attributes support is available since JGit 4.3, see the 4.3 release notes
https://projects.eclipse.org/projects/technology.jgit/releases/4.3

https://git.eclipse.org/r/#/c/67085/ "Optimize attribute handling" trying to improve performance is still in code review, we should look into that in order to finally close this bug
Comment 146 Michael Haeusler CLA 2016-05-17 05:36:29 EDT
also please note this bug that I reported some weeks ago
https://bugs.eclipse.org/bugs/show_bug.cgi?id=492521
which is related to the recent changes.
Comment 147 Rolf Theunissen CLA 2016-06-23 04:24:11 EDT
The behavior of git and jgit is still different (on windows), I have set:

core.autocrlf=false
core.eol=native

and a .gitattributes with:
* text=auto

On checkout with git eol's are converted to CRLF, on checkout with jgit eol's are LF (as they are in the archive). 

Setting core.eol to crlf does convert LF to CRLF on checkout with jgit.
Comment 148 Garret Wilson CLA 2016-09-27 18:57:51 EDT
> git attributes support is available since JGit 4.3.

Yet this bug, "Bug 342372 - support gitattributes" is still marked as "New"? 

I'm a little confused. Is it done? Is it not done? Is it in EGit 4.4? Are there still bugs in the implementation?
Comment 149 Christian Halstrick CLA 2016-09-28 03:29:14 EDT
Currently JGit supports a certain set of attributes defined in gitattributes. Looking at attributes dealing with line-endings I see that we support "text" (also when set to "auto"), "crlf", "input", "eol" (see the code [1]) and the "binary" macro. Regarding the attributes which are not tied to line-ending handling we support "filter".

The native git attributes man page lists a lot more possible attributes ("ident", "encodeing", "delta", "export-ignore",...) and I guess we will never support all. We add them as they are needed.

Regarding bugs: I think the simple use cases of these attributes work. But there are open bugs in specific use cases. E.g. when you define multiple attributes 492521. There are quite a lot of problems on Windows regarding filter command execution. Others want "ident" support (357039).

[1] https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/util/io/EolStreamTypeUtil.java#L145-L253
Comment 150 Björn Michael CLA 2016-09-29 03:15:45 EDT
(In reply to Christian Halstrick from comment #149)
> Currently JGit supports a certain set of attributes defined in
> gitattributes. [...]
> 

That fact should be documented in JGit or EGit FAQ with a reference to this bug on e.g. https://wiki.eclipse.org/EGit/FAQ#How_compatible_is_EGit_with_Git.3F
Comment 151 Christian Halstrick CLA 2016-09-29 11:30:44 EDT
Great idea. I added a paragraph to https://wiki.eclipse.org/EGit/FAQ
Comment 152 Rolf Theunissen CLA 2016-09-30 03:26:24 EDT
(In reply to Christian Halstrick from comment #151)
> Great idea. I added a paragraph to https://wiki.eclipse.org/EGit/FAQ

You should update the table in the section above too. Currently the table 'config' has row 'core.autocrlf' with a description stating that 'gitattributes are not supported yet'
Comment 153 Christian Halstrick CLA 2016-09-30 04:10:55 EDT
Sure. I updated the entry for the config parameter core.autocrlf to not talk about gitattributes anymore. The config parameters and the attributes are different things and now have their own paragraphs.
Comment 154 Garret Wilson CLA 2017-04-21 18:00:25 EDT
Unfortunately using EGit 4.7.0.201704051617-r in Eclipse 4.6.3, there still seems to be some problems. The problem is in _merging_, not in committing.

I'm on a Windows machine, and I have a .gitattributes file which says this (among other things):

* text=auto
*.html text diff=html
*.js text


1. In `master` I created branch `foo` using Eclipse and made several commits.
2. Then I went back to `master` and made several commits.
3. I wanted to finish `foo` so I merged the latest `master` into `foo` using Eclipse.

It is somewhere along this point that EGit (as far as I can tell) changed all my HTML and JavaScript files from using CRLF to LF in my working directory.

And I had noticed when I was ready to do the merged commit in Eclipse that at first it showed a diff with _all_ lines changed. But then when I brought the diff up again, it only showed the correct lines changed. Very odd.

4. I made one more more commit in `foo`.
5. I merged `foo` into `master` _on the remote repository_ using Bitbucket.
6. I switched to `master` using command-line Git.

It is only now that I made some changes and committed them on `master` using command-line Git that it gave me a warning saying that some of my local files use LF. Sure enough, the files affected in the merge from `master` to `foo` by Eclipse had been changed from CRLF to LF in my working directory.

So it appears the correct normalized LF EOL is getting into the repository. The problem is that EGit's merge seems to ignore `.gitattributes` and, as part of the merge process, convert CRLF to LF in my working directory sometimes.

Has anyone verified that all the code for merging correctly follows `.gitattributes`? It looks to me like it's simply pulling out whatever is in the repository and not doing the correct conversions.
Comment 155 Sam Gabriel CLA 2017-05-05 20:10:05 EDT
Our team biggest problem with Eclipse and autocrlf on windows is the same as described in Comment #154 
Somewhere when merging branches in the files that are modified by the merge those files if afterwards modified locally would have different LF line endings instead of CRLF. This happens a lot for files that you have never edited in Eclipse before. Once you edit the files once and modify it , it wouldn't happen again to this file for sometime until some other event happen. 

Again it is not in the committing process but rather in the merging from a branch to the head.
Comment 156 Sam Gabriel CLA 2017-05-05 20:14:22 EDT
there is another bug #499615 that has more details on the merge issue however no work has been done yet. perhaps someone could look into this since it is much easier to fall into this trap without noticing while committing.
Comment 157 Jonathan Nieder CLA 2018-08-30 23:27:58 EDT
Marking fixed per comment 149. Please open new bugs for additional functionality you'd like.
Comment 158 Ilya Rokhkin CLA 2019-03-20 13:53:11 EDT
We use jgit via gerrit and git archive command seems do not read .git attributes from remote repository before compressing and sending it 

Gerrit version: 2.16.6

What steps will reproduce the problem?
1. ~/tmp/AppShell3 % git archive --format=tar.gz --remote=ssh://gerrit-server:29418/Repo1 V_BRANCH1 | tar zxf -
Bring all files with eol LF
2./tmp/AppShell4 % git archive --format=tar.gz --remote=git@gerrit-server:/git/Repo1 V_BRANCH1 | tar zxf -
Regular git archive via ssh port 22 not related to gerrit, works as expected, brings *.mht file eol CRLF windows style
3.
this is the repository Repo1.gitattributes file
# Declare files that will always have CRLF line endings on checkout.
*.mht eol=crlf

What is the expected output?
To see *.mht files in CRLF eol 

By default git archive reads .gitattributes before compressing and sending archive
Comment 159 Matthias Sohn CLA 2019-03-20 18:58:20 EDT
(In reply to Ilya Rokhkin from comment #158)
> We use jgit via gerrit and git archive command seems do not read .git
> attributes from remote repository before compressing and sending it 
> 
> Gerrit version: 2.16.6
> 
> What steps will reproduce the problem?
> 1. ~/tmp/AppShell3 % git archive --format=tar.gz
> --remote=ssh://gerrit-server:29418/Repo1 V_BRANCH1 | tar zxf -
> Bring all files with eol LF
> 2./tmp/AppShell4 % git archive --format=tar.gz
> --remote=git@gerrit-server:/git/Repo1 V_BRANCH1 | tar zxf -
> Regular git archive via ssh port 22 not related to gerrit, works as
> expected, brings *.mht file eol CRLF windows style
> 3.
> this is the repository Repo1.gitattributes file
> # Declare files that will always have CRLF line endings on checkout.
> *.mht eol=crlf
> 
> What is the expected output?
> To see *.mht files in CRLF eol 
> 
> By default git archive reads .gitattributes before compressing and sending
> archive

please file a new bug to report this problem