Bug 286653 - push omits trees/blobs from transmitted pack
push omits trees/blobs from transmitted pack
Status: RESOLVED FIXED
Product: JGit
Classification: Technology
Component: JGit
unspecified
PC Windows Vista
: P3 critical (vote)
: 0.7.0
Assigned To: Shawn Pearce CLA Friend
:
: 307778 (view as bug list)
Depends on:
Blocks: 299726
  Show dependency tree
 
Reported: 2009-08-14 11:46 EDT by John Bito CLA Friend
Modified: 2010-04-27 16:02 EDT (History)
8 users (show)

See Also:


Attachments
Repo causing bad behavior (873 bytes, application/octet-stream)
2010-01-31 04:24 EST, Robin Rosenberg CLA Friend
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Bito CLA Friend 2009-08-14 11:46:39 EDT
Build ID: I20090611-1540

GitHub provided a 370MB tgz that should contain the repository after the corruption.  The repository that was pushed is available.

After pushing the repository, attempts to fetch reported 'remote: aborting due to possible repository
corruption on the remote side'

The folks at GitHub provided the following git-fsck output:
       <mojombo>       git fsck --full
       <mojombo>       broken link from tree
f4f9ecd1875938baa42467dfd6a8134d75fe5de4 to tree
57548924f1eca854dc8db00844f95d3de2c82957
       <mojombo>       broken link from tree
f4f9ecd1875938baa42467dfd6a8134d75fe5de4 to tree
3d1f74522c3e7c3c03390fae376446fda6eed306
       <mojombo>       missing tree 3d1f74522c3e7c3c03390fae376446fda6eed306
       <mojombo>       missing tree 57548924f1eca854dc8db00844f95d3de2c82957
       <mojombo>       dangling commit ab6ce47159c1eaff0e4bae19291679267de9f669

git fsck --full on the local repo produces no output.

The repo that was pushed to cause the corruption GitHup has the tree entry f4f9ecd1875938baa42467dfd6a8134d75fe5de4 in a dangling commit.

$ git fsck --full f4f9ecd1875938baa42467dfd6a8134d75fe5de4
dangling commit 560a391801d1712b625d4ae317a490529a4ccf08

$ git ls-tree f4f9ecd1875938baa42467dfd6a8134d75fe5de4
100644 blob 9324c851e6816962f87cb772ebc34f9c8036d832    .classpath
100644 blob 1cb465bc1e95c2e088376ac06b363a3aa481c9ce    .project
040000 tree f00fa742e906037190ae0cce70ec235fbf6eab83    .settings
100644 blob 7186ec1ee04722679dd9b8c0567fa522ac0495b3    asql.jardesc
040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904    bin-groovy
040000 tree 57548924f1eca854dc8db00844f95d3de2c82957    bin
040000 tree 1156d24cd387c7278bc536d32b09057556a6c60d    lib
040000 tree 3d1f74522c3e7c3c03390fae376446fda6eed306    src

Fortunately, there is exactly one folder with these contents.

$ git show 560a391801d1712b625d4ae317a490529a4ccf08
commit 560a391801d1712b625d4ae317a490529a4ccf08
Author: John W. Bito <jwbito@XXXX>
Date:   Mon Aug 10 14:01:22 2009 -0700

    Now traverse rows using DB cursor.  Enable statement cache.

diff --git a/ADS/testlib/asql.jar b/ADS/testlib/asql.jar
index bb434c7..155b825 100644
Binary files a/ADS/testlib/asql.jar and b/ADS/testlib/asql.jar differ
diff --git a/ADS/testlib/groovy-all-1.6.1.jar b/ADS/testlib/groovy-all-1.6.1.jar
deleted file mode 100644
index a6252c2..0000000
Binary files a/ADS/testlib/groovy-all-1.6.1.jar and /dev/null differ
diff --git a/ADS/testlib/groovy-all-1.7-beta-1-SNAPSHOT.jar b/ADS/testlib/groovy-all-1.7-beta-1-SNAPSHOT.jar
new file mode 100644
index 0000000..4d3fec8
Binary files /dev/null and b/ADS/testlib/groovy-all-1.7-beta-1-SNAPSHOT.jar differ
diff --git a/queryengine/.classpath b/queryengine/.classpath
index 4aab135..9324c85 100644
--- a/queryengine/.classpath
+++ b/queryengine/.classpath
@@ -1,17 +1,17 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<classpath>
-	<classpathentry excluding="ch/viveo/query/test/resources/" kind="src" path="src"/>
-	<classpathentry excluding="bin-groovy/|src/|src/ch/viveo/query/test/resources/" kind="src" path=""/>
-	<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/>
-	<classpathentry exported="true" kind="con" path="GROOVY_SUPPORT"/>
-	<classpathentry combineaccessrules="false" kind="src" path="/nb_binding"/>
-	<classpathentry kind="con" path="org.eclipse.jdt.junit.JUNIT_CONTAINER/4"/>
-	<classpathentry kind="con" path="org.eclipse.datatools.connectivity.jdt.DRIVERLIBRARY/Oracle Thin Driver"/>
-	<classpathentry kind="lib" path="/nb_binding/lib/xpp3_min-1.1.4c.jar"/>
-	<classpathentry kind="lib" path="/nb_binding/lib/xstream-1.3.1.jar">
-		<attributes>
-			<attribute name="javadoc_location" value="http://xstream.codehaus.org/javadoc"/>
-		</attributes>
-	</classpathentry>
-	<classpathentry kind="output" path="bin"/>
-</classpath>
+<?xml version="1.0" encoding="UTF-8"?>
+<classpath>
+	<classpathentry excluding="ch/viveo/query/test/resources/" kind="src" path="src"/>
+	<classpathentry excluding="bin-groovy/|src/|src/ch/viveo/query/test/resources/" kind="src" path=""/>
+	<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/>
+	<classpathentry combineaccessrules="false" kind="src" path="/nb_binding"/>
+	<classpathentry kind="con" path="org.eclipse.jdt.junit.JUNIT_CONTAINER/4"/>
+	<classpathentry kind="con" path="org.eclipse.datatools.connectivity.jdt.DRIVERLIBRARY/Oracle Thin Driver"/>
+	<classpathentry kind="lib" path="/nb_binding/lib/xpp3_min-1.1.4c.jar"/>
+	<classpathentry kind="lib" path="/nb_binding/lib/xstream-1.3.1.jar">
+		<attributes>
+			<attribute name="javadoc_location" value="http://xstream.codehaus.org/javadoc"/>
+		</attributes>
+	</classpathentry>
+	<classpathentry exported="true" kind="con" path="GROOVY_SUPPORT"/>
+	<classpathentry kind="output" path="bin"/>
+</classpath>
diff --git a/queryengine/.settings/org.codehaus.groovy.eclipse.preferences.prefs b/queryengine/.settings/org.codehaus.groovy.eclipse.preferences.prefs
old mode 100755
new mode 100644
index 1abf79e..c5e9a2e
--- a/queryengine/.settings/org.codehaus.groovy.eclipse.preferences.prefs
+++ b/queryengine/.settings/org.codehaus.groovy.eclipse.preferences.prefs
@@ -1,4 +1,4 @@
-#Tue May 05 14:16:49 PDT 2009
+#Sat Aug 08 14:15:43 PDT 2009
 eclipse.preferences.version=1
 groovy.compiler.output.path=bin-groovy
 support.groovy=true
diff --git a/queryengine/bin-groovy/README b/queryengine/bin-groovy/README
deleted file mode 100644
index 6dd996e..0000000
--- a/queryengine/bin-groovy/README
+++ /dev/null
@@ -1 +0,0 @@
-Folder for Groovy compilation.
diff --git a/queryengine/bin/.settings/org.codehaus.groovy.eclipse.preferences.prefs b/queryengine/bin/.settings/org.codehaus.groovy.eclipse.preferences.prefs
old mode 100755
new mode 100644
index 1abf79e..c5e9a2e
--- a/queryengine/bin/.settings/org.codehaus.groovy.eclipse.preferences.prefs
+++ b/queryengine/bin/.settings/org.codehaus.groovy.eclipse.preferences.prefs
@@ -1,4 +1,4 @@
-#Tue May 05 14:16:49 PDT 2009
+#Sat Aug 08 14:15:43 PDT 2009
 eclipse.preferences.version=1
 groovy.compiler.output.path=bin-groovy
 support.groovy=true
diff --git a/queryengine/src/ch/viveo/query/TableBinding.groovy b/queryengine/src/ch/viveo/query/TableBinding.groovy
index 29e0e58..410ebd6 100644
--- a/queryengine/src/ch/viveo/query/TableBinding.groovy
+++ b/queryengine/src/ch/viveo/query/TableBinding.groovy
@@ -7,6 +7,7 @@ import ch.viveo.ads_sql.TableInterface ;
 import ch.viveo.ads_sql.buffer.ClientBuffer;
 import ch.viveo.ads_sql.exceptions.*;
 import groovy.sql.Sql ;
+import groovy.sql.GroovyResultSet ;
 import java.util.logging.Logger;
 import java.util.logging.Level;
 import java.nio.ByteBuffer;
@@ -19,20 +20,20 @@ import java.lang.IllegalStateException
  *
  */
 public class TableBinding implements TableInterface {
+	private static final Logger log = Logger.getLogger(getClass().getSimpleName());
+
 	private final Structure str;
 	private final Table tab;
 	private final def db;
 	private final def allFieldNames;
-	private final Set<Object> allColumnNames;
+	private final Set<String> allColumnNames;
 	private boolean valid;
 	private boolean open;
 	private def uri;
-	private def rowSet;
+	private GroovyResultSet resultSet;
 	private def queryParams;
 	private def queryKey;
 	private int queryMatchType;
-	private boolean rowSetUsed;
-	private static final Logger log = Logger.getLogger(getClass().getSimpleName());
 	private final String readQuery;
 	private final String maxUriQuery;
 	private final String sequenceQuery;
@@ -195,8 +196,7 @@ public class TableBinding implements TableInterface {
 	
 	void select(int lockOption){
 		log.log(Level.FINEST, "Parameters: {0}", queryParams)
-		rowSet = db.rows(tab.keys.(lockOption ? "queryStringLock" : "queryString") (queryKey, queryMatchType), queryParams);
-		rowSetUsed = false;
+		resultSet = db.rowCursor(tab.keys.(lockOption ? "queryStringLock" : "queryString") (queryKey, queryMatchType), queryParams);
 	}
 	
 	public void selectRange(int keyNumber, int matchOption, ByteBuffer startValues, ByteBuffer endValues) {
@@ -209,29 +209,23 @@ public class TableBinding implements TableInterface {
 	}
 
 	private reset() {
-		rowSet = null;
-		rowSetUsed = false
+		if (resultSet && !resultSet.isClosed())
+			resultSet.close();
 	}
 	
 	public int retrieveUris(ByteBuffer result, int lockOption) {
-		if (null == rowSet) {
+		if (null == resultSet) {
 			if (!queryKey)
 				throw new IllegalStateException("No row set available for ${tab.name}");
 			else {
 				select(lockOption);
-				log.log(Level.FINEST, "Select URIs returns {0}: {1}", rowSet.size(), rowSet)
 			}
 		}
-		if (rowSetUsed)
-			return 0;
-		rowSetUsed = true;
 		IntBuffer urilist = result.asIntBuffer();
 		int count = 0;
-		rowSet.each {
-			if (urilist.position() < urilist.limit()) {
-				urilist.put(it.uri as int)
-				count ++;
-			}
+		while (urilist.position() < urilist.limit() && resultSet.next()) {
+			urilist.put(resultSet.uri as int)
+			count ++;
 		}
 		log.log(Level.FINEST, "retrieveUris returning {0} uris", count);
 		//TODO hold pagination state so next request returns following list elements
@@ -239,6 +233,7 @@ public class TableBinding implements TableInterface {
 	}
 	
 	public void close() {
+		reset();
 		valid = false;
 		open = false;
 		//TODO remove the reference to this in the Process instance
diff --git a/queryengine/src/ch/viveo/query/TableBindingTest.groovy b/queryengine/src/ch/viveo/query/TableBindingTest.groovy
index 8e2a389..aa448d1 100644
--- a/queryengine/src/ch/viveo/query/TableBindingTest.groovy
+++ b/queryengine/src/ch/viveo/query/TableBindingTest.groovy
@@ -8,7 +8,6 @@ import org.junit.Test
 import org.junit.Before
 import org.junit.After;
 
-import groovy.sql.Sql;
 import java.nio.ByteBuffer;
 import java.nio.IntBuffer;
 
@@ -18,6 +17,9 @@ import ch.viveo.ads_sql.TableInterface
 import ch.viveo.ads_sql.buffer.ClientBuffer
 import ch.viveo.ads_sql.test.TestData
 
+import ch.viveo.query.mgr.Process;
+import ch.viveo.query.sql.SqlCursor;
+
 /**
  * @author John W. Bito
  *
@@ -27,13 +29,17 @@ public class TableBindingTest extends GroovyTestCase {
     def db = null;
     def str, res, fdf, keys;
     ByteBuffer buff;
+
+    public static final String TEST_DRIVER = 'oracle.jdbc.driver.OracleDriver';
     
     @Before
     public void setUp() throws Exception {
         resource = TestData.genericResource("clienp01.info.xml", this);
         fdf = new FDFTranslator().readXML(new FileReader(resource));
         str = FDFTranslator.createStructure(fdf); 
-        db = Sql.newInstance("jdbc:oracle:thin:nbquali5/nbquali51@localhost:1521:nb");
+        Class.forName(TEST_DRIVER)
+        System.setProperty(Process.JDBC_DRIVER_PROPERTY, TEST_DRIVER)
+        db = SqlCursor.newInstance("jdbc:oracle:thin:nbquali5/nbquali51@localhost:1521:nb");
         def tab = new Table("uri_clienp01");
 		res = new TableBinding(str, tab, db);
 		keys = new KeyBinding(tab, str, fdf.key.code);
@@ -65,6 +71,10 @@ public class TableBindingTest extends GroovyTestCase {
 		def rec = readLast();
 		String query = "select * from uri_clienp01 where uri=${res.uri}";
 		def sqlData = db.firstRow(query); 
+		sqlData.each { k, v ->
+			println k + "->" + v?.class + "=" + v
+			
+		}
 		assertEquals(sqlData["MAJ0H"], rec["MAJ0H"].value);
 		assertTrue(res.getUnboundColumns().isEmpty());
 		assertTrue(res.getUnboundFields().isEmpty());
@@ -121,7 +131,7 @@ public class TableBindingTest extends GroovyTestCase {
 		
 		res.selectRange(0, TableInterface.MATCHGEKEY1LEKEY2, start, end);
 		res.select(01);
-		assertEquals(12, res.rowSet.size());
+		assertEquals(4, res.queryParams.size());
 	}
 	
 	@Test
diff --git a/queryengine/src/ch/viveo/query/mgr/Process.groovy b/queryengine/src/ch/viveo/query/mgr/Process.groovy
index 9613e2a..5bce4ed 100644
--- a/queryengine/src/ch/viveo/query/mgr/Process.groovy
+++ b/queryengine/src/ch/viveo/query/mgr/Process.groovy
@@ -4,13 +4,13 @@
 package ch.viveo.query.mgr ;
 
 import java.nio.ByteBuffer ;
-import groovy.sql.Sql;
 import java.sql.Connection;
 
 import ch.viveo.ads_sql.FDFTranslator;
 import ch.viveo.ads_sql.TableInterface;
 
 import ch.viveo.query.*;
+import ch.viveo.query.sql.SqlCursor;
 
 /**
  * Handle the lifecycle of PDs (Process Definition) and the resources that map to SQL queries
@@ -28,6 +28,7 @@ public class Process{
 	def dbConnection = null;
 	
 	public static final String CONNECTION_PROPERTY = "ch.viveo.db.connect";
+	public static final String JDBC_DRIVER_PROPERTY = "ch.viveo.db.driver";
 	
 	FDFTranslator loader = null; 
 	
@@ -49,10 +50,13 @@ public class Process{
 	private def getConnection() {
 		if (!dbConnection) {
 			String connSpec = System.getProperty(CONNECTION_PROPERTY);
+			String driverName = System.getProperty(JDBC_DRIVER_PROPERTY);
 			if (!connSpec)
 				throw new IllegalArgumentException("$CONNECTION_PROPERTY not set");
-			Class.forName("oracle.jdbc.driver.OracleDriver"); // TODO setup a proper initialization
-			dbConnection = Sql.newInstance(connSpec);
+			if (driverName)
+				dbConnection = SqlCursor.newInstance(connSpec, driverName);
+			else
+				dbConnection = SqlCursor.newInstance(connSpec);
 		}
 		return dbConnection;
 	}
diff --git a/queryengine/src/ch/viveo/query/sql/SqlCursor.java b/queryengine/src/ch/viveo/query/sql/SqlCursor.java
new file mode 100644
index 0000000..232ccd3
--- /dev/null
+++ b/queryengine/src/ch/viveo/query/sql/SqlCursor.java
@@ -0,0 +1,149 @@
+/**
+ * 
+ */
+package ch.viveo.query.sql;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.util.List;
+import java.util.Properties;
+
+import javax.sql.DataSource;
+
+import groovy.sql.GroovyResultSet;
+import groovy.sql.GroovyResultSetProxy;
+import groovy.sql.Sql;
+
+/**
+ * @author IBM User
+ *
+ */
+public class SqlCursor extends Sql {
+
+	/**
+	 * @param dataSource
+	 */
+	public SqlCursor(DataSource dataSource) {
+		super(dataSource);
+		setCacheStatements(true);
+	}
+
+	/**
+	 * @param connection
+	 */
+	public SqlCursor(Connection connection) {
+		super(connection);
+		setCacheStatements(true);
+	}
+
+    /**
+     * Creates a new Sql instance given a JDBC connection URL.
+     *
+     * @param url a database url of the form
+     *            <code> jdbc:<em>subprotocol</em>:<em>subname</em></code>
+     * @return a new Sql instance with a connection
+     * @throws SQLException if a database access error occurs
+     */
+    public static SqlCursor newInstance(String url) throws SQLException {
+        Connection connection = DriverManager.getConnection(url);
+        return new SqlCursor(connection);
+    }
+
+    /**
+     * Creates a new Sql instance given a JDBC connection URL
+     * and some properties.
+     *
+     * @param url        a database url of the form
+     *                   <code> jdbc:<em>subprotocol</em>:<em>subname</em></code>
+     * @param properties a list of arbitrary string tag/value pairs
+     *                   as connection arguments; normally at least a "user" and
+     *                   "password" property should be included
+     * @return a new Sql instance with a connection
+     * @throws SQLException if a database access error occurs
+     */
+    public static SqlCursor newInstance(String url, Properties properties) throws SQLException {
+        Connection connection = DriverManager.getConnection(url, properties);
+        return new SqlCursor(connection);
+    }
+
+    /**
+     * Creates a new Sql instance given a JDBC connection URL,
+     * some properties and a driver class name.
+     *
+     * @param url             a database url of the form
+     *                        <code> jdbc:<em>subprotocol</em>:<em>subname</em></code>
+     * @param properties      a list of arbitrary string tag/value pairs
+     *                        as connection arguments; normally at least a "user" and
+     *                        "password" property should be included
+     * @param driverClassName the fully qualified class name of the driver class
+     * @return a new Sql instance with a connection
+     * @throws SQLException           if a database access error occurs
+     * @throws ClassNotFoundException if the class cannot be found or loaded
+     */
+    public static SqlCursor newInstance(String url, Properties properties, String driverClassName)
+            throws SQLException, ClassNotFoundException {
+        loadDriver(driverClassName);
+        return newInstance(url, properties);
+    }
+
+    /**
+     * Creates a new Sql instance given a JDBC connection URL,
+     * a username and a password.
+     *
+     * @param url      a database url of the form
+     *                 <code> jdbc:<em>subprotocol</em>:<em>subname</em></code>
+     * @param user     the database user on whose behalf the connection
+     *                 is being made
+     * @param password the user's password
+     * @return a new Sql instance with a connection
+     * @throws SQLException if a database access error occurs
+     */
+    public static SqlCursor newInstance(String url, String user, String password) throws SQLException {
+        Connection connection = DriverManager.getConnection(url, user, password);
+        return new SqlCursor(connection);
+    }
+
+    /**
+     * Creates a new Sql instance given a JDBC connection URL,
+     * a username, a password and a driver class name.
+     *
+     * @param url             a database url of the form
+     *                        <code> jdbc:<em>subprotocol</em>:<em>subname</em></code>
+     * @param user            the database user on whose behalf the connection
+     *                        is being made
+     * @param password        the user's password
+     * @param driverClassName the fully qualified class name of the driver class
+     * @return a new Sql instance with a connection
+     * @throws SQLException           if a database access error occurs
+     * @throws ClassNotFoundException if the class cannot be found or loaded
+     */
+    public static SqlCursor newInstance(String url, String user, String password, String driverClassName) throws SQLException,
+            ClassNotFoundException {
+        loadDriver(driverClassName);
+        return newInstance(url, user, password);
+    }
+
+    /**
+     * Creates a new Sql instance given a JDBC connection URL
+     * and a driver class name.
+     *
+     * @param url             a database url of the form
+     *                        <code> jdbc:<em>subprotocol</em>:<em>subname</em></code>
+     * @param driverClassName the fully qualified class name of the driver class
+     * @return a new Sql instance with a connection
+     * @throws SQLException           if a database access error occurs
+     * @throws ClassNotFoundException if the class cannot be found or loaded
+     */
+    public static SqlCursor newInstance(String url, String driverClassName) throws SQLException, ClassNotFoundException {
+        loadDriver(driverClassName);
+        return newInstance(url);
+    }
+    
+    public GroovyResultSet rowCursor(String sql, List<Object> params) throws SQLException {
+    	ResultSet rs = executePreparedQuery(sql, params);
+    	return new GroovyResultSetProxy(rs).getImpl();
+    }
+
+}
diff --git a/queryengine/src/nb.sql b/queryengine/src/nb.sql
index 277ff44..7c2c875 100644
--- a/queryengine/src/nb.sql
+++ b/queryengine/src/nb.sql
@@ -100,4 +100,8 @@ delete from clienp01 where uri > 1500;
 desc uri_clienp01;
 
 select distinct nom0x from clienp01
-select uri from clienp01 where nom0x = 'CL99898000'
\ No newline at end of file
+select uri from clienp01 where nom0x = 'CL99898000'
+
+alter table uri_clienp01 add  (time timestamp)
+alter table uri_clienp01 drop column time 
+update uri_clienp01 set time = current_timestamp
\ No newline at end of file

========================

The folder /queryengine/bin-groovy should also have been deleted by the commit.  I may have done 'remove from version control' on /queryengine/bin-groovy/README before deleting the folder from the working copy.  The commit was created by egit.
Comment 1 Robin Rosenberg CLA Friend 2009-09-05 17:01:00 EDT
Also reported at http://code.google.com/p/egit/issues/detail?id=114. 

This is probably a JGit issue.
Comment 2 Shawn Pearce CLA Friend 2009-09-08 11:18:58 EDT
(In reply to comment #0)
> The folks at GitHub provided the following git-fsck output:
>        <mojombo>       git fsck --full
>        <mojombo>       broken link from tree
> f4f9ecd1875938baa42467dfd6a8134d75fe5de4 to tree
> 57548924f1eca854dc8db00844f95d3de2c82957
>        <mojombo>       broken link from tree
> f4f9ecd1875938baa42467dfd6a8134d75fe5de4 to tree
> 3d1f74522c3e7c3c03390fae376446fda6eed306
>        <mojombo>       missing tree 3d1f74522c3e7c3c03390fae376446fda6eed306
>        <mojombo>       missing tree 57548924f1eca854dc8db00844f95d3de2c82957
>        <mojombo>       dangling commit ab6ce47159c1eaff0e4bae19291679267de9f669
...
> $ git ls-tree f4f9ecd1875938baa42467dfd6a8134d75fe5de4
> 100644 blob 9324c851e6816962f87cb772ebc34f9c8036d832    .classpath
> 100644 blob 1cb465bc1e95c2e088376ac06b363a3aa481c9ce    .project
> 040000 tree f00fa742e906037190ae0cce70ec235fbf6eab83    .settings
> 100644 blob 7186ec1ee04722679dd9b8c0567fa522ac0495b3    asql.jardesc
> 040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904    bin-groovy
> 040000 tree 57548924f1eca854dc8db00844f95d3de2c82957    bin
> 040000 tree 1156d24cd387c7278bc536d32b09057556a6c60d    lib
> 040000 tree 3d1f74522c3e7c3c03390fae376446fda6eed306    src

OK, so from this we know that the "bin" and "src" subtrees were not sent to github.  JGit must have assumed they were present on the remote side based on the boundary commits it observed from the remote peer.

> The repo that was pushed to cause the corruption GitHup has the tree entry
> f4f9ecd1875938baa42467dfd6a8134d75fe5de4 in a dangling commit.
> 
> $ git fsck --full f4f9ecd1875938baa42467dfd6a8134d75fe5de4
> dangling commit 560a391801d1712b625d4ae317a490529a4ccf08

This tells me nothing.  Heck, I'm not even sure what this is saying.  `git fsck --full SHA1` is supposed to verify that SHA1 is structurally sane, all the back to its roots.  Yet if f4f9 is a tree as shown above, it cannot possibly report about a commit object because a tree does not contain pointers to commits.  So the --full must have dragged in the refs of this repository.  Which means the "dangling commit 560a3" report above is a red-herring.

> Fortunately, there is exactly one folder with these contents.
> 
> $ git show 560a391801d1712b625d4ae317a490529a4ccf08
> commit 560a391801d1712b625d4ae317a490529a4ccf08
> Author: John W. Bito <jwbito@XXXX>
> Date:   Mon Aug 10 14:01:22 2009 -0700

This doesn't tell me anything.  I'm not even sure this is related to the repository corruption you observed.

> The folder /queryengine/bin-groovy should also have been deleted by
> the commit.

Ah, now this gem gave me a bit more detail.  You are trying to indirectly say that the tree f4f9 above, whose subtrees "bin" and "src" were missing at GitHub,  is actually the subtree "queryengine" of your project?  I'm making this assumption based upon the bin-groovy entry above.
Comment 3 John Bito CLA Friend 2009-09-08 11:49:18 EDT
I am merely a git user, so I regret that I'm not sure how to find the most useful information for you.

f4f9 is the folder at projectroot/queryengine.  I expected that the commit would remove the folder projectroot/queryengine/bin-groovy (the commit listing shows projectroot/queryengine/bin-groovy/README being deleted).

I'm the only one writing to this repository for the most part (certainly no one else was committing to the branch that was being modified).

The commit should not have sent the projectroot/queryengine/bin, since that was already on the reqpository.  There were a few changes in the commit for projectroot/queryengine/src, but the folder itself was already on the repository and didn't need to change.

I hope that is the clarification you were looking for.
Comment 4 Gergely Kiss CLA Friend 2009-09-19 09:50:20 EDT
I also ran into the same problem on an Ubuntu system, after only 5 pushes. What I did was:
- created a github repo
- git init on command line
- git add in eclipse
- git commit in eclipse
- before pushing, I had to 'git remote add origin...', because egit blew up every time I tried to add a remote repo
- git push in eclipse

And everything is green and shiny. However, when trying to clone to a windows host, I got the 'aborting due to possible repository corruption' with command line git, and a message-less InvocationTargetException with eclipse.

And no wonder it failed, because looking more closely at the repo in github, _half of my code is missing_, including files and whole folders also! Clearly, the commits didn't ever made it there, yet there were no errors at all.

This is most upsetting, because we planned to use git for our next java project, but with critical issues like this, we can only trust the command line.

Also now I can't hack my hobby project until I get back to the linux box, which makes me sad :(
Comment 5 John Bito CLA Friend 2009-09-24 12:58:45 EDT
Just corrupted the repository again.  This time, the push was from a Solaris machine running 0.5.0.200908282229.

I don't think I'll ask GitHub to preserve anything, since the material I collected last time was never requested.
Comment 6 John Bito CLA Friend 2009-09-24 13:19:03 EDT
Upon further investigation, it seems the push was actually from a Windows box running 0.4.9.200906240051.  Now I'll REALLY make sure that we don't have any old versions laying around...
Comment 7 Robin Rosenberg CLA Friend 2010-01-31 04:24:55 EST
Created attachment 157708 [details]
Repo causing bad behavior

I was alerted of the this example and repo. This repo (attached as a bundle) has some empty trees that make the revwalk fail. 

$ git ls-tree -rt a54f1a85ebf6a7f53aa60a45a1be33f8b078fb7e
040000 tree bfe058ad536cdb12e127cde63b01472c960ea105    A
040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904    A/A
040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904    A/B
100644 blob abbbfafe3129f85747aba7bfac992af77134c607    B

$ git ls-tree -rt a54f1a85ebf6a7f53aa60a45a1be33f8b078fb7e^
040000 tree f8bfbec77c96ba97d47e4c8bc72729ac29905ef1    A
040000 tree 7c178d1296d8b87e83382c324aeb32e2def2a5af    A/A
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    A/A/A
040000 tree 7c178d1296d8b87e83382c324aeb32e2def2a5af    A/B
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    A/B/A
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    B

This was derived from a live repo git://github.com/RdeWilde/rows-server.git with a strange commit at 8beb7a5ebdf136a714f3ba9aad774dc4b09e8fce
Comment 8 Robin Rosenberg CLA Friend 2010-01-31 08:22:43 EST
(In reply to comment #2)
> (In reply to comment #0)
> > 040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904    bin-groovy

4b825dc642cb6eb9a060e54bf8d69288fbee4904 is the empty tree SHA-1
Comment 9 Shawn Pearce CLA Friend 2010-01-31 22:03:14 EST
Yesterday on IRC Ilari pointed out to me that the empty tree causes issues:

<Ilari>
Looks like the code won't handle hitting 4b825dc642cb6eb9a060e54bf8d69288fbee4904 (empty tree) too well.
ObjectWalk appears to quit walking that revision and move to next if empty tree is hit. If object walk is part of push, there's only single revision to push and there are modified files after the empty tree....
Well, lets just say hope the receiving end has fsck on receive enabled...
Or maybe it takes two empty trees in row...
Anyway, empty trees seem to do bad things to object walking machinery.
What creates those empty trees is one matter. But that they screw object walking is more serious bug.

Which agrees with your comments, and that bad repository.

I've previously looked at that same project, he managed to make a git link (submodule 160000 mode) link back to another commit in the same repository, and then EGit changed its mode to 100644, causing fsck --full to report the corruption of a blob converted to a commit.  I'll look at it again tomorrow.

I'm just glad we have a more information on this problem.  The empty tree doesn't come up often.  So I'm not surprised it took so long to track down.
Comment 10 Robin Rosenberg CLA Friend 2010-02-01 01:35:14 EST
I got this from Ilari who also credits Robert de Wilde.
Comment 11 Shawn Pearce CLA Friend 2010-02-02 13:05:10 EST
I have a new test in ObjectWalkTest which shows this behavior.

ObjectWalk is completely thrown off when it hits an empty tree.
It just stops traversing that commit, and all files within it,
which means PackWriter will never include those commits.  It
also means our fsck implementations are bogus, because we don't
do full reachability.

I'm working up a fix now.
Comment 12 Shawn Pearce CLA Friend 2010-02-02 17:28:56 EST
I think this is fixed by http://egit.eclipse.org/r/259
Comment 13 Shawn Pearce CLA Friend 2010-02-03 13:37:31 EST
Fixed in commit 7cdf0e2fb1c01632ff1d785a462e09b2e823a763
Comment 14 Shawn Pearce CLA Friend 2010-04-27 16:02:52 EDT
*** Bug 307778 has been marked as a duplicate of this bug. ***