It is well-known that two git commits within a single repo can be independent from each other, by changing separate files to each other, or changing separate parts of the same file(s). Conversely when a commit changes a line, it is "dependent" on not only the commit which last changed that line, but also any commits which were responsible for providing the surrounding lines of context, because without those previous versions of the line and its context, the commit's diff would not cleanly apply.

As with most dependency relationships, these form a directed acyclic graph. Sometimes it is useful to understand the nature of parts of this graph; for example when porting a commit "A" between git branches via git cherry-pick, it can be useful to programmatically determine in advance the minimum number of other dependent commits which would also need to be cherry-picked to provide the context for commit "A" to cleanly apply.

Another use case might be to better understand levels of specialism / cross-functionality within an agile team. If I author a commit which modifies (say) lines 34-37 and 102-109 of a file, the authors of the dependent commits forms a list which indicates the group of people I should potentially consider asking to review my commit, since I'm effectively changing "their" code. Monitoring those relationships over time might shed some light on how agile teams should best coordinate efforts on shared code bases.

I'm sure there are other use cases I haven't yet thought of.

I have written a tool called git deps which automatically walks this graph. Currently the output is text only, but it would be cool to visualise the results, e.g. by generating a static HTML page which uses d3.js to provide a force-directed, zoomable layout of the graph where individual nodes can be hovered over or clicked to expand to see the commit details. It would be nice to colour the nodes according to the commit author. There is a decent amount of prior art for visualizing dependency graphs in this way, e.g.

Also the tool should make better use of pygit2 since blame support was not complete when it was originally written.

(BTW the dependency graph is likely to be semantically incomplete; for example it would not auto-detect dependencies between a commit which changes code and another commit which changes documentation or tests to reflect this code. (Incidentally this is one reason why it is usually a very good idea to logically group such changes together in a single commit.)

Looking for mad skills in:

git python d3js javascript

This project is part of:

Hack Week 11

Activity Show All

  • about 2 years ago: aspiers added keyword "javascript" to detect and visualise git commits' patch-based dependencies
  • about 2 years ago: aspiers removed keyword javscript from detect and visualise git commits' patch-based dependencies
  • over 2 years ago: vbabka liked detect and visualise git commits' patch-based dependencies
  • over 2 years ago: barendartchuk liked detect and visualise git commits' patch-based dependencies
  • over 2 years ago: froh liked detect and visualise git commits' patch-based dependencies
  • Comments

    • aspiers
      over 2 years ago by aspiers | Reply

      I'd love to know if this work would help the kernel team with their backporting work, which presumably can get pretty hairy.

    • vitezslav_cizek
      over 2 years ago by vitezslav_cizek | Reply

      Not just the kernel team, but essentially any packager who needs to backport a fix, which happens quite often (eg. security bugs). I miss a tool like this and i'm pleased to learn about your git-deps.

    • aspiers
      over 2 years ago by aspiers | Reply

      Somehow I missed your reply before :) Good to know there is interest! I did not take part in the last hackweek so I may be able to do this for my own hackweek soon.