It is well-known that two git commits within a single repo can be independent from each other, by changing separate files to each other, or changing separate parts of the same file(s). Conversely when a commit changes a line, it is "dependent" on not only the commit which last changed that line, but also any commits which were responsible for providing the surrounding lines of context, because without those previous versions of the line and its context, the commit's diff would not cleanly apply.
As with most dependency relationships, these form a directed acyclic graph. Sometimes it is useful to understand the nature of parts of this graph; for example when porting a commit "A" between git branches via
git cherry-pick, it can be useful to programmatically determine in advance the minimum number of other dependent commits which would also need to be cherry-picked to provide the context for commit "A" to cleanly apply.
Another use case might be to better understand levels of specialism / cross-functionality within an agile team. If I author a commit which modifies (say) lines 34-37 and 102-109 of a file, the authors of the dependent commits forms a list which indicates the group of people I should potentially consider asking to review my commit, since I'm effectively changing "their" code. Monitoring those relationships over time might shed some light on how agile teams should best coordinate efforts on shared code bases.
I'm sure there are other use cases I haven't yet thought of.
I have written a tool called
git deps which automatically walks this graph. Currently the output is text only, but it would be cool to visualise the results, e.g. by generating a static HTML page which uses
d3.js to provide a force-directed, zoomable layout of the graph where individual nodes can be hovered over or clicked to expand to see the commit details. It would be nice to colour the nodes according to the commit author. There is a decent amount of prior art for visualizing dependency graphs in this way, e.g.
Also the tool should make better use of
pygit2 since blame support was not complete when it was originally written.
(BTW the dependency graph is likely to be semantically incomplete; for example it would not auto-detect dependencies between a commit which changes code and another commit which changes documentation or tests to reflect this code. (Incidentally this is one reason why it is usually a very good idea to logically group such changes together in a single commit.)
Looking for mad skills in:
This project is part of:
Hack Week 11