March 22, 2017
Working with Git is great. The modern version control system that boosts a distributed architecture has much to offer when compared to many of its alternatives. It keeps track of any changes in the file contents that are in the repository, you can easily go back to a previous state of the project and review what went wrong when nothing seems to work and enables you to work with a team without much hassle.
But more often than not, every project encounters the scenario where it needs another project (or commonly referred to as library) to function properly. Using third party libraries is generally favored rather than creating something from scratch. When such a situation occurs, the most common approach is to use a dependency manager (like cocoa pods, gradle, composer, and countless more…) to take care of installing and updating the required libraries.
This is a decent enough solution until the library your project is dependent on has a new version release with lesser bugs but the dependency manager isn’t updated about it. This can be because the developer chose to stop using the particular dependency manager to distribute his product. Whatever the reason maybe, when the dependency manager you’ve been using doesn’t seem to contain the library with the particular version you need, it’s always a nuisance to implement an alternative.
This becomes exasperating when you decide to use multiple dependency managers just to keep your project afloat.This is when you wish if there was anything available out there to directly include the library into your project via its own repository so that you can control the version of the library easily. Git takes care of this issue using submodules. This allows you to include a Git repository as a subdirectory of another Git repository, and yet keep the commits of both these repositories separate. This can be used to clone the repository of another library (or another one of your own projects), into your current project.
WHAT IS A GIT SUBMODULE?
As mentioned before, Git submodule enables us to add a Git repository as a subdirectory (sub-folder) of another Git repository at a specific path. This subdirectory still contains its own Git system, which means you can set the subdirectory to a specific commit with which you want to work with.
A file ‘.gitmodules’ is created in the parent directory which keeps track of the all the submodules configuration. This file maps the Project’s repository Url and the local subdirectory. Multiple entries will be recorded for multiple submodules.
‘.gitmodules’ file in the parent Git directory which has integrated the submodules
A generic .gitmodules file with three different submodule entries will look like:
In the above example, SubModule1, SubModule2, and SubModule3 are integrated into the project via the subdirectories of SubModule1Local, SubModule2Local, and SubModule3Local respectively. The ‘url’ gives the repository URL link for the respective submodules using which Git clones the project.
WHO IS GIT SUBMODULE INTENDED FOR?
There is a popular opinion that Git submodule is harder for a beginner to comprehend. This is partially true since Git in itself can be a bit too overwhelming, to begin with. It is recommended that one is comfortable with all the fundamentals of Git before diving into Git Submodules since you’ll be dealing with multiple tracking trees.
In brief, any changes made in the directory of the submodule will be committed to the repository of submodule and not the main module. The main module tracks the changes in the commit of the submodule and not the actual change to the contents itself. The submodule directory acts like a link to the whatever repository you’ve decided to include into your project.
For example, consider I have a project named ‘MainModule’ with URL ‘https://github.com/aju-techjini/MainModule.git’. I would now like to include another project named ‘SubModule’ with URL ‘https://github.com/aju-techjini/SubModule1.git’ in a folder named ‘SubModule1’. Once I’ve added the Submodule1 project to MainModule project in the SubModule1 folder,
My project folder now looks like:
And my .gitmodules file looks like:
Now when I create a ‘SampleFile’ in a MainModule project folder, it is tracked by the MainModule repository and can be committed as a normal file. But when I create ‘AnotherSampleFile’ in a Submodule1 folder, it isn’t tracked by the MainModule repository. MainModule repository does recognize that there is a modified content in the Submodule1 directory, but it is unable to track or commit the changes. Once you change your current working directory into a Submodule1 folder, you’ll be able to track and commit it using a Submodule1 repository. The MainModule repository now tracks that a new commit has been made to the Submodule1 submodule and on committing the code in MainModule, it records and updates the current commit of Submodule1.
WHEN TO USE GIT SUBMODULES?
-Git submodules generally are considered complicated to work with as the complexity of libraries, and your own project increases. The entire purpose of using a version control system and submodules are to make the life of a developer easier. So it’s better to use a dependency manager if your project seems to be better handled with a dependency manager rather than submodules.
-When using Git submodules it’s always at most important to have consistency between the submodule contents between all the collaborators. If you have updated a submodule to a different commit, the repository doesn’t automatically update the contents of the submodules. You’ll have to let your teammates know that they are required to update their submodules to avoid weird behaviors.
-So it’s best to use Git submodules when other solutions are unreliable and the team of collaborators has really good communication (Good team communication is recommended regardless of the use of Git submodules).
HOW TO USE GIT SUBMODULES?
Now we’ll look into how to use Git submodules. As discussed before, a Git submodule is considered to be a link to another Git repository. Hence, every submodule will also have a local and server-side repository. We’ll look into how to integrate a submodule, update it and delete it.
-Initializing a submodule: Consider that there is two sample submodule that I have to integrate into my project which is named ‘SampleProject’. The two sample submodules are ‘SampleModule1’ with URL ‘https://github.com/aju-techjini/SubModule1.git’ and ‘SampleModule2’ with URL ‘https://github.com/aju-techjini/SubModule2.git’.
1)We’ll start by initializing a Git repository for SampleProject, to do this we use the command in the terminal:
2)We’ll create a SampleFile for initial commit using any text editor and commit the file using the command:
git commit -m “Initial Commit”
My project folder now looks like
3)Once the git is setup, we can now include our submodules to the project. We have the details for the submodule above, using that we’ll integrate those repositories into our project as one of the subdirectories:
git submodule add https://github.com/aju-techjini/SubModule1.git SubModule1Local
git submodule add https://github.com/aju-techjini/SubModule2.git SubModule2Local
Here three files are created ‘.gitmodules’ to map the submodule URL to a local directory, ‘SubModule1Local’ and ‘SubModule2Local’ is a simple file which stores the commit reference. We’ll be able to check the commit references once we have committed the code.
4)We’ll go ahead and commit the code using the command:
git commit -am ”Integrated Submodules”
4.a)We can check the commit difference using ‘git show’ command, which gives the edit in the three files mentioned above, ‘.gitmodules’, ‘SubModule1Local’, and ‘SubModule2Local’.
Here you can see that the files ‘SubModule1Local’ and ‘Submodule2Local’ just added a single line ‘+Subproject commit df72b0bca243b0cf0259f58d6fa62666cec943c5’ and ‘+Subproject commit b1d6cb161063750fc004eb26eb6a06965f3ada82’ respectively.
My Project folder now looks like,
My .gitmodules file contents are,
Note: you can use ‘git submodule status’ command to check the current branch and commit of the submodules
git submodule status
-Updating a submodule: We just saw how to create/initialize submodules. Do remember that one of the collaborators in the team clones the repository, the submodules will not be initialized. They need to use these commands to fetch the submodules content:
git submodule init
git submodule update
Note: One of the alternative command to use to avoid using submodule init and update commands would be to use –recursive option when cloning the main project that contains the submodules:
git clone –recursive <Project-Url>
-Removing a submodule: To Properly remove the Submodules from your project, you can use the following commands (It was a pain to find the proper commands to use for this, thanks to CodeWizard <http://stackoverflow.com/users/1755598/> for clearing it up):
git submodule deinit <SubModule>
git rm <SubModule>
# Note: SubModule (no trailing slash)
# or, if you want to leave it in your working tree
git rm –cached <SubModule>
rm -rf .git/modules/<SubModule>
“Commands Quoted From http://stackoverflow.com/questions/29850029/
Let’s try removing the SubModule1Local from our project:
git submodule deinit SubModule1Local
git rm SubModule1Local
My project folder now looks like,
My ‘.gitmodules’ file contents are,
-Working in Submodule: You can directly work in the repository of the submodule by changing the current working directory to the subdirectory of the submodule. Once the changes are made, we can directly commit in the repository of submodule and push it.
Then to update the commit in your main project, just change the current working directory back to your project path and make a commit.
We’ll look into a sample example here,
1)We’ll change the working directory to a SubModule1Local folder and edit the file already present in the SubModule1Local folder named ‘AnotherSampleFile’. Using ‘git status’ we can see the file whose content has been changed:
2)Let’s change the directory to the main folder ‘SampleProject’ and run the command ‘git status’ again to check the changes that the SampleProject’s repository is tracking
Here the SampleProject’s repository tracks that SubModule1Local submodule has some contents that are modified but they can’t be tracked using this repository. Let’s try to commit the changes to the SampleProject’s repository and see what happens,
git commit -am “SubModule Changed”
As we see here, the changes made in the submodule can’t be tracked or committed to the main module.
3)Let’s change our directory back to the submodule directory and commit the changes
git commit -am “File Change for demo”
The changes are committed to the repository of SubModule1 locally, it will not be available to other collaborators unless you push using ‘git push’ command. If you decide not to push the changes, the link in the main module for this particular submodule will point to a non-existent commit which will lead to inconsistencies.
Using ‘git status’ will inform you about the number of commits yet to be pushed to a remote.
4)Let’s change our working directory back to the SampleProject folder and use ‘git status’ to track the status of our repo.
Here we see that the repo tracks that SubModule1Local submodule has a different commit now. We need to commit this change and push so that the repo is updated about the commit change in the submodule.
All collaborators (once they pull the changes in the main module) has to run a ‘git submodule update’ command to update the submodule to correct commit.
As Mentioned before, if the said commit isn’t pushed in the submodule repository, it’ll cause an inconsistency and the submodule will show that it has a detached head.
CONS OF GIT SUBMODULES
Now we know what, when and how to use Git Submodules. Let’s now look at few of the reasons that this solution may cause issues during development of a project.
a)We’ve been talking about how susceptible Git Submodules are towards inconsistencies. That seems to be almost the only problem that threatens Git submodules. Listed here are some of the scenarios this issue occur,
-When one of the collaborators have made changes to the submodule but doesn’t push the submodule repo, other collaborators will not be in sync.
-When one of the collaborators have made commit changes to the submodules if others don’t use the ‘git submodule update’ command. The repo will have inconsistent commits.
-If two different branches are being merged, the submodules remain unchanged and submodule update command must be used without fail.
b)It’s almost impossible to track the entire project at a time when using submodules. Changes in the submodules aren’t tracked by the main module and must change the current working directory to look at the changes.
c)It’s hard to use IDEs when using Git Submodules and developers are forced to use the shell to maintain consistency.