The Pitfalls of Preliminary Over-Modularization in Android Projects

Modularization of Android applications has grown to be popular topic in Android community over the past couple of years. In older times, you wouldn’t hear much on this subject, but today it feels like it’s not even “hello world” app if it doesn’t have several modules.

Don’t take my word for that, though. Search for “Android modularization” in Google and set the time range to 2016. You’ll find very few resources. But if you change the time range to 2019, you’ll find lots of articles and tutorial on this subject. Including a flood of presentations by googlers at various conferences.

In my opinion, the hype around modularization causes much more harm than good. It makes it look like something that all Android projects need, while, in practice, that’s not exactly the case and it’s important to understand the nuances. Therefore, in this article, I’ll explain the risks associated with preliminary modularization and over-modularization. In addition, I’ll address several widespread myths about modularization circulating in Android community.

Modularization in Android

We say that Android application is modularized if it consists of more than one Gradle module. [I know that build systems other than Gradle exist, but they are far less popular, so I’ll ignore them for the sake of this post.]

Gradle module is basically a standalone project. When you create new Android project in Android Studio, the wizard automatically adds one module for you. This single module usually called “app”. As long as you don’t add new modules, your app is not modularized.

[Terminology clarification: in Gradle’s official terminology, modules are called projects. However, in this article, whenever I say “project”, I mean the higher-level structure that includes one or more Gradle projects. Whenever I say “module”, I mean the individual Gradle’s projects].

The simplest way to modularize your application is to right-click in Project navigator in Android Studio and then click on New->Module. Then choose the type of module you’d like to add, configure module’s attributes and voila, your app is modularized. Android Studio will also include your newly created module in top-level project’s settings.gradle. This will allow you to make use of the newly created module in other modules.

For example, if you add new module and call it newmodule, then, when you want to use it from app module, you need to add the following line to dependencies section in app‘s build.gradle file:

[code language=”groovy”]
implementation project(":newmodule")
[/code]

As you see, adding new modules to your application isn’t difficult. It takes just a minute. But this fact doesn’t explain why you’d want to have multiple modules to begin with. Therefore, in the following sections, I’ll discuss the most common reasons for modularization.

Shorter Build Times

The main reason for the popularity of modularization is that it supposedly reduces build times.

As many Android developers know, on bigger projects, build times become the main limiting factor of developers’ productivity. It’s so frustrating when you need to wait for a minute or more for the app to build after you changed one single character in the code. In this situation, you’re pretty much guaranteed to be unable to reach the state of “flow”. Furthermore, long build times become especially problematic if you want to follow Test Driven Development practice, for which short feedback cycle is essential. I, personally, find it very difficult to do TDD if it takes more than ~30 seconds to build the app and execute tests.

So, long build times is a real problem. How does modularization help with it?

Well, Gradle has several mechanisms that reduce the amount of work that needs to be done on incremental builds. These mechanisms can be especially beneficial in modularized projects. For example, if you change code in one module, other modules that don’t depend on the changed module won’t need to be rebuilt. In fact, even modules that do depend on the changed module might not need to be rebuilt if the change is ABI compatible.

Sounds great, right? Unfortunately, it’s not that simple.

See, modularization by itself introduces additional build time overhead. Therefore, for example, extracting each source file into standalone module is not a good idea at all. This bears the question: “what’s the right size for a module, then?”. Unfortunately, as far as I know, this question has no definitive answer and there isn’t even a rule of thumb for that. Therefore, by making your modules too small, you can easily shoot yourself in the foot.

In addition, Gradle’s incremental compilation and compilation avoidance can be “broken” by incorrect configuration, misbehaving dependencies, non-incremental annotation processors and, probably, more factors. When this happens, modularization becomes a net negative contributor to build times. Some developers say: “well, just don’t break things then”, but it’s really not that simple. Gradle is extremely complex beast and most developers (including myself) aren’t very proficient with it. Bigger companies often employ “infrastructure” teams who take care of builds, but it’s not an option for absolute majority of projects out there.

What I’m trying to say here is that modularization can easily make your build times worse. This point is rarely ever mentioned in blogs and conference talks, but you should definitely be aware of it. Therefore, even though adding new modules to the project is simple, doing that without hurting yourself is actually quite difficult and can take a considerable effort.

Build Time Stats

What I find especially troubling is that advocates of modularization don’t usually provide any quantitative context. Modularization is just advertised as some kind of best practice, while, in fact, in many cases you don’t need modularization at all.

To illustrate my point, I’m going to list some real numbers from real Android projects in this section.

Application A.

  • 27 KLOC (thousands of lines of source code, excluding tests)
  • Java
  • 1 module
  • 2 incremental annotation processors
  • Incremental build: 2s

This is relatively small, but also quite complex application in terms of interfaces to external components and configurations. Having build times under five seconds is the dream of all Android developers. I’ll use this application as a baseline for comparison.

It should be clear that investing any effort into modularization of this application for the sake of achieving better build times would be a total waste of time. Even reducing the build time by 50% (unreasonably high number) wouldn’t justify the investment.

Application B.

  • 67 KLOC
  • Java
  • 1 module
  • 4 non-incremental annotation processors
  • Incremental build: 10s

This is medium-sized financial application. As you can see, the increase in build time is disproportionate to the increase in app’s size compared to application A. In general, I found that build times don’t scale linearly with lines of code, but in this case the discrepancy is too big. In my estimation, the additional overhead can be attributed to non-incremental annotation processing, but I didn’t test this hypothesis.

Does it worth it to try modularizing this app to reduce the build time? Well, it’s a question of ROI.

Let’s say I can reduce the build times by 50% if I invest 8 hours of work into extraction of logic into child modules (again, unreasonably optimistic assumption). After how many builds will my efforts pay off? 5760! That’s probably more builds than this app saw during its entire lifetime (three years). So, I, personally, don’t think this app needs any modularization.

Application C.

  • 87 KLOC
  • Java
  • 4 modules (app: 61 KLOC; X: 2 KLOC; Y: 6 KLOC; Z: 18 KLOC)
  • no annotation processors (yes, you can do that)
  • Incremental build after change in module “app”: 12s
  • Incremental build after change in module “X”: 16s

This is very interesting and very complex medical application. It was effectively written as single-module application, with modules Y and Z containing some specialized third-party code which is rarely ever changed. I myself added module X into the mix.

What’s interesting to note here is that the existence of module X evidently hurts the build times. Note how any change in this tiny module disproportionately increases the build time by more than 30% compared to module app! You’d hope that, at least, it reduces the build times when the change is made in module app, but even that isn’t true. These 2 thousands lines of code that currently reside in X wouldn’t cause any noticeable difference in build time if they’d reside in app, but the configuration overhead associated with one additional module is quite significant.

To test this hypothesis, I moved the contents of module X back into module app and deleted X. Then I tested the build performance again. After this refactoring, changes in both the original files from module app and in files that were migrated from X led to 10-11s incremental builds. So, yeah, the existence of this additional module makes the performance worse in all situations.

And that’s really the main point I want to make in this article: modularization can easily make your build times longer. You can’t just add several modules, move some files there and expect that your builds will automatically get faster.

By the way, you might wonder why module X is kept in this app if it increases the build times. Well, it’s because it wasn’t added to improve the build times in the first place. See, this app is complex and quite difficult to maintain. Just to give you an idea: MainActivity with 5 KLOC and MainFragment with another 5 KLOC. So, I added module X to host newly written, clean logic, and to slowly migrate existing refactored features into it. In other words, modularization in this case was used to introduce architectural boundary between “legacy” and “new” parts of the app. Since the increase in build times is not too dramatic in absolute terms, it’s a justified trade-off in this case.

Google IO Scheduler 2019:

All the above apps use Java exclusively. To get the idea of how it works out with Kotlin let’s take a look at Google’s IO Scheduler app.

  • 35 KLOC
  • Kotlin
  • 6 modules (mobile: 28 KLOC; shared: 7 KLOC; model: 428 LOC; ar: 80 LOC; androidTest-shared: 24 LOC; test-shared: 0 LOC (only test code))
  • 4 annotation processors
  • Incremental build after change in module “mobile”: 7-21s
  • Incremental build after change in module “shared”: 12s
  • Incremental build after change in module “model”: 12-31s

The first interesting thing to note is inconsistency in build times. In most cases, the build times are closer to the lower number, but once in a while something happens and it takes 2-3 more time to build incrementally. I didn’t invest time to look into that, but feel free to leave a comment if you know what’s the problem there.

Now, even assuming that only the lower ends of build times ranges are relevant, these stats are still problematic. See, this app isn’t much bigger than the baseline application A, but its incremental build takes three to six times more time. Just think about this crazy ratio for a moment.

What could be the reasons for such a bad build performance?

Well, for one, IO Scheduler is written in Kotlin, which takes much more time to compile than Java. Then there is Kapt, which is, according to Uber’s benchmarks, can basically double the compilation times. [Side-note: I’m still amazed that Google hides these crucially important facts from Android developers while they promote Kotlin.]. Then there is the fact that this app uses more annotation processors. So, at least three factors that could’ve contributed to longer builds.

But, just like with application C, it’s evident that modularization makes matters worse. For example, when you make a change in module model, which contains just 400 lines of code, it takes ~60% more time to build the app compared to a change in module mobile. And that’s not even the smallest module in the app. I’m not even sure why IO Scheduler needs all these modules to begin with. The existence of shared module hints at some kind of code sharing, but, even if that’s the case, the application would need just one additional module to share code, not five!

And there is more. If you just rebuild IO Scheduler without changing anything, it will still take four seconds because seven tasks will be executed. That’s your first sign that the incremental build is broken. Again, I didn’t invest time to explore why Google’s app is that broken, so please write a comment if you know the reason.

All in all, IO Scheduler is terrible in terms of build times and it employs nonsensical modularization strategy which clearly makes the build times even worse. In addition, it looks like the incremental build in this app is broken. Given the fact that googlers often point developers to this application as a reference, I’d expect to see a project of higher quality.

Edit:

This specific discussion of IOScheduler generated a surprising amount of feedback. Therefore, I decided to invest more time into this app and perform a more nuanced analysis. You can read about the results here.

Moonshot:

The last example that I want to show you is a small app that shows scheduled launches of SpaceX rockets. I guess it’s a must have for SpaceX fans out there 🙂

Now, there is a huge difference between this application and the previous examples because Moonshot is a pet-project of a single developer: Kshitij Chauhan. I truly respect developers who self-educate and test new tools in pet-projects, instead of turning their professional codebases into kitchensinks of the latest and the greatest techs. Therefore, while I’m going to criticize the structure of this project, I really like it.

  • 12 KLOC
  • Kotlin (incl. in Gradle files)
  • 18 modules
  • 4 non-incremental annotation processors
  • Incremental build after change in module “app”: 20s
  • Incremental build after change in module “core”: 25s-1m15s (sometimes build simply fails with exception)

In my opinion, this pet-project clearly demonstrates the danger of following the “latest and greatest” trends in Android development. Its build times are at least 10 times longer than what I’d expect for a project of this size. In addition, I’ve never experienced so many outright build failures, though I’m not sure whether it’s related to the project, or to the latest “stable” release of Android Studio.

Why this app’s build performance is so poor? Well, just like IO Scheduler, it uses Kotlin and Kapt. Moreover, it uses Kotlin even in Gradle configuration files, which is known to cause additional build overhead. In addition, annotation processors employed in this app are non-incremental (probably due to Epoxy).

And, of course, there is modularization. Having 18 modules for 12 KLOC project is crazy over-modularization which is very expensive in terms of build times. I guess the developer implemented the commonly recommended “modularize by feature” approach, but, just like many other developers, shot himself in the foot with a shotgun.

One very interesting experiment you can make is to rebuild this app without changing any code. It will show you that all 499(!) tasks are up-to-date. So, incremental build isn’t broken in this app, as opposed to Google’s IO Scheduler. However, even though no tasks get executed, such a build still takes 3 seconds on my monstrous desktop (i9, 32GB RAM, the fastest SSD and motherboard I could afford). In other words: doing nothing with this 12 KLOC application takes Gradle more time than re-building the baseline application A, which has 27 KLOC, after a change.

How can this even be possible?

Well, these three seconds, which is a lot of time, are spent on so-called configuration phase. And the length of this phase depends on the number of modules in your app. So, it’s kind of constant tax for modularized apps. Gotcha.

You can find some additional build stats here and here. Although they’re not directly comparable (because they’re measured on different machines), you can still see from there that over-modularization can be a problem.

All in all, as you can see, you don’t really need modularization for a wide range of small- and medium-sized projects. Furthermore, by adding unneeded modules, you can easily make your build times much longer than they need to be.

Better Architecture

Another popular claim about modularization in Android is that it automatically makes your architecture better. The argument goes like that: since Gradle doesn’t allow circular dependencies between modules, you’ll have less coupling and better structure of inter-dependnecies.

Well, Gradle indeed doesn’t allow circular references between modules, which is indeed a good thing in my opinion. To say that it automatically makes the architecture better, however, is a baseless statement, as far as I’m concerned.

For example, in the previous section I discussed two overly-modularized open-source apps (IO Scheduler and Moonshot). You saw how modularization hurt their build times, but, maybe, these apps gained something architecture-wise? Well, in my opinion, modularization didn’t make their architecture any better. In fact, modularization made both these codebases harder to explore and reason about. In these cases, modularization complicated the apps for no reason.

Can you use modularization to improve architecture? Yes, definitely. Is this process automatic or even reasonably simple? Not at all. In fact, it’s extremely difficult task that requires a lot of experience, thought and discipline.

I won’t go into more details here, but I’ll give just one simple advice: if the structure of modules in your application follows the structure of application’s screens, you’re on a wrong path. Screens are very poor candidates to represent app’s top-level architectural boundaries.

Big Applications

As you might’ve noticed, I only discussed small- and medium-sized applications so far. What happens when your app gets larger than, say, 100 KLOC in size? Well, at some point, modularization becomes essential to keep build times sane. However, even then you need to be careful with how you modularize your application and keep an eye on performance metrics.

To be honest, I don’t have experience with huge apps having 500+ KLOC, so I’m not in position to recommend anything in these special cases. However, my gut feeling says that many of these projects suffer from over-modularization as well. I just can’t see how it makes sense to have a hundred, or even many hundreds of small modules. You can surely find meaningful architectural boundaries within projects that can be emphasized by extracting reasonably-sized modules, all the while having reasonable build times. In my estimation, if the average size of a module in your project is below 10 KLOC, it’s a bad sign.

But, again, these are just speculations of mine, not based on any real-world experience.

Summary

My goal with this post was to warn you about the pitfalls of preliminary and over-modularization. Unfortunately, these two anti-patterns are very popular in Android community today and their drawbacks are not discussed at all.

In my estimation, most Android projects can happily proceed as single-module applications for a very long time. Maybe even indefinitely. To demonstrate this point, I showed you quantitative metrics from real-world projects. Unfortunately, once your project grows beyond certain size, you’ll need to modularize. Still, the best strategy in light of this fact is to wait until you experience real performance issues and you can’t find simpler ways to address them.

In my opinion, having tens of modules in your project is nothing to boast about. Having few meaningful modules which emphasize important architectural boundaries within your codebase is what you should aim for.

Hopefully, you understand now that modularization is not a simple concept and it’s not a silver bullet. Instead, proper modularization is an art and a science at the same time, mixed with some amount of black magic. Therefore, approach it with caution. In the next article, I’ll list cases when modularization is justified and explain what benefits you get out of it.

As always, thank you for reading. If you have any comments or questions, you can leave them below.

Check Out My Courses on Udemy

6 comments on "The Pitfalls of Preliminary Over-Modularization in Android Projects"

  1. In a very “Big” application , modularization’s build time has some other definition. In your entire article you emphasize a lot of compiling/building the entire repositories. However in a real world scenario, sometimes a modularizating app doesnt need to compile all modules…

    For example, take https://www.ctrip.com/ this chinese OTA app as example, this single app contains more than 20+ feature, like flight, train, hotel, travel ticket booking… etc. each one of them is a indepdendent business unit and they contribute code to same respository,but in different module. So assume im a developer who work on flight ticket department, in dev time i can just choose to shut other module downs without compiling at all, it deletes other features but its fine until i need to call to start activities from other module.

    This is a very big extra benefit we get from modularzing a very very big app. just fyi.

    Reply
    • Hello Qing,
      That’s definitely a good use of modularization, which I also covered in the following article about valid reasons to modularize Android projects.

      Reply
  2. Hi Vasiliy,

    It is really fetching to go through about this post. However, I would like to know the ways with which we can reduce the build time for the modularized project which cant be simplified at one go and which was integrated with firebase crashalytics for logging. I have disabled crashalytics for debug builds but, it did not make a change. So, would like to know if the build time increment is because of over modularization of the project or not.

    Thank you,
    Kiramai

    Reply

Leave a Comment