Choosing the correct file structure in your projects
When talking about software structure we often think in terms of design patterns and generally software architecture. The actual file structure within your code base is less talked about, but is equally important. Good choices go a long way towards a software with good long-term readability and maintainability. In this article we would like to compare the two conceptual approaches of vertical and horizontal slicing and help you decide on a fitting approach for your use case.
Software development comes with a wide range of challenges. We probably all know a version of the joke about the two hard problems in computer science (Naming, Cache Invalidation and Off-By-1 Errors). There is indeed a certain difficulty about giving things good names that encompass the essential function, distinguish the thing from similiar things (choosing "thing" for example seems to be a bad idea) and still are catchy enough not to pollute your code and file structure (I'm looking at you Java with your ridiculously long albeit maximally specific class names: InternalFrameInternalFrameTitlePaneInternalFrameTitlePaneMaximizeButtonWindowNotFocusedState). When it comes to file structure, choosing good names is just one of the problems. Oftentimes when starting a new project developers start arguing about the best way to devide the project code files into meaningful chunks. It usually boils down to the argument about either slicing vertically or horizontally. In this article I would like to talk about those two approaches, weigh advantages and disadvantages and discover, whether it is really just a question of personal taste of if there are meaningful consequences to the choice you make in your project.
Slicing the cake
I have to admit, I quite like the analogy of a cake when talking about software structure.
To build a layered cake, you add layer by layer, starting from the bottom and work your way up to the top. In the end have assembled multiple horizontal slices, with each slice being prepared beforehand (baking the cake layers, mixing the fillings and the frosting) in an own procedure. You might decide that to satisfy the preferences of every guest you add a cherry on top of one piece, a waffle or a piece of chocolate on the other and maybe a gummy bear on another piece. Now that we have prepared our cake we can finally get to the software side of things:
When building a software we often think in terms of software layers. An example of a simple (backend) architecture could be:
- Data/Persistence Layer: Manages access to data through database communication, file system access or access to internal or external APIs
- Business Layer: Services that run business logic to allow for processing data
- API Layer: Web-APIs for a client to access and manipulate data
Like the layers in the cake the layers within the software can be developed independently, possibly by different parts of the team. Technologies and technological aspects separate the horizontal layers and to get a working software (or cake), all layers need to be put together and to be connected.
As in the cake where different processes are required to prepare the different layers, the horizontal layers in software development are separated by different technological concepts.
If we create a file structure with those technological/horizontal layers in mind, it might look like this:
model/ |-- Classroom.cs |-- Student.cs repository/ |-- ClassroomRepository.cs |-- StudentRepository.cs service/ |-- ClassroomService.cs |-- StudentService.cs api/ |-- ClassroomApi.cs |-- StudentApi.cs
Technologically driven file structure (horizontal slices)
We could see the cherry or the gummy bear on top of a piece of the cake as the distinguishing factor between the pieces. Our guests represent the business side of things that either consume the cherry piece or the piece with the gummy bear. In our software the different entities within the business domain (Classroom, Student) represent the single distinguishable pieces of the cake.
Although from a point of view of our guests every single layer does matter, the more important thing is to receive a full piece with all layers. To serve a piece of the cake, we need to create vertical slices. In our software we need components of all the integrated layers for a specific entity to work properly. So a vertical slice gives us the full stack required to process a single business entity or workflow.
A domain-driven file structure would represent the business-side distinction of the entities (the pieces of the cake) and would then look like this:
Classroom/ |-- Classroom.cs |-- ClassroomRepository.cs |-- ClassroomService.cs |-- ClassroomApi.cs Student/ |-- Student.cs |-- StudentRepository.cs |-- StudentService.cs |-- StudentApi.cs
Domain-driven file structure (vertical slices)
Keeping things together
With either approach we try to group files together. Choosing vertical slices our grouping is rather specific to the business domain whereas with horizontal slices we group by technologies and technological aspects within our software. It is not the first time we reference Kent C. Dodds who makes a good point about the advantages of colocation: keeping things together that change together to drastically improve readability and maintainability. The obvious question now arises: In which ways does our software change?
Adding to the cake
Having a working architecture and a first version of the software running is usually not the end of the story. And this is the part where you might regret the decisions you made earlier about the file structure. We could imagine two cases for us of how to add to the cake.
Case 1: Extending the domain to encompass a new business aspect
Thinking in pieces of cake this could be represented as extending the cake sideways (I will leave it to your imagination how exactly you might do this) to allow for another piece or vertical slice. The important thing is, this new slice will be mainly composed of the same (horizontal) layers except for the custom topping.
Using a domain-driven file structure (vertical slices) we could simply select a similar entity as a template for all the required files to include the new entity. Updating an existing domain entity would be just as easy because we could find all the files close by. This would be a good example for cohesive entities (cohesion: degree by which things change together) being colocated.
Using a technology-driven file structure we would need to ensure that we check for every representation of the model (*Repository.cs, *Service.cs, *Api.cs) to be updated or added accordingly. This would not only be inconvenient from a perspective of the developer but also pose a certain risk for missing one of the many folders we need to adjust files in.
Case 2: Adding a new feature that requires the layer infrastructure of the project to be adjusted or new layers to be added
Again using a domain-driven file structure this would require checking the corresponding file (e.g. *Repository.cs file) in every entity-folder. With a techology driven file structure (horizontal slices) we would again have a high degree of colocation of cohesive files. As we can see, the cohesion of the files changes depending on the context.
So it seems one factor for choosing the approach for structuring our code base is the kind of changes we expect to encounter during the lifetime of our application.
At this point I would like to add that in any case there should still be a place for shared logic. For example all *Repository.cs files could either inherit from a shared base class BaseRepository.cs or use a shared module to communicate with the database. For a technologically driven file structure the infrastructure files could be found in the same folders as the implementations. The adjusted file structures might look like this:
model/ |-- Classroom.cs |-- Student.cs repository/ |-- ClassroomRepository.cs |-- StudentRepository.cs |-- BaseRepository.cs service/ |-- ClassroomService.cs |-- StudentService.cs api/ |-- ClassroomApi.cs |-- StudentApi.cs |-- BaseApi.cs
Technologically driven file structure (horizontal slices) with shared infrastructure
Using a domain-driven file structure we could place infrastructure code in a separate folder.
domain/ |-- Classroom/ |-- Classroom.cs |-- ClassroomRepository.cs |-- ClassroomService.cs |-- ClassroomApi.cs |-- Student/ |-- Student.cs |-- StudentRepository.cs |-- StudentService.cs |-- StudentApi.cs infrastructure/ |-- BaseRepository.cs |-- BaseApi.cs
Domain-driven file structure (vertical slices) with shared infrastructure
While using shared modules does increase coupling, considering adjustment to new features it allows us to achieve the best of both worlds in combination with a domain-driven file structure: Domain-entity-specific files stay colocated and we can still indirectly make adjustments to a full horizontal layer through editing the shared code.
Agile vs. Waterfall or Vertical vs. Horizontal?
I completely agree with Jerry Virgo – who wrote a really insightful article about this topic – that it is not surpising that in bigger teams with split responsibilities the code base will reflect this with predominantly horizontally layered file structures. Smaller teams tend to give the necessity to develop in full-stack which is why in start-up businesses you will encounter verticle slicing more often than not.
The choice of file structure will be able to reflect the main mode of development methodology. A modern agile approach models new features or workflows in user stories where each user story could be represented as a vertical slice. With a mainly vertical file structure adding a new vertical slice is easy and convenient. In projects following a waterfall approach, whole technologically coherent parts (horizontal slices) of the system may be built up one by one in stages with the end of each stage of development being marked by finishing one complete horizontal layer maybe even having own separated teams working on the different horizontal layers.
So it's a piece of cake?
The short and unsurprising answer is: No. You will almost never find projects where you have completely independent vertical slices and, having selected a vertical slicing approach in your file structure, you might struggle to find proper ways to integrate cross-entity functionality cleanly into your code base. User stories may and will create slices that either don't go all the way through (imagine the 4 CRUD stories for an entity, only having separated services and APIs but being run on a shared model and repository) or that span multiple of your existing slices (user story of assigning one entity to another: assigning a Student to a Classroom).
While in this article I mainly focussed on the extremes of either slicing completely vertically or completely horizontally, there may well be mixed approaches. You might for example separate your project into multiple modules that each take their own approach.
Maybe one day we will see a different approach to managing our code files. The first thing that comes to my mind is tagging files with both technological (Repository, Service, API) and business-domain (Classroom, Student) tags. A trend that could already be perceived a while ago in managing image files. Depending on the use case we then could either group by technological tags or business-domain tags but this would certainly come with its own challenges.
Using the convention of having files named by a combination of entity and service type, a combination of domain and technology already is a kind of tagging. And with modern IDEs having powerful project wide quicksearches for files we already can partially surmount the boundaries that a fixed file structure imposes onto us. For now we still have to select one approach for our projects. The mode of development, team size and the requirements for long term flexibility should be taken into account when doing this as there can be long term benefits of choosing the correct one for your application.
Enjoy your cake! 🍰