My Shortcut of Choice: Reading the Source Code

Ever stack troubleshooting and browsing the internet trying to find a solution? Well, there is a shortcut out there, and it does require web searches or prompts.

Emmanouil Gkatziouras

CORE ·

Jul. 24, 24 · Opinion

Likes (2)

Comment

Save

2.7K Views

There is a great post post on c2.com. c2.com is one of those golden blogs of the past just like codinghorror and Joel on software. You might have stumbled upon them before especially if you have been around for a long time.

In the past, it was the norm to encourage individuals to read the source code and be able to figure out how things work. I see a trend against it from time to time including ranting on open source software and its documentation, which feels weird since having the source code available is essentially the ultimate form of documentation.

Apart from being something that is encouraged as a good practice, I believe it’s the natural way for your troubleshooting to evolve.

Over the last years, I’ve caught myself residing mostly on reading the source code, instead of StackOverflow, a Generative AI solution, or a Google search. Going straight to the repository of interest has been waaaaaay faster.

There are various reasons for that.

Your Problems Get More Niche

One of the reasons we get the search results we get is popularity. More individuals are searching for Spring Data JPA repositories instead of NameQueries on Hibernate. As the software product you develop advances, the more specific issues you need to tackle. If you want to understand how the Pub/Sub thread pool is used, chances are you will get tons of search results on getting started with Pub/Sub but none answering your question. And that’s ok, the more things advance the more niche a situation gets.
The same thing applies to Gen AI-based solutions. These solutions have been of great help, especially the ones that crunched vast amounts of open-source repositories, but still, the results are influenced by the average data they have ingested.

We could spend hours battling with search and prompts but going for the source would be way faster.

Buried Under Search Engine Optimization

The moment you go for the second page on a search engine you know it’s over. The information you are looking for is nowhere to be found. On top of that, you get bombarded with sites popping up with information irrelevant to your request. This affects your attention span but also it’s frustrating since a hefty amount of time is spent sorting out the results with the hope of maybe getting your answer.

You Want the Truth

LLMs are great. We are privileged to have this technology in this era. Getting a result from an LLM is based on the training data used. Since GhatGPT has crunched GitHub, the results can be way closer to what I am looking for. This can get me far in certain cases. Not in cases where accuracy is needed. LLMs make stuff up and that’s ok, we are responsible adults and it’s our duty to validate the output of a prompt’s response as well as extract the value that is there. If you are interested in how many streams the BigQuery connector for Apache Beam opens on the stream API, there’s no alternative to reading the source code. The source code is the source of truth.

The same applies to that exotic tool you recently found out, which synchronizes data between two cloud buckets. When you want to know how many operations occur so you can keep the bills low, you have no alternative to checking the source code.

The Quality of the Code Is Great

It’s mind-blowing how easy it is to navigate the source code of open-source projects nowadays. The quality of the code and the practices employed are widespread. Most projects have a structure that is pretty much predictable. Also, the presence of extensive testing assists a lot since the test cases act as a specification of how a component should behave. If you think about it on the one hand I have the choice of issuing multiple search requests or various prompts and then refining them until I get the result of choice, on the other hand, all I have to do is search a project with a predictable structure.

There’s Too Much Software Out There

Overall, there is way too much software out there that would be a Herculean effort to document fully. Also no matter how many software blogs are out there they won’t focus on that specific need of yours. The more specialized a software is, the less likely to be widely documented. I don’t see this as a negative, actually, it’s a positive that we can have software components available to tackle niche use cases. Having that software is already a win, having to read its source is part of using it.

Devil Is in the Details

It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so.

It is common to assume that a software component operates in a specific way and operates based on this assumption. The same assumption can also be found in other sources. But what if that module that you thought was thread-safe is not? What if you have to commit a transaction while you have the assumption that the transaction is auto-committed once you exit a block? Usually, if something is not in the documentation with bold letters we can rely on certain assumptions. Checking the source is the one thing that can protect you from false assumptions. It’s all about understanding how things work and respecting their peculiarities.

Overall, the more I embraced checking the source code the less frustrating things became. Somehow it is my shortcut of choice. Tools and search can fail you, but the source code can’t let you down, it’s the source of truth after all.

Coding best practices Documentation Tool Coding (social sciences)

Published at DZone with permission of Emmanouil Gkatziouras, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending