5 common software product challenges that require innovative, focused action

16 November 2020

Over the past 20+ years, I have been deeply involved in designing, building and delivering systems in a few different sectors across a multitude of technologies and methodologies, varying team sizes and skill levels, as well as differing levels of client sophistication. Predominantly these have been Java based solutions within financial services. For the past three years I have worked directly on the product side for a vendor who ships to clients. It’s been a great experience with lots of learning and laced with a healthy dose of fun and the chance to work with an awesome team of winners.

Time is a great teacher and in the years of exposure, you start to develop “spidey senses” that something, or things, or everything, is not quite right. You have a gut feeling that you cannot quell or quiet, and it normally turns out that your teammates are experiencing the same phenomenon. Once this happens, it signals the start of your next journey, assuming you are all passionate enough about addressing the challenges! Here I’ll talk about some of the key signs that we have experienced to some degree and addressed as we continue to innovate our collaborative payments platform, IPF.

1. Diagnosing problems requires your most experienced and highly skilled team members

You are following DevOps practices and have a great, helpful and skilled team of engineers, business analysts, architects and product owners and yet it still seems issue analysis and diagnosis consumes far too much mental energy and sucks in your most experienced team members. You have even built custom support tooling and used well known observability products and/or services. However, there are still times where the team is struggling to narrate exactly what has happened. Reactive, async, non-blocking programming models can make this much harder. Generally, this points to a complex, fragmented codebase with logic scattered in many places and little adherence to sound design principles such as SOLID.

Make supportability a quality attribute

To remedy this situation, we made ‘supportability’ a quality attribute and a part of every ticket sign off. Adopting Domain Driven Design (DDD), building modular components with an emphasis on contracts, plus event sourcing for state+history and lots of metrics from our streaming pipelines, has given us a super-rich set of information. This really helps us to quickly understand what has happened in all situations. Ok, we admit in desperate times we have “grep’d the logs”… we’re all human.

2. The architecture isn’t easy to communicate

You may have been in those meetings, or even just casual conversations, where you try to articulate how your product is architected and what patterns are built in. Because you know the codebase and how it really works intimately, you feel uncomfortable and can’t quite express it using language that resonates with the listener and be 100% factual at the same time. Architecture has many strands, and structure, boundaries, intention, abstractions, patterns and naming all spring to mind, especially when you are communicating the details to other people. If it is difficult to explain and you get lots of questions, then either you need to revisit what you are saying or there are underlying reasons which could point to issues with your product architecture or even the target solution architecture. You might be fortunate enough to have a comprehensive and well governed set of design decisions that show your thinking at the time. However, that still doesn’t help if you are having to rely on a historic decision log to defend your explanation.

As we looked to evolve the core architecture of IPF, we took time to reflect on our challenges, revisit our customer’s needs, assess our strategic direction, build a few POCs, collaborate (and disagree) in workshops, assess scores of design decisions, adopt new techniques (e.g. event storming) and adhere to guiding, fundamental principles, and approaches (Domain Driven Design (DDD) and hexagonal architecture). As a result, we arrived at the product architecture we have today: “reactive, event sourced, microservices that orchestrate the payments domain modelled in our Payments Domain Specific Language (DSL)”.

Adopt new design approaches

DDD really forces you to articulate your core domain and the interactions between domains in a model that best represents it, expressed through the language of experts to be ultimately enshrined in your codebase. Hexagonal architecture, in conjunction with DDD, focused our minds on layering, modularity, interfaces and dependencies to ensure the inside, or domain (core), is unaware of the specifics of the outside and wild things like networks. This series is a fantastic exploration of these approaches and much more.

3. To understand one thing, you have to understand everything

This sign can emerge when you are making a change that on the surface appears simple or when you helpfully volunteer to investigate an issue your teammate may be having. You start clicking through the code, opening more and more class files, config files, decompiling dependencies, scribbling notes, looking at git histories and then back to the source JIRA tickets until you realise that you are hopelessly lost and out of capacity to hold any more active relationships in your head. You discover corners of the system from the early days and a tangled web of interdependencies. On the one hand, you have uncovered a lot and crumbs that could be useful in the future but on the other hand, six hours have passed and you are now more confused about a way forward given all of these extra considerations. Don’t get me wrong, I think anyone working on a product should have a broad and, over time, deep knowledge beyond just the area or service they are working on, but the sheer cognitive load and increased time spent getting new team members up to speed becomes inefficient and not enjoyable.

Apply DDD

DDD has acted as an unwavering guide that, combined with great engineering, discipline, and consensually agreed choices from my earlier point, has led to a more modular, clearer codebase that is a joy to work with. Going through a major innovation cycle is a great way for the whole team to feel a renewed sense of ownership and purpose. It is even easier to ensure everything is of the highest quality with deviations known, documented, deliberate and short lived.

4. Documentation is a secondary concern and doesn’t match the codebase

I’m sure there is no shortage of documentation where you are but it is likely to be out of date, incomplete and not well filed. There is always point in time documentation that can stand on its own for things like design decisions or requirements and maybe conceptual topics that change less frequently. However it is rarer to come across accurate, living documentation for a production system or product on its journey to maturity. When you are up against it, trying to meet that impossible, immovable milestone, the focus, quite pragmatically, is on delivering, leaving little time for shiny up-to-date documents that truly reflect the codebase. Corners will likely have been cut and outstanding actions unrecorded for subsequent revisits.

Make documentation a first class citizen

In addressing this one, we promoted documentation to first class citizen giving it equal importance to code, taking inspiration from the great documents we admiringly refer to over at Lightbend – the wonderful creators of Akka (amongst other great things) whose technology underpins our product’s core. Each repository has living ‘ascii docs’ and it enables team members to quickly get up to speed and enhance, or just use, particular components safely and self-sufficiently without much assistance. We also generate valuable pictures from PlantUML, Graphviz and dependency plugins to help surface the information quickly. A final decision was to use the simple yet splendid C4 Model created by Simon Brown to describe the overall architecture of our product plus the solution spaces in which IPF is integrated.

5. Performance of the system doesn’t feel right

Last but not least, and I really did have to trim a few more challenges off to get this past marketing, is our old friend performance. A sometimes overloaded term that generally manifests itself as builds taking too long, nodes consuming more resources than you think they should (“but I’m mostly just mapping, so why do I need 1.27GB of heap”), and inability to attain throughput targets even in optimal conditions. There are many contributing factors and reasons as to why your overall system or product doesn’t perform as you feel it should. However looking at certain graphs from your probes and metrics prompts you to ask “what is it doing, why does it need so much grunt?”. With enough profiling and experimentation plus obsessive determination, you will hone in on the source(s) and maybe things can be improved with config tuning. Perhaps a different framework or maybe just a few code tweaks to remove that blocking method call or strong memory reference but most likely you will be forced to take a step back and challenge the overall processing solution that forms the distributed system that your product lives in.

Performance, resilience and scalability

With the nature of our product and the demanding, very visible banking payments service that it powers, cross-functionals, (especially performance, resilience and scalability) are at the forefront of our minds. We make sure that automated tests are in place, maintained, executed and listened to regularly so as to better understand the behaviour of our product from the exterior and interior (JVM / Akka / Spring) viewpoints. As a result, we are very pleased that the performance improvements we expected to see as part of recent innovation arrived and, in many cases, exceeded what we thought possible. As per all of the points above, the team’s dedication, pride, ownership and downright awesomeness has played the pivotal role in these accomplishments.

Thanks for reading…

…it has been a therapeutic experience to write and I hope you have learned about some of approaches that have worked for us when faced with challenges common to building and operating innovative software. This is the first in a series of blogs written on our journey to version two of The Icon Payments Framework (IPF), keep an eye out for the second in the series coming soon.

Please get in touch if you wish to continue the conversation or indeed would like to learn more about our future-proofed payments technology platform and how it can help your payments transformation journey.

Categories: Banking Blog cloud Instant Payments open banking Payments Payments Transformation Payments Transformation Race Real Time Payments

Simon Callan