How do you holistically document microservices in a multi-repo setup?
from onlinepersona@programming.dev to programming@programming.dev on 30 Apr 16:48
https://programming.dev/post/13467309

Let’s say I had a few microservices in different repositories and they communicated over HTTP using JSON. Some services are triggered directly by other microservices, but others can be triggered by events like a timer going off, a file being dropped into a bucket, a firewall rule blocking X amount of packets and hitting a threshold, etc.

Is there a way to document the microservices together in one holistic view? Maybe, how do you visualise the data, its schema (fields, types, …), and its flow between the microservices?


Bonus (optional) question: Is there a way to handle schema updates? For example generate code from the documentation that triggers a CI build in affected repos to ensure it still works with the updates.

Anti Commercial-AI license

#programming

threaded - newest

deegeese@sopuli.xyz on 30 Apr 16:54 next collapse

Generate code from documentation

Let me stop you right there. If you want to generate API bindings, those should be generated from code, along with the documentation, not through it.

onlinepersona@programming.dev on 30 Apr 16:57 collapse

Generate API bindings from code? Then what’s the code for? Do you have an example?

OpenAPI can generate bindings from their spec, but the spec only seems to describe a single microservice.

Anti Commercial-AI license

breadsmasher@lemmy.world on 30 Apr 17:28 collapse

I feel like this is just confusing specification with documentation.

Code generated from a specification, and documented. Swagger using the same specification could maybe sort of be documentation

onlinepersona@programming.dev on 30 Apr 17:39 collapse

For a while Peertube solely used OpenAPI to document the project. That’s spec, documentation, and code generation in one. Dunno when they switched to a separate documentation tech + OpenAPI, but it’s there.

breadsmasher@lemmy.world on 30 Apr 17:44 collapse

Yeah thats more what I was thinking

MagicShel@programming.dev on 30 Apr 16:57 next collapse

OpenAPI will let you generate both controller and swagger documentation from a single yaml configuration. That’s probably not the whole answer, but it’s the hard part. Then you just need something to index all the swagger docs.

This presumes Java. I don’t know about other ecosystems.

onlinepersona@programming.dev on 30 Apr 17:09 collapse

OpenAPI unfortunately doesn’t provide an overview of the different microservices. For example I won’t see that $ServiceConsumer is a consumer of messages from $ServiceProducer. It only gets more complicated the more microservices are added.

It could be part of the code generation solution, but I do wonder if OpenAPI is the only solution out there.

Anti Commercial-AI license

MagicShel@programming.dev on 30 Apr 17:15 collapse

I would create a Jenkins task that runs during deployment that does whatever magical thing that updates your central index. That’s going to be implementation-dependent. I once worked on a custom workflow and documentation repository that did basically this, but I don’t have more info because I was only there a few weeks before getting moved to the contract I had actually been hired for. It would’ve been more complicated because they had api preview docs for things still under development.

Point is it was a custom solution and I’m not aware of an existing product.

onlinepersona@programming.dev on 30 Apr 17:42 collapse

I don’t mind using multiple compatible technologies. OpenAPI for services and something else that consumes an OpenAPI JSON or even JSON Schema to connect the different projects together and provide an overview.

Anti Commercial-AI license

johnydoe666@lemmy.dbzer0.com on 30 Apr 17:23 next collapse

We’re using backstage in combination with openapi. The schema is documented in OpenAPI, but how services are connected is done via backstage, which crawls all repositories and puts it together to form nice graphs we can traverse easily

onlinepersona@programming.dev on 30 Apr 17:29 next collapse

That sounds great! I’ll look into backstage. It’s backstage.io?

Anti Commercial-AI license

aes@programming.dev on 01 May 07:46 collapse

Backstage has become quite misaligned to what we were originally trying to do. Originally, we were trying to inventory and map the service eco-system, to deal with a few concrete problems. For example, when developing new things, you had to go through the village elders and the grape vine to find out what everyone else was doing. Another serious problem was not knowing / forgetting that we had some tool that would’ve been very useful when the on-call pager went off at fuck you dark thirty.

A reason we could build that map in System-Z (the predecessor of Backstage) is that our (sort of) HTTP/2 had a feature to tell us who had called methods on a service. (you could get the same from munging access logs, if you have them)

Anyway, the key features were that you could see what services your service was calling, who was calling you, and how those other systems were doing, and that you could see all the tools (e.g. build, logs, monitoring) your service was connected to. (for the ops / on-call use case)

A lot of those tool integrations were just links to “blahchat/#team”, “themonitoring/theservice?alerts=all” or whatever, to hotlink directly into the right place.

It was built on an opt-in philosophy, where “blahchat/#team” was the default, but if (you’re John-John and) you insist that the channel for ALF has to be #melmac, you can have that, but you have to add it yourself.

More recently, I’ve seen swagger/openapi used to great effect. I still want the map of who’s calling who and I strongly recommend mechanicanizing how that’s made. (extract it from logs or something, don’t rely on hand-drawn maps) I want to like C4, but I haven’t managed to get any use out of it. Just throw it in graphviz dot-file.

Oh, one trick that’s useful there: local maps. For each service S, get the list of everything that connects to it. Make a subset graph of those services, but make sure to include the other connections between those, the ones that don’t involve S. (“oh, so that’s why…”)

spacedogroy@feddit.uk on 30 Apr 19:27 next collapse

Diagrams. Loads and loads of diagrams. One for each use-case.

Then I’d have one diagram to draw out dependencies between each service at the broadest level. Although depending on how messy your architecture is it can be very difficult to read, in my experience.

onlinepersona@programming.dev on 30 Apr 20:21 collapse

You do this all manually?

Anti Commercial-AI license

spacedogroy@feddit.uk on 30 Apr 21:07 next collapse

More or less. Either Excalidraw for your quick and dirty diagrams or I’ve used PlantUML + C4 Plug-in for your larger, more long lived diagrams with some success.

senkora@lemmy.zip on 04 May 05:08 collapse

I just gave PlantUML + the C4 Plugin a try and generally liked it, thank you for the rec!

It seems like a good tool although it inherits all the joys and pains of automatic graph layout.

I think I’ll keep it in my arsenal for detailed diagrams that can handle being a little aesthetically wonky.

I hadn’t heard of C4 before and it seems like a solid idea.

RonSijm@programming.dev on 01 May 10:26 collapse

I manually redraw my service architecture because I can create higher quality documentation than when trying to auto-generate it.

But you can get a baseline depending on which Cloud you use. For example, in AWS you can use workload discovery - that generates a system overview.

Bonus (optional) question: Is there a way to handle schema updates? For example generate code from the documentation that triggers a CI build in affected repos to ensure it still works with the updates.

Yes, for example, if your build server exposes the API with an OpenAPI scheme, you can use the build server to generate a client library like a nuget or npn.

Then in the API consumer you can add a build step that checks if there are new version of the client library. Or setup dependabot that creates PRs to update those dependencies

Falst@lemmy.world on 01 May 08:07 collapse

If you don’t mind the runtime overhead OpenTelemetry would do the job (with maybe some sort of manual instrumentation for things like timers) and builds a service map.

IMO however if your services are closely tied together then how about grouping them together into one or multiple mono-repositories ? Or at least start designing your bounded contexts so that documenting by hand doesn’t become a maintenance burden.