Recap of webinar: “Feature flags & authorization: Key tools for modern development”

In our recent webinar, we had the pleasure of hosting Ben Rometsch, the CEO and co-founder of Flagsmith, for a conversation about the role of feature flags and authorization in modern application development, as well as a demonstration of both solutions. As pioneers in these technologies, both Cerbos (authorization) and Flagsmith (feature flags) aim to simplify the complexities developers face today. This blog post will recap the key points from the webinar, illustrating how developers can leverage these technologies to build more dynamic, secure, and efficient applications.

Webinar recap

The essence of feature flags and authorization

The webinar kicked off with a deep dive into the world of feature flags and authorization, with insights from both Alex Olivier, our Chief Product Officer at Cerbos, and Ben. They discussed the differences and similarities of feature flags and authorization, as well as how feature flags and authorization, when used effectively, can transform the development lifecycle—enhancing flexibility, reducing risks associated with deployment, and facilitating a more granular control over features and user access within applications.

Aspect	Authorization	Feature Flags
Definition	Manages user permissions and access to resources within an application.	Allows dynamic enabling and disabling of features in an application.
Objective	To secure applications by ensuring only authorized users can access specific resources or perform certain actions.	To manage software releases and testing more flexibly by controlling feature availability without deploying new code.
How It Works	Defines policies that specify who can do what under which conditions.	Integrates flags in the code that can toggle features on or off for different user segments or environments.
Typical Use Cases	- User role management - Access control to features or data - Compliance with data access policies	- A/B testing - Gradual feature rollout - Managing beta features - Environment specific feature deployment
Benefits	- Enhances security - Fine-grained access control - Helps comply with regulatory requirements	- Reduces risk in deploying new features - Improves user experience by customizing features - Facilitates faster feedback on new features
Challenges	- Complexity in managing detailed policies - Risk of overly permissive or restrictive access controls	- Complexity in managing numerous flags - Risk of technical debt if flags are not properly managed or retired
Overlap	Both can be used to control what different users can see or do in an app, but through different mechanisms. Both involve some form of toggle or switch, but with different focuses (security vs. flexibility).

Practical demonstrations and real-world applications

One of the highlights of the webinar was a live demonstration by Ben on how Flagsmith operates. He showcased how easily feature flags can be managed across different environments, allowing developers to control feature rollouts, perform A/B testing, and customize user experiences without needing to deploy new code. This capability is crucial for testing in production and ensures that new features meet quality standards before full-scale implementation.

Alex then took the reins to demonstrate how Cerbos integrates with Flagsmith to handle authorization decisions influenced by feature flags. He detailed a scenario within a demo application where access to certain features, like an "expenses" section, is controlled both by feature flags and complex authorization policies. This example illustrated how combining feature flags with robust authorization checks can secure applications and tailor user experiences based on real-time data and user roles.

Try Cerbos Hub

The importance of SDKs

Both speakers emphasized the importance of SDKs in streamlining integration efforts. They highlighted how Flagsmith and Cerbos provide extensive support for multiple frameworks and languages, ensuring that developers can easily integrate these tools into their existing infrastructure.

Conclusion

The webinar not only highlighted the technical capabilities of Cerbos and Flagsmith but also reinforced the importance of these tools in modern software development. By decoupling deployment from feature release and abstracting authorization into manageable policies, developers can achieve greater agility and safety in their development processes. This approach minimizes disruptions caused by new features and ensures compliance with security policies, ultimately leading to more robust and user-centric applications.

For those interested in learning more, detailed documentation and community resources are available on our respective websites, and we welcome all to join our communities to continue the discussion.

Thank you to everyone who joined the webinar, and we look forward to seeing how you implement these strategies in your projects to enhance both the developer and end-user experience!

Useful resources

Cerbos

Cerbos Hub - find out more
Cerbos Hub - documentation
Book a 30 minute free workshop. Let us help you build or review your first policy
Join our Slack Community to get all your questions answered, and keep up-to-date with latest developments

Flagsmith

Transcript

Alex: Hello everyone that's joining us. We will get started in one minute's time So go and grab yourself a quick drink as such get comfortable. I'm really looking forward to our chat today around feature flags and authorization with the Flagsmith. Excellent, so we will kick this off. So good morning. Good afternoon. Good evening. We're from the world you are Thank you for joining us for this webinar today about feature flags and authorization and the keys that it really unlocks for you and your modern development. I am Alex Olivier. I'm the chief product officer and co founder here at Cerbos, which is an authorization solution.

I'm very pleased we have Ben Rometsch joining us. He's the CEO and co founder of Flagsmith. Hey, Ben, how are you doing?

Ben: I'm good. Yeah. I was just moaning to you about the British weather earlier. And it's still raining. So other than that, I'm great.

Alex: Yeah, unfortunately, I'm in the UK as well and have equally bad weather and just north of London So I promise you won't talk about the weather too much because that's too British of us on this.

So today's topic. I'll be absolutely tea tea. There's a bit too late. Yeah, maybe tea is a bit after tea time. So yeah, today we're gonna talk about authorization feature flags. Ben, I think you you are best placed of Anyone given what you're doing with Flagsmith and your involvement with some of the open projects around Around feature flags as well so please give us a bit of introduction kind of how you got into this world With feature flags and kind of what your mission is, for everyone to make their lives better.

Ben: Yeah, awesome So yeah, i'm one of the co founders and ceo of Flagsmith. We are a commercial open source feature flagging project you'll find us on github. com slash Flagsmith my background, I was running a software agency in London for 25 years and Flagsmith was a side project that we started at that agency to build a feature flagging platform that, you know in our own vision, basically and that started as a side project was open sourced.

Started to get some interest and then just before COVID, I started working on that project full time by myself. So it's very lonely memories in that little room during various COVID lockdowns. And now we're a team of 15 all around the world as a remote first company. Working on yeah, you know, the, the platform, the SDKs, all that sort of stuff.

And I'm also on the governance board of OpenFeature, which has been really fascinating sort of project to work on. So OpenFeature, which maybe talk about a little bit towards the end is. Kind of analogous to open telemetry. If if people know about that, it's trying to build a common SDK standard for feature flags to try and help you know, prevent flagging companies from having to reinvent the SDK weird all the time and to help with avoidance of vendor lock in and all that sort of stuff.

So most of my day job is, is Flagsmith, but yeah, it's, it's fun to work on open feature and. you know, meet people from very, very different disciplines and whatnot. Yep.

Alex: Yeah, we're super excited to speak to you because I feel like there's a lot of similarities between what we do with Cerbos You know, our timelines as a company are the same.

Like, we are a commercial open source solution for for authorization. So the Cerbos project, up on GitHub again, Cerbos. At github. com slash Cerbos. We have our open core and then we have the commercial control plane servers hub that sits on top And equally working on kind of these standards as well.

So we're part of the open id working group around standardizing authorization so it's always nice to speak to someone that's basically we're at the same point. We're we are, you know Trying to build this this layer that makes it basically developers lives easier. As much as we can and you know Hopefully build a business around at the same time.

So as we kind of hinted at at the start One of the reasons why we wanted to do this session is when we're talking to users that are trying to implement authorization One of the topics that comes up quite a lot is around. Okay. What's authorization? What's feature flags? What's the scope of each?

Where's the overlap? You know, are they the two sides of the same coin? Are they kind of separate concerns? And I guess to kind of kick that off ben it'd be good if you kind of go through so the use cases you see You When should you use feature flags? When do you see the commonly used feature flags?

And then we can sort of have a bit of a overlay of how that works authorization as well

Ben: Yeah, okay. So when you shouldn't use them is when you're doing anything around security. That would be my first suggestion, well I mean, there's always exceptions in engineering, right? But generally if you're worried about the security of something then generally, yeah, don't, don't go here.

There's dragons here. Go and use a tool that's designed for that. But feature flagging is an interesting engineering concept in that it's very, very simple. The core use case is very, very simple and the benefit that you get from that core use case generally is as much as we see from the people that use Flagsmith and OpenFeature is really, really you know, a lot of value added to teams.

So the core, that core use case is. Trying to decouple deploying software and releasing features and generally use feature flag. That's, that's the general use case for using feature flags. So most of our, and the back story to Flagsmith and the reason that we built it as an agency. As a side project was we just had enterprise customers who could not sort of release versions of their applications without something burning or setting on fire or, you know, teams getting fried having to work weekends because they were working in larger organizations with multiple dependencies amongst teams.

And they, they, they, their, their release, their feature releases was tied and coupled to their, their application deployments. And quite often those deployments were happening, you know, at three in the morning by some poor DevOps or SRE person. You know, and then if things break and all this sort of stuff, so flag feature flagging, which we'll go through in a little bit, really just allows you to, that's the core, you know, if you only think about or remember one thing from this.

from this call, it would be like they just decouple deploying your software with releasing your features. And that one very, very simple sort of progress of step of progress, especially amongst larger teams or teams that have to go through a lot more of a ceremony to release their software, you know, because they're in a bank or financial service institution or whatever.

Is, can be transformational to those teams. So it's, it's, it's not much software and it's a lot, a lot about process and humans maybe a little bit, you know, I think maybe service a little bit, you know, the ratio is a little bit less, but yeah, you know, software people, engineers always like to try and you know, slap loads of software on stuff, but feature flags are pretty simple.

And they're, they're very much around helping teams. Become more efficient, as as teams of humans as opposed to You know some heavy lift of software to try and do something incredible. Yeah

Alex: Yeah, I certainly have many dark memories of doing those feature releases and things going catastrophically wrong and trying to roll back at 2 am.

Ben: Yeah database migration reversals and all that sort of stuff and yeah, and you know, like and that's another thing that we we sort of experienced as well is, there's a huge amount of legacy code out there, right? There's, you know, like new code is like the tip of the iceberg and legacy code, legacy infrastructure, legacy frameworks, legacy languages, legacy databases, that's like the, the vast majority of software engineering out.

And I think that's something that um, the discipline forgets a lot, like, you know that those, those people working in those teams kind of get forgotten about and so. Yeah, feature flags can be really, really valuable to folk like that because, you know, if it's, you know, dozens of hours of labor to, to get a release out you know, then you really magnify the value of, of what you get out of feature flags.

Alex: Yeah, and it's like I really like the framing like decoupling, sort of releases and then That way is is kind of the story we've been telling with servers all along as well Which is around the security side of things not the future flag side of things So, what we do say is Cerbos is here to help you with authorization and by authorization to be very clear We're talking about Can this user do this particular action in this particular side of the system?

So an input to that we see quite a lot is feature flags You know has this person got the future flag enabled would have this particular trait set upon them And that's an input to make an authorization decision But if you imagine us in a typical like sas application You have a user they belong to like a project like insider Flagsmith, which we'll go through shortly You have an environment you're inside a project etc.

So I have a role within that you need this kind of layer which defines okay can Alex or can Ben do the enable flag in this environment under this these kind of scenarios type logic? And kind of traditionally you end up having a hard code all this into the application code So if user role equals x then allow the action if user role equals y etc, etc And that's certainly what I did and you know Similarly to kind of feature flags you end up kind of hard coding some of this logic in and trying to hack around it to To a point and the server's approach is to decouple that out into a standalone externalized service Similar to you're doing with kind of feature flags as well and the server's approach is to you know, decouple that out you then have your your authorization policy written as actual policy files rather than code So you're not hard coded since your application you've decoupled this now And so your authorization policies can evolve and change independently of your application code And it took a kind of legacy.

A lot of our users that we kind of see implementing Cerbos are going through these kind of big migration type projects or or kind of moving to a new platform and they're heavily using feature flags to kind of point users to new parts of the system or the old parts of the system. And one of those components you need to abstract as well now is Much like authentication, you need to abstract authorization, so it works commonly upon it, which is why we more often than not end up having these discussions around like, should this be a feature flag or should this be an authorization piece, which is why we kind of thought it would be an interesting discussion to have with you around.

You know, where are those boundaries? Where do they lie? And I think the way you put it around, if it's security, that's your authorization. If it's more about the controls to part then, then feature players kind of makes, makes more sense. I don't know what you're kind of seeing when users are kind of implementing flags, like kind of best practices you're pointing them towards.

Ben: Yeah, I mean, it really depends on where the customer's coming from. Like it's very common for us to have and work with customers who, who are coming from like a big legacy, sorry, homegrown thing that they've built. That's just kind of sort of grown out of control. No one wants to look after it.

You know, there's no standardization around languages. There's no kind of SDKs to think of. It's quite common for fairly hairy things to be happening around, like controlling application behavior with like environment variables. Like modifying environment variables and then like bouncing services to, to try and modify application behavior.

So it really does depend. I guess it's, there's, it's kind of so, so interesting how if you squint, service and Flagsmith kind of almost identical. Yeah, it's amazing. If you start off with Flagsmith, if you're starting a new project and you're literally, you know, get in it, whatever um, there's really no, there's really no drama or fanfare or, you know, anything at all, really.

It just becomes part of your workflow. I'm assuming it's, you know, pretty, pretty similar with you guys. You know, the flip side of that is if you're coming from a legacy application you know, that's maybe, you know, that's the other thing as well. It's quite common for a legacy platform to have like four different ways of remotely modifying application behavior, like environment variables might be one.

There might be. You know, a config file somewhere on some, someone's laptop there, there might be an admin interface to something. So you know, if you, if you're coming from, from that, then the best practices are very different. It's just about just trying to chip away at at that kind of edifice of like slowly moving across.

From, from, you know, from what way, where you're going from to where you're coming to, and that's why OpenFeature is so important as well, because OpenFeature is like, you're never going to have to do, well, the, the, the, the goal of OpenFeature is you're never going to have to do that again. Right? Like you can go back to your own brain platform if you want, just need to write an OpenFeature provider.

So yeah, it's, it's, it's really, it really, you know, it really is the case of, of where you're coming from. Yeah.

Alex: Yeah should we I've got to hand you the reins, Ben, and you can take us through sort of flags, for example.

Ben: Let's yeah, it's always good to give people some concrete examples of what things look like.

So hopefully I'm going to share my screen now. And it's actually, we just redesigned this UI this navigation area. So I'm really excited every time I log into the platform, it's doing its loads better and we spent years trying to solve this problem. So basically, so the way Flagsmith works, this is going to be a super high level quickfire demo of a couple of ways that you can use feature flags and hopefully give people some ideas about how they might be able to use them.

So this is the Flagsmith application. And Flagsmith organizes itself into projects. Projects are generally applications or properties that you, that your teams are working on. And so if I click into a project we're looking at the Flagsmith website itself. So we use Flagsmith to deliver features into the platform itself, which is meta and fun.

And this is what we're looking at here now. So projects have environments. And flags live within, within a project. But the values of flags can change depending on the environment you're looking at. So super simple workflow would be, you start working on a new feature. You come into the Flagsmith platform and create a new flag or that feature.

And then as you can see here, these are the different environments that we've got within the platform. So we only have two kind of like, you know, production, like infrastructure environments. Of staging and production. And if I and then these are the features that are associated with the platform.

So you can see here, like this, like, this is a, a metadata feature that we're working on at the moment, we've been working on it quite a while. If I change the um, the, the environment from say, staging to production here, you can see that the flags. The flag value, the flag names are the same, but the values that they might take are different.

So that allows us to go through and, you know, as we're, as we're approaching and finishing a certain feature, we might turn it on in, in staging or we might turn it on for a set, a set of users in production. What feature flag platforms they to do once you've got you know, once you've got an achieved decoupling deployment and release, they then allow you to be much more expressive about how you release features.

So you can release features to you know, a different different class of user users on different platforms, maybe mobile devices or different geographical regions. or who've got access to a different plan, for example. So just to give you a super quick idea, if I come to the demo feature, sorry, the demo environment and I go to demo.

Flagsmith. com you will see that this is, Exactly the same front end, but we're now looking at the demo environment within Flagsmith and I've got a butter bar here. So the butter bar allows us to control messaging to, to the users within the platform. And currently within the demo environment, the butter bar has no value.

So flags, and this is common amongst many flag providers aren't just necessarily booleans. They could be textual values, structured data multivariate values if you want to drive an AB test. But so if I say welcome to our stream you can see here currently there's no butter bar. But if I update this feature value you can see here that the butter bar appears on a demo application and and our, our text appears.

And so all I'm doing is I'm, I'm remotely controlling the behavior of the application in this example, the demo environment of the, the application. Using flags and flag values to modify that application behavior. So when it comes to decoupling deployment and release this, you know, the canonical example would be you deploy your application with the flag for the feature disabled.

And then, you know, if you want to, your SRE and DevOps spoke can do that at three in the morning. If that's just the way you roll and then, you know, when, when the, when the engineering team, the product team come in a reasonable hour, they can just come in and turn on this feature in production and see, you know, track, track the behavior of that feature.

Look at the telemetry coming out of the platform and see that they're happy with it. And then obviously, you know, if they want to and they're like, Oh, this really isn't working. We didn't think of the, you know, there's some consideration we didn't make with regards to the production environment.

They can just turn it off and it will disappear. So just to give you a super quick example, that's doing, that's modifying the behavior of the application from a particular environment. So everyone accessing that environment can see you know, that change. But you can also apply user context to to that, to that feature value.

So I can say, for example, I want to, in the demo environment, I've just found my particular user that I'm identifying. I can come in and I can provide a specific. Butter bar value for my individual user into that demo environment. So I can turn it on just for me, which is super powerful testing in production against production data, production APIs and services.

I can do this just as an individual user. And then I can also provide context to users. So we call that context traits. These are key value pairs of data that you can apply and you provide via our SDKs to the flag evaluation engine. Just very briefly, I don't want to spend too long on this demo.

We can define a segment, um, which makes use of those traits. So segments are just collections or groups of users. Based on a set of rules and here we have a Flagsmith team as Kyle has been demoing this so we can delete him. This, this is a simple segment and it is basically just saying if the email traits of the user matches this regular expression, then include them in this Flagsmith team segment.

So effectively, anyone logging in with a Flagsmith. com email address is going to be a member of this segment. Once I've defined that, I can come back to my butter bar, and I can basically say to the platform, Right, for Flagsmith team members, I want them to say Welcome Flagsmith team. And I will update that segment override for the butter bar.

And you can see that this message is showing. This is going to show everyone who logs into the platform with a Flagsmith. com domain and no one else. Thanks. So that just gives you a bit of an idea about, you know, a lot, you know, a lot of different flag providers kind of similar in this way where you can be much more expressive about how you release features, who you release features, releasing features to that then opens the doors for doing things like AB and multivariate tests, but using the flag engine to, to bucket your user behaviors.

Doing things like percentage rollout. So saying I want 1 percent of my user population to see this feature. And then, you know, if nothing is on fire, I want to, I want to up that to 10 to 50 to a hundred percent, very, very expressive around this sort of stuff. So the initial use case, very, very simple.

And, you know, we generally see customers taking on board those, those simple use cases, but then as they get more confident with the, with the pattern and the approach they, they start to use the platform in much more expressive ways. And yeah, you can download everything that I've seen. I've just shown you, and this is all open source.

You can grab it, have it up and running on your laptop in a few minutes. And yeah, start, start hacking around with the with the tool.

Alex: So someone that, that is always looking at how we're rolling out Cerbos as our product, you know, we've got new, new features coming out very, very soon. And, you know, one thing.

That we're actively talking about is like, okay, how do we control these and how we feature flag these and my main, my brain immediately wants to kind of runs to the concern of like, how do I not end up with like a proliferation of too many feature flags and too many segments or, you know, that I can see that list getting longer and longer and longer.

Have you got any kind of like best practice or kind of recommendations for, you know, when to use feature flag? What's the hygiene around them? When do you clear them up? Those sort of things. Yeah. So

Ben: that's, yeah. And there's, there's a lot of tooling in Flagsmith to help you with that. And there's generally kind of different classes of flags.

So there are long lived flags that, you know, like the butter bar, for example, that's going to live for the lifetime of the, of the platform, I would expect, because it's always useful to be able to say, you know, I need to tell users in a particular environment, something optionally, right. So that those, those flags would live in the platform and in your code you know, forever but then other, other types of flags, you know, you know, if you are using it to roll out a feature and you know, then, then everyone's been using that feature, you're happy with it, you know, the Jira ticket has been long closed, and there you go, then, yeah, there's, there's, there's tooling around like identifying stale flags, you know to, to promote cleanup of that code.

So yeah, like the, you know, but there's, there's a lot of different. Our customers use the platform in very, very different ways. You know, because it's such a general tool. So yeah, it's always interesting to see how people are using it. And, you know, there are fairly obvious you know um, like, like bad ways to use the platform.

And we've seen some kind of like. Crazy, you know, we've seen people use it to do local string localization for example, things like that. Yeah. Just because, you know, if you can dynamically, you know, control the behavior application, then you can start doing all sorts of stuff. And sometimes we get asked, you know, is there a, is there a language runtime in the you know, on the, on the client side and things like that, like, you know, can we feed it code and things like that?

And it's like, it's not really you know, what, what we've designed it for, but yeah, you know, there's, it's generally quite obvious if you're, if you're doing something wrong or if you're, you know, you're doing an anti pattern, but there's a lot of stuff that we've written and thought about and published on our, on our website about best practices.

And that actually open feature really is kind of like. We are really trying to distill down you know, a bunch of different projects and their learnings and try and get a common denominator of the best way to do things, yeah.

Alex: And what does like a typical sort of integration look like? I'm assuming you've got SDKs for most of the languages, APIs and such.

You know, what does it take to actually go and integrate Flagsmiths into your application?

Ben: Yeah, so if you're using our SaaS platform It's literally, you know, within five minutes, you can have the SDK running in your, your JavaScript application, your react native application your, your iOS application.

And then, yeah, like the actual code that you're writing is really, really simple. It's, you know, it's a conditional, but it's like, Here's an environment key, grab my flags. What's the state, what's the value of this flag? Is it, is it on or off or, you know, just display, emit the textual value somewhere into the application.

So yeah, there's, you know, the, the SDKs are very, very lightweight. They have like, you know, a handful of methods. And, you know, I guess very similar to yours, right? It's like, Most of the time they're just, they're just, you know, your, your, your SDK is either giving a true or false, right? Like, yeah. So the, the the actual amount of kind of like stuff you need to get clear in your head around a code level is very, very, very simple.

Alex: Yeah. And I actually, we had a question come in during that around. You mentioned you kind of got the sass offering as well, but there's the open source project There's a question here from raganath around. Can you kind of run Flagsmith inside of your own cluster?

Ben: Yeah, you certainly can so, the most common, method of of deploying Flagsmith if you're not running on our SaaS.

So, you know if you want to run it yourself In in your own infrastructure is is through kubernetes. So yeah, we've got a helm chart. We you know, we regularly deploy directly into our customers infrastructure you know, and one of the cloud providers or they run it on their bare metal.

Actually, that's been one of the interesting things is you know, when we started on the project five years ago the deployment story for on premise was, was much more complicated. You know, there was Rancher there was a couple of other orchestration frameworks. It was like Docker swarm and.

It was much more complicated answer to that question. Like how do we deploy Flagsmith? But it's been kind of interesting. Like, yeah, I can't remember the last time someone wanted to deploy it, not in Kubernetes. So yeah, we've got very much first class support for, for Kubernetes. And that's the, the the most common way that that's the, you know, it exists in Kubernetes way more.

Than anywhere else for sure.

Alex: Yeah, and that's kind of what we're seeing as well with kind of people deploying servers You know, we we have a similar story with helm charts and those kind of things I would say the one other environment we see quite a lot in is more of these Cerbos container runtimes say cloud runs or amazon ecs As kind of I would say the number two spot But I think we're now mostly beyond the days of bare knuckle machines and running.

Yeah, executable binaries directly on them Which is a relief, especially now you've got to worry about like arm or an x86 instruction sets and all that fun stuff cool. So Thank you for that ben and just anyone that's listening in feel free to Fire away questions in the chat. We'll get back to those.

i'm gonna just just jump back in and drive for a bit so Bear with me, uh, so when we were kind of preparing preparing for this session and this is kind of the start We used to get asked a fair bit around, you know, what's authorization? What should be feature flags? How do those two things work together?

And so what we have is a bit of a demo that basically combines blacksmith with Cerbos and kind of gives that end to end flow around using feature flags to influence authorization decisions and then using authorization decisions to influence how how Basically the user experience behaves.

So just a bit of scene setting before we kind of dive into that and kind of show that off. I've said this authorization word a few times now. Unfortunately it sounds very familiar to authentication and it's even worse cases when it gets reduced down to Auth Z and Auth N they are two different concepts.

So authentication, just make sure on the same page is about. You ensuring the person is who they say they are so you challenge someone to provide a credential username password, etc And they get back an identity that you're pretty sure is who they say they are What's authorization is around? Okay, you now know who someone is can they actually do something or can they perform particular action inside of a system?

And this probably looks very familiar to what you'd use with the flags with SDK But you end up with basically these if statements in your code and If you were to hard code authorization logic similar to feature flags, you're going to put in things like okay Let's work out this person is an employee So you'd have if the user's email includes the company domain name kind of example you were showing Then they should be able to access a certain capability or you might look at what package someone has you want to work out if someone belongs to a certain group whether they should have access or not you may go down to more things that are like data Locality or data security specific and generate audit logs around access controls And this logic gets kind of brittle and as you start having more and more these requirements come in is the the fine Be able to get very fine grained new authorization logic is actually a bit of a maintenance overhead and Anytime you want to go and change that authorization logic You're going to have to update your codebase all over the place and similarly with flags If you want to enable a feature you now you can just flick a switch in Flagsmith versus having to Do a whole new release of the application. So the kind of problem with that approach, you know, authorization logic is not going to be spread out if you hard code it across all your different apps and services, you know, you might have a react native app, you might have an iOS app, you might have an Android app, you have a web app, you have some API access, and you need to have standardized authorization across all those, so updating it is going to be very painful, you have to go and take ultimately that business logic and convert it into n number of languages based on however many services or, you know, Frameworks you have in your stack.

And then documentation becomes a bit of a nightmare. Cause you know, there's logic spread out across different parts of your code base. So what Cerbos does is authorization. Similar to flags with other feature flags. Is it kind of extracts all that out and decouples it out into something that's much more standardized.

Is a single pane of glass, as it were, to see exactly what's going on. And when it comes to authorization, Cerbos does this via policy. So, Rather than hard coding who can do what inside of a system you define your authorization logic as static policy files and We use yaml I think yaml is here to stay love it or hate it but as a way of basically defining conventions Cerbos Uses yaml as well as common expression language to defining conditions and basically your policy files are defining Here are the different resource types in my system Here are the different actions that are possible and under which condition and you can do your simple role based access control checks You So does this person have a role or not?

Or you can get more to the fine grained attribute based control checks, where you're actually inspecting individual attributes about the user making the request or the resource they're trying to access. And where this overlaps with kind of feature flags is these attributes can come from anywhere. They can come from your identity provider, they can come from your internal database, or in this case, they could come from your feature flagging system.

So attributes about the user can include what flags they have turned on. And then you can start making authorization decisions at actually at the API kind of security level. Should this request be allowed or not? That can be dependent on whether a particular flag is ultimately enabled. So in terms of a demo architecture, sort of how this works if we imagine we're looking at we're looking at like a single request going through the system and we're going to look at how that intersects with the feature flags, how it intersects with Cerbos, how it intersects with rest of parts of your app.

So you have your end users, they're on one of those client devices, that client device might have locally on it a feature flags as well using one of the flags from FFTKs, but ultimately there's going to be API pools going off to some backend infrastructure. And that's, you know, probably gonna be running in a Kubernetes cluster in some hyperscaler somewhere.

But that request comes in and when that request comes in, you really know a few different things, you know, who's making the request because that request is going to be authenticated. So you can go to your authentication provider. Okta, Auth0, Cognito Firebase, these kind of solutions. You can go look up that user's identity.

You can also then go off to using the flags from the SDKs, go and grab what feature flags that user has enabled, or whatever config they have enabled. So using your, like, butter bar example, you can go and fetch, okay, the butter bar is enabled, and it has these particular values. And these are basically attributes that you can associate with the user.

The other thing, you know, based on the request is what resource they're trying to access. So we're going to be looking at like an expenses tracking application. So imagine you've just gone to a conference, you want to file your expenses for work, and that needs to be approved by you know, finance or your manager, et cetera.

So based on the request, you might have an approve endpoint that is going to do the approve action. So, you know, from that request, you can go out to the database, go and fetch that particular record, and then make an authorization decision. And then rather than hard coding it, the Cerbos approach decouples that out into a standalone service that runs right alongside your application.

So your application is now going to send a request from, it's from your service to a Cerbos instance. And in that payload will be the principles of the user making the request. Who they are, their ID, their teams, their groups, what feature flags they have enabled at that moment in time. And also the resource they're trying to access.

Or do an action upon. So here's an expense, one, two, ID, one, two, three, it's about 10, 000 for this event, et cetera. And then what actions are trying to be done, create, read, update, delete, approve, deny, comment, flag, whatever is relevant for that particular business domain. And that goes off to that authorization decision point.

And what's loaded into that is the policies. So those kind of YAML definition files I showed you before, that contains all the business logic. So now in this example, we're actually going to go through and actually make authorization decisions based upon not only who the user is and what role they have, but whether they have particular flags enabled on them.

And so, and again, these flags are all being sourced from the Flagsmith by their SDKs. The authorization point will make a decision. It will create an audit log of that decision. And I saw you have, you know, audit log capabilities inside of Flagsmith as well to really capture that decision. This user tried to do this action this resource and it was either allowed or denied by this particular policy And then what comes back to the application is a simple boolean Yes, this actually should be allowed or no this shouldn't for this particular reason So architecturally that's kind of how we're sticking things together as it were for for this particular example So here's our little demo application.

It's a sort of basic React application it's talking, to a A back end that's running in as a container and there's data that's behind it. And the only difference here is I can actually very quickly switch between different user roles. And we can see different parts of this application are being shown and hidden and enabled and disabled and access granted based upon those authorization policies.

So this is a little wired up to Flagsmith. So here's our Flagsmith demo environment. I'm going to use the dark mode example, so I'm going to go and turn on dark mode as a feature flag. I'm going to go turn that on in my environment. I go and reload my application. And dark mode is enabled by a flag.

And so that's more of a visual type check. What we've also wired up here is more like developer tools. So we have like this debug panel, which I can go and turn on, and I can obviously turn this on based on the targeting like you were showing us, Ben. Refresh the page. Now we get this kind of demo panel that shows us all the authorization decisions that are going on, et cetera.

But the one I want to focus on is this expenses. So inside our application we have this expenses section, which currently, regardless of who I'm looking at, um, there's, there's no data. There's like, there's no information beyond here. So what's happening behind the scenes every time this page is loading is grabbing that user's identity.

It's saying to the backend, go and fetch all the expenses that this user should see. And that request is going off to Cerbos to make an authorization check. So I pulled up our authorization policy behind the scenes. So this is in Cerbos Hub, which is our management control plane for authorization. And this is our resource policy for an expense resource.

So here we have an expense resource. We have some different rules. There's some different actions. So to view a resource in this case an expense, you must be the owner, you must be in the finance team, you must be a manager. But we also have this very blanket rule at top that says All actually should be denied unless the expenses flag is enabled for the person making the request So I could tell you what's going on, apps loading, we're authenticating, we're going out to Flagsmith, grabbing the flags that that user has enabled.

We're passing them into Cerbos, and then we are making an authorization decision. Firstly, checking whether the Spencer flag is enabled, and if it isn't, then we're going to deny everything, and thus present, like, the empty state we see on the screen. But if I go and flick that switch now, enable it, that goes through, the backend will kind of update, update this settings, and then what we'll see in a couple of seconds is we're now getting data back.

So here we're actually controlling and combining feature flagging with authorization policies with ServBoss into one solution. So we have policy driven approach for API access, how things are controlled, things updated, but also then controlling that and influencing that based on the flags that came from Flagsmith.

And so it combines those, those two systems together and, you know, other parts of the application we can easily control by, by flags also. So what if we wanted to like change this logic or kind of really you know, do something a bit more interesting, just showing and hiding, hiding data response here.

So if you're looking at this particular expense, we have this bond for global airlines for 12, 000. And I'm currently viewing this as someone that's a member of the finance team and we can see it, we can go into it. But we can see here, I can't actually approve it. Why is that? Well, we have in our policy, we have some rules.

We can look at the approve action, which says, okay, someone that has the role of finance. Is allowed to approve only if the expense is less than ten thousand dollars So the one we're looking at was twelve thousand dollars thus this rule doesn't pass. So the allow actually shouldn't be allowed So I can actually go and update this, move this to 20, 000, and we can see inside of our editor here that things have been this actually should be allowed this is now approved the approved action is allowed, sorry, based on our logic, change it back, you can see, that will update, so we get that kind of real time feedback, kind of you were seeing with the flags, so on this side of things, as you change things, things update, you get the similar flow with Cerbos, and I can go now and kind of commit this change and push this policy, so if I actually go and do that and show you how things are wired up If you don't like code, maybe look away, but essentially what's going on inside our application to kind of enforce this is when a request comes in, we authenticate that request.

So we've got a user's identity. We've instantiated the flags with sdk. So we've gone and grabbed that we're pulling through the flags And then in our request to Cerbos when we construct what we call our principal object So who's the user making the request? We've got their id. We've got their roles We've got some attributes about them what department they're in etc And then what flags they've currently got enabled based upon what's coming back So again combining those two so if I go and make that change in my Policies, so i'm now doing this vs 000 I change this to 20 000.

I can go and press that change Update logic that then change goes out pushing up. So this is all going up to my github repo And then we have server's hub, which is our SaaS offering that manages your policies for you. And we're actually doing, kicking up a whole CI process. So, we're updating that policy, the logic's changing builds are going in, actually, this case, tests have failed.

Almost like I planned it. So test driven for approach authorization is something that you kind of get the service and, you know, we can test different states, the feature flags, making sure that the policy is behaving correctly based on those flags. In this case, this particular test is failing because the action should be the test says the action should be.

It should be denied but actually it's allowed. And in this case we, if we go and look at that test case inside of here, we're saying for a finance team member over 10, 000 it should be denied. I should go and update that and say allow. So we didn't update our test case, we have now. Push that change,

sync that up, and then our bills go off. So what's happening behind the scenes? Service Hub is building our policies, running our tests, and then it's going to go and distribute it to all of those connected instances. So my demo environment has a couple of Servers PDPs running, Policy Decision Points running, and this build will go through, and you can see it's already generated, all the tests have passed, we go and update all the instances that are listing out, in this case for the Flagsmith policies it's running, it goes out and pushes and deploys.

And then in a few seconds, and there we go, you can see it's already updated. So we're updating our authorization policy to without having to redeploy our application. The exact same story that you get with Flagsmith as well, where you're decoupling that deployment with configuration. So here we're marrying up and joining together both the feature flags, we're showing enabling parts of the application with the authorization policies of Cerbos to give that joined up experience where you really can do these, the feature flagging and security as one holistic solution.

Cool. So we have some questions coming in. Ben, I don't know if you wanted to jump in then.

Ben: Yeah. Well, one of the, one of the things that I'm interested in is like this is like, the demo is a great example of like separation of, of concerns. And we'll say, yeah, why, why you shouldn't use a feature flags to deal with like permissioning.

Can you just talk a little bit about how that would be handled on a law or because Flagsmith runs on the client side and the server side, right? And I'm curious to know the specific things you need to be conscious of if you were managing Cerbos in that environment as well, because there You're running on an untrusted runtime, right?

Alex: Yeah, great question So you authorization must be done in an environment where you can trust the identity and you can basically trust the inputs So you're always going to be doing this from your your backend side of things. So when the request comes in from the client, you know, you might pass attributes about a resource or a principle, but you should always be Verifying those at the backend and then doing the check against the service assistance.

Now there is a whole class of authorization checks, which don't necessarily need that rigor. Be for example, as I was showing those approve and deny buttons showing up and, and hiding based upon user roles with the service side of what we call the service version of it. A policy decision point of CBOs that's running in your backend, your, your backend service is making the backend API call to CBOs and the checks being done and service side.

Now on the client side where you what you would have to do if you want to check permission You would normally have to like round trip to your back end So client to back end to servers to client to back end, etc which works and it's certainly a viable solution, but for those groups those types of those types of Checks, we don't necessarily It doesn't really matter if the permission maybe is necessarily wrong because you're going to re verify on the backend anyway.

We offer what we call an embeddable policy decision point or embedded policy decision point where we actually, as part of our SaaS offering, we take your same policies that you've written once and we compile them for two outputs. One is for the version that's running in the backend. But we also compile it down to a, what we call our embedded policy decision point, which is actually powered by WebAssembly.

And you can push that down with the client. So it could be a browser, but equally could be a mobile app or a Cerbos edge function as some use case for it. And that allows you to very quickly do locally on client or on the edge authorization checks, where you already know the user, you already know the resource, because you've loaded it into the web page already, onto the client already.

And you can very quickly fire off those checks locally and do those more conditional UI based, things and given you have a client SDK with Flagsmith You have all the feature flags which are available on the client Anyway, and then it kind of completes completes the journey Then you do the checks on the client side using the embedded SDK using the client input from slagsmith Then on the server side, you revalidate that check with the actual real confirmed data from your real database rather than something that potentially could have been untrusted, let's say.

Ben: Yeah, that's, that's really interesting. I mean, Plagsmith's got a similar notion where our server side SDKs can run the entire evaluation engine. Within our SDK. So yeah, we, we there's, there's actually some flag providers that are using WebAssembly to do that. We we decided to like implement our um, our evaluation engine like natively in the languages that we support on the server side, but because they're running in a trusted environment, they can basically grab the entire set of data around a particular environment and then do those evaluations all in memory.

So they get like, you know, extreme performance and actually our larger customers you know, that are running like, you know, fleets of, of, of pods, you know, in the, you know, tens of thousands. Yeah, they, they're generally running. Running our SDKs and what that, that, what we call local evaluation mode, because they get like.

You know, just it will scale forever, basically. So, yeah, it's really interesting. You know, this again, there's like a very, very reflective similarity between the two platforms in terms of whether they're running in trusted or untrusted environments, yeah.

Alex: Yeah, i'm sure you get this question as well where we're now in a world where client applications aren't necessarily always online and connected how do you handle the you know degraded connection state let's say or you know The classic gone through a tunnel haven't got a signal.

How does it this kind of thing? So yeah I think there's always this going to be so it's like hybrid deployment model where you do On client but then reconfirm or recheck on the back end as well. And and then also

Ben: like with that as well you know, there's there's there's frameworks like next You Where it can be really actually it's kind of interesting how sometimes next can be quite hard to reason about where the code that you're writing is actually going to be run.

Like, you know, is it going to be run on a GitHub CI runner, or is it going to be run on like a browser or what have you? So yeah, like writing the, so next is actually one of the frameworks that that we have kind of like first class support for. And yeah, it's kind of hairy sometimes trying to figure out yeah, how to reason about what, what a reasonable expectation of behavior would be for those sorts of things.

Yeah, definitely like those, those things are, are becoming more complicated, but yeah, that's, that's one, one of the things that actually, when we started working on Flagsmith, you know, we were thinking about the API and the performance of the API and the dashboard and what that would look like and what the functionality would be, but actually like a huge amount of the energy of the engineering team goes into the, the SDKs You know, like supporting uh, frameworks like react native and things like that.

And and yeah, and that, again, like that's one of the, that's one of the reasons that we're, um, you know, putting out well, whatever weight we can muster behind open feature because you know, down the line, I don't, a good example is open telemetry, right? Like it would be a big lift. For us to put like open telemetry tracing into, into all of our SDKs.

That's something that is being worked on with an open feature at the open feature level and then our provider, which is kind of like a little plug in for that, whatever language you're talking about, we'd have to worry about open feature. That's like a delegated concern. So yeah, like that, that's been one of the things that's been a constant like sort of surprise almost, which is um, how much work the SDKs take and also expectation of quality from people using the platform in terms of SDKs.

Like, a few years ago, we went through and redesigned the interfaces to all of the server side SDKs when we, when we built out local evaluation mode, actually. To make them all consistent, like obviously some a camel case, some a snake case, et cetera, et cetera. And some have slightly different idioms around how you would expect them to work based on, you know, the, the language.

But yeah, like, and that was, you know, it was a huge amount of work and, and our API is, is really small as well, you know.

Alex: Yeah, similar thing, like we have these two API points and we've actually done a lot of work to generate our SDKs from the underlying protobuf gRPC definitions that the PDP works, which sometimes works, sometimes doesn't work, depending on how nice your language of choice supports gRPC.

But actually speaking of SDKs, we've had some questions come in around, sorry, frameworks. There was a question around for Flagsmith, how does the, how does the annotation at configuration in Spring Boot differ from feature flags.

Ben: Ah, yeah. So that's a good question. So, I mean, the, the main thing is that, um, you're generally, you know, as far as I understand it, like you, you'd need to do a redeployment for a change of that annotation to, to take effect.

So what you, you, you, you, you know, you, you could possibly wire up that actual annotation codes to a flag, um, you know, I guess that's something that, that would be feasible. Yeah, the thing, the thing that you need to get your head around is like that there, and this is, this is a you know, this is, this is a consideration that you have to give when you're, you're implementing, you know, flagging tools or tools like Cerbos is like, you are giving up a little bit of the, of the control of like, The, you know, the, the runtime behavior, um, you know, the, the app, the code powers in your application you are seeding those up to a flag engine or an authorization engine.

And, you know, you do, it is a little bit of a a leap of faith in, in terms of doing that. And I think slowly, but surely you know, engineering teams and, and, and companies are coming around to the, the, the idea that that the benefits outweigh you know, the, the, the negative aspects of that.

But you know, there are considerations you have to make around. Like a good example is like the support team that, you know, is supporting your customers using the application and they, you know, they log in as that, you know, particular organization and see a different set of buttons as, as someone else.

So, you know, you do, and I'm sure, you know, your customers have the site, you know, you talked to your customers about the same thing is. You know, it's, it's, it's, and it, it's not just the engineering team that this is relevant to it flows through every, everyone who's involved in the organization who, who, who has some, um, connection or, you know, you know, touches upon that application in some way, because it's important for, you know, everyone involved to, to be understanding of, Of what's going on really.

Yeah. And, you know, once, once people get used to it and it becomes part of that regular, regular sort of like life cycle of developing and deploying your application, then everyone gets it, you know, gets into it. But it's important that it's not just yes, it's not just the engineers that it's, it's it's, it's important for,

Alex: yeah, it's all very, it's very easy to get, you know, To involve with just the technology without realizing it's, you know, the people and process that's around it as well that kind of makes these things successful across the board and to ultimately enable you to keep delivering value and, and, and delightful user experiences.

Now the question came in around, what's the most unique or interesting way you've seen feature flags being used? We hinted at this earlier.

Ben: I get asked this a lot. I think, I mean, we've had, we did have one customer who was having some performance issues. And we couldn't get to the bottom of it and then they did a screen share with us.

And we discovered that they had like 13, 000 feature flags or something because they were doing like, you know, the flag names for like uh, Spanish. dashboard. logout. And then I should have used an example of a I don't speak Spanish, so I don't know the Spanish.

Alex: Come on the spot.

Ben: You know, and then like, and then like, you know, German.

dashboard. logout. And so, yeah, that. That was like, we were a little bit kind of like, Oh, we never really thought about that, that use case. So yeah, that, that was that was an interesting one. And then, you know, that, that's the other, that's the other interesting thing about a general tool is. We, you know, we hadn't, we never thought that someone would put 13, 000 flags into one environment, right.

But of course people will, because if there's lots of people using the platform, they'll do things that to them work, right. So that, that's been another thing that I've always been surprised about. You know, like the, I don't know, yeah, like the failure modes the performance. Considerations you have to make you know, dealing with people who are on you know, like a lot of mobile devices in, in the world are on very, very, very data limited network connections as well, right?

And you can't, can't just assume that they can download like, you know, two megabytes of configuration data, you know, if you're in London, like, great. Yeah, you know but yeah, so that that's, that's been another thing that's been interesting as well. I need to, I need to think I need to mind the team actually for a better answer to this question because I'm sure that they've, yeah, there's, there's, there's some other better ones out there, but the, the 13, 000 flag localization strings is the one that comes to mind. Yeah.

Alex: Couple more in here about kind of the server side of things around deployments. So Cerbos runs. Similarly to Flaxmas offering directly in your kubernetes cluster. So you run it deploy it as another service We actually recommend running it as a like a sidecar and because that way as your application scales server scales with it so you always have you know the same node guarantees that the service is running right alongside your app Keep latency nice and snappy.

And then round kind of the sdks automated testing service rules. Yes As you're writing your policies, as you kind of saw in the IDE, we give you that feedback loop visually but we have CLI tooling again as part of the open source project, and that you can run locally in your dev environment, and we have VS Code support and such, so you get nice intellisense and autocomplete of your actual policies to fit into kind of your actual workflows.

Cool. So we've got one minute left. Ben, thank you very much for joining me today. Where can people find at you? Where can people find out more about Flagsmith and open feature?

Ben: Yeah, so well, so flag smith.com. Otherwise I'm gonna get told off by Anna who runs Marketing FlagSmith github.com/flagsmith is where our source code is.

And then we have a discord community that you can find the link to anywhere, I think at docs docs.flagsmith.com. You know, if you want to ask questions or you know, help work on the platform discord's the best place to go for that. Yep.

Alex: Great and for Cerbos you can find us at Cerbos.dev. We have a slack community not discord though That's enough for debate. Where we where we all we all hang out team around around the world You can also book a free workshop where you get to speak to unfortunately me or someone else on the team. And to go and kind of throw your authorization requirements and we'll get you all up and running and writing policies in no time so finally, thank you very much again, anyone listening in. We will be sending out a page from this webinar recorded, shortly along with all the links and resources we spoke about. And with that have enjoy the rest of your days evenings and take care and speak to you soon.Thanks.

Ben: Yep. Thanks Alex. Cheers

Recap of webinar: “Feature flags & authorization: Key tools for modern development” | Cerbos (2024)