Let’s say you have a website receiving 1 million requests per day. 0.01% of those are admin requests that need to know if your are an admin to execute them. It would be wasteful to check with the database if you are an admin for every request, when only a tiny minority of then needs to know that. So for 999.900 of the requests isAdmin will be null. We don’t know if the user is an admin and we don’t need to know.
That all is besides the point. There’s no real advantage to use null instead of defaulting to false there… it’s semantically more accurate and also less wasteful in that other code does not have to worry about nulls which always leads to unnecessary overhead when false is already equivalent in your proposed example.
It is not more accurate nor less wasteful. If you default it to false and you have 100 admin requests in the session where that variable is stored, you would have to check with the database 100 times. Because you are conflating false to mean two things: “the user is not an admin” and “we don’t know if the user is an admin”. Where if you set it to true or false the first time we need the information, then subsequent requests don’t need to check anymore.
does not have to worry about nulls
I am used to null safe languages where there is no such thing as worrying about nulls, because not checking for null on a nullable type is a compile error, and so it’s impossible to forget about the null case. Maybe it’s why I don’t see any issue with using them.
That is all just external implementation details. Not sure if it was you or someone else, but the main argument in defense of the OP as in it reasonable, was that the name is wrong. That it ought to be idAdmin. None of what you just described should have anything to do with user being or not being an admin. In place of checking “isAdmin” for null, the semantical and resourcewise equivalent would be a third variable for “admin rights having been validated” or whatever. Conflating it in this one variable while renaming it to isAdmin or similar, would be even less sensical… what if somewhere else in the code you have to check whether the initial validations have been made (while the actual role or whether is admin or not is irrelevant), you’d have to check if isAdmin equals null, which in that context would be confusing, ambiguous (i.e someone reading that bit will not know this is what is being checked without additional documentation) and just a code smell in general. You do want to make the important things unambiguous and self-documenting. Even more so the bigger the codebase is and the more contributors there are across its lifetime and in parallel at any given time.
But if we go with the original meaning of roles overall, then the union type is just a code smell that warrants a proper role thing around it.
the semantical and resourcewise equivalent would be a third variable
So you are advocating for:
data class Filters(
val isAdmin: Boolean = false,
val filterByAdmin: Boolean = false,
val isConfirmed: Boolean = false,
val filterByConfirmed: Boolean = false,
)
fun filterUsers(users: List<User>, filters: Filters): List<User> {
return users
.filter { !filters.filterByAdmin || it.isAdmin == filters.isAdmin }
.filter { !filters.filterByConfirmed || it.isConfirmed == filters.isConfirmed }
}
fun getAdmins() {
val users = getUsers()
val filters = Filters(isAdmin = true, filterByAdmin = true)
val admins = filterUsers(users, filters)
println("Admins: $admins")
}
Over:
data class Filters(
val isAdmin: Boolean? = null,
val isConfirmed: Boolean? = null,
)
fun filterUsers(users: List<User>, filters: Filters): List<User> {
return users
.filter { filters.isAdmin == null || it.isAdmin == filters.isAdmin }
.filter { filters.isConfirmed == null || it.isConfirmed == filters.isConfirmed }
}
fun getAdmins() {
val users = getUsers()
val filters = Filters(isAdmin = true)
val admins = filterUsers(users, filters)
println("Admins: $admins")
}
To me, Filters(isAdmin = true) is a very easy to use API, where Filters(isAdmin = true, filterByAdmin = true), with the additional variable to avoid nullable booleans, is more verbose, for not much benefit and brings ambiguity. What if someone writes Filters(isAdmin = true), but forgets they need to set filterByAdmin = true for it to actually work? Easy mistake to make. We can prevent these mistakes by removing default values so they have to be specified on the call site, but then you need Filters(isAdmin = true, filterByAdmin = true, isConfirmed = false, filterByConfirmed = false), which is very verbose. Having two variables also allows your systems to get into invalid states:
What do these mean? It’s better for invalid states to be unrepresentable. Since these states are mutually exclusive, we should have only 3 states, not 4 which you get with 2 booleans. Which you could achieve with an enum True, False, None, but then you are just reinventing the wheel of nulls. You also get the issue that now you have to remember to always update both variables together.
It all comes back to your point:
it’d be idiomatic and reasonable to assume it to be false if we have no data
You want to have ambiguous states, where a single value represents both “we have no data” and “we have the data, the answer is no”.
And I’m not advocating for any of that. That’s just weird design, both of them, and as such a good example of something that warrants a bigger redesign in general.
Just advocating for clear, sensible, self-documenting and most importantly, expandable and maintainable code.
What’s idiomatic varies between languages and the conventions aren’t the same even then, when arguing across disciplines. This discussion seems to be more about different educations. I can get your point but from my personal experience in academia and working in the field it sounds undesired. But that’s just it. My, as in extremely limited, perspective. From your pov what you argue here is probably equally correct to what I think from mine is from my pov, it’s just a difference in the segment of the field we work in I suppose. Or plain old cultural differences.
Whichever it is, I bet we both can find better use for our time. I’m thankful for the time and effort though, even if I wasn’t persuaded. Sorry to have prolonged it so.
Let’s say you have a website receiving 1 million requests per day. 0.01% of those are admin requests that need to know if your are an admin to execute them. It would be wasteful to check with the database if you are an admin for every request, when only a tiny minority of then needs to know that. So for 999.900 of the requests
isAdmin
will benull
. We don’t know if the user is an admin and we don’t need to know.That all is besides the point. There’s no real advantage to use null instead of defaulting to false there… it’s semantically more accurate and also less wasteful in that other code does not have to worry about nulls which always leads to unnecessary overhead when false is already equivalent in your proposed example.
It is not more accurate nor less wasteful. If you default it to
false
and you have 100 admin requests in the session where that variable is stored, you would have to check with the database 100 times. Because you are conflatingfalse
to mean two things: “the user is not an admin” and “we don’t know if the user is an admin”. Where if you set it totrue
orfalse
the first time we need the information, then subsequent requests don’t need to check anymore.I am used to null safe languages where there is no such thing as worrying about nulls, because not checking for null on a nullable type is a compile error, and so it’s impossible to forget about the null case. Maybe it’s why I don’t see any issue with using them.
That is all just external implementation details. Not sure if it was you or someone else, but the main argument in defense of the OP as in it reasonable, was that the name is wrong. That it ought to be idAdmin. None of what you just described should have anything to do with user being or not being an admin. In place of checking “isAdmin” for null, the semantical and resourcewise equivalent would be a third variable for “admin rights having been validated” or whatever. Conflating it in this one variable while renaming it to isAdmin or similar, would be even less sensical… what if somewhere else in the code you have to check whether the initial validations have been made (while the actual role or whether is admin or not is irrelevant), you’d have to check if isAdmin equals null, which in that context would be confusing, ambiguous (i.e someone reading that bit will not know this is what is being checked without additional documentation) and just a code smell in general. You do want to make the important things unambiguous and self-documenting. Even more so the bigger the codebase is and the more contributors there are across its lifetime and in parallel at any given time.
But if we go with the original meaning of roles overall, then the union type is just a code smell that warrants a proper role thing around it.
Then you’d need to do something else.
So you are advocating for:
Over:
To me,
Filters(isAdmin = true)
is a very easy to use API, whereFilters(isAdmin = true, filterByAdmin = true)
, with the additional variable to avoid nullable booleans, is more verbose, for not much benefit and brings ambiguity. What if someone writesFilters(isAdmin = true)
, but forgets they need to setfilterByAdmin = true
for it to actually work? Easy mistake to make. We can prevent these mistakes by removing default values so they have to be specified on the call site, but then you needFilters(isAdmin = true, filterByAdmin = true, isConfirmed = false, filterByConfirmed = false)
, which is very verbose. Having two variables also allows your systems to get into invalid states:isAdmin = true, adminRightsHaveBeenValidated = false
isAdmin = true, filterByAdmin = false
What do these mean? It’s better for invalid states to be unrepresentable. Since these states are mutually exclusive, we should have only 3 states, not 4 which you get with 2 booleans. Which you could achieve with an enum
True
,False
,None
, but then you are just reinventing the wheel of nulls. You also get the issue that now you have to remember to always update both variables together.It all comes back to your point:
You want to have ambiguous states, where a single value represents both “we have no data” and “we have the data, the answer is no”.
Precisely my point.
And I’m not advocating for any of that. That’s just weird design, both of them, and as such a good example of something that warrants a bigger redesign in general.
Just advocating for clear, sensible, self-documenting and most importantly, expandable and maintainable code.
What’s idiomatic varies between languages and the conventions aren’t the same even then, when arguing across disciplines. This discussion seems to be more about different educations. I can get your point but from my personal experience in academia and working in the field it sounds undesired. But that’s just it. My, as in extremely limited, perspective. From your pov what you argue here is probably equally correct to what I think from mine is from my pov, it’s just a difference in the segment of the field we work in I suppose. Or plain old cultural differences.
Whichever it is, I bet we both can find better use for our time. I’m thankful for the time and effort though, even if I wasn’t persuaded. Sorry to have prolonged it so.