I don't find any of the arguments here particularly convincing. This claim in particular is weird:
> Similarly, when you use len() to check a sequence for emptiness, you are reaching out for a more powerful tool than necessary. As a result it may make the reader question the intent behind it, e.g. if there is a possibility of handling objects other than a built-in sequence type here?
Given that checking for truthiness is less strict than a length test, by the same token, whenever you use it, you're reaching for an even more powerful tool than necessary. And, if anything, seeing `not items` is what makes me question the intent - did the author mean to also check for None etc here, or are they just assuming that it's never going to be that? And sure, well-written code will have other checks and asserts about it - but when I'm reading your code, I don't know if it's missing an assert like that because you intended it, or because you couldn't be bothered to write it.
OTOH len() is very explicit about what is actually checked, and will fail fast and loudly if the argument is not actually a sequence.
Also note that it's not, strictly speaking, an either-or - you can use `not len(x)` instead of `len(x) == 0` if you want a distinctive pattern specifically for empty collection checks.
Also in Pandas if you try to check the truth value of a dataframe to see if it’s empty, it will fail. It will say “the truth value of a dataframe is ambiguous”.
df.empty is less ambiguous but you have to remember it specifically for dataframes.
But len(df) > 0 almost always works for any type of collection.
I don't agree with this at all, and I wonder if it reflects what other languages you use that may have shaped your assumptions. `if mylist` feels very much like Common Lisp to me. In much of the code I've seen, the value would never be none because an empty list was definitely created.
Python's truthiness behavior was the trigger for one of my worst ever bugs early in my career, which not only pulled in senior engineering but also marketing/comms and legal to help sort out the mess. Not a fan!
Keep in mind that truthiness comes from __bool__ and is overridable, so separate from Python itself, a lot of library authors have made questionable decisions here. A perennial contender is https://github.com/psf/requests/issues/2002.
I agree that `len(seq) == 0` is more readable. I don't mind the recommended truth test of the sequence itself, but I have no idea who would use their "wrong" option with the length as a truth value. Or maybe POSIX exit codes (0 is success) have made me shy about using integers as truth values.
The one big exception to this I've found is working with image data. When you have over a million pixels, you'll have a slow time transforming them unless you can process each one within nanoseconds. Even a compiled language like Rust will struggle with image data if you build it in debug mode. So there's really always going to be a place for optimized implementations of those sorts of things.
Isn't this just a Perl feature (arrays are also their length and zero values are falsy)? I can't help but feel Python is getting closer to Perl as time goes on. Ironic, since their original goal was to be simple and make themselves distinct from Perl. What was the saying again? "There should be one-- and preferably only one --obvious way to do it." Honestly, I think it was all this "Zen" stuff that lead Python down the path to weirdness. This article reads like a monk interpreting sacred text. I can think of no good reason for all this malarkey.
This can't be Python "getting closer" to Perl, since it's been this way in Python since inception, and I'm pretty sure, Perl as well. Both languages have always been exactly where they are now on the topic, with no motion, probably since the beginning, and certainly for multiple decades.
I'm not particularly well versed in the history of either language. I just know that Python was supposed to be "simple and intuitive", but I've had quite a different experience of it and this has often been down to Perl-like things going on.
My personal opinion is that Python gave up on "simple and intuitive" a long time ago and people still citing that as a guiding principle of the language really need to sit down and just read the Python documentation again from start to finish, and then ponder for a moment. If they still need a clue, they are invited to the same thing to, say, Python 2.1, and ponder for a few more moments. It clearly isn't and hasn't been for a while. That is not, on its own, a bad thing necessarily. It just means it isn't a guiding principle anymore, or at the very least, it has moved way down the priority list. Plenty of languages have changed their guiding principles over time.
However, this is not an example of that. "if list:" has been the idiomatic Python since inception, and "if len(list):" has been an unnecessary complication for the same period of time. Python's "preferably one way to do it" has never been about "there is literally only one syntactically valid way to do it", for fairly obvious reasons if you think about it.
"Idiomatic" doesn't mean "good". It means the normal way within the language. Personally after using both languages like Python with "truthiness" and languages that rigidly require all if clauses to evaluate to a boolean, rigidly and directly, I say the latter is unambiguously superior in practice. The former leads to surprises, sometimes even creating security vulnerabilities when a user can wedge an unexpected value into an if statement somehow.
It has unambiguously been idiomatic Python since the beginning. My opinion is that it is bad, but that's much more an opinion than the fact it has been idiomatic. And to Python's credit, part of the reason why I am so sure it's bad is precisely the experience I gained in Python using it. At the time Python was implementing the principle, I don't think the general experience of the programming language community was strong enough to know that it was a bad idea.
Python's guiding principle right now seems to be the same guiding principle as almost every other language, "let's solve as many problems as possible by adding features to the language". If a year goes by without at least one major new feature, the language must be "dead" or "failing". As the years wear on and so many languages have piled up so many features, I wonder when people will finally look around and realize that all these features, for all their superficial appeal in the small scale, are not generally helping them write better programs, or write programs they couldn't have written before, and often harm their programs on larger scales. There are exceptions. I have a hard time imagining any modern language without some concept of closures, for instance. I could name a few more; some sort of easy polymorphism (there's a few ways to get there but you need to take at least one of them... but preferably not all of them...), some sort of concurrency solution in this era, solutions for memory safety (again, multiple solutions, but you need at least one of them). But so many of these features are, in my opinion, not a net positive, their benefits far smaller than meets the eye and the costs so much greater.
Well, lists in Python are not their length. They are convertible to a boolean via __bool__(), however, which is how the "if" tests them (or any other object that has this method) for "truthiness".
`if x: foo()` is a cancer on the Python community. Devs often use it with the intention of handling x being None, and carelessly lump in zero and empty lists/strings at the same time. Endless bugs.
Python’s “truthiness” is a cutesy feature that is just an excuse for bugs in your code. It’s opaque/ too magic, exhibits poor readability and endless confusion.
Just use a normal check, like everyone is expecting to see.
“Oh but what if it’s not a sequence”, well then you have bigger problems. Why are you emptiness testing something that may-or-may-not-be-a-sequence? Maybe solve that problem first.
Python’s “truthiness” is a cutesy feature that is just an excuse for bugs in your code. It’s opaque/ too magic, exhibits poor readability and endless confusion.
Indeed. Relying on truthiness has always felt very un-Pythonic to me, not least because it contradicts several principles in the Zen of Python:
• Explicit is better than implicit.
• Special cases aren’t special enough to break the rules.
• There should be one — and preferably only one — obvious way to do it.
Someone argued user_list, user_count, and has_users are clearer than users, users, and users. Will you argue the opposite?
The original Hungarian notation was a less readable implementation of the same idea. The Hungarian notation most people hated replaced functional types like count and index with data types like unsigned long. And used them everywhere.
Use the simplest syntax and check the code works via unit testing like you should be doing. Don't statically type your code as it increases the bugs by a factor of 3x typically.
Typically, software developers write 3x the number of lines of code when using static typing compared to duck typing. It's just the nature of the static typing code style. Write 3x the lines, get 3x the bugs.
The programming language that has the least measured bug in practice is Clojure because it is duck typed and because it doesn't use OOP. Both static typing and OOP have a significant measurable negative effect on code correctness.
I don't find any of the arguments here particularly convincing. This claim in particular is weird:
> Similarly, when you use len() to check a sequence for emptiness, you are reaching out for a more powerful tool than necessary. As a result it may make the reader question the intent behind it, e.g. if there is a possibility of handling objects other than a built-in sequence type here?
Given that checking for truthiness is less strict than a length test, by the same token, whenever you use it, you're reaching for an even more powerful tool than necessary. And, if anything, seeing `not items` is what makes me question the intent - did the author mean to also check for None etc here, or are they just assuming that it's never going to be that? And sure, well-written code will have other checks and asserts about it - but when I'm reading your code, I don't know if it's missing an assert like that because you intended it, or because you couldn't be bothered to write it.
OTOH len() is very explicit about what is actually checked, and will fail fast and loudly if the argument is not actually a sequence.
Also note that it's not, strictly speaking, an either-or - you can use `not len(x)` instead of `len(x) == 0` if you want a distinctive pattern specifically for empty collection checks.
For me, len() is much less ambiguous.
Also in Pandas if you try to check the truth value of a dataframe to see if it’s empty, it will fail. It will say “the truth value of a dataframe is ambiguous”.
df.empty is less ambiguous but you have to remember it specifically for dataframes.
But len(df) > 0 almost always works for any type of collection.
I don't agree with this at all, and I wonder if it reflects what other languages you use that may have shaped your assumptions. `if mylist` feels very much like Common Lisp to me. In much of the code I've seen, the value would never be none because an empty list was definitely created.
Ah yes, reusing an empty list NIL as the false value because having a separate #f atom is bad for whatever reason.
But it's only one value in the entire system. There is only one nil; there is no other false but nil. And there is no other empty list but nil.
An empty array or string are not false. Zero is not false.
In Lisp dialects with nil, we don't have endless discussions about how to test for false, or for empty, as the case may be.
It is very convenient and makes the code terse and that matters.
Python's truthiness behavior was the trigger for one of my worst ever bugs early in my career, which not only pulled in senior engineering but also marketing/comms and legal to help sort out the mess. Not a fan!
I think this needs more explanation to know if this is a good argument or not.
Keep in mind that truthiness comes from __bool__ and is overridable, so separate from Python itself, a lot of library authors have made questionable decisions here. A perennial contender is https://github.com/psf/requests/issues/2002.
You know you're going to need to provide us with a little snippet demonstrating this behavior now, right?
I'm mostly annoyed that 'if len(items) == 0' / 'if len(items) > 0' aren't presented as options.
If we're talking about readability, they're far clearer than either of the options in the article and require no pre-knowledge of truthiness rules.
I agree that `len(seq) == 0` is more readable. I don't mind the recommended truth test of the sequence itself, but I have no idea who would use their "wrong" option with the length as a truth value. Or maybe POSIX exit codes (0 is success) have made me shy about using integers as truth values.
They are presented as options, it's just that their performance sucks. From the article:
I feel like, if you care about performance, using Python at all is a mistake.
For most applications, the choice of algorithms and data structures has more impact on performance than programming language.
The one big exception to this I've found is working with image data. When you have over a million pixels, you'll have a slow time transforming them unless you can process each one within nanoseconds. Even a compiled language like Rust will struggle with image data if you build it in debug mode. So there's really always going to be a place for optimized implementations of those sorts of things.
Then caring about a microptimisation over readability would still be the wrong call
When you profile your code and find that all you gotta do is change a few if statements for a 2x perf boost, that will be a happy day.
Isn't this just a Perl feature (arrays are also their length and zero values are falsy)? I can't help but feel Python is getting closer to Perl as time goes on. Ironic, since their original goal was to be simple and make themselves distinct from Perl. What was the saying again? "There should be one-- and preferably only one --obvious way to do it." Honestly, I think it was all this "Zen" stuff that lead Python down the path to weirdness. This article reads like a monk interpreting sacred text. I can think of no good reason for all this malarkey.
This can't be Python "getting closer" to Perl, since it's been this way in Python since inception, and I'm pretty sure, Perl as well. Both languages have always been exactly where they are now on the topic, with no motion, probably since the beginning, and certainly for multiple decades.
I'm not particularly well versed in the history of either language. I just know that Python was supposed to be "simple and intuitive", but I've had quite a different experience of it and this has often been down to Perl-like things going on.
My personal opinion is that Python gave up on "simple and intuitive" a long time ago and people still citing that as a guiding principle of the language really need to sit down and just read the Python documentation again from start to finish, and then ponder for a moment. If they still need a clue, they are invited to the same thing to, say, Python 2.1, and ponder for a few more moments. It clearly isn't and hasn't been for a while. That is not, on its own, a bad thing necessarily. It just means it isn't a guiding principle anymore, or at the very least, it has moved way down the priority list. Plenty of languages have changed their guiding principles over time.
However, this is not an example of that. "if list:" has been the idiomatic Python since inception, and "if len(list):" has been an unnecessary complication for the same period of time. Python's "preferably one way to do it" has never been about "there is literally only one syntactically valid way to do it", for fairly obvious reasons if you think about it.
Certainly "there should be one, and preferably only one, obvious way to solve a problem" hasn't been the case for a while, or maybe ever. See: tfa.
Perhaps it's just because I'm not Dutch.
> "if list:" has been the idiomatic Python since inception
I would argue the reverse. It was a bad idea to begin with and the start of something worse.
> Plenty of languages have changed their guiding principles over time.
What would you say is the guiding principal of Python now then?
"Idiomatic" doesn't mean "good". It means the normal way within the language. Personally after using both languages like Python with "truthiness" and languages that rigidly require all if clauses to evaluate to a boolean, rigidly and directly, I say the latter is unambiguously superior in practice. The former leads to surprises, sometimes even creating security vulnerabilities when a user can wedge an unexpected value into an if statement somehow.
It has unambiguously been idiomatic Python since the beginning. My opinion is that it is bad, but that's much more an opinion than the fact it has been idiomatic. And to Python's credit, part of the reason why I am so sure it's bad is precisely the experience I gained in Python using it. At the time Python was implementing the principle, I don't think the general experience of the programming language community was strong enough to know that it was a bad idea.
Python's guiding principle right now seems to be the same guiding principle as almost every other language, "let's solve as many problems as possible by adding features to the language". If a year goes by without at least one major new feature, the language must be "dead" or "failing". As the years wear on and so many languages have piled up so many features, I wonder when people will finally look around and realize that all these features, for all their superficial appeal in the small scale, are not generally helping them write better programs, or write programs they couldn't have written before, and often harm their programs on larger scales. There are exceptions. I have a hard time imagining any modern language without some concept of closures, for instance. I could name a few more; some sort of easy polymorphism (there's a few ways to get there but you need to take at least one of them... but preferably not all of them...), some sort of concurrency solution in this era, solutions for memory safety (again, multiple solutions, but you need at least one of them). But so many of these features are, in my opinion, not a net positive, their benefits far smaller than meets the eye and the costs so much greater.
Well, lists in Python are not their length. They are convertible to a boolean via __bool__(), however, which is how the "if" tests them (or any other object that has this method) for "truthiness".
`if x: foo()` is a cancer on the Python community. Devs often use it with the intention of handling x being None, and carelessly lump in zero and empty lists/strings at the same time. Endless bugs.
Yup!
Python’s “truthiness” is a cutesy feature that is just an excuse for bugs in your code. It’s opaque/ too magic, exhibits poor readability and endless confusion.
Just use a normal check, like everyone is expecting to see.
“Oh but what if it’s not a sequence”, well then you have bigger problems. Why are you emptiness testing something that may-or-may-not-be-a-sequence? Maybe solve that problem first.
Python’s “truthiness” is a cutesy feature that is just an excuse for bugs in your code. It’s opaque/ too magic, exhibits poor readability and endless confusion.
Indeed. Relying on truthiness has always felt very un-Pythonic to me, not least because it contradicts several principles in the Zen of Python:
• Explicit is better than implicit.
• Special cases aren’t special enough to break the rules.
• There should be one — and preferably only one — obvious way to do it.
I would prefer a standard list method/function to test for emptiness, which would be both readable and efficient.
It won't be efficient as it's python, but it could be readable
Holy moly, that meme about type checkers and variable names, someone is arguing for hungarian notation in 2024?!
Someone argued user_list, user_count, and has_users are clearer than users, users, and users. Will you argue the opposite?
The original Hungarian notation was a less readable implementation of the same idea. The Hungarian notation most people hated replaced functional types like count and index with data types like unsigned long. And used them everywhere.
tldr: do not use 'if len(list) == 0', use 'if not list' !
You just have to not write any bug in your code. Also, use type checking everywhere. And rewrite your mind, too.
The benefits are worth it .. ! Oh well;
> Also, use type checking everywhere
I can't think of anything more Pythonic than that!
[flagged]
Use the simplest syntax and check the code works via unit testing like you should be doing. Don't statically type your code as it increases the bugs by a factor of 3x typically.
How does statically typing your code "increase[] your bugs by a factor of 3x" when it has no runtime effect? What orifice are you getting the 3x from?
Typically, software developers write 3x the number of lines of code when using static typing compared to duck typing. It's just the nature of the static typing code style. Write 3x the lines, get 3x the bugs.
The programming language that has the least measured bug in practice is Clojure because it is duck typed and because it doesn't use OOP. Both static typing and OOP have a significant measurable negative effect on code correctness.
Clojure has the least measured bugs in practice? Where does that statistic come from?
I too can invent numbers.
My invented numbers say there's no meaningful difference in line counts between statically typed and dynamically typed languages.
So how will we prove any of us wrong?