Python Performance: Why 'if not list' is 2x Faster Than Using len()

abhi9u@lemmy.world · 13 days ago

Python Performance: Why 'if not list' is 2x Faster Than Using len()

thebestaquaman@lemmy.world · 13 days ago

I write a lot of Python. I hate it when people use “X is more pythonic” as some kind of argument for what is a better solution to a problem. I also have a hang up with people acting like python has any form of type safety, instead of just embracing duck typing.This lands us at the following:

The article states that “you can check a list for emptiness in two ways: if not mylist or if len(mylist) == 0”. Already here, a fundamental mistake has been made: You don’t know (and shouldn’t care) whether mylist is a list. These two checks are not different ways of doing the same thing, but two different checks altogether. The first checks whether the object is “falsey” and the second checks whether the object has a well defined length that is zero. These are two completely different checks, which often (but far from always) overlap. Embrace the duck type- type safe python is a myth.

sugar_in_your_tea@sh.itjust.works · 12 days ago

type safe python is a myth

Sure, but type hints provide a ton of value in documenting for your users what the code expects. I use type hints everywhere, and it’s fantastic! Yes, there’s no guarantee that the types are correct, but with static analysis and the assumption that your users want their code to work correctly, there’s a very high chance that the types are correct.

That said, I lie about types all the time. For example, if my function accepts a class instance as an argument, the intention is that the code accept any class that implements the same methods as the one I’ve defined in the parameter list, and you don’t necessarily have to pass an instance of that class in (or one of its sub-classes). But I feel like putting something reasonable in there makes a lot more sense than nothing, and I can clarify in the docstring that I really just need something that looks like that object. One of these days I’ll get around to switching that to Protocol classes to reduce type errors.

That said, I don’t type hint everything. A lot of private methods and private functions don’t have types, because they’re usually short and aren’t used outside the class/file anyway, so what’s the point?

thebestaquaman@lemmy.world · 12 days ago

Type hints are usually great, as long as they’re kept up to date and the IDE interprets them correctly. Recently I’ve had some problems with PyCharm acting up and insisting that matplotlib doesn’t accept numpy arrays, leading me to just disable the type checker altogether.

All in all, I’m a bit divided on type hints, because I’m unsure whether I think the (huge) value added from correct type hints outweighs the frustration I’ve experienced from incorrect type hints. Per now I’m leaning towards “type hints are good, as long as you never blindly trust them and only treat them as a coarse indicator of what some dev thought at some point.”

sugar_in_your_tea@sh.itjust.works · 12 days ago

leading me to just disable the type checker altogether.

The better option is to just put # type: ignore on the statements where it gets confused, and add hints for your code. I’ve done that for SQLAlchemy before they got proper type hinting, and it worked pretty well.

That said, a type hint is just that, a hint. It shouldn’t be relied on to be 100% accurate (i.e. lots of foo: list should actually be foo: list | None), but if you use a decent static analysis tool, you should catch the worst of it. We use pyright, which is built in to the VSCode extension pylance. It works incredibly well, though it’s a bit too strict in many cases (e.g. when things can be None but generally aren’t).

So yeah, never blindly trust type hints, but do use them everywhere. The more hints you have, the more the static analysis can help, and disabling them on a case-by-case basis is incredibly easy. You’ll probably still get some runtime exceptions that correct type checking could have caught, but it’s a lot better than having a bunch of verbose checks everywhere that make no sense. A good companion to type checks is robust unit test cases with reasonable data (i.e. try to exercise the boundaries of what users can input).

As it stands, we very rarely get runtime exceptions due to poor typing because our type hints are generally pretty good and our unit test cases back that up. Don’t blindly trust it, and absolutely read the docs for anything you plan to use, but as long as you are pretty consistent, you can start making some assumptions about what your data looks like.

AnUnusualRelic@lemmy.world · edit-2 11 days ago

From that little image, they’re happy it takes a tenth of a fucking second to check if a list is empty?

What kind of dorito chip is that code even running on?

PattyMcB@lemmy.world · 13 days ago

I know I’m gonna get downvoted to oblivion for this, but… Serious question: why use Python if you’re concerned about performance?

lengau@midwest.social · 13 days ago

It’s all about trade-offs. Here are a few reasons why one might care about performance in their Python code:

Performance is often more tied to the code than to the interpreter - an O(n³) algorithm in blazing fast C won’t necessarily perform any better than an O(nlogn) algorithm in Python.
Just because this particular Python code isn’t particularly performance constrained doesn’t mean you’re okay with it taking twice as long.
Rewriting a large code base can be very expensive and error-prone. Converting small, very performance-sensitive parts of the code to a compiled language while keeping the bulk of the business logic in Python is often a much better value proposition.

These are also performance benefits one can get essentially for free with linter rules.

Anecdotally: in my final year of university I took a computational physics class. Many of my classmates wrote their simulations in C or C++. I would rotate between Matlab, Octave and Python. During one of our labs where we wrote particle simulations, I wrote and ran Octave and Python simulations in the time it took my classmates to write their C/C++ versions, and the two fastest simulations in the class were my Octave and Python ones, respectively. (The professor’s own sim came in third place). The overhead my classmates had dealing with poorly optimised code that caused constant cache misses was far greater than the interpreter overhead in my code (though at the time I don’t think I could have explained why their code was so slow compared to mine).

Opisek@lemmy.world · 12 days ago

The graph makes no sense. Did a generative AI make it.

Harvey656@lemmy.world · 12 days ago

I could have tripped, knocked over my keyboard, cried for 13 straight minutes on the floor, picked my keyboard back up, accidentally hit the enter key making a graph and it would have made more sense than this thing.

-2x faster. What does that even mean?

AnUnusualRelic@lemmy.world · 11 days ago

There’s probably an “import * from relativity” in there somewhere.