Python multi-level break and continue

norhi999 · on Sept 4, 2022

I don't get why they discarded

for/while... as label1:

__for/while... as label2:

___break label1

suggestion. It actually seems a good enough idea to implement that. And it's rather concise. I often need to break deep inner loops to outermost and doing it with flags is... strange.

montroser · on Sept 4, 2022

Agreed -- Perl does it with labels this way, and JavaScript too. It has always seemed totally sensible in the occasional real-world examples I've seen in the wild.

hansvm · on Sept 4, 2022

Mildly on-topic, Zig has basically that behavior, and it's really smooth in my experience.

eesmith · on Sept 4, 2022

Could you give a real-world example of how it proved useful?

Eg, quoting the text:

> Angelico noted that he has used the Pike programming language, which does have a labeled break. He found that he had used the feature twice in all of the Pike code he has written. Neither of the uses was particularly compelling in his opinion; one was in a quick-and-dirty script and the other is in need of refactoring if he were still working on that project, he said. That was essentially all of the real-world code that appeared in the discussion.

IshKebab · on Sept 4, 2022

Rust supports this. It appears to be used more than twice...

https://grep.app/search?q=break%20%27%5Cw&regexp=true&case=t...

eesmith · on Sept 4, 2022

My example was "twice by one developer", not "twice across all indexed repos."

A spot check shows that quite a few in your link are used specifically to ensure correct handling of Rust multi-level breaks work syntax, like https://github.com/rust-lang/rust-analyzer/blob/master/crate... , https://github.com/rust-lang/rustfmt/blob/master/tests/sourc... , https://github.com/rust-lang/rust/blob/master/src/tools/rust... , https://github.com/rust-lang/rust/blob/master/src/tools/rust... and likely more.

Another is a translation of BASIC code to Rust, using break as a form of goto. https://github.com/coding-horror/basic-computer-games/blob/e... . The Python version at https://github.com/coding-horror/basic-computer-games/blob/e... doesn't use that approach.

The example at https://github.com/tokio-rs/mio/blob/master/tests/tcp.rs is a nice one

    // Wait for our TCP stream to connect
    'outer: loop {
        poll.poll(&mut events, None).unwrap();
        for event in events.iter() {
            if event.token() == Token(1) && event.is_writable() {
                break 'outer;
            }
        }
    }

though it can be replaced with a helper-function (note: I don't fully know what the code is doing, but the following looks right):

    def find_writable_event(events):
      for event in events:
         if event.token() == Token(1) and event.is_writable():
             return True
     
    while 1:
      poll.poll(events, None)
      if find_writable_event(events):
         break

So while there's likely a good real-world example of something which can't easily be re-written in an alternative form for Python, simply pointing to a grep result isn't all that persuasive.

IshKebab · on Sept 5, 2022

Sure, but I don't think you can really evaluate a feature based on how many times one developer uses them. Nor even on how many times they are used in total.

There are a ton of Python features (and misfeatures) that I'm sure most Python devs never use. __subclasses__ for example, or sitecustomize.py.

Similarly I'm pretty sure Rust's i128 is almost never used but it would be really weird to omit it.

eesmith · on Sept 5, 2022

What they can evaluate is that no one has yet come up with a real-world example where it's a useful addition to Python, and at least some of those people involved in the discussion have experience with multi-level break from other languages.

Hence my "Could you give a real-world example of how it proved useful?"

Convincing examples might change their minds!

While simply saying "here's how to find Rust programs which use that construct" is not informative nor convincing.

With __subclasses__ there are definite real-world examples where they are useful. We can see the motivation in the commit history:

  commit 1c45073aba8f4097b4dc74d4fd3b2f5ed5e5ea9b
  Author: Guido van Rossum <guido@python.org>
  Date:   Mon Oct 8 15:18:27 2001 +0000

    Keep track of a type's subclasses (subtypes), in tp_subclasses, which
    is a list of weak references to types (new-style classes).  Make this
    accessible to Python as the function __subclasses__ which returns a
    list of types -- we don't want Python programmers to be able to
    manipulate the raw list.

    In order to make this possible, I also had to add weak reference
    support to type objects.

    This will eventually be used together with a trap on attribute
    assignment for dynamic classes for a major speed-up without losing the
    dynamic properties of types: when a __foo__ method is added to a
    class, the class and all its subclasses will get an appropriate tp_foo
    slot function.

and the commit logs show an example of use:

  Author: Victor Stinner <victor.stinner@gmail.com>
  Date:   Fri Mar 25 17:36:33 2016 +0100
    ...
    * Use __subclasses__() to get resource classes instead of using an hardcoded
      list (2 shutil resources were missinged in the list!)
    ...

In addition, we can find third-party packages which use it.

I've never used 128-bit integers, but its seems many people do have real-world cases for it.

What do you think is a convincing real-world example that should motivate its inclusion in Python?

IshKebab · on Sept 5, 2022

> What do you think is a convincing real-world example that should motivate its inclusion in Python?

That Rust example is a decent one. Also the various "find something in a nested structure" examples people have posted.

(And the fact that you can achieve the same result in a more awkward way by putting it in a function and using `return` is irrelevant because there are many many language features that are just convenient sugar: +=, lambdas, even while loops!)

eesmith · on Sept 5, 2022

The various examples are synthetic.

The use of a helper-function is given in the lwn article as a reason for not supporting multi-level break:

] The solution to "Python needs a way to jump out of a chunk of code" is usually to put the chunk of code into a function, then return out of it.

More specifically:

] To make this proposal convincing, we need a realistic example of an algorithm that uses it, and that example needs to be significantly more readable and maintainable than the refactorings into functions, or the use of try…except (also a localised goto).

] If you intend to continue to push this idea, I strongly suggest you look at prior art: find languages which have added this capability, and see why they added it.

The multi-level break in that Rust example may be "more readable and maintainable than the refactorings into functions", but it is not "significantly more readable."

IshKebab · on Sept 6, 2022

Frankly that's just broken logic. You could easily say:

> The solution to "Python needs a way to increment variables" is usually to so `a = a + 1`.

> To make this proposal convincing, we need a realistic example of an algorithm that uses it, and that example needs to be significantly more readable and maintainable than `a = a + 1`.

Why does `+=` get into the language but labelled breaks (a reasonably standard feature) don't. This isn't hypothetical - not all languages have +=. Matlab for example requires you to do `a = a + 1`.

eesmith · on Sept 6, 2022

https://peps.python.org/pep-0203/ says:

> The idea behind augmented assignment in Python is that it isn’t just an easier way to write the common practice of storing the result of a binary operation in its left-hand operand, but also a way for the left-hand operand in question to know that it should operate on itself, rather than creating a modified copy of itself.

Here's an example of how "a += b" is not syntactic sugar for "a = a + b". First, "a = a + b", which rebinds 'a' to a new object while leaving 'b' bound to the original:

  >>> a = b = [9,8]
  >>> a + b
  [9, 8, 9, 8]
  >>> a = a + b
  >>> a
  [9, 8, 9, 8]
  >>> b
  [9, 8]

Second, "a += b", which keeps both a and b bound to the same object:

  >>> a = b = [9,8]
  >>> a += b
  >>> a
  [9, 8, 9, 8]
  >>> b
  [9, 8, 9, 8]

IshKebab · on Sept 8, 2022

Nah that's not a good enough example to motivate adding += to Python. You can just do `a.extend(b)`!

eesmith · on Sept 8, 2022

It wasn't meant to be a convincing argument.

It was meant to show your comment at https://news.ycombinator.com/item?id=32741165 wasn't a relevant parallel, because it ignored how "a+=1" and "a=a+1" are different.

For a more convincing use, consider NumPy arrays, where a+=1 re-use the same (potentially very large) array, while a=a+1 creates a new array.

  import numpy
  a = b = numpy.array([[1,2], [3, 9]])
  a += 1  # modify in-place
  a = a + 1 # create a new array

  >>> a
  array([[ 3,  4],
         [ 5, 11]])

  >>> b
  array([[ 2,  3],
         [ 4, 10]])

In-place modification can improve performance over using intermediate/temporary arrays.

benj111 · on Sept 4, 2022

I can see circumstances where you've just got one loop, it would be easier to follow.

shusaku · on Sept 4, 2022

goto label1

dataflow · on Sept 4, 2022

> But the arguments given in support of the feature were generally fairly weak; they often used arbitrary, "made up" examples that demonstrated a place where multi-level break could be used, but were not particularly compelling.

> To make this proposal convincing, we need a realistic example of an algorithm that uses it, and that example needs to be significantly more readable and maintainable than the refactorings into functions, or the use of try…except (also a localised goto).

How about this:

  found = None

  for r, row in enumerate(table.rows):
    for c, cell in enumerate(row):
      if search_query.matches(cell.value):
        found = (r, c)
        break 2

  logger.log("Found cell: {}".format(found))
  return found

BiteCode_dev · on Sept 4, 2022

EDIT: made a mistake, see my next comment bellow. I won't update this comment so that other readers can see the code for the 2 use cases.

Python has itertools.product and unpacking for this case:

    from itertools import product

    space = product(enumerate(table.rows), enumerate(row))
    for (r, row), (c, cell) in space:
        if search_query.matches(cell.value):
            logger.info("Found cell: (%s,%s)", r, c)
            return (r, c)

Note that we don't need the "found" variable neither to get the same beheavior. We can also use break + else, but in this case, the return is cleaner since that's the default value for python functions. Additionnally, the logger should do the interpolation, not format().

I think the Python team is having to do the difficult job of filtering new feature, and it makes sense they don't introduce it in this case. We already have all the facilities in the language to deal with most nested loop cases, and those are rare cases anyway.

progval · on Sept 4, 2022

Not in this case. 'row' is an item in the 'table.rows' iterable.

BiteCode_dev · on Sept 4, 2022

Right, I misread the code, my bad.

For this case, one should use a nested comprehension:

    space = ((r, c, cell) for r, row in enumerate(table.rows) for c, cell in enumerate(row))
    for (r, c, cell) in space:
        if search_query.matches(cell.value):
            logger.info("Found cell: (%s,%s)", r, c)
            return (r, c)

Or see masklinn comment if you don't need the logger.

In all languages, you separate code for preparing your input and code for processing it. In python, it's also a good idea to do that at the loop level:

- prepare your stream of data so that it's normalized using itertools, generators, a function, a numpy array, etc

- do a single loop pass for your processing

This way you will you obtain a single iterable, and will be able to get the most out of the language tooling:

- you can pass it to map/filter, itertools or another comprehension

- you can convert to a list, a tuble, a Counter, etc. And slice it

- you can use "else" in the loop, which is an underused construct, but very useful when you need break

- you can use advanced unpacking

- you can call iter() and/or next(). This is what masklinn does in his example.

The point being, nested loops are rare enough, and we have tool to deal with them in an idiomatic way in Python, and not adding new syntax for a rare case that usually has a good solution makes sense to me.

masklinn · on Sept 4, 2022

At that point you might as well make the entire thing into a gencomp inside a `next()` call, the explicit for loop is redundant.

BiteCode_dev · on Sept 4, 2022

Yes, if you don't need the logger, it is indeed cleaner.

masklinn · on Sept 4, 2022

In that case a flatmap does the job fine[0]. In fact you don't even need a flatmap:

    found = next((
        (r, c)
        for r, row in enumerate(table.rows):
        for c, cell in enumerate(row):
        if search_query.matches(cell.value)
    ), None)

Or just return at the location where you found a match. That way you can also provide a better error message in the failure case.

[0] well map + itertools.concat, IIRC the stdlib doesn't have a flatmap per-se, though more_itertools probably does.

_t4za · on Sept 4, 2022

That doesn't look to be "significantly more readable" than:

  def find_cell(table):
    for r, row in enumerate(table.rows):
      for c, cell in enumerate(row):
        if search_query.matches(cell.value):
          return (r, c)

  found = find_cell(table)
  logger.log("Found cell: {}".format(found))

dataflow · on Sept 4, 2022

It sure looks more readable/maintainable to me than if it was refactored into a function. Could you try refactoring it and showing what it would look like in your mind that would be similar in that respect?

ajanuary · on Sept 4, 2022

They…they did show it refactored…

dataflow · on Sept 4, 2022

No, they edited their comment after I requested it without indicating that they did so, which makes me look dumb.

_t4za · on Sept 4, 2022

Sorry, didn't see your message. You're right I edited it after, but I didn't refresh the page to see your comment.

ajanuary · on Sept 4, 2022

Apologies

jasonhansel · on Sept 4, 2022

Wouldn't you just write:

  found = next((
    (r, c)
    for r, row in enumerate(rows)
    for c, cell in enumerate(row)
    if search_query.matches(cell.value)
  ), None)

ajanuary · on Sept 4, 2022

  def _find_cell(table, search_query):
      for r, row in enumerate(table.rows):
        for c, cell in enumerate(row):
          if search_query.matches(cell.value):
            return (r, c)
      return None

  found = _find_cell(table, search_query)
  logger.log("Found cell: {}".format(found))
  return found

dataflow · on Sept 4, 2022

Okay, so we managed to add cruft to the code without making it significantly worse. Now is this a change you would actually prefer as an improvement if someone asked you to review the original version?

It may be worth observing Gouvernathor does not exactly write "factored-out" code like this either: https://github.com/Gouvernathor/renpy-ParliamentDiagram/blob...

ajanuary · on Sept 4, 2022

But not making it significantly worse is the bar requested:

> that example needs to be significantly more readable and maintainable

Yes, things can be slightly tidier with labelled breaks in some contexts. But that’s not a particularly high bar. There are tons of language features for which that’s true. Adding everyone’s pet nicety to the language makes it a huge language, which isn’t what they’re aiming for. If that’s what you want, go use Perl or C#, both of which are very fine languages.

dataflow · on Sept 4, 2022

> But not making it significantly worse is the bar requested. [...] Adding everyone’s pet nicety to the language makes it a huge language, which isn’t what they’re aiming for.

I might risk going on a rant here, but if the improvement from break-label is insignificant enough that it wouldn't meet that bar, I'm not sure how a ton of other features already in the language ever met it. Did for-else and while-else clauses (which somehow pretty much every other language on earth gets by without) genuinely meet this bar? How about the (pitfall-ridden!) match-case and walrus operator that they added quite recently? Did assignments-in-conditionals suddenly meet this bar in 2019 but not meet it in the decades prior where they refused to consider it compelling enough? What about annotations—couldn't you already do f = annotate(f)?

joshuamorton · on Sept 4, 2022

You're arguing mostly about cases where something was added two decades ago. The bar for new features should go up as time goes on. But importantly "it was a mistake to add for-else" is an argument to not add a similarly not useful and confusing feature.

Assignment expressions passed basically because there are around 2 common places where they clean up code. But unlike labeled break, those cases are relatively common. I dislike the walrus operator and don't think it was worth adding to the language, but I've still run into cases where it's been valuable. That's never been true for labelled break.

ledauphin · on Sept 4, 2022

you're arguing against your own position now.

You are unimpressed with various marginal features that have been added but still think this other marginal feature is the special one that should be added.

In other words, the "but everyone else jumped off the bridge" argument is not especially compelling.

dataflow · on Sept 4, 2022

No, I'm saying their bar for adding things to the language (i.e. their "significance" threshold, which needs to be defined consistently somehow) seems to be lower in reality than your interpretation—or my default one, for that matter. I'm saying if I use the same threshold they utilize in reality, then yes, the improvement is very much "significant" enough to add.

still_grokking · on Sept 5, 2022

> Okay, so we managed to add cruft to the code without making it significantly worse. Now is this a change you would actually prefer as an improvement if someone asked you to review the original version?

Added cruft? I don't think so.

The second version without the break is imho a clear improvement! It's pure functional code (ignoring the logging).

Compared to the imperative code which uses mutation of state defined outside of the function, which makes the inner workings of this algo even harder to follow, it's an obvious improvement.

Line count stays the same. Therefore I would in any case prefer the second, refactored version.

There may be cases where a multi-level break or even a goto would yield a better solution, sure. But those cases are rare and likely not present in Python at all as you do such things usually only to avoid a few machine code instructions; something that is irrelevant in Python as it's anyway slow, no matter what you do, so such optimizations would not make any sense at all.

joshuamorton · on Sept 4, 2022

There's a number of options here that I'd suggest: first would be that you're doing something wrong with the api of the table, it should be indexable as `table[r][c]` or `table[r, c]`, and you really should opt for that if at all possible, then this becomes super straightforward.

But if you really need this to work on an 2-d array that is only an Iterable, and somehow isn't Sized or Sequence, this still works:

    def flatten(table):
        for r, row in enumerate(table.rows):
            for c, cell in enumerate(row):
                yield (r, c, row, cell)

    def query(table):
        for (r, c, row, cell) in flatten(table):
            if search_query.matches(cell.value):
                return (r, c)

This process is generalizable, you can pretty much always separate the iterate and test steps by following this pattern.

dataflow · on Sept 4, 2022

(1) I dare say this is less readable than the original. Not un-readable, but definitely less readable. Just put them side-by-side if this isn't obvious.

(2) You probably don't want to call it something generic like "flatten" if it's going to do something so specific like concatenate table.rows. Note that I'm not just nitpicking on the name here. Rather, my bigger point is your factored-out functions aren't really as reusable as your name suggests. In this particular case you can trivially get around it by passing table.rows directly, but that obviously wouldn't generalize if you needed anything else in the table in the process.

> This process is generalizable

(3) I'd question that too. Imagine if you had a per-row operation too. Like maybe:

  found = None
  for r, row in enumerate(table.rows):
    for c, cell in enumerate(row):
      if search_query.matches(cell.value):
        found = (r, c)
        break 2
    logger.log("Finished row #{}".format(r + 1))

Are you really going to factor it out like this?

  def flatten(table):
    for r, row in enumerate(table.rows):
      for c, cell in enumerate(row):
        yield (r, c, row, cell)
    logger.log("Finished row #{}".format(r + 1))

joshuamorton · on Sept 4, 2022

(1) yes my way is clearer, side by side this is very obvious. Certainly it is not worse.

(2) sure, but this doesn't change anything. Factoring a big ugly function out into two smaller easy functions is...fine, even if each is only used once. The name I picked wasn't to imply generalization, it was because I needed a name.

(3) yes.

But again the whole reason this is unreadable at all is because you've created an intentionally obtuse api. We're returning the row and column, so clearly either were able to index into this thing normally, or it's a temp we've created, in which case we can modify it's api and index into it normally.

dataflow · on Sept 4, 2022

> But again the whole reason this is unreadable at all is because you've created an intentionally obtuse api. [...] we can modify its API

Do you never consume APIs you have no control over? And is every sequence you use an indexable list?

And no, there was nothing intentionally obtuse in my comment. It was incredibly straightforward and readable code. We obviously disagree, but regardless, I don't enjoy the bad-faith accusations, so I'll stop here.

joshuamorton · on Sept 4, 2022

> Do you never consume APIs you have no control over? And is every sequence you use an indexable list?

I already explained this. You're returning an index. There's no reason to do that unless.you can use the index. Somewhere, the api allowed you to use the index, so in this case, you've got control over the relevant parts of the api that don't!

> And is every sequence you use an indexable list?

Yes, or it's small enough that I can store or transform it into one, or it's big enough that I need to use specialized tooling that doesn't play well with break or continue.

> It was incredibly straightforward and readable code

I agree, there's nothing wrong with the code as written. It's fine. The obtuseness was the example. I'll reiterate: you're returning an index into some table. What are you going to do with that index, as you're claiming the table you pulled that index from cannot be indexed into for some reason? The code is fine, the example is obtuse.

dotancohen · on Sept 4, 2022

Throw it in a function and return early:

  found = findIt(search_query, table.rows, logger)

  def findIt(query, rows, logger):
    for r, row in enumerate(rows):
      for c, cell in enumerate(row):
        if query.matches(cell.value):
          logger.log("Found cell: {}".format(found))
          return (r, c)
          break 2

    return None

progval · on Sept 4, 2022

You can use a generator expression:

  all_matches = (
    (r, c)
    for r, row in enumerate(table.rows)
    for c, cell in enumerate(row)
    if search_query.matches(cell.value)
  )
  found = next(all_matches, None)
  if found:
    logger.log("Found cell: {}".format(found))
  return found

It is lazy, so it stops as soon as the first match is found.

dataflow · on Sept 4, 2022

This doesn't generalize well. Add a statement or something in the loop and it clashes with this.

jdougan · on Sept 4, 2022

I suspect the Pythonistas would suggest something like the below is somehow better. I'm not confident in it's superiority, particularly since python doesn't really optimize.

  def searchRow(row,r):
    found = None
    for c, cell in enumerate(row):
      if cell.value == expected:
        found = (r, c)
        break
    return found

  found = None
  for r, row in enumerate(self.rows):
    found = searchRow(row, r)
    if found is not None:
      break
  logger.log("Found item: {}".format(found))
  return found

artificialidiot · on Sept 4, 2022

I think it boils down to "early exit" v.s. "single return" style. More than likely, the average programmer will pick the style they've been taught at school or whichever was preached more.

I think early exit constructs improve readability and python already offers a lot of syntax open to abuse at this point. (things you can write with nested list/dict comprehensions with if conditions/expressions scattered..)

benstopics · on Sept 6, 2022

The first thing that came to mind was either for/else, try/except, or refactoring out to a function which were all mentioned in the article. This solves the problem in all cases except a multi-level continue as it was called, and the solution by Python Millionaire was essentially goto labels. The reason high level programming languages moved away from goto labels is because they create spaghetti code. Goto labels were before for and while loops and are not necessary now, but this is a shallow answer because goto labels do work.

To dig deeper into why goto labels are bad and the alternative, what goto labels allow you to do is to create a state machine in procedural code. The more explicit way to do this is to create a dictionary object that tracks the state and a while loop with state-based logic that includes a halting condition. But according to Python Millionaire this would not be “Pythonic” which is basically just a blanket term for saying it is built into the language and therefore designed to be simple and easy to read. However, I would argue that goto statements are inherently un-Pythonic because they encourage spaghetti code. I don’t see how for/else or try/except is not Pythonic, but also I don’t see the aversion to refactoring to separate function which is the accepted cross-language way to refactor and simplify code. It even lets you write a built-in comment for what the function does, the function name. That being said, an easy way to determine if the function failed is to either throw and exception or simply return a result like True/False which would determine whether to break out of the parent loop.

greatgib · on Sept 11, 2022

Personally I like very much the solution proposed at the end of comments:

https://lwn.net/Articles/907510/

With something like: For loop... As my_loop: my_loop.continue()

henrydark · on Sept 4, 2022

Python has coroutines (in the original sense), so one can implement almost Break-break. For example, it's easy to implement an object so that this will work as expected:

  from easytowritelibrary import LoopManager
  lm = LoopManager()
  for x in lm(a1):
    for y in lm(a2):
      if cond:
        lm.break_(2); break
      block()

It can support stuff like:

  for x in lm(a1, "a1"):
    for y in lm(a2, "a2"):
      if cond1:
        lm.break_("a1"); break
      if cond2:
        lm.continue_("a1"); break
      block()

It can't handle "for-else" correctly, but maybe that's OK.

mangecoeur · on Sept 4, 2022

You might as well implement goto while you’re at it.

analog31 · on Sept 4, 2022

At least GOTO is readable to a layperson in terms of what happens after the break. And Python has sprawled to the point where it's important to make things readable by people who have not memorized the entire language.

chkas · on Sept 4, 2022

I argue for a "break" with a number that indicates how many loops you want to break out of. I have this in my programming language https://easylang.online. Each "break" needs the number, so also a "break 1".

  for i = 1 to 50
    for j = 1 to 50
      if i * i + j * j * j = 80036
        break 2
      end
    end
  end
  print i & " " & j

ajanuary · on Sept 4, 2022

1) I need to count the loops to work out where it breaks. 2) It’s sensitive to refactoring changing the nesting.

That seems strictly worse than labelled breaks. What are the upsides?

chkas · on Sept 4, 2022

You don't have to make up a name again. The namespace is usually pretty full anyway. If you accept a "break", which is a "break 1", you must consistently also allow a "break 2". Many things are sensitive to refactoring that changes the nesting.

datalopers · on Sept 4, 2022

PHP does this. Works fine, though it’s admittedly rarely needed in the first place.

shusaku · on Sept 4, 2022

FYI Fortran has this feature so if you feel like python is really missing this feature try using a more advanced language

rich_sasha · on Sept 4, 2022

Walrus operator? In. F string interpolation? In. Pattern matching? Hacked in.

Genuinely useful feature that is awkward to emulate? Nah.

Let's face it, everyone writes a double or triple nested for loop every now and then.

stinos · on Sept 4, 2022

every now and then

Exactly; at least for me the other features you mention I use daily (or when considering coding hours only, multiple times per hour), whereas I cannot even exactly remember when the last time was I had to break out of a nested loop. I'm not saying it's not useful, just that I don't think that code frequency is a useful argument to make here, especially not in comparison with the features you mention.

rich_sasha · on Sept 4, 2022

See I'm the other way round. I don't use walrus op, I dislike f strings because the "operator" f is to the left, whereas I'm used to "filling" a strong from the right. Pattern matching scares me, since unless you're working with data classes, the contents of a class can be only loosely related to its constructor. Loops i write daily.

This feels like a bunch of people projecting their subjective biases on what code should look like on everyone else. Python 2/3 all over again (in a small teacup).

joshuamorton · on Sept 4, 2022

It is weird to add a feature to the language that would be ~immediately banned by every style guide worth its salt.

Python is already expressive enough that unlabeled break and continue are usually the wrong choice.

rich_sasha · on Sept 4, 2022

> unlabeled break and continue are usually the wrong choice.

Citation needed. You can of course write code so they are not needed, but it won't necessarily be clearer, more efficient or "better" in any way.

That sometimes they facilitate spaghetti code is no reason no ban them.

joshuamorton · on Sept 4, 2022

> Citation needed. You can of course write code so they are not needed, but it won't necessarily be clearer, more efficient or "better" in any way.

If efficiency truly matters, you should not be using break or continue, you should be using a vectorized processing library like numpy or pandas.

The pattern I describe in https://news.ycombinator.com/item?id=32710822 works for continue, and the same general concept works for break (and note that I've made this as unappealing as possible for my refactor, and it's still not really worse, but could easily be made even better):

    for i in x:
        if cond(i):
            break
        return do_a_thing(x)

can become

    def find(x):
       for i in x:
           if cond(i):
               return i
        
    do_a_thing(find(x))

and these two patterns generalize to most more complex examples, and are fundamentally easier to follow, as the code is doing fewer things at any given time!

Like break and return from a loop are equivalent modulo implicit state, and the same is true for continue and a conditional generator.

rich_sasha · on Sept 4, 2022

They are not necessarily easier to follow, since you chop your logic into different functions. This may be better, but isn't necessarily. Esp if the loop has some (gasp) state which you then need to pass around to the outsourced functions.

Not all loops ate numeric, in fact most aren't (since vectorisable stuff already is vectorised).

joshuamorton · on Sept 4, 2022

> Esp if the loop has some (gasp) state which you then need to pass around to the outsourced functions.

Being explicit about which state needs to be passed is almost universally better than having implicit state in a large complex function. If you're dealing with a complicated function tracking a large amount of state, separating the state needed to iterate over the loop, and the state needed to validate whatever condition you're checking is less complex than keeping it all in one larger function. The number of potential interactions grow exponentially in the number of local variables. Less state (fewer locals) means exponentially less complexity.

When dealing with very small numbers of locals, the additional confusion from a second function might win out, but any situation where you're forced to pass state between the loop and the conditional is going to be over the threshold where multiple functions are easier for most people to follow.

And that's just stateful interactions, the functions I've written have precisely one entrance and exit point, and precisely one path through them, while break and continue create opportunities for more complicated control flow.

You're arguing that a function with more local state and more complex control structures should be simpler in some cases, but I don't see that happening as the state and control structures get even more complex. This is of course made obvious by the fact that the first code example I posted contains a bug, while the second avoids it.

rich_sasha · on Sept 4, 2022

You're arguing a very patronizing point of view: you know what is better for my code.

I find these things are not universal. Passing state around can be very confusing. Not on trend with everything being "pure" and functional but true pragmatically speaking.

Certainly compared to walrus operator or f strings, labelled break/continue operations are very valid tools, which has different tradeoffs to alternative ways of emulating it, so sometimes will be better.

joshuamorton · on Sept 4, 2022

I am arguing the very obviously true point that some ways of writing a piece of code are easier to understand for most people.

I'm not arguing anything about your code, I've only provided my own code examples. There's no reason for that to be patronizing.

My point was and is that labelled break and continue are almost always a code smell, and would be banned by any good style guide, much like c++ has many misfeatures that are banned by most style guides. You may not be most people, break and continue may be clearer to you, but that's not a good reason to add a feature to a language. Not every language needs to be perfectly catered to your needs. Certainly not every language should (or needs) to cater to all of my needs, and again there's nothing patronizing about expressing that.

Adding a misfeature doesn't make much sense, and you've done nothing to demonstrate that labelled break and continue are useful or make code clearer. Nor even that unlabelled break and continue are particularly valuable. Instead you've claimed I'm insulting you.

ledauphin · on Sept 4, 2022

"Citation needed" is an unproductive demand when we're discussing our opinions about programming language style and design. Or do you think your own assertions about labeled break should be met with the same demand?

rich_sasha · on Sept 4, 2022

"I would find this useful" is self-supporting.

"Break and continue are almost always the wrong choice" is not, and indeed dissing one the oldest and most retained features in programming language history requires substantial justification.

BiteCode_dev · on Sept 4, 2022

Python has had itertools.product for ages for this use case.

masklinn · on Sept 4, 2022

product only covers independent nested iterations, some of them have dependencies.

But for those you've got iterools.concat, or generator comprehensions.

hk1337 · on Sept 4, 2022

Two nested loops is ugly but tolerable, 3+ is disgusting. Refactor it so you don’t have to go 3 deep.

feet · on Sept 4, 2022

What method would you use to iterate over a list of lists?

wrigby · on Sept 4, 2022

It definitely doesn't solve all cases, but quite a lot of nested loops can be eliminated with `itertools.chain`:

  for item in itertools.chain.from_iterable(list_of_lists):
      # Do something with item

hk1337 · on Sept 4, 2022

A list of lists is two deep, while not preferable two nested loops is probably going to be the solution but if I can come up with another, I would rather use that. Like, perhaps a function to process the row instead of two loops in a single function.

feet · on Sept 4, 2022

If you have a function with a loop to iterate the inner list and place that function within another loop iterating the outer list, isn't that the same thing?

What would you have the inner function do?

barrkel · on Sept 4, 2022

It's more tedious in C derivatives, where loops and switches share the break keyword. Where that isn't the case, it's much harder to justify multi-level break. Refactor to a function and return instead.

awinter-py · on Sept 4, 2022

have always wanted `break break` syntax for this kind of thing

  for x in a1:
    for y in a2:
      if test(y):
        break break # break both outer loops

in general, I wonder if there's a class of 'tree-like' control structures which can break multiple levels under certain circumstances

an application might be scope management parsing -- I tried to build something like this a while ago, not clear that it's better than parser-generators but it's certainly different

BiteCode_dev · on Sept 4, 2022

Itertools is your friend:

    from itertools import product

    for x, y in product(a1, a2):
        if test(y):
            break

Not only you get the behavior you want, but you get an easier to read code with one less level of indentation. Bonus, you can now use "else" (which most people don't know can be used with "for" and not just "if").

Why introduce a new syntax when a library does the job? A new syntax is a huge bar to pass for a language, and nested loops are rare.

Itertools is a gem, and any python dev will benefit greatly from mastering it.

rich_sasha · on Sept 4, 2022

It is markedly less efficient since you will test y every time for each x, as opposed to once. O(1) vs O(n)

henrydark · on Sept 4, 2022

Specifically the original double loop example also has this behavior and complexity.

If the loops were written as

  for y in a2:
    if cond:
      break
    for x in a1:
      # something

then they would have O(|a2|) complexity instead of O(|a1||a2|), but then we also don't need break break

tempxyz · on Sept 4, 2022

Swift let's you declare a label, eg: "outer", in front of the outer for loop, then you can declare "break outer".

Gare · on Sept 4, 2022

So do Java and Javascript

tempxyz · on Sept 4, 2022

Cool :D.

nemoniac · on Sept 4, 2022

Common Lisp has a system of Conditions and Restarts which can do this and more.

The link below describes an example of handling errors which arise when reading a collection of files line by line.

https://gigamonkeys.com/book/beyond-exception-handling-condi...

phoe-krk · on Sept 4, 2022

No, you don't even need the condition system to break out of nested iteration.

    (loop named loop-a do
      (loop named loop-b do
        (loop named loop-c do
          (return-from loop-b)))
      (return-from loop-a 42))

Under the hood, this uses CL:BLOCK; you can also use it yourself to mark where you want to be able to return from within the lexical scope of your code.

jeffybefffy519 · on Sept 4, 2022

Could just raise StopIteration?

BiteCode_dev · on Sept 4, 2022

The first loop would catch it, it would not reach the next. But you can indeed use any exeption. However, itertools.product does a better job.