Why is the for loop hateful?

original
2012/07/12 12:45
Number of readings 425

The closure feature of Java has recently become a hot topic. Some elites are drafting a proposal to add closure features to future versions of Java. However, the proposed closure syntax and the language extension have been heavily criticized by many Java programmers.

Not long ago, a great writer who published dozens of programming books Elliotte Rusty Harold Published a query on the value of closures in Java. Especially when he asked“ Why is the for loop hateful? ”[ http://justjavac.com/other/2012/05/15/whats-wrong-with-the-for-loop.html ]:

I don't know. Some people are so anxious to eliminate the for loop. What exactly are they against? This is not the first or second time that theorists in computer science have opposed the for loop (or something similar).

It is unfair to just say that Elliott questions the value of obscure closures. His main complaint is that Bruce Tate, the author of Better, Faster, Lighter Java, another famous person who won the Jolt Award and created the highest sales record, recently wrote about this theme After the topic of, he could not see the value of closures in Java.

(Bruce uses Ruby as an example):

Table 1 The simplest closure

 3.times {puts "Inside the times method."} result: Inside the times method. Inside the times method. Inside the times method.

Times is a method on the object 3. It executes the code in the closure three times. {puts "Inside the times method."} Is a closure. It is an anonymous function, which is passed into the times method to print a static sentence. Compared with the traditional for loop statement, this code is more compact and simple, as shown in Table 2:

Table 2: Non closure cycles

 for i in 1..3 puts "Inside the times method. " end

Because of this lifeless introduction to closures, I can hardly see its real value. This first comparison, at best, can reflect a subtle difference. Most of the other examples in Bruce's articles on developerWorks are of little value, either vague or lacking in inspiration.

I do not intend to comment further on Elliott's confusion caused by this Ruby style closure; It is pointless to be too picky about such problems.

I don't want to discuss the current debate about the syntax of closures in Java, including whether there should be closures in Java. I have no position in such an argument. To be honest, I don't care how or when these problems are solved.

However, Elliott raised an important question: why is the for loop hateful?

Here is a common example:

 double sum = 0; for (int i = 0;  i < array.length; i++) { sum += array[i]; }

What's the problem? I have been programming for many years, and I am very comfortable with this grammar at a glance; Obviously, it is to add the values in an array together.

But when I really read this code, there are more than 30 marks scattered in these four lines of code that I need to analyze. Yes, some characters can be reduced by grammatical abbreviations. But for such a simple addition, you need to write a lot of things and make sure that they are written correctly.

Why do you say that? Here is another example in Elliott's article, a copy of the original:

 String s = ""; for (int i = 0;  i < args.length; i++) { s += array[i]; }

See the mistake inside? If the code compiles and passes the code review, you may need several weeks to find such bugs, and then several weeks to make patches. These are simple for loops.

Imagine how complicated things will become when the body of the for loop becomes larger and larger, even nested. (If you still don't worry about such a bug and think it's just a spelling error, then you should think about how many times you are like this in the for loop.)

If you can write a simple for loop as a line with fewer repetition and fewer characters, it will not only be easier to read, but also easier to write. Because it is more concise, there is less chance to introduce bugs, and when bugs appear, they are easier to find.

How can closures help? The following is the first example, written in Haskell language:

 total = sum array

Ha ha, I'm lying. The sum function does not use closures. It is defined in the way of fold, which accepts closures:

 total = foldl (+) 0 array

The following is the second example, which is very common and uses closures:

 s = concat array s = foldr (++) [] array

I admit that it doesn't make much sense for programmers who are more familiar with the for loop to use these strange functions called foldl and foldr to explain the function of closures. However, these functions can highlight the key disadvantage of the for loop: it combines three independent and different operations - filtering, induction and conversion.

The two for loops above aim to receive a list of values and summarize them into a value. Programmers of functional programming call these operations "folds".

The process of a fold operation is to first have an operation (a closure) and a seed value, and also use the first element in the list. This operation is applied to the seed value and the first element in the list to generate a new seed value. The fold operation then applies this operation to the new seed value and the next element in the list. This continues until the last value. The result of the last operation becomes the result of the fold operation.

Here is a demonstration:

 s = foldl (+) 0       [1, 2, 3] = foldl (+) (0 + 1) [2, 3] = foldl (+) 1       [2, 3] = foldl (+) (1 + 2) [3] = foldl (+) 3       [3] = foldl (+) (3 + 3) [] = foldl (+) 6       [] = 6

Haskell language provides many fold functions; The foldl function starts from the first position of the list and repeats to the last one. The foldr function starts from the last function of the list and operates from the back to the front. There are many other similar functions, but these two are the most basic.

Of course, folders are very basic operations. If you abandon the for loop and replace it with various forms of foldl and foldr spells, you will be confused.

In fact, more advanced operations, such as sum, prod, and concat, are defined with various folders. When your code is written with this advanced inductive operation, it will become more concise, easier to read, easier to write, and easier to understand.

Of course, not all for loops are inductive operations. Look at the following:

 for (int i = 0;  i < array.length; i++) { array[i] *= 2; }

This is a conversion operation. Programmers of functional programming call it the map operation:

 new_array = map (*2) array

The map function works by checking each element in the list and applying a function to each element to form a new list containing new elements. (In some languages, this operation is in place replacement.). This is an easy to understand operation. The sort function has similar functions. It accepts a list and returns (or modifies) a list.

The third type of for loop is filtering.

Here is an example.

 int tmp[] = new int[nums.length]; int j = 0; for (int i = 0;  i < nums.length; i++) { if ((nums[i] % 2) == 1) { tmp[j] = nums[i]; j++; } }

This is a very simple operation, but after using the for loop and two independent counters, the needless complex performance completely covers up the truth. If filtering is a basic operation, it should be like a fold or a map. In fact, it is:

 odds = filter (\i => (i `mod` 2) == 1) nums Odds=filter isOdd num s -- more common form

At the core, this is why the for loop has a problem: it combines (at least) three independent operations together, but focuses on a secondary detail: traversing a series of values.

In fact, fold, map and filter are three different operations for processing a data list, and they should be processed separately. By passing the closure into the loop, we can more easily separate what from how. Every time I traverse a list, I will use an anonymous function, or reuse common functions (such as isOdd, (+) or sqrt).

Although closure is not a very profound concept, when it is deeply embedded in a language and its standard library, we do not need to use these low-level operations to make the code messy. Instead, we can create more advanced operations and do what we want, such as sum and prod.

More importantly, thinking about problems with these concepts will make it easier for us to think about more complex operations, such as transforming a tree, filtering a vector, or merging a list into a hash.

At the end, Elliott also mentioned some problems about parallel execution on multi-core processors, saying that code like 3. times is "less efficient" than the for loop.

Unfortunately, I don't think he was on the point. Yes, some operations need to be serialized and some can be parallel. However, if you are only based on a for loop, it is difficult to determine which is classified as which. This is a complex compiler optimization problem. If you decompose an operation that may perform parallel operations (such as map and filter) into continuous operations (such as foldl and foldr), the compiler will be easier to judge from them.

Moreover, if you know your data better than the compiler, you can explicitly require a map operation to be executed sequentially or in parallel.

Expand to read the full text
Loading
Click to lead the topic 📣 Post and join the discussion 🔥
Reward
zero comment
one Collection
one fabulous
 Back to top
Top