Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tensor considered harmful (seas.harvard.edu)
46 points by singhrac on Jan 4, 2019 | hide | past | favorite | 20 comments


I believe historically dimensions were indexed because the exact order made a huge difference for performance. Hand coded CUDA would only work for the exact order it was designed for.

Now that nearly all frameworks have one/many optimisation passes which can transpose/rearrange/split data transparently for performance, this reason seems to go away.


How does this compare with xarray?


Can those developers who write deep learning libraries stop calling multi-dimensional arrays "tensors"? This marketing gimmick is really annoying. One may represent a tensor as a multi-dimensional array, but multi-dimensional arrays per se are not tensors. Calling them tensors is like calling every object a "vector". While technically correct (given any field, every set can be viewed as a vector space and hence every object --- whatever it is --- is technically a vector in some vector space), this is not how we think in normal contexts. The case with arrays vs tensors is similar. Calling them tensors is just bending the language to make arrays sound fashionable.

Actually, I suspect that those "tensors" in TensorFlow or the like have nothing to do with the tensors in physics/mathematics at all. The only connection of arrays to "tensors" that I can find in various deeping learning tutorials is low-rank approximation of a matrix (but even this can be put in matrix language without mentioning any tensor). Apparently all other operations in those "tensor" libraries are ordinary array operations.


Completely agree! It feels like a gimmick to make things sound fancier than they are.

A key property of (actual) tensors is how they transform under coordinate transformations. One can store the components in multi-dimensional arrays but that doesn't fully describe the tensor.

Vectors which are a kind of tensor (of type (1,0) i.e. maps from the dual space to the field) also transform in a very certain way under coordinate (change of basis) transformations.

In physics, every quantity has certain transformation properties under rotations, Lorentz transformations, internal symmetry transformations and that puts a tight constraint on the quantities that can be constructed and the "tensorness" or "vectorness" of quantities has deep meaning. The use in deep learning completely bypasses all the meaning and instead defines a tensor to be a high-dimensional matrix.


Coming from a math background, I found the name "tensor" to be intuitive. If height is a vector space of dimension h, width is a vector space of dimension w, and rgb is a vector space of dimension 3, then the space of w-by-h images can be thought of as the tensor product of height, width, and rgb.

I think "tensor" is the abstract idea, where "multi-dimensional array" is the implementation.


Thanks, but I don't think so. On one hand, feature spaces in machine learning are usually not conceptual vector spaces (e.g. in a tall/short classification, it doesn't make sense to add two "tall"s together or multiply a "tall" by a scalar). In general, features have different data types: some booleans, some integers, some real numbers. We only lump them together, cast them into a homogeneous data type and store them in a multidimensional array solely for interpolation purpose.

On the other hand, since you come from a math background, I suppose you know the concept of "free vector space" (which is usually introduced along with the universal mapping property of tensor product). As said in a previous comment, technically every object can be viewed as a vector, and hence a tensor. A duck is a tensor, a bird that quacks like a duck is a tensor, a Java SimpleBeanFactoryAwareAspectInstanceFactory is a tensor, a Neo Armstrong Cyclone Jet Armstrong Cannon is a tensor, a red-black tree is a tensor. But we don't call them tensors, because normally we don't do anything tensorial to them.

Then why shall we treat multidimensional arrays differently? Apart from the mere fact that a tensor in physics/mathematics can be stored numerically as a multidimensional array, my impression is that (please correct me if I'm wrong) the coinage of the term in machine learning libraries is a complete disregard for the original physical or mathematical concept. Is "multidimensional array" a name so hard to understand that we need a new name? Does the new name in any way enhance one's understanding of machine learning that the old name cannot? I wonder.


So I would disagree that the machine learning terminology was coined without regard to the mathematical concept. I would argue that multi-dimensional arrays are very naturally thought of as concrete realizations of tensors, just like how floating point numbers are concrete realizations of real numbers.

It's true that you can take the "free vector space" on any set under the sun, but I disagree with the statement "every object can be viewed as a vector." Rather, you can construct a vector space and associate to each object an element of that vector space, but the object itself is not a vector.

As opposed to an array of floats, which really itself looks like and is fruitful to think about as a vector.

Edit: Oh, also regarding the heterogeneous type stuff. I think most would agree that calling a bag of ints, floats, bools, and strings a "tensor" would not make much sense.


Out of curiosity, what would you prefer? I think MultiDimensionalArray is a bit verbose, and numpy already has a hold on just `array`.

I don't think the notation comes from any claims or delusions about what's being represented, it's just a convenient term. I can understand the frustration though.


I haven't any preferred term. You may call it multiarray, mdarray, ndarray, hdarray, MDA etc..

I don't agree that "tensor" is a good convenient term. E.g. C++'s std::vector does not really support vector operations, but most programmers can guess that it is a 1D array in terms of memory layout. "Tensor"? It's just a meaningless term to most beginners. If you think MultiDimensionalArray is too verbose, you may call it "MDA". It is shorter, conveys the meaning of the term better than "tensor" does and it's by no means misleading.


To precisely the extent that a vector is an array, we have also that a tensor is a multidimensional array. A multidimensional array-of-scalars type is precisely the tensor product of the appropriate one-dimensional array-of-scalars types, in the standard math sense of "tensor product": linear maps out of the former are in natural correspondence with multilinear maps out of the latter.

The only things I see to complain about are that vector spaces don't have to come with a choice of ordered finite basis and that arrays don't have to be of scalars from a field. But if you're already happy with the conflation of "vector" and "array", nothing new is happening in then speaking of multidimensional arrays as "tensors"; it's just continuing the same conflation.


But if you're already happy with the conflation of "vector" and "array"

Hmm, no, I am not happy with that. All I'm saying is that while std::vector is a bad name, most people can immediately get that it has an array-like memory layout. So, there is at least some merit with this inaccurate name. This is not the case with "tensor", a term that most programmers have never heard of (at least before the advent of TensorFlow).


DenseMatrix. Also, tensor implies something about how to interpret the data that is not usually appropriate.


ndarray or narray sounds good to me.


The most abstract mathematical definition of tensor I know of is multilinear map of some kind.

The tensor data structures are of course just multidimensional arrays, but so are tensors in physics computations. Complex numbers are usually 2-dimesional arrays just like two dimensional vectors. It's the extra structure and operations that defines what they are.

I think the tensor in tensor flow comes from Tensor networks theory from 80's. https://en.wikipedia.org/wiki/Tensor_network_theory


I ain't saying that deep learning is unrelated to tensor algebra. E.g. low-rank approximation of matrix can be reformulated as a tensor problem. Apparently, the following discussion in CrossValidated suggests that some deep learning researchers do try to do something clever using true tensors:

https://stats.stackexchange.com/questions/198061/why-the-sud...

But my point is that the current "tensor" nomenclature in deep learning software has little to do with tensors in physics or mathematics. The name is just confusing if not misleading.


What uses of MDArrays in deep learning don't correspond to tensor manipulations? Many aspects of deep learning theory are discussed in a coordinate-free manner as tensors or multilinear maps, with the MDArray simply referring to the coordinate representation. The operations on the MDArray then refer to a change in representation of the underlying linear-algebraic object, or a discretized version of a linear-algebraic operation on the underlying object. Hence I find the naming entirely appropriate.


What uses of MDArrays in deep learning don't correspond to tensor manipulations?

That's news to me. All genuinely tensor-related operations in machine learning that I was aware of were about dimension reduction (tensor decompositions or low-rank tensor approximations). Maybe things have changed and I stand corrected.


Remember that NNs do inference forward but learning happens backwards.


In many cases, technology is no different than those people who run conferences for "real estate advice" - the advice is incredibly simple and straightforward, but I don't get to charge you $2000 a ticket if I can explain it to you in the elevator.

Every time I try to learn machine learning, I run in to this problem. Nobody actually tells you how it works (because it's actually pretty straightforward), they want you to attend their course, or purchase their book.


If you're interested in a very practical, explain-it-clearly, no marketing type of resource for deep learning, I can highly recommend https://www.fast.ai/

I did the first version of the course, and I felt that at the end I legitimately knew enough to read current papers and see through any hype presented to me anywhere else.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: