Feedback

Description says you collect feedback, but not specifies how it should be provided, so I should open an issue ... I guess?

First, nice job @zdevito ! torchdim looks very promising, in particular indexing looks very friendly. 



### Unforeseen axes

Curious how you plan to implement operations that introduce a new axis, like boolean indexing or bincount / unique / set-like operations. 

One possible way would be to return a new axis object along with result, but it has issues:

```python
x1, axis1 = bincount(x)
x2, axis2 = bincount(x)
x1 + x2 # they have different axes or same axis?
```

This can be solved by adding one more argument or by allowing manual 'coalescing' of axes.

### Concatenation / chunking of named axis

Again, curious about your thoughts here.

### Multi-axes

Cases when a single function should deal with tensors of several possible dimensionalities are frequent.

Potentially you can leave those problems to positional axes, but I'd recommend exploring the direction of multi-axes:

```python
(Q[b, qaxes, [head, c]] * K[b, kaxes, [head, c]]).sum(c).order(b, *qaxes, *kaxes, head)

# * not allowed in indexing
(Q.index(b, *qaxes, [head, c]) * K.index(b, *kaxes, [head, c])).sum(c).order(b, *qaxes, *kaxes, head)
```

It has very 'pythonic' look, under the hood iterating over multi-axis would yield a single helper object, which would designate the position among other axes.

### Delayed computations

It is a super-clever trick to delay multiplication until possible summation follows, but making it a single operation is more predictable

```python
x = a * b
result = (x * c).sum(i, j) # here einsum-ification probably happens

x + 1 #  user actually expected that one to be materialized.
```

Just placing that in a function does not look worse to my eye, but open to other opinions

```python
sum_product([i, j], a, b, c)
```

### Calling functions

```
batch, inputs, hidden, classes = dims(4)
print(loss(w1[inputs, hidden], w2[hidden, classes], images[batch, inputs], labels[batch]))
```

Can you provide more complete example here? It is unclear how `loss` function can take a matmul of images and w1, because it needs to sum over `hidden` variable, but it was not passed to the function.

More broadly, there should be some contract how callee interprets its inputs (from this example seems it deals only with non-named axes, and behavior of named axes is left to the calling function, but maybe I misunderstand). More examples with would be very helpful here.

### Interaction with deep learning blocks

Can you explain how DL operations (e.g. convolution) would handle named dimensions (and would they)?

### Add Dims context manager

```
with dims(6) as (h2, w2, c, b, h, w):
    <computations>
```

Suggestion may sound a bit strange, but here is a rationale:
if you don't have an axis object, you can't manipulate it, thus whole tensor becomes non-manipulatable.

I expect users would commonly return created objects without `order`-ing them first, and then deal with downstream problems (since those will be scalars for outer code, they will not error out in most operations, and then users will chase skipped `order`).

Exit from contextmanager should deallocate all tensors that use axis objects created with context manager => more efficient memory management almost for free + in a large number of cases you can point user to the problem immediately.


### Using Better Terminology

'Flattening and Splitting Dims'. Both terms are not suitable to the context. Yes, that's torch ops, but they become inappropriate as you move from discussing old-style ops to operations that are focused on axes. For instance, phrase 'flatten the dimensions' does not make any sense as dimensions/axes are already flat. 

Einops uses terminology '**composition** and **decomposition** of axes', because 1) it is obvious when you compose you get fewer axes 2) it hints that original content is preserved 3) wording: decomposition reverses composition, even kids know that  (compare that to flatten vs split dims) 4) you can refer to 'composed axis' and 'composing axes', which is helpful in discussing code. Let's use this better terminology.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback #3

Unforeseen axes

Concatenation / chunking of named axis

Multi-axes

Delayed computations

Calling functions

Interaction with deep learning blocks

Add Dims context manager

Using Better Terminology

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feedback #3

Description

Unforeseen axes

Concatenation / chunking of named axis

Multi-axes

Delayed computations

Calling functions

Interaction with deep learning blocks

Add Dims context manager

Using Better Terminology

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions