
A few responses here, with the intent of helping you leave the language 
study with a few more questions answered.  I agree that most of the things 
that you point out could be better documented/rationalized.

Thanks for your thoughts -- we really appreciate them,
-Brad

------------

> * I know we discussed this in class, but while writing my toy program, I 
> could not keep straight the difference between calling functions with [] 
> versus ().  Most of the time, it didn't matter -- most of my functions were 
> one-argument anyway, but I was confused by the documentation about 
> string.substring:  you show both foo.substring(i) and foo.substring[i], but 
> only foo.substring[i..j], so I wasn't ever sure whether ranges were treated 
> as single variables or whether they were zipped over...

Here's how I remember the zipper vs. tensor semantics -- it's based 
strongly on how the choice of syntax fell out:  Our belief was that for a 
function call with two promoted arguments -- say, exp(A, B) -- the natural 
interpretation seemed to be to zipper the arguments and create a result of 
the same shape (e.g., exp(A(1), B(1)), exp(A(2), B(2)), ...).  Meanwhile, 
array slicing notation, which is a tensor product, more traditionally uses 
square brackets -- A[2:n-1, 2:n-1] in Fortran, for example.

Stylistically, when I'm slicing an array, I almost always use square 
brackets, even if its a 1D slice like A[i, 1..n].  This is equivalent to 
A(i, 1..n) of course, but I tend to use the square brackets because it 
looks more like a slice to me.

I suspect that the author of the substring code (which may have been me, 
though I don't think it was), was falling into a similar mode, thinking of 
a substring as being a slice of a string.  At some point, I think we even 
supported substring slicing via the indexing function, so this may have 
been a leftover from that conversion.  All that said, I suspect that I'd 
probably write it using parenthesis today now that it looks more like a 
method and I don't think of it as being a promotion.  Do you remember 
where you saw that?  Language spec or example codes?


> * The semantics of begin, cobegin, forall and coforall need some 
> clarification.  For instance, the spec should say explicitly something like, 
> "Coforall <cond> { <body> } is sugar for forall <cond> { begin { <body> } }", 
> rather than just the English prose it currently has.

Agreed.  This makes sense to me.  I'd probably describe it less in terms 
of sugar and more interms of "semantically equivalent to", since that 
suggests that we might implement it intelligently (using a tree of spawns 
rather than a linear sequence of them, e.g.).  Sugar implies strongly to 
me that it is probably just implemented by making the conversion you 
specified.

Note that the sugar you give above is incomplete because it doesn't 
capture the join at the bottom of the forall loop.  The rewrite of this 
would probably involve capturing some sort of termination condition into a 
sync variable and waiting after the loop for all the sync variables to be 
filled.  The tedium of writing this common idiom all of the time was part 
of the motivation for introducing the coforall (along with the observation 
that we could probably do the spawn and joins more efficiently for a large 
group of tasks than a beginning or lazy user would bother to.


> * I ran across a bug in the implementation of coforall loops where if your 
> iteration variable has structure (a tuple or something), the current compiler 
> doesn't privatize the destructured values properly.  Combining the first and 
> third points above, why can't you say "'for <tuple> in <2+D domain> { <body> 
> }' is sugar for 'for temp in <2+D domain> { const <tuple> = temp; <body> }'. 
> In all cases, temp is locally scoped to the current iteration of the loop." 
> Then your compiler bug goes away, because your semantic support for loops is 
> simplified.

As I understand it, this rewrite is essentially what we do, but a 
"privatize me" flag on the original variables wasn't getting propagated to 
the temporary variables appropriately, causing the temps to be allocated 
on the stack rather than the heap in the generated code, which is what 
resulted in having multiple threads share the values.  So yes, we think in 
terms of these rewrites, but that doesn't imply that the compiler will be 
bug-free.  Steve's already working on the fix for this.


> * I'd like to hierarchically define domains.  That is, given domains D1, D2, 
> I want to define var D3 = [D1, D2] and have it be the cartesian product of 
> them.  It's of no big deal to me whether you flatten those domains (e.g. if 
> D1 and D2 were a 1-D and a 2-D domain, whether D3 would be a standard 3-D 
> domain or a 1D x (2-D) domain) with coordinates ((x), (y, z)).

We do too.  This is on the todo list.


> * File IO is sorely lacking :-p

Agreed.


> * Idiomatically, I wanted the ability to construct lists of values and fold 
> over them.  The best I came up with was to define sparse domains and add 
> elements to them, then iterate over indices in the sparse domain.  That seems 
> rather "meta" -- I want a list of those values, not have those values be 
> indices to some other array.

I'd suggest thinking of the domains truly as 1st class index sets, and not 
simply a means of creating an array.  For example, I use associative 
domains to implement sets of values all the time, and never allocate 
arrays over them.  Along those same lines, I think sparse arithmetic 
domains are a very reasonable way to store a sorted list of elements from 
a bounded set of indices, even if they're never used to implement an 
array.

All that said, it is our intention to support a standard list class in the 
standard libraries (and in fact, I think we have a first draft already 
there -- I'm not sure if it's documented).  We used to have a sequence 
type in the language, but it was one thing that's been pruned out of the 
language and put into the library with the goal of making the language 
smaller.


> * Also, once I'd defined a sparse domain, and an array using it, every time I 
> added an element to the domain that array must be reallocated, no?  I'd like 
> a way to say, "Here's a dense domain, and a sparse domain over it, and an 
> array over *that*; now hang on a sec while I initialize the sparse domain... 
> ok, *now* go ahead and allocate that array once and for all, thanks."

Does anything prevent you in this case from moving the declaration of the 
array after the sparse domain's initialization?  If not, I'd suggest doing 
that -- declare the dense domain, the sparse domain, initialize the sparse 
domain, now declare the sparse array.  That way, the allocation happens in 
one fell swoop as you'd like.

In any case, I believe an important compiler optimization for sparse 
domains/arrays is going to be to batch up reallocations like this on the 
user's behalf (and this is the sort of thing that fits into that second 
"optional interface" category of a distribution -- a sparse implementation 
that wants to support this kind of optimized addition would support it, 
but there would be no requirement to support modification of more than one 
index at a time.

Even with such an optimization, as a distrustful user, I'd still move the 
array declaration after the sparse domain's initialization, if I had the 
power to do so.

Or, can the sparse domain be initialized on its declaration line rather 
than after its declaration (using an iterator, e.g.)?  This would be the 
ideal way to do it since it would allow the sparse domain to be declared 
const, which should result in other optimization and readability benefits.


> * There doesn't seem to be an easy way to re-order iterators, or if there is 
> I didn't understand it clearly.  For instance, given the iteration [1..2, 
> 1..4], which yields the pairs (1,1), (1,2), (1,3), (1,4), (2,1), (2,2), 
> (2,3), (2,4) in that order, I'd like a simple way to "take the transpose" of 
> that and get column major order.  If I'd had that, then I wouldn't have been 
> bitten by the coforall bug above -- I could have used a forall loop instead. 
> :)  I suspect I could work around this by having done forall (y,x) in [1..4, 
> 1..2], which swaps the variables and also the iterator, but that feels 
> not-clean, somehow.

If you're doing a forall loop then by definition you're abdicating control 
over the iteration order of the loop (and since our forall loops are 
currently sequential, you're going to get our default iteration order of 
RMO in practice).  You can regain this control by authoring the 
distribution for the domain over which you're iterating, in which case 
you're specifying the implementation of forall loops.

A lighter-weight way to do this would be to simply write your own 
standalone iterator:

 	def CMO(D: domain(2)) {
 	  for i in D.dim(2) {
 	    for j in D.dim(2) {
 	      yield (i,j);
 	    }
 	  }
 	}

and invoke it:

 	for (x,y) in CMO([1..2, 1..4]) {
 	  ...
 	}

This looks simple because it's all sequential -- making it parallel would 
be harder, but given that you have some pretty heavy constraints on the 
order of the parallelism, that seems appropriate.  Implementing parallel 
wavefront style loops has similar issues -- one technique is to use an 
array of sync variables to control when each iteration can legally fire.



> * The syntax isn't well specified in the spec, I think due to its change over 
> time.

Hmm, I don't think it's changed that much since the spec was written. 
Could you clarify for me in what way it seems poorly specified and how it 
could be improved?  Are you referring to the syntax "diagrams" or the 
supporting text that describes them?

> But it wasn't clear why you had an optional "then" keyword, for 
> instance; pick one syntax (e.g. C's, since you're very close to it already) 
> and stick to it.

For single-statement conditionals, there's a syntactic ambiguity due to
Chapel's use of square brackets for both function arguments and forall
loops.  For example:

 	if A [D] write("hi");

Could be:

 	if A[D] {
 	  write("hi");
 	}

or:

 	if A {
 	  [D] write("hi");
 	}

So optional keywords like "then" (and its partners in crime) are used to 
support the single statement cases of control flow:

 	if A[D] then write("hi");

 	if A then [D] write("hi");

Since the single statement could itself be a compound statement, this
permits:

 	if A then { [D] write("hi"); write("there"); }

but of course, the then isn't strictly required in this case.  The result 
is if you use curly brackets religiously, you'll end up being very C-like. 
If you like to avoid curly brackets in single-statement cases, you need to 
use the keywords.

This rationale should be captured in the spec if it isn't already.

If you have thoughts about getting rid of the then altogether, that would 
of course be of interest.



> * There's a heavy learning curve, I think, which is aggravated by the lack of 
> examples.  A lot of the code I wrote was written via copy-paste-modify, and I 
> wasn't always sure what syntax was critical and what was style and what was 
> optional.

Are you talking about examples in the spec (which I'd completely agree 
with), or examples in the release itself (which I could also agree with, 
but would need some suggestions about what examples you'd like to see 
included, or how the existing examples could be made more instructive -- 
we're happy to add more, but don't want to overwhelm anyone by providing 
more than they can wrap their heads around).



