Mochabot log - CommonJS IRC channel: #commonjs on irc.freenode.net

2010-01-24:

[0:40] <inimino> immutable ByteStrings mean that equality on byte strings at least has the possibility of being defined sensibly as it for strings and numbers
[0:40] <inimino> I think that's worth keeping
[0:41] <inimino> if you're only going to have one, lose the mutable variant
[1:09] <Dantman> FWIW, Buffer doesn't have the issues ByteArray has in terms of []
[1:15] <ashb> inimino: mutable in length or mutable in cotent or both?
[2:24] <inimino> ashb: both
[2:25] <ashb> they are of minimal use if they are RO
[2:25] <ashb> you need some form of writable binary buffers
[3:14] <Wes-_> ondras: there are performance and logical advantages to having immutable types
[3:16] <Wes-_> inimino: you can't lose the mutable variant, it's too useful for a billion different reasons
[3:17] <inimino> Wes-_: very likely true
[3:19] <inimino> Wes-_: but if I had to pick just one, I wouldn't have any hesitation
[3:19] <Wes-_> inimino: Really? I'd pick the opposite one
[3:19] <Wes-_> OTOH, I've worked with multi-megabyte mutables
[3:19] <inimino> well, we seem to do fine without mutable strings
[3:19] <Wes-_> I'd hate to have to copy them every time I want to change a byte
[3:19] <inimino> well, you'd just use something more efficient in that case, like an array
[3:20] <Wes-_> inimino: mutable strings are a bad argument
[3:20] <Wes-_> one of the big performance problems with ie6 javascript immutable strings
[3:20] <Wes-_> so that innerHTML + "crap" is expensive
[3:20] <Wes-_> the other browsers *cheat*
[3:20] <inimino> what's wrong with cheating?
[3:20] <inimino> optimization is all about cheating and not getting caught
[3:21] <Wes-_> Mozilla's implementation looks at the string, checks for references, extends the memory allocation if it's modified
[3:21] <Wes-_> inimino: we're bound by not wanting to modify the JS interpreter
[3:21] <Wes-_> at least I am
[3:21] <Wes-_> And I'm not sure you cheat without doing that
[3:22] <inimino> ah, well, I'd rather design a long-term nice API and let interpreters make it fast in time
[3:22] <Wes-_> and in the short term... how do you go about working with mutable byte things?
[3:22] <inimino> but anyway, we'll probably have mutable and immutable variants and everyone will be happy (except minimalists)
[3:23] <inimino> Wes-_: well, depends on the use case
[3:23] <Wes-_> let's say a 1 GB memory mapping where you want to change the 7th byte
[3:23] <inimino> you can always split it up
[3:24] <inimino> it's probably coming in from somewhere in chunks anyway, just hang on to them
[3:24] <Wes-_> letting the kernel virtualize memory mappings is bound to be far saner IMO
[3:25] <inimino> by memory mappings
[3:25] <inimino> are you talking about exposing a mmap'd file directly to JavaScript?
[3:25] <Wes-_> sure, why not?
[3:25] <Wes-_> it's very convenient
[3:26] <inimino> well, for that, I'd want a mutable or a different API altogether
[3:26] <Wes-_> The only tricky part, frankly, was deciding on a sane syntax to get ByteArrays without tying the mmap code to binary/b
[3:26] <Wes-_> Hmm, true, I suppose I could use a different data type
[3:27] <Wes-_> But ByteArray does the job ATM
[4:15] <Wes-_> Anybody have any fs-base tests yet?
[8:29] <ondras> Wes-_: so far, I do not see any advantage of having an immutable type
[8:29] <ondras> Wes-_: and as far as gpsee implementation goes, in my opinion BA performs the same as BS
[8:30] <ondras> because they both share the same low-level storage mechanism - the byteThing (sp?)
[9:46] <ondras> Wes-_: as for the efficient ByteString.slice() stuff, I believe that this can be realized with some kind of lightweight "range" wrapper, which represents [storage, start, end] triple and proxies all requests to it
[10:52] <kriskowal> ondras: the "range" wrapper is speced in both D and E
[10:52] <kriskowal> and C, i think
[11:18] <ondras> kriskowal: yeah, but they all specify both mutable + immutable types
[11:18] * ondras does not want both
[11:19] <kriskowal> would it be consequential to have one type and alias it as both names?
[11:19] <kriskowal> m, not a good idea
[11:19] <kriskowal> (it's getting late)
[11:19] <kriskowal> (0321)
[11:19] <ondras> kriskowal: I have ~95 of binary/b completed in v8cgi now
[11:19] <ondras> 95%
[11:19] <kriskowal> ah, no kidding
[11:20] <ondras> http://code.google.com/p/v8cgi/wiki/API_Binary
[11:20] <ondras> but I see two problematic areas with that now
[11:20] <kriskowal> we have most of it in narwhal too, but i think we're pretty flexible
[11:20] <ondras> 1) BS + BA seems to much to me
[11:20] <ondras> 2) there are several places which I find too php-ish
[11:21] <ondras> (too much ducktyping for instance)
[11:21] <kriskowal> water, water, everywhere?
[11:21] <ondras> well
[11:21] <kriskowal> not a drop to drink?
[11:22] <ondras> methods accepting many argument types
[11:22] <ondras> constructors accepting many argument types
[11:22] <ondras> ability to convert everything to everything with a single call
[11:22] <kriskowal> seems consistent with the aesthetic of the language
[11:22] <spoob> kriskowal; quick question. I see you co-wrote a file to bring various javascripts up to the ECMA standard for commonjs/narwhal. My question is whether it is safe to load this file in the client side as a general platform leveller
[11:23] <kriskowal> spoob, it should work. i haven't tested it
[11:23] <ondras> kriskowal: for instance, what is the purpose of specifying a BA/BS as a first argument to .indexOf() ?
[11:23] <ondras> and how should this behave?
[11:23] <kriskowal> and, it doesn't give you security. it just makes secure code not throw errors
[11:24] <kriskowal> search for a substring.
[11:24] <kriskowal> i forget who asked for that.
[11:24] <kriskowal> but it is consistent with string
[11:24] <ondras> ah
[11:24] <ondras> so the whole sub-bs/sub-ba must be found
[11:24] <kriskowal> yes
[11:24] <ondras> as opposed to .split()
[11:24] <ondras> where any of the requested bytes must be found?
[11:24] <kriskowal> ah, split.
[11:25] <kriskowal> that's probably what's written in binary/b. i'm not sure whether i've changed it.
[11:25] <ondras> okay
[11:25] <kriskowal> you're right that they should be consistent, and i'm of the opinion that it should be substring search
[11:25] <ondras> hmhm
[11:25] <ondras> extendLeft/Right, why these?
[11:25] <kriskowal> we can worry about having a binary equivalent of regex parsing some other day, imo
[11:26] <kriskowal> ashb asked for them
[11:26] <ondras> hm.
[11:26] <kriskowal> i think i dropped them in D or E
[11:26] <ondras> imho push/unshift just do the job
[11:26] <ondras> well, never mind
[11:26] <ondras> I will finish the implementation
[11:26] <ondras> and discuss the need for immutable type with others
[11:27] <kriskowal> there are strange issues with overloading push and unshift
[11:27] <ondras> ?
[11:27] <kriskowal> i mean, purely theoretically problems paralleling array
[11:28] <kriskowal> [1,2,3].push([4,5,6]) produces [1,2,3,[4,5,6]]
[11:28] <ondras> that is because array can store other arrays
[11:28] <kriskowal> [].push.apply([1,2,3],[4,5,6]) produces [1,2,3,4,5,6]
[11:28] <ondras> yes
[11:28] <kriskowal> so, the question is whether it would be wise to collapse both forms
[11:29] <ondras> and [].push(1,2,3) produces [1,2,3]
[11:29] <kriskowal> if we are using Array as a stand-in for ByteArray on the browser-side, which we might for something like an md5 algorithm, it would be unwise to use push in that way
[11:29] <kriskowal> of course, it would be unwise to use extend* since those don't exist :P
[11:29] <ondras> well, in my opinion, the behavior should match the need
[11:30] <ondras> you want to add something to binary array by appending it at the end -> you use push
[11:30] <ondras> binary.push([...]) logically collapses that
[11:30] <kriskowal> in any case, i'm thinking of dropping push/unshift altogether
[11:30] <ondras> heh
[11:30] <kriskowal> doesn't make much sense to do those things with a fixed-width buffer
[11:31] <ondras> that is true
[11:31] <ondras> and that is also the reason why I personally would drop fixed-width / fixed contents stuff
[11:31] <ondras> well, I have to leave, good nite!
[11:31] <kriskowal> night
[11:32] <kriskowal> or morning, as it were
[11:32] <ondras> noon here
[11:32] <ondras> 12.34
[11:32] <ondras> detach
[17:34] <Wes-_> ondras: ByteStrings are somewhat more effecient than ByteArray in GPSEE ATM, but will be *much* more effecient when I'm done with them
[17:35] <Wes-_> ondras: big advantages to immutable types, BTW, are algorithmic, although you won't see as many in JS as languages with threads
[17:39] <ondras> Wes-_: I would be happy to accept that, but I need some examples/explanations for these statements
[17:39] <ondras> please excuse my lameness, but so far I truly do not see any advantage of BS above BA
[17:42] <Wes-_> ondras: real debate is not what's on .prototype anyhow, it's mutability AFAIC (prototypes are almost complete exchangeable)
[17:42] <Wes-_> with immutable objects, you can pass around references to objects, without worrying about other functions mutating them on you
[17:42] <Wes-_> also, *proper* immutables are directly equal, i.e. === when contents are same
[17:43] <Wes-_> binary/b doesn't mandate this, probably because it's hard to implement
[17:43] <ondras> Wes-_: I understand that there are situations where you want to pass around reference without having it modified.. but is it not your responsibility to not modify the data?
[17:44] <Wes-_> ondras: that's kind of a short-sighted approach, especially with multi-thread concurrent systems
[17:44] <ondras> hm
[17:45] <ondras> this is not my area of expertise
[17:45] <ondras> so I cannot provide any arguments here
[17:45] <Wes-_> yeah, it's really a CS question
[17:45] <Wes-_> not my area either
[17:45] <ondras> but I strongly believe that some kind of lock may surely solve this
[17:45] <Wes-_> locks are always a performance killer
[17:45] <ondras> as for the ===, I recall someone saying that modifying this one is not possible in all engines
[17:45] <ondras> (an == is not possible ATM in v8 as well)
[17:46] <Wes-_> Actually, it IS possible in all of them, IIRC
[17:46] <Wes-_> The thing is, you can't do it the cheating way
[17:46] <ondras> two or three?
[17:46] <Wes-_> i.e. operator overloading
[17:46] <Wes-_> three
[17:46] <ondras> wow
[17:46] <Wes-_> so, what you have to do is intern all your bytestrings
[17:46] <Wes-_> that means, any time you new a ByteString, you look it up in a table
[17:46] <Wes-_> if your program has ever seen that bytestring before, you return a pointer to THAT one
[17:47] <ashb> that seems as expensive as locks
[17:47] <Wes-_> So, it's not always a brand-new object
[17:47] <ondras> Wes-_: hm, but this works only in the case of immutable objects
[17:47] <ondras> so it is not a generally working approach
[17:47] <Wes-_> ashb: it's not cheap, but it's a one-time lookup per instance
[17:47] <ashb> true
[17:47] <Wes-_> ashb: and it's probably about the price of malloc anyhow
[17:48] <Wes-_> ondras: That's one of the defining characteristics of properly implemented immutables (interning)
[17:48] <ondras> (btw: is there some change in es5 regarding "return" in construct calls?)
[17:48] <Wes-_> You can also same on memory, BTW, by de-interning with the garbage collector, instead of holding weak references
[17:49] <Wes-_> ondras: not as far as I know, why?
[17:49] <Wes-_> s/same memory/save memory/
[17:49] <ondras> Wes-_: I was just interested... the "can return only typeof()=='object'" behavior is something unorthodox and interesting
[17:50] <Wes-_> ondras: that's ES3 behaviour too
[17:50] <ondras> yeah
[17:50] <ondras> the only one I know about
[17:50] <Wes-_> ondras: you can't return anything in the constructor other than the object that's being constructed
[17:50] <ondras> no
[17:50] <ondras> that is not true :)
[17:50] <ondras> you can return anything which is typeof object
[17:50] <ashb> yeah - i've returned random functions from a ctor and it works fine
[17:50] <ondras> "functions" ?
[17:50] <Wes-_> Interestingly, that means it might not be possible to make interned bytestrings in the browser
[17:51] <Wes-_> ...
[17:51] <Wes-_> REALLY?
[17:51] <ashb> http://github.com/ashb/juice/blob/cde8b8ec2a5da3b017845904e142b3fe77f6a5c0/lib/juice/application.js#L6-15
[17:51] * Wes-_ goes to test
[17:51] <ashb> and that is called with new Application( module );
[17:52] <Wes-_> son of a bitch!
[17:52] * Wes-_ learns something new today
[17:52] <ashb> of course should test that in somethign other that spidermonkey :)
[17:52] <ondras> ashb: ah, maybe because function(){} instanceof Object
[17:52] <ashb> ondras: yeah probably
[17:52] <Wes-_> Hmm, but I can't return an intrinsic, makes sense...
[17:52] <ashb> the only thing that won't be is unboxed primitives
[17:52] <ondras> ashb: it works in v8 as well
[17:52] <ashb> cool
[17:52] <ashb> thats all i care about then
[17:53] <ondras> let me just lookup it in ecma262
[17:54] <Wes-_> So, yeah, ByteString === doesn't seem possible in browser
[17:54] <Wes-_> but I am convinced it is possible everywhere else
[17:55] <ondras> Wes-_: why not? new ByteString() can return some other object
[17:55] <ashb> would cause a cacheing issue tho
[17:55] <Wes-_> ashb: caching issue?
[17:55] <ashb> cache bloat
[17:55] <Wes-_> ondras: Hmm - playing
[17:56] <Wes-_> ashb: that's what gc is for
[17:56] <ashb> since no weak refs in browsers
[17:56] <ondras> afk
[17:56] <ashb> yeah but to return an existing object you have to hold a reference to it
[17:56] <Wes-_> If you don't have a reference to the interned object, you can free it
[17:56] <Wes-_> OIC
[17:56] <Wes-_> yeah, in browsers that will be an issue
[17:56] <Wes-_> *Hmm*
[17:56] <Wes-_> stupid browsers
[17:56] <Wes-_> ES-next is looking at weak refs, FWIW
[17:57] <Wes-_> as a strawman, anyhow IIRC
[17:57] <ashb> yeah
[17:57] <ashb> think its being discussed at this meeting or the next
[17:57] <Wes-_> okay -- ondras you're right
[17:58] <Wes-_> you *can* return interned objects from the browser
[17:58] <Wes-_> but ash's issue is key: they will bloat
[18:07] <ondras> re
[18:07] * ashb really dislikes how tightly the 'in' operator binds
[18:07] <ashb> i always want to right if (! x in y )
[18:08] <ashb> er ,write
[18:08] <ondras> If Type(result) is Object then return result.
[18:08] <ondras> ha
[18:08] <ondras> this is what ecma262 says
[18:09] <ondras> this probably corresponds to instanceof
[18:09] <ondras> okay then
[18:09] <ondras> returning functions is ok
[18:09] <ondras> hmmm!
[18:10] <ondras> never the less
[18:10] * ondras is still not convinced about the usefullness of immutable binary data type
[18:10] <ondras> *usefulness
[18:13] <ashb> optimization of a form
[18:13] <ashb> but yes, mutability is the first thing i want
[18:19] <ondras> Wes-_: btw, as soon as I finish with binary/b (very soon), I will start looking into a ffi module :)
[18:20] <Wes-_> ondras: I think it would be excellent if you stole lots of my code
[18:20] <ondras> :)
[18:20] <ondras> actually, the gpsee implementation of ffi looks rather complex on first sight
[18:21] <Wes-_> It does a lot
[18:21] <Wes-_> and spidermonkey's API doesn't help there
[18:21] <ondras> hm
[18:21] * ondras still believes that it should be very easy to do in v8
[18:21] <ondras> including the .finalizeWith trick
[18:21] <Wes-_> GPSEE's FFI spends a significant hunk of it's code volume in two places: argument coercion, and making sure ~ 100% of POSIX is reflected properly
[18:22] <ondras> but is is very possible that I will change my opinion once I start actually coding someting
[18:22] <Wes-_> Also, handling structs
[18:22] <Wes-_> And casting from one type to another
[18:22] <Wes-_> shit, there's a lot of stuff in thee
[18:22] <ondras> :)
[18:22] <Wes-_> It didn't seem like much when I wrote it!
[18:23] <Wes-_> ondras: an interesting, and doable approach, would be a straight-up port
[18:23] <ondras> hmhm
[18:23] <Wes-_> the spidermonkey-specific parts are argument coercsion and defining properties on objects
[18:23] <ondras> the thing is that I also want to actually *learn* something about the whole ffi stuff
[18:24] <Wes-_> (most of those definitions are lazy as well)
[18:24] <Wes-_> ondras: Don't worry, there is no way in hell you can make it work without understanding it :)
[18:24] <ondras> hehe
[18:24] <ondras> the iconv code, for instance, could have been taken as a whole iirc
[18:24] <Wes-_> You really do need to understand your underlying engine *and* how FFI works
[18:24] <ondras> (but I refactored it completely)
[18:25] <Wes-_> ondras: That code needed it anyhow
[18:25] <Wes-_> binary/b is really quite messy
[18:25] <ondras> :)
[18:25] * ondras still thinking about his own binary proposal
[18:25] <ondras> with one, simple Binary object
[18:25] <ondras> with factory methods
[18:25] <Wes-_> too many layers of patches, too many authors, but I don't want to invest heavily until binary/X is finalized
[18:25] <ondras> with a single constructor
[18:25] <ondras> ...
[18:25] <ondras> (/me hates when constructor works with 5 different ways of calling)
[18:25] <Wes-_> ondras: GPSEE basically does that by having "ByteThings"
[18:25] <ondras> yeah
[18:26] <ondras> I have a very similar approach
[18:26] <ondras> an instance of ByteStorage c++ class
[18:26] <Wes-_> Except byte things in GPSEE-core are much more general than in the binary module
[18:26] <Wes-_> One of the key things for GPSEE was making byte things compatible at the *C* level
[18:26] <Wes-_> That way, I can cast between objects in JS land and get the methods that I want
[18:26] <ondras> Binary.fromString(encoding), Binary.fromArray(array), new Binary(length)
[18:26] <Wes-_> and use call/apply
[18:26] <ondras> this is my preferred api.
[18:27] <ashb> ondras: the first one needs a string too :)
[18:27] <ondras> ashb: good point! :)
[18:28] <ondras> Wes-_: I also cheated a bit by adding some of the common methods directly to Binary.prototype
[18:28] <Wes-_> ondras: That's not cheating IMO
[18:28] <ondras> :)
[18:28] <ashb> its exactly what we did too
[18:28] <Wes-_> or wait
[18:28] <ondras> it is highly logical
[18:28] <Wes-_> Mmm - it makes methods harder to steal
[18:28] <Wes-_> if you don't know what prototype they will be on
[18:29] <Wes-_> I'd recommend grafting them down myself, rather than letting inheritance take over
[18:29] <Wes-_> otherwise, you kill users' ability to call/apply
[18:29] <ondras> http://code.google.com/p/v8cgi/source/browse/trunk/src/lib/binary/binary.cc#132
[18:30] <ondras> Wes-_: how comes?
[18:30] <ondras> var codeAt = ByteArray.prototype.codeAt, although this is defined as Binary.prototype.codeAt
[18:31] <Wes-_> Oh okay, as long as that is true I see no problem at all
[18:31] <ondras> afk, dinner :)

 

 

Logs by date :