data.table := an improved data.frame

I started using a package called data.table just yesterday. I re-wrote the whole of the MayaCalc package to use data.table instead of data.frame. Got it working in a few hours. Syntax is clearer and very concise. As a bonus everything should execute much faster (x10 to x30). I will time MayaCalc today after checking that the results are o.k.
Documentation for data.table is a bit terse, I will write some examples here in a few days’ time.

Be aware though that the semantics is really different when used as an argument to a function. If you use the := operator you achieve the equivalent of passing the data.table argument by REFERENCE rather than the normal R convention of passing all arguments by COPY. This is much faster for large data tables as copying is avoided, but you should be careful even if you understand the difference between these two semantics. If you don’t, do not use data.table before you fully understand the difference and all its implications.

Leave a Reply

Your email address will not be published. Required fields are marked *