2007-05-01 Process Dictionaries vs. ETS
In this email, Ulf Wiger illustrates the differences between using process dictionaries and ETS tables:Comparing the process dictionary an ets, there are
pros and cons with each:- process dictionary is a linear hash table, like
ets sets (but not exactly the same implementation)- there's no copying when accessing the process
dictionary, but then again, there is GC. With ets,
it's the other way around. (We have code where the
process heap is sized so that all garbage of
short-lived processes fits on the heap. This means
no copying - no GC).- ets tables can be made to survive a process crash
(by making some other process the owner). This can
be a good thing, but forces you to use public ets
tables (bad thing). I believe most ets tables in
use are public, which in a sense makes them worse
than the process dictionaries.- On a few occasions, it is actually useful to let
several processes write to the same ets table,
but in most cases where it's contemplated, it's
a very bad idea.- The contents of the process dictionary is included
in crash reports, while the contents of ets tables
are automatically wiped out when the owner dies -
bad for debugging, unless you make another process
the owner, and make the table public, so you can
write to it (which means everyone else can write
to it too - bad) - Of course, storing huge amounts of data in the
proc dict is bad because (a) it's GC:d, and
(b) all the data is dumped into the SASL crash
reports, which are (by default) pretty-printed
to the tty - potential diaster.- ets has much better search and fold facilities
(the process dictionary has practically none).- Allowing many processes to write from an ets
table _can_ be very useful. This could be done
with the process dictionary as well, but at
the moment, the only option available is to
read the entire dictionary at once (through
process_info(P, dictionary). This involves
a copy, btw, just like with ets.
pros and cons with each:- process dictionary is a linear hash table, like
ets sets (but not exactly the same implementation)- there's no copying when accessing the process
dictionary, but then again, there is GC. With ets,
it's the other way around. (We have code where the
process heap is sized so that all garbage of
short-lived processes fits on the heap. This means
no copying - no GC).- ets tables can be made to survive a process crash
(by making some other process the owner). This can
be a good thing, but forces you to use public ets
tables (bad thing). I believe most ets tables in
use are public, which in a sense makes them worse
than the process dictionaries.- On a few occasions, it is actually useful to let
several processes write to the same ets table,
but in most cases where it's contemplated, it's
a very bad idea.- The contents of the process dictionary is included
in crash reports, while the contents of ets tables
are automatically wiped out when the owner dies -
bad for debugging, unless you make another process
the owner, and make the table public, so you can
write to it (which means everyone else can write
to it too - bad) - Of course, storing huge amounts of data in the
proc dict is bad because (a) it's GC:d, and
(b) all the data is dumped into the SASL crash
reports, which are (by default) pretty-printed
to the tty - potential diaster.- ets has much better search and fold facilities
(the process dictionary has practically none).- Allowing many processes to write from an ets
table _can_ be very useful. This could be done
with the process dictionary as well, but at
the moment, the only option available is to
read the entire dictionary at once (through
process_info(P, dictionary). This involves
a copy, btw, just like with ets.

Comments