Codawiki talk:Community Portal

From Codawiki

Question: Size of coda data

The coda documentation always uses further small amounts of data.

Coda seems to be quite interesting to me. Im looking for a solution to savely manage webcontent of a view thousand users. We run several apaches which deliver the webcontent and the content itself is not changed very often. Because of the way that coda caches the data it could be a good solution which combines speed and redundancy.

But the sizes with which the coda documentation deals are far to small: "The cache size should be at least 10Meg, typically 60-200Meg is used. Do not go above 300Meg." (http://www.coda.cs.cmu.edu/doc/html/manual/x933.html). I need at least several GB of cache size.

Are the sizes which are mentioned in the documentation still up-to-date? Is it possible/reasonable to use coda to maintain data amounts from 10 to 100 GB?


Coda doesn't scale to multi GB caches. The problem is that all the metadata (attributes) and namespace (directory contents) are stored in memory. And over the years, hard disk sizes have scaled faster compared to available memory. Moving the metadata out of main memory is not trivial.

Some memory is used up for mostly redundant information, we store file name information both as part of the file system object, but it is also available in the contents of the parent directory. We have been moving towards a model where the tree is always fully connected, which should allow us to drop this redundancy. However that is mostly a small improvement.

At the moment most of the recoverable memory segment is used by directory data. Offloading this to container files would pretty much halve the current RVM usage for a typical Coda client. Also, the existing directory format can't scale past 256KB (~4000 names), so it needs to be restructured anyways. However, we are relying on the transactional properties of RVM to safely revert uncommitted operations when the client or server crashes, so replacement would have to provide similar guarantees. i.e. We might still need some sort of directory cache or work-in-progress store in RVM. --Jan Harkes 03:21, 18 Nov 2005 (UTC)