One of the shockers I came across with MongoDB is that each instance of a key ta...

taligent · on June 5, 2012

Not really sure what you mean about the rows given that it is JSON but anyway.

I think what you are referring to is the tokenization of field names: https://jira.mongodb.org/browse/SERVER-863

james4k · on June 6, 2012

Right, what I really meant was a document. I was trying to relate it to SQL, but I may have made it more confusing than anything.

That issue is what I was referring to, and it would be a good step forward. However, it's a shame you have the overhead of field names in the first place. I understand why it is like it is, being schemaless, and I suppose in terms of scaling it isn't a huge issue, as the overhead scales linearly which is manageable. But, in most cases, it's still huge compared to the size of the data itself.

I'm not sure how they'll fix it..and I don't know much about other schemaless DBs, but perhaps some sort of pattern recognization would be appropriate. Now that MongoDB has lots of funding for research, it will be interesting to see what they come up with.

kermatt · on June 6, 2012

As in an INFORMATION_SCHEMA collection ?