I was wondering, does anyone here have anything interesting to say about flat files for data storage ? Recommended reading, tips, tricks and advanced concepts...
You see, I'm not a big fan of relational databases. I do understand them very well, actually I pay my bills by working on a quite big database application, it's just that I like the simplicity and beauty of flat files. And real men don't need DBMS anyway. OK, let's not talk about RDBMS.
Say, which markup language / data serialization format do you like ? Have you used xattrs in any project ?
Flat files are cool for data which is written out once and then read in again. 3D meshes, configuration files, high score tables, user preference data and so forth.
For things you'd like to modify a record at a time, where reading the file in and writing it out again slightly differently isn't acceptable, there's either something like GDBM or Berkeley DB, or the more heavy and serious RDBMS of your choice. I mean, it's not like most of us would like to spend a year or more putting together a storage engine when people, smart people, have written (and debugged!) such things for us. Plus, Berkeley DB has concurrency control without a server. How cool is that?
I like to write XML files and store them in RDMS
JSON is a cool serialization format. (YAML is ok too but JSON is simpler and good enough)
Did you know that this board stores everything in flat files ?
I didn't have enough time to explore the source, but I believe they go into res[1] directory. It's weird... could it be that Kareha stores posts in XHTML ? Parsing through that shit every time... Also notice how meta data about thread is stored, view source and look at the first line.
Very little is parsed - the full html is stored on a single line, and can just be copied over to the front page. The only thing that is parsed is a regexp which replaces the body text with an abbreviated version when needed.
>>7
I see, that makes sense... still, no matter how you look at it, mixing data with the view seems to be a bad idea.
What if you want to change the view ( beyond CSS ), what if you need more than one view ( like RSS ), where will you store that data which shouldn't be on the view ( for example IP addresses ) ?
All of that is hard, but: None of that is really necessary.
Hypothesis: Proper modularization and abstraction actually causes feature bloat, by making new features too easy to add.
Counterpoint: modularization only appears to lead to code bloat because adding features "because they might be needed in the future, although I know of no plan to do so" is so seductive in comparison to doing actual work.
Feature bloat: Who cares? If a feature sucks and nobody uses it, just take the damn thing out again. Life's too short to agonize over these things.
>>12
Do you mean to say that's not how you work, i.e., you can't bear to remove a useless feature, thinking it might be needed in the future?
If so, use version control. That way the old code is always in svn or darcs or whatever if you change your mind.
No, feature bloat is not about useless features. It's about adding features that somebody wants, to the point where the program becomes less usable because there are so many special-purpose features. If it was just removing things nobody uses, it wouldn't be a problem.
>>12
My point is that modularization leading to bloat is more a matter of lacking discipline on the part of the programmer(s) or rather the designer(s), and not of modularization itself.
With perfect discipline and insight, there'd never be a problem. That part is not really interesting. The point is, modularization and abstraction creates the temptation for feature bloat.
>>9-17
Guns kill people.
No, people kill people.
Don't start that "Oh no! ARGUMENT on the INTERNET!" shit. This isn't and argument that has been done to death, and it is entirely civil and well-reasoned.
Don't make a mockery of human discourse, man.
>>19
That's just what Hitler would've said!
I cant believe you people are actually discussing this. It makes no sense ! Do you seriously believe that it is better to make bad decisions, because good ones will allow you to add more features ?
Look, I'm not saying that every design should be perfect. If you really understand what you're doing, then there's nothing wrong with breaking the rules in the name of simpler/faster software ( which is the case here ).
Maybe you are familiar with the gambit of playing Devil's Advocate?
But also, note how your own argument pre-supposes that modularization and abstraction is a good decision, and then argues that you should do this because it is a good decision. You are begging the question there.
This idea is very deeply ingrained in software developers. I am merely throwing out the suggestion here that maybe that isn't always the case.
I dunno. I tend to think if you lack discipline enough, you'll find a useless feature you can add instead of real work, regardless of how modular your code is or isn't. My own experience bloating up really messy FSMs attests to this.
You code Flying Spaghetti Monsters? That's awesome.
Er, seriously though, you could argue that if it was hard enough to add those useless features, you would prefer to do the real work instead.
[FSM means Finite State Machine.]
Wouldn't the real work be harder too?
Delicious flat Guevara?
Did someone mention Godwin's law yet?
Saying modular design promotes feature bloat is like saying clean code and indentation promote feature bloat. The link is there if you squint hard enough, but it's sketchy at best.
Modular design may lead to a runtime performance hit, if it isn't automatically optimized out.
Database recoverability is much, much superior to filesystem recoverability. So at the very least, store your xml files in an RDB.
>>29
And access the RDB with CORBA. Your methods should be wrapped at least five deep, or it won't be Enterprise and Scalable.