Fixed-size buffers in C (10)

1 Name: #!usr/bin/anon 2005-11-17 04:47 ID:kotEQ2Sk

I found an interesting discussion on the use of the bounds-checking strlcpy(), strlcat(), and snprintf() functions, versus their counterparts strcpy(), strcat(), and sprintf(), which assume a destination buffer of infinite size:

http://sources.redhat.com/ml/libc-alpha/2002-01/msg00001.html

Surprisingly to me, the glibc developers seem to believe that the bounds-checking functions should be avoided, because fixed-size buffers shouldn't be used. They believe that buffers should be dynamically allocated as necessary. In other words, I guess they write fun code like:

char *str;
str = malloc(strlen("//.log") + strlen(a) + strlen(b) + 11 + 1);
sprintf(str, "%s/%s/%d.log", a, b, n);

It seems to me that not only is this rarely needed (come on, how often is str going to need more than, say, 200 characters?), this sort of calculation is prone to error, and snprintf() should be used anyway for an extra layer of safety. I can say this with some experience, as I frequently used to mess up calculations like that, until finally I got some sense and started doing:

char str[200];
if (snprintf(str, sizeof str, "%s/%s/%d.log", a, b, n) >= sizeof str)
{
/* string was too long */
...
}

Is it just me, or is creating strings the first way absolutely crazy? Furthermore, even if strings are being created that way, isn't it criminally stupid not to use a bounds-checking function anyway, since essentially you are getting a little performance gain for betting that programmers can program perfectly?

Discuss!

2 Name: dmpk2k!hinhT6kz2E 2005-11-17 05:04 ID:Heaven

They're being pedantic asses.

Once you've seen it a few times (and on slashdot, that's all the time), you can smell it from a mile away. Ulrich Drepper's post in particular reeks.

On the other hand, if they threw the entire standard library out and replaced it with something less archaic and inconsistent, that'd be nice too.

3 Name: #!usr/bin/anon 2005-11-17 18:54 ID:Heaven

>>2 Rewriting hundreds of millions of lines of code would be "nice"?

4 Name: dmpk2k!hinhT6kz2E 2005-11-17 19:59 ID:Heaven

Either do what C++ did, or make a backward-compat layer.

5 Name: #!usr/bin/anon 2005-11-18 02:38 ID:kotEQ2Sk

What I found really interesting was the aversion to fixed-size buffers (not the complaint about the function not being standard, which is obviously a non-issue since the way a function gets standardized is by being widely available first).

Any C programmers here: Do you use fixed-size buffers in your programs, or dynamically allocate things? When, if ever, is the former appropriate?

6 Name: dmpk2k!hinhT6kz2E 2005-11-18 02:47 ID:Heaven

I mostly use fixed, for a couple reasons:

It's more difficult to fuck up:
Usually you allocate fixed on the stack, thus you don't need to worry about freeing them. There's also less code to be written, and it's simpler to read, reducing the number of potential bugs. Just keep in mind the end of the array.

It's a whole lot faster:
Dynamic allocation is really slow. You want to do it the absolute minimum possible, and only when necessary.

Speed isn't an issue? Maybe you should use a higher level language instead.

7 Name: #!/usr/bin/anonymous : 2006-03-25 11:44 ID:JZAgZf8x

>>1
Then again, the GNU C library has things like asnprintf() and other thoroughly nonstandard thingamabobs. No strlen() chains there. I'll agree 100% that the strlen() chain is the more error prone part in your code than anything else.

Much of the time though you can make a sane guesstimate on how much buffers you need. I.e. something like snprintf(buf, sizeof buf, "cheese%04d.jpg", intvalue) where intvalue is between 0 and 9999; hardly need any more space than 15 bytes there. Add to this that there may not be dynamic memory allocation on some targets (like an operating system kernel before it's got a good grip on "where the RAMs at, yo").

snprintf(), strlcpy() and so forth are pretty good at keeping your hands clean of the exploit-of-the-day though, turning cases where your unforeseen consequences would've become openings for stack smashing into just harmless string truncation. (Though that may in turn lead to stuff like symlink race condition attacks and so forth.)

8 Name: #!/usr/bin/anonymous : 2006-03-25 12:52 ID:Heaven

>>7

You should always assume that %d will require 11 bytes. Anything else is reckless.

9 Name: #!/usr/bin/anonymous : 2006-03-25 22:54 ID:Heaven

>>8
Better not assume anything about it. int will be 64 bits before you know it. I guess you could do

#define INT_MAX_STRLEN 11 /* fix this if `int' is larger than 32 bits */

but anything less would be sloppy.

10 Name: #!/usr/bin/anonymous : 2006-03-29 16:19 ID:Heaven

>>8
I was using an integer as an example, since a string format with a maximum length would've made things too long for that point. Then again, sometimes you actually _know_ that a variable will always be between 0 and 9999. Stick an assert above it if you like; "the {guy,drooling idiot} who comes after me" can handle his own damn issues.
This thread has been closed. You cannot post in this thread any longer.