Oh, the Pascal I'm using now is a practical Pascal, not the original teaching language I first learned, which is deficient in many ways.
While there is a very fine wikipedia page comparing these two languages, this is my personal opinion of these two third-generation languages. I like third-generation languages. Because of my interest in performance and state, necessary for high scale and fault tolerant computing, their general lack of higher-level abstraction means that you can pretty much always tell what a program is doing, and how long and what resources it is going to need while doing it; the general flow of control is not hidden. (To me, high-abstraction languages are a lot like handing tasks off to graduate assistants. I may be thinking that he'll get right on it, whereas he might think that the waves are bitchin' gnarly right now, and he'll do my task maybe tomorrow. Sometime. Sure, to avoid this I can be very specific in my task assignments and specifications, but often it's easier to just do it myself than to fully specify all the necessary parameters. The same, I find, can be true of computer languages.)
I'm going to ignore both languages' Object modeling. (And thus C++ and its ilk.) Just the straight procedural languages, which is where I have always been, and in fact still am, working. (Use of object models can conflict with both high-scale and fault tolerance, making them more difficult or even impossible.)
I'm going to ignore most purely syntactic differences, such as
":=
" and "!=
" vs "=
" and
"<>
", or whether types or names come first in
declarations, and Pascal's terminating period. Po-tay-toe,
po-tah-toe. I'm also going to ignore attributes of specific
compiler/IDE products, such as convenience, speed, or quality of
generated code. These can all change, and often are extremely
important, but have nothing to do with the qualities of
the language itself.
NB: Both C and Pascal have learned from each other over the years, and some of what once might have been clear advantages of one language over the other are no more, in any significant way. These languages are now more alike than different.
The #include
directive allows for one to construct a
semantic whole out of fragments placed in multiple files, controlled
by #if
constructs. The #define
directive,
besides being used for type definitions and primordial enumerations,
can be 'abused' for textual aliasing. It can even be used for system
configuration duties, by multiply #include
-ing
configuration files written in a configuration meta-language of sorts,
using #define
to determine what flavor of configuration
is being pulled in at that moment. (This is commonly used when
configuring
RTOSs.)
And, of course, the pre-processor and the easily-abused ternary and comma operators make it the undisputed champion of obfuscated language contests between the two languages. (Maybe that's not a plus!)
f()
" (the
parentheses) make it very clear that flow of control is going
elsewhere, whether or not arguments are necessary. (In C
"x = y;
" is always a 'cheap'
operation; in Pascal "x := y
" could be cheap,
or could be a hideously expensive function call with deep
side-effects. There's no way to tell just by looking, you have to do
deeper analysis, at every level due to nested functions, in
order to know the true cost.)
result := X
" uses a magic name, which is less
obvious than C's explicit keyword-based "return X;
". If
you use Pascal's older "f := X
" form where you assign
results to the function's name, you have to be aware of exactly what
function you're inside of in order to recognize what is going on.
Also, recursion can be a bit tricky if the function doesn't have any
arguments. This is one place where the presence (or absence) of
parentheses is crucial. "f := f()
" is how you call for
recursion. I find this all quite messy.
ELSE
clause to a conditional meant
that you had forgotten to go back and re-type the preceding card in
order to remove the now-prohibited semicolon.
Start Over. Really, quite infuriating.
if X then Y; else {plus these two} Z; {new cards/lines}Will not compile. It must be changed to:
if X then Y {<== Note absence of formerly-required semicolon!} else Z;Feh!
Also note that when using conditional compilation (which Pascal now supports) that pesky semicolon can be a real problem. Consider:
if X then Y; {$IFDEF ANAL} else Z; {$ENDIF}Would not compile properly if you intended to include the optional error checking. (With C this structure would compile properly.) Instead you'd need something a bit obfuscatory like:
if X then Y {<== Not here,} {$IFDEF ANAL} else Z {<== and not here,} {$ENDIF} ; {<== but here. WTF, Pascal!}
if X then begin Y; end else begin Z; end;is just too damned wordy. (Oh look: there's an extra semicolon, and don't you forget it. Yay!) C's braces are much less... abrasive.
if (x = y()) z();
" is just what you wanted. This
compact assign-and-test notation is especially attractive for
programmers (like me) who learned assembly language first, where
partial results ("x
", above) are easily saved for later
re-use, and where condition-code-based conditional execution is the
norm. In C, "x = y = z;
" is a
clever side-effect of regular assignment, and can be
used anywhere. (It's also another example of "Don't Repeat
Yourself", on a small scale.)
Consider a situation where we are going to be concatenating two strings, but the result cannot be overlong, so we're going to preferentially punish the longer of the two components. In C it's pretty short, and moderately efficient. (If we really cared about runtime efficiency we'd factor the string lengths out of the loop entirely, or even eliminate the loop altogether, but that would be more complex, which carries its own problems. We favor here the simplest form, and modest efficiency. As a general policy Good Enough, with a reasonable path for getting Better if it should later turn out to be necessary, is often the best design choice.)
In C:
vs Pascal's similar-sized (but far less efficient):while (((len1 = strlen(s1)) + (len2 = strlen(s2))) > 26) if (len1 > len2) s1[len1-1] = 0; else s2[len2-1] = 0;
while length(s1) + length(s2) > 26 do if length(s1) > length(s2) then setLength(s1, length(s1) - 1) else setLength(s2, length(s2) - 1);
But! Five different string length calls per loop iteration, versus two. (And only one semicolon.) Any Pascal expression that tries to approach C's runtime efficiency here will be somewhat larger due to the syntactic requirements of the language, which will tend to obscure the intrinsic simplicity of what we are trying to do here. Perhaps:
Bulky. Or, consider another, more pithy example. In C:while True do begin len1 := length(s1); len2 := length(s2); if len1 + len2 <= 26 then break; if len1 > len2 then setLength(s1, len1 - 1) else setLength(s2, len2 - 1); end;
vs Pascal's equally efficient but much more verbose:while (p = next()) free(p);
or:while True do begin p := next; if p <> nil then free(p) else break; end;
repeat p := next; if p <> nil then free(p) until p = nil;
Which language best embodies the simple concept: "While there is an X, do Y to it?"
In C:
vs Pascal's much bulkier:if (p && *p) {...
if p <> nil then if ^p <> 0 then ...
Pascal is quite a bit wordier. More to the point, you can't even put the Pascal conditional chain everywhere you might want to, like you can with the C idiom.
Pascal compilers now offer an option for short-circuit evaluation, where you could then express:
if (p <> nil) and (^p <> 0) then ...
without risking dereferencing a null pointer, but this behavior is optional, controlled by a compiler pragma, so you have to do more work to engage it, but then just how certain are you that the correct mode has been selected? And that you haven't damaged nearby code that perhaps was relying on the opposite behavior? And that nobody's ever going to change compilers on you?
&
',
'|
', and '~
' versus '&&
',
'||
', and '!
'. (Pascal has only and
,
or
, and not
.)
This difference really only exists because short-circuiting of logical
operators is supported. (Consider that there is no '^^
'
operator, because there is no short-circuit possible for
exclusive-or.) Otherwise the default bitwise values for True/False
(1/0) mean that using bitwise operators for logic would often work
too, if short-circuit evaluation were not necessary for any reason.
Which is to say, this:
is logically equivalent to:if ((a == b) | (c == d)) { ...
if short-circuiting were not necessary. However, because C treats any non-zero value as True, many otherwise valid logical expressions using bitwise (instead of logical) operators would actually fail. Consider:if ((a == b) || (c == d)) { ...
Ifif (a & b) { ...
a
were 1 (True) and b
were 2 (also True),
this expression, if intended to express logical conjunction, would
evaluate to 0 (False)—Fail! This expression is
perfectly legal C for detecting if any bits are set in common
(intersection) between a
and b
, so it's not
like the compiler should try to flag this usage, as it has no way of
knowing which behavior you actually had in mind. (Bitwise
intersection versus logical conjunction.)
It is usually best if logical operations are done exclusively with the
logical operators. Consider that 'a
' and
'~a
' are
both True for any non-zero value! Bugs of this sort are
common for a novice C programmer. C, like any powerful tool, has
sharp edges; keep your hands out of the blades! (Short-circuit
evaluation is expressly prohibited in Standard Pascal, however it is
provided as an extension in many newer Pascals.)
You also want to know what data connections there are in your code, and because Pascal's nested functions have implicit access to all their containing functions' variables, all the way back up the chain, this means that the data connection graph is fairly complex, and somewhat hidden. This, IMHO, is unnecessary freedom, and can ultimately result in the programmer not knowing what is truly going on. More bugs, and much more difficulty in refactoring, hampering restructuring that might make sense logically, but which in practice would result in too much work, and so does not get done.
Which is to say, Pascal offers you the near-unlimited opportunity to step on (overload) method and variable names you don't even know are there. You think you're looking at one thing when examining your code, but what the compiler actually built (according to its rules) might be something else entirely. More bugs that must be found at runtime. C doesn't do this.
In this case, I argue that simpler is better. Complexity is one of the true sources of problems; unnecessary complexity is just stupid. Having more features is not necessarily better; sometimes it's just giving you more rope with which to hang yourself.
printf()
. In Pascal, such functions are few
(e.g. WriteLn()
) and are built in to the compiler; you
cannot write your own. (If using an Object Pascal you might
be able to get around this, sometimes, by writing a lot of
type-specific code.)
volatile
declaration, which allows you to
cleanly write device drivers and inter-thread communications without
having to delve into assembly language. (C's bitfields, though
non-portable, are also of use when writing device drivers.)
for-in
loop for traversing strings, arrays, and sets, etc.
Ditto. (Only in newer Pascals.)
with X do
' construct, which I have long thought of
as a significant Pascal advantage. It saves a lot of typing. On the
other hand, it's also a disadvantage, because it introduces yet
another layer of implicit context that can cloud the issue of exactly
which variables are being referred to. C seemed a lot cruder, but if
'X
' is messy enough that repeated typing of it is
repugnant, you can always introduce a short variable
"(p->
)" to mitigate that, while not clouding the issue
of to what exactly you're referring. And, if you need to be working
with two of these structures, as in a copy or comparison
situation, which is not exactly unheard-of, Pascal's shortcut is of no
help whatsoever.
This point I'm still waffling about, but I'm a lot less impressed
with with
than I used to be. It looks great in trivial
examples, but falls down a lot when you start trying to do real work.
Also consider the following (stripped-down) scenario:
This is syntactically clean and will even compile, but won't do what is intended. The compiler could choose to do any of the following:type rec = record id : integer; timeStamp : TDateTime; end; var timeStamp : TDateTime; // Current timestamp for marking things. procedure stampRec(var thing : rec); begin with thing do timeStamp := timeStamp; end;
only two of which are potentially useful, and you don't know which it'll choose. (The third is likely what you had in mind, but the fourth is likely what it would choose to do.) If you're going to usetimeStamp := timeStamp; timeStamp := thing.timeStamp; thing.timeStamp := timeStamp; thing.timeStamp := thing.timeStamp;
with
you have to avoid using variables with the same
names as fields, including fields you don't necessarily know/care
about, even if that's what would otherwise make the most sense.
Information hiding is definitely a double-edged sword.
Likewise, while C's 'new' prototypes are a boon, how they
were introduced was not. I loathe the fact that
you must use the new-style function definitions when using
prototypes. When rummaging through established code I rarely
need to see the types of function arguments, and would much prefer the
simpler argument lists of K&R C. If there's a typing problem I
can always look at the type declarations, otherwise I can skip over
them with ease. The compiler should be smart enough to do lint-style
type checking, regardless of the form of the function definition. (In
fact, the DIAB C compiler, pre-Standard, allowed mixed use of
prototypes and K&R function definitions. There was no ambiguity,
and no conflict. However, using the
Only when you need to use the C preprocessor's magic, like for system
configuration, are there advantages to separate
On the other hand, if you should have crafted a library
routine you end up with a lot of duplicate code, and multiple
maintenance issues. (Don't Repeat Yourself!) Also, these little
sub-functions should be just that: little. If they get too big you're
probably doing something wrong, and obfuscating the structure of your
program rather than clarifying it.
Oh, and beware the presence of any compiler pragma that allows
short-circuit AND/OR evaluation. If it were on for this example
function it could destroy the guarantee that all database tables be
modified together! (This is another win for C: having two
sets of Boolean evaluators, making this behavior explicit.) The
invocation of the sub-function might better be:
register
keyword in
prototypes to change the function
ABI
while not doing so in the function definition was a recipe for
disaster.)
.c
and .h
files. Also, for libraries, where you
are interfacing to code whose source is not provided, you end
up needing two files anyway: one for the library provider which has
declaration and implementation together, and one for the library user
that has only declaration. Now you have two files with the same (!)
information in them both, which is bad. (Don't Repeat Yourself!) In
this use case C has the advantage. (The .h
declaration
file is shared, the .c
implementation file is kept
private, and the compiler itself will check the declarations against
the definitions when building the library. You maybe did have to
Repeat Yourself, but at least an automated tool is checking your
work.)
The sub-function naturally has access to all the parent's variables
and arguments, so you don't have to craft a linkage— this is
something Pascal does for you. Just extract the fragment of common
code and refer to it multiple times. In this example we apply
the same operation to multiple SQL tables, guaranteed, and we
get a composite result status.
function deleteTransaction(tranNumber : integer; workDate : TDateTime) : Boolean;
function delFrom(dbt : string) : Boolean;
begin
result := False;
if Query('DELETE FROM ' + dbt + ' WHERE register = ' +
QI(regNumber) + ' AND workDate = ' +
QD(workDate) + ' AND tranNumber = ' +
QI(tranNumber)) then exit;
result := True;
end;
begin
result := delFrom('head') and delFrom('detail') and delFrom('payment');
end;
in order to make the intended behavior explicit. (This is the
Free Pascal pragma. Others
could vary.)
begin
{$PUSH}{$B-} // Under no circumstances allow short-circuit evaluation here!
result := delFrom('poshead') and delFrom('posdetail') and delFrom('tranpay');
{$POP}
end;
in C the obvious (but naive) declaration:
var p1, p2, p3 : ^Char;
does not get you the same thing. You must use:
char* p1, p2, p3;
to declare several pointers. Not really a big deal, and you won't get
far trying to compile your code if you got it wrong, but a minor
irritant nonetheless.
char *p1, *p2, *p3;
There's a reason that C is the language in which most (if not all, now) newer languages are implemented, at least at first. Its data and control structures align very well to how almost all machines actually work at the low level, so C lives up to one description of it as a "high-level assembly language." With C you are very close to the machine, and can extract the ultimate in performance. Yes, it comes at a cost, but if you're mapping an abstraction to machine level, for whatever reason, that cost must be paid, by somebody. C allows you to get very close to the machine, if that's what you want, yet at the same time it allows you to easily construct very high levels of abstraction, all without hiding from you what's going on. The choices are up to you, and that freedom is C's ultimate strength.