In an earlier project, I was looking at how It might be possible to get the C/C++ -processor to chow down on Pascal programs, that is if the preprocessor would allow us to do things like temporarily redefining things like the semicolon or equals symbols, and so on - with nested #ifdef's, #undef's and the like. Sort of like this - which doesn't actually work with all of the macros, but it does work with some so that you can at least partially convert a Pascal program to C/C++ by creating some kind of "pascal.h" file and then add #include "pascal.h" in your Pascal code, and then grab the preprocessor output, right? Well, no - but almost, very very almost like this:
#define { /*
#define } */
#define PROCUEDURE void
#define BEGIN {
#define END }
#define := ASSIGN_EQ
#define = COMPARE_EQ
#define IF if (
#define ASSIGN_EQ =
#define COMPARE_EQ ==
#define THEN )
#define REPEAT do {
#define UNTIL } UNTIL_CAPTURE
#define UNTIL_CAPTURE (!\
#define ; );\
#undef UNTIL_CAPTURE
#define ; );
#define = [ = SET(\
#define ] )\
#define ) \
#undef = [\
#undef ]
// so far so good ....
#define WITH ????????
I mean, if someone else once figured out how to get the GNU C/C++ preprocessor to play TETRIS ... then it should be possible to do whatever else we want it to do, even if some other powers claim that strictly speaking the preprocessor isn't fully Turing complete in and of itself, but that it is actually only just some kind of push-down-automation, because of some issues like having a 4096 byte limit on the length of string literals, and so on. Yeah, right - I think I can live with that one if what they are saying, is in effect is that it is probably as Turing complete as anyone might actually need to be.
Still, this gives me an idea that seems worth pursuing, like what does ELIZA have in common with the preprocessor or a full-blown compiler for that matter? Well, the Eliza code from the previous log entry used the following static string tables, arrays, or whatever you want to call them, based on a C++ port of an old C version that was converted from an example that was written in BASIC and which most likely appeared in some computer magazine, most likely, Creative Computing, back in the '70s.
char *wordin[] =
{
"ARE", "WERE", "YOUR", "I'VE", "I'M", "ME",
"AM", "WAS", "I", "MY", "YOU'VE", "YOU'RE", "YOU",NULL
};
char *wordout[] =
{
"AM", "WAS", "MY", "YOU'VE", "YOU'RE", "YOU",
"ARE", "WERE", "YOU", "YOUR", "I'VE", "I'M", "ME",NULL
};
This could probably be fixed up a bit - to be more consistent with the methods that I am using in my port of the UCSD Pascal compiler to solve the problem of keyword and identifier recognition, as was also discussed earlier, and for which in turn I had to write my own TREESEARCH and IDSEARCH functions.
struct subst
{
char *wordin;
char *wordout;
subst();
subst (char *str1, char *str2)
{
wordin = str1;
wordout = str2;
}
};
Which should allow us to do something like this - even if this is, as of right now - untested.
subst conjugates [] =
{
subst("ARE","AM"),
subst("WERE","WAS"),
subst("YOUR","MY"),
subst("I'VE","YOU'VE"),
subst("I'M","YOU'RE"),
subst("ME","YOU"),
subst("AM","ARE"),
subst("WAS","WERE"),
subst("I","YOU"),
subst("MY","YOUR"),
subst("YOU'VE","I'VE"),
subst("YOU'RE","I'M"),
subst("YOU","I"),
subst(NULL,NULL),
};
So, I searched Google for Eliza source code, and among other things, I found variations of Weizenbaum's original paper on the subject are now available, as well as variations of things like some kind of language called GNU SLIP, which is a C++ implementation of the symmetric list processing language that the original Eliza was originally written in since it seems that Eliza wasn't actually written in pure Lisp at all, contrary to popular belief! Yet, documentation for the SLIP language looks impossibly bloated, and it just as well warns about having a steep learning curve. So, I won't venture down that rabbit hole, at least not yet, and will prefer instead to continue on the path that I am currently following:
Of course, it should become obvious that Pascal to C conversion might start to look like this:
subst pascal2c [] =
{
subst("{","/*"),
subst("}","*/"),
subst("PROCEDURE","void"),
subst("BEGIN","{"),
subst("END","}"),
subst(":=","ASSIGN_EQ"),
subst("=","COMPARE_EQ"),
subst("IF","if ("),
subst("ASSIGN_EQ","="),
subst("COMPARE_EQ","=="),
subst("THEN",")"),
subst("REPEAT","do {"),
subst("UNTIL","} UNTIL_CAPTURE"),
subst("UNTIL_CAPTURE","(!"),
subst("= [","= SET("),
subst("]",")"),
subst(NULL,NULL),
};
With some work to be done with rules so as to implement #ifdef and #undef, or other means for providing context sensitivity, since for now a weird hack is still required that might temporarily require redefining the semicolon so as to property close out UNTIL statements, as well as rules for capturing the parameters to FOR statements, with the WITH statement still being an interesting nightmare in and of itself.
WITH (provided ingredients)
BEGIN
make(deluxe pizza);
serve(deluxe pizza);
END;
Of course, that isn't proper Pascal. Neither is it proper C, but maybe it could be if the variable ingredients was a member of some class which in turn had member functions called make and serve. Welcome to free-form, natural language software development! Well, not quite yet. Still, is too much to ask if the preprocessor can somehow transmogrify the former into something like this:
void make_pizza (provided ingredients)
{
ingredients->make(deluxe,pizza);
ingredients->serve(deluxe,pizza);
}
Maybe provided is an object type, and ingredients is the variable name or specialization so that we can in the style of the Pascal language tell the C/C++ preprocessor to find a way to call the make function, which is a member of the provided ingredients class hierarchy, and which in turn can find the appropriate specializations for making not just a pizza, but a deluxe pizza, just as we might call the Pascal WRITELN function with a mixture of integers, floats, and strings, and then it is the job of a preprocessor, or compiler to resolve the object types, so the WRITELN function will know which sub-specialization to invoke on a per object basis, which I figured out how to do, elsewhere in C++, by using an intermediate s_node constructor to capture the object type in C++ via polymorphism, thus allowing the C style var_args to capture type information, which it can't do, as far as I know in pure C, but as I have shown elsewhere, it can be done in C++, via a hack!
And thus, we inch ever so slowly toward figuring out how to accomplish free-form natural language programming. Obviously, if the meanings of words can be deduced, and therefrom intentions can be ascribed, then it should follow that from the ascribed intentions there can be associated the appropriate objects and methods, if proceeding algorithmically, or else there should also be a way of programmatically generating a corresponding data-flow based approach which can be embodied in the form of some kind of neural network.
I think that the Pascal compiler source might be making an appointment to have a conversation with Eliza sometime in the near future.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.