Ann: Paper about Web Prolog (Discussion)

Nice …

not sure about the intelligent conversational agent – its quite an (over) loaded term these days. From chatbots to alexa, to what-not.

Perhaps, a forward looking subtitle might be worthwhile … describing what we would see manifested in the future once its available everywhere …

Also, terms such as “next generation” “Web 3.0 / 4.0 / 5.0”

1 Like

Programmable Semantic Web?

Prolog-Web – i start to like it more and more – its succinct.

Prolog Semantic Web.

1 Like

I’m with Dan on this. You don’t need to be cute - this is a technical paper. Anyone who can get it doesn’t need mottos and such.

Ok, let me try to motivate the choice of subtitle. (Sorry about the length of this post. You can always ignore it if you want, but then you would miss what I consider an interesting aspect of Web Prolog and the Prolog Web.)

Developing Web Prolog into a language that allow people to program such agents is actually what I think we should be aiming for. But before I try to explain how this might be done for agents such as Siri or Alexa, I’d like to start with something simpler, namely pengines. I like to think of a pengine as a simple kind of intelligent and conversational software agent, and of the Prolog Web as the environment in which such agents are born, act and die. While they are alive, they talk to other agents that populate the Prolog Web, some of which are software agents, some of which are humans.

Pengines are agents

The notion of an agent is rather fuzzy but there are at least three properties most theorists would agree a software agent must possess: it must be a process of a sort, only loosely connected to other processes, it must be stateful, thus have a kind of memory of its own, and it must be capable of interacting with the world external to it. Note that under this definition, any stateful actor would qualify as an agent, and even Erlang might be seen as an agent programming language. A pengine is a kind of actor which in addition to the properties listed above has two other traits we intuitively tend to associate with agenthood: it is capable of reasoning and capable of giving answers to queries – answers that follow logically from what it believes.

The intelligence of a pengine agent is of course very limited. It is capable of an elementary form of reasoning from knowledge in the form of Prolog source code, and that is about it. The conversational abilities of a pengine are also very limited. It is capable of answering simple questions based on conclusions it draws from the knowledge it has to its disposal.

The birth, life and death of a pengine

Note: This section is more or less an extended version of the example in my Erlang’19 paper. I included it here just to make the point that talking to Prolog is like talking to something which on a certain level of abstraction can be described as an intelligent conversational agent. If you already believe in this point, you should skip to the section about “real” voice-based intelligent and conversational agents.

Below, we show how to create and interact with a pengine process which is running as a child of the current top-level process. Indeed, what we have here is a pengine running another pengine, a Prolog top-level running another Prolog top-level:

?- pengine_spawn(Pid, [
       node('http://ex.org'),
       src_text("p(a). p(b). p(c)."),
       monitor(true),
       exit(false)
   ]),
   pengine_ask(Pid, p(X), [
       template(X)
   ]).
Pid = 439752@'http://ex.org'.
?- flush.
Shell got success(439752@'http://ex.org',[a],true)
true.
?- pengine_next($Pid, [
       limit(2)
   ]),
   receive({Answer -> true}).
Answer = success(439752@'http://ex.org',[b,c],false).
?-

There is quite a lot going on here. The node option passed to pengine_spawn/2 allowed us to spawn the pengine on a remote node, the src_text option was used to send along three clauses to be injected into the process, and the monitor options allowed us to monitor it. These options are all inherited from spawn/3.

Given the pid returned by the pengine_spawn/2 call, we then called pengine_ask/2-3 with the query ?-p(X), and by passing the template option we decided the form of answers. Answers were returned to the mailbox of the calling process (i.e. in this case the mailbox belonging to the pengine running our top-level). We inspected them by calling flush/0. By calling pengine_next/2 with the limit option set to 2 we then asked for the last two solutions, and this time used receive/1 to view them.

Since we passed the option exit(false) to pengine_spawn/2 the pengine is not dead and we can use it to demonstrate how I/O works:

?- pengine_ask($Pid, pengine_output(hello)),
   receive({Answer -> true}).
Answer = output(439752@'http://ex.org',hello).
?-

Input can be collected by calling pengine_input/2, which sends a prompt message to the client which can respond by calling pengine_respond/2:

?- pengine_ask($Pid, pengine_input('|:', Answer)),
   receive({Message -> true}).
Message = prompt(439752@'http://ex.org','|:').
?- pengine_respond($Pid, hi),
   receive({Message -> true}).
Message = success(439752@'http://ex.org',[pengine_input('|:',hi)],false).

The pengine is still not dead so let us see what happens when a query such as ?-repeat,fail is asked:

?- pengine_ask($Pid, (repeat, fail)).
true.
?-

Although nothing is shown, we can assume that the remote pengine is just wasting CPU cycles to no avail. Fortunately, we can always abort a runaway process by calling pengine_abort/1:

?- pengine_abort($Pid),
   receive({Answer -> true}).
Answer = abort(439752@'http://ex.org').
?-

When we are done talking to the pengine we can kill it:

?- pengine_exit($Pid, goodbye),
   receive({Answer -> true}).
Answer = down(439752@'http://ex.org',goodbye).
?-

Note that messages sent to a pengine will always be handled in the right order even if they arrive in the “wrong” order (e.g. next before ask). This is due to the selective receive which defers the handling of them until the PCP protocol permits it. This behaviour guarantees that pengines can be freely “mixed” with other pengines or actors. The messages abort and exit, however, will never be deferred.

Voice-based intelligent conversational agents

Pengines are kind of dumb, but agents can be become both smarter and more conversational by programming. If the most optimistic AI researchers are right, there is in fact no limit to how smart a software agents might become. To be maximally useful for humans however, they need to be able to talk to us using natural language.

As a sign of where voice user interface technology may be heading, here is a call for participation in a the Conversational Interaction Conference that was held is San Jose, California, March 11th and 12th, 2019:

Talking to computers has long been a staple of science fiction. Today, talking or typing in human language to computers is becoming commonplace.

But this isn’t just a neat trick. Sure, you can ask Alexa to turn off the lights, play a song of your choice, or tell you a joke. But you can also ask her to connect with a company and talk to that company about products, services, issues, and even buy something.

Conversational interaction isn’t just a major technology trend in Artificial Intelligence. It’s also a breakthrough in user interface technology. It’s coming at a time it is needed, as the Graphical User Interface (GUI) that has served us so well is getting overburdened with too many features, icons, and long menus - with the small screen of mobile phones further limiting its effectiveness. The user manual for the Conversational User Interface is simply “Say or type what you want” or, even more simply, “How can I help you?”

Companies are hearing repeatedly about having an “AI Strategy,” but that vague admonition comes with significant hurdles, even to understand what it means. But every company can use the AI technology of conversational interaction. Companies can, for example, use it to improve customer service while lowering its cost or make employees more efficient.

And the good news is - there are many vendors providing tools that reduce the effort and risk in using this technology. This isn’t the future - companies are benefiting now!

That’s a lot of marketing lingo, but Vlad Sejnoha, former CTO for Nuance Communications, has written a brilliant, enthusiastic yet sobering article in the Wired magazine about the future prospects for intelligent conversational agents, in which he asked, with reference to the film Her: ``Can We Build `Her’?: What Samantha Tells Us About the Future of AI’'.

Hardware devices for intelligent conversational agents

The figure below shows five different hardware devices, all (normally) connected to the Web, often over Wi-Fi, sometimes over 4G or 5G. They can be seen as representing the most recent additions to the infrastructure of the Web.

a) A mobile phone is used by more people than ever to access the Web. Most mobile phones are equipped with a virtual assistants (such as Siri or Google Now) which uses a voice interface to answer questions, make recommendations, and perform actions.

b) Amazon Echo is a smart speaker equipped with the virtual assistant Alexa. According to Amazon, 100 million products with Alexa integrated have been sold.

c) Mycroft Mark II is an open-source alternative to Amazon’s Echo device.

d) Furhat is a social robot that communicates with us humans as we do with each other - by speaking, listening, showing emotions and maintaining eye contact. According to Furhat Robotics, the company that makes it, it is “the world’s most advanced robot of its kind. In any scenario where communication is required, Furhat can potentially fill this gap. Ask questions, practice interviews, train your skills, play games or learn something new.” See here for a lot of video clips. (Note: I know the people who founded the company, and I admire what they’ve done, but I don’t work for them.)

e) Oculus GO is a Virtual Reality (VR) headset. It is equipped a browser that allows a user to access the VR world. While inside (and maybe in the context of a game), the user may encounter virtual conversational agents (maybe NPCs) that may want to strike up a conversation with you. For inspiration, you may want to listen to a pod episode Human Interact’s Starship Commander, which uses AI-enabled, voice-activated commands in order to participate within an interactive story.

So, how to program these things?

There appear to be four main drivers behind the trend towards conversational voice user interfaces: 1) machine learning has improved a lot, 2) speech technologies have matured and error rates have gone down, 3) the Web (at least from a technological point of view) is in good shape and is only getting faster, bigger and better, and 4) as we saw above, new kinds of hardware devices that take advantage of improvements in those areas are now commercially available.

Technologies that in my opinion ought to be able to play an important role in these developments, but which seem to be under-utilised at this time, are 1) technologies for symbolic knowledge representation and reasoning technologies and 2) for specifying interaction. I think Web Prolog and the Prolog Web has something to contribute here.

Interestingly, Prolog has already been used (by people in this group!) for programming Alexa. Sam Neaves (@sam.neaves) made a video clip in the “Playing with Prolog” series, where he described how to set things up. More recently, Falco Nogatz (@fnogatz) et al published an interesting paper dealing with the subject.

With built-in logic-based knowledge representation and reasoning, and a built-in grammar formalism for parsing and generation of natural language, Prolog appears to be ideal for these sort of things, and with the even more powerful means for knowledge representation which tabling and well-founded semantics brings to the language, it gets even better. However, conversational systems, at least if they are sophisticated (e.g. Furhat robots and VR game NPCs), need concurrency as well as primitives for sending and receiving messages. I never felt that Prolog was a good language for programming interaction, at least not if fine-grained real-time interaction is called for. Traditional Prolog just doesn’t provide us with the proper means. This is where the actor model and the Erlang-ish features of Web Prolog comes in handy. Most likely, sophisticated conversational agents can be built from components that are themselves actors running concurrently, allowing agents, as it were, to think, listen, speak and act at the same time.

As far as I’m aware, event-driven state machines have not been used much in the Prolog world. Perhaps the reason is that primitives for sending and receiving messages did not appear in Prolog until (a few) platforms implemented the ISO Prolog Threads draft standard. As can be seen by an Erlang/OTP behaviour such as gen_statem, Erlang takes event-driven state machines very seriously. I think it may be time for Prolog to follow suit. I have therefore, in a chapter of my manuscript, provided a somewhat sketchy proposal for how to introduce Web Prolog as a scripting language for State Chart XML (SCXML), a W3C standard which provides an XML-based notation for statecharts.

SCXML is based on the graphical statechart notation introduced by David Harel in (Harel, 1987). Statecharts already has a solid reputation as a great tool for the design and implementation of user interfaces and, because of this, I believe that the combination of SCXML and Web Prolog might be a great choice when programming conversational systems such as digital assistants and various forms of multi-modal user interfaces.

A chapter on Web Prolog and SCXML is here: https://github.com/Web-Prolog/swi-web-prolog/raw/master/book/web-prolog-and-scxml.pdf .

So, to summarise my argument, I think Web Prolog can be “sold”, not only as language for building a Prolog Web, but also as an excellent choice of language for programming voice-based intelligent and conversational agents.

(Again, sorry about the length of this post.)

1 Like

Of course you don’t.

If course you can. I have a whole chapter on that - chapter 8 in the manuscript. It is obvious that the SCXML + Web Prolog combo is a lot more expressive than what can be easily built in the way described in this chapter. Also, statecharts is a visual formalism, so statechart editors such as YAKINDU Statechart Tools can be used - under a license which is free if you’re in academia.

Commercial companies that develop conversational software agents are in agreement that state machines have an important role to play. But simple state machines won’t do. Furhat robots, for example, are controlled by means of hierarchical state machines (see here for a proof). They are implemented in Kotlin. SCXML is an XML-based W3C standard - the result of the work of specialists on voice-based and/or multi-modal UIs, taking Harel’s statecharts as a point of departure. The statechart formalism was chosen because it is well-known, fairly formal, and adds not only hierarchical but also parallel states to state machines.

SCXML was off to a slow start, as it was used mostly by the organisations that were involved in the standardisation work. But recently, I’ve noticed that JavaScript front-end developers had started to form a group around a software called Xstate, which implements statecharts and claims to aim for compatibility with SCXML. This group has grown fast, and I wouldn’t be surprised if the use of statecharts got really popular in the future. See here for a list of resources.

I like SCXML. But having said that, I was involved in the standardisation process, and I now think certain mistakes were made (for which I do of course accept a certain amount of responsibility), which should be corrected in SCXML version 2.0. The changes I propose will allow SCXML authors to pick Web Prolog as a scripting language, rather than JavaScript (or ECMA-script to be more exact) - more or less the default now. It think it would be worth it, because a combination of SCXML and Prolog is powerful and exactly what is needed for complex smart GUIs, VUIs or MMUIs (multi-modal UIs). The draft chapter I point to hints at how this combination might be designed. It’s a first draft, and I’d love to get comments from people who take the time to read it.

Yes, last-call opimisation is crucial. And Erlang has it, for sure.

Interesting discussion, takes me back to a previous existence. Some of the issues, esp. around supporting frame-based representations here might retouch on issues in the DB/KR communities on separating internal and conceptual levels. For Web Prolog or whatever it ends up being called what is needed at which level.
See for example: Mapping Between a NIAM Conceptual Schema and KEE Frames,
Steven Twine
Data & Knowledge Engineering 4(2), pp. 125-155, 1989.
Then there is also Brachman’s paper on what is-a is and is-a-isn’t
Disclaier: I did not contribute directly in any way to this paper was part of the DB research group at Uni. Queensland the time.
In general. Lest we re-tread old paths. I suspect much of the discussion from the DB
community on separating levels of representation will be relevant again. and probably point to issues that will need to be pinned down at some point if there is to be some kind of standard agreed.
regards David Duke

2 Likes

Thanks for your comments. They are much appreciated!

Prof. David Duke of University of Leeds?! I’m doing a PhD supervised by Vania and Tony! I was also in the last cohort you taught Parallel and Concurrent Programming to. Glad to see you here!

2 Likes

Hello Paul, yes one and the same. Pleased to hear you are going on. Unfortunately I’m heading towards an early retirement due to stroke, but using the opportunity to catch up on unfinished business. I have a long-standing interest in logic prpgramming ,and SWI was a great vehicle for what I was interested in doing. So the forum was an obvious place to look. Please you still recall the P&C course. I’m afraid I didn’t have tine to cover some of the higher-level approaches or theory, but from what I can see you are finding your own way, enjoy working on the phd. I wont say good luck, as luck shouldn’t come into it…

regards
David

2 Likes