Tuesday, September 11, 2007

Mindmapping the papers, ideas for an article

I spent the last two days gaining an overview of another 10 or so papers about commonsense reasoning on natural language and processing of natural language and possible user interaction.
Quite a wide field.
I also talked to my colleague Tom and we have several approaches which we need to address in the next couple of weeks:
  • First of all we found the perfect set of word / text which we would like to interpret. It's a strict rule, it does make sense, everybody know it and can relate to it and we will not have to bother too much with ambiguities and weird word usages.
    What am I talking about? Well, I am talking about the official rules of chess.
  • We will come up with an article concerning the state-of-the-art of processing natural language and converting into program code or anything similar.
  • We will then add our thoughts and extensions which we mean to introduce in the next couple of months to round up the complete picture
So far, there are many approaches of dealing with natural language.
One uses the semantics of the English language and transfers those into programmatic semantics. Others rely only controlled languages and specified domains. Some out there intertwine their concepts with MDA and some started to reason/infer on natural sentences.

All of these ideas bring us closer to what we want, but the complete picture is now clearer:
We want give the progam the chess-specification and we shall receive a UML-model out of which code could be generated that can actually "play chess".

Tom's approach with graphs (I will explain that in a later post) abstracts from many other solutions because it initially relies on thematic roles. From then on, it's all graph transformations including reasoning. The latter one will be my part. No need of specified objects, etc. is necessary after the initial prose has been annotated.

The disadvantage of many approaches so far is, that they mostly rely on the specifics of the English language. We understand that this whole concept has to work with any language out there. Or at least a great deal of them.

The steps to be fulfilled and realized therefore are:

  1. Annotation:
    (Half-)Automatically annotate the initial text with its thematic roles
  2. Processing
    Process the annotated text and create an inital graph
    Use graph-transformation to create an initial UML-model
  3. Reasoning
    Use reasoning to get rid of ambiguities or double-fetched objects which belong together.
    Also use reasoning to split obvious "objects" into other objects with certain properties, e.g. "The cold bottle" could be one object. But what if a "warm bottle" comes around the corner later? Is this a new object, or do you just have an object "bottle" which has the property "temperature" with its possible values "cold, hot"?
    Good question, huh? Well, we'll try to do the latter one - it just makes more sense.
    Reasoning will supposedly also take place by just having graph-transformations done.
  4. Processing
    Process the results of reasoning again and create the new UML-model.
    Then transfer this model into code using any of the popular method to create code from UML.
That's it for today - more at the end of the week. I still have loads of papers in front of me which I have to read ...

No comments: