Monday, October 29, 2007

Summing up my work in 19 slides

I just finished summing up the work of the last 12 months into 19 slides for my PhD-Meeting tomorrow.
I will upload the files asap when I got the web server back online on which I can put the files.

Tomorrow will be another "Try to get used to working with (Research-)Cyc" day. That thing is so powerful and yet so awkward to use once in a while ... and the stubby Java interface they issue with it doesn't help this a lot.
Well, I guess I'll have to deal with it.

I will give you a short sum-up of my slides later tonight after I refined them once more.

So long.

Wednesday, October 24, 2007

Preparing for PhD Meeting

I got a talk with my co-PhD-Students and my professor on Tuesday evening.
There I will present my results, what I did so far, what I achieved and what the future possible holds for me.
This also includes feedback to my work and giving directions for the next few weeks months of work.

I will offer my presentation as download here as soon as it's done. I should be done with it this Sunday. At least that's my plan.

See you guys soon.

Tuesday, October 9, 2007

Bordering aspects

The system we dream of is supposed to generate program code from natural language.
So far, so good.

My job is to put enough "common sense" into the processing task so that most of the mistakes machines still make can be avoided.
Or to cite Voltaire here: The problem with common sense is, that it is not so common.

But what if we succeed in this task? Somebody will still have to work with the piece of Software we generated. Somebody will have to use it. So what are the interfaces? How can our software interact? How easy will it be to understand that piece of software?
Will it make up for the time we saved generating the code? What do we have to look out for?

To sum it up: What happens around our scenario? Do we have to address that or can we just ignore how to deal with what the machine is supposed to deliver? I don't know yet.

Questions over questions which I will try to address in the next weeks as well.
Thanks to my friend Georg for this valuable tip.

Monday, October 8, 2007

Annotating text - questions raised

Hey, it's been almost 2 weeks.
But I do have some results for you. They look like that:

"Leaving one’s own king under attack, exposing one’s own king to attack and also ’capturing’ the opponent’s king are not allowed."

This would be the original text from the specification.
In order to transform this into a graph, thematic (theta) roles have to be given to each necessary part of the sentence. The redundant/disposable words are just "sharpened-out" by marking them with a #.

The output of this sentence would be something like this:

[ { [ Leaving|ACT one`s1|POSS #own king|{HAB,STATII} under_attack|STAT, ] , [ exposing|ACT one`s2|POSS #own king|{HAB,STATII} to_attack|STAT ] , [ #and #also capturing|ACT #the opponent`s|POSS king|HAB ] }|MODII #are not_allowed|MOD. ]

one`s1 <= They
one`s2 <= They

This might look a little confusing in the beginning, but it is also quite impressive, how easy it is for us humans, to understand relations and concepts of a sentence. But sitting there and annotating the text by hand quickly shows that many things are processed by our brain implicitly and are actually quite hard to put on paper.

To put it short:
Reading the above sentence does not make you think of possessors, habitums and stati at once, does it? We recognize verbs and nouns, the rest just seems to come "naturally".

This is in my opinion the biggest obstacle when it comes to machine understanding.

Anyway, several question especially concerning reasoning where raised. Those were:
  • How will we be dealing with numerals after all?
  • When is a word a numeral, when an article?
    • e.g.: "one can find ..." or "you can move with only one player"
      • one == same?
      • one == 1?
      • one == one/you?
  • How can realations in numerals be detected?
    • e.g.: "The chessboard has 8x8 field. Those 64 fields ..."
  • What happens to prepositions which seem unnecessary while annotation but actually do or can change the semantics of the sentence?
    • e.g.: "the near corner square to the right of the player is white"
    • "to the right of the player" (shows a location) is different from "the right of the player" (could also mean the right in a jurisdictional way)
  • Difference between verbs and their tense:
    • e.g.: "checkmate" vs. "checkmated" which mean something different
Well, a lot of new stuff to think about I guess ...