27 September 2010

Lecture 9: Dynamic programming and the CKY algorithm

We will begin by briefly attempting to recover from my confusion explanation of X-bar theory, and talk about specifiers, complements and adjuncts.  We'll then finish up the DP versus NP distinction and IP for S, and then movement.  We'll then get to the computational side...

Parsing as search

At a basic level, the parsing problem is to find a tree rooted at S (or IP...) that spans the entire sentence.  We can think of this as a (bottom-up) search problem over partial hypotheses (eg., hypotheses that span an initial segment of the sentence).
  • At any given time, we've built a partial tree over the first n words
  • Want to extend it to include the n+1st word
  • A search step involves:
    • Choose a rule from the grammar
    • Choose existing constituents that match that rule
    • Create a new constituent
Note that there are other ways to search, eg., top-down.

Dynamic Programming

Search can be expensive because we end up re-creating the same constituents over and over again.  This suggests dynamic programming.  For the following, we will assume our grammar has been binarized, so rules always look like "A -> B C".  Unary rules (eg., "A -> B") are do-able, but are a simple extension.

We will write sentences as "0 The 1 man 2 ate 3 a 4 sandwich 5" where the numbers indicate positions.  A partial hypothesis has a label (eg., "NP") and a span (eg, "from 0 to 2").  We will write this as NP[0,2].

The key insight in dynamic programming is to store a chart.  A chart is a two-dimensional array of (eg.) sets.  Chart[i,j] will store all possible constituents that can span the range from i to j.

Initialization:
  • For each position i=1..|sentence|
    • For each possible part of speech P for word i
      • Insert P into Chart[i-1,i]
      • (This means that we've found a way to cover i-1 to i using non-terminal P)
Iteration:
  • For phrase lengths l=2..|sentence|
    • For each start position i=0..|sentence|-1
      • Set end position k=i+2
      • For each split point j=i+1..k-1
        • For each grammar rule X -> Y Z
        • If we can find a Y in Chart[i,j] and a Z in Chart[j,k]:
          • Add X to Chart[i,k] with backpointers ala Viterbi
          • (This means we've found a way to cover i to k using the non-terminal X)
Now, look in Chart[0,|sentence|] for a "S" and chase back-pointers.

Time complexity is O(|sentence|^3 * |Grammar|).

    2 comments:

    1. Image

      HW4 problem 1 asks to "Construct the complete chart". Does this mean a complete, CKY parse table?

      I ask because the text only uses "chart" in the context of Earley or chart parsing, but above the word appears to be used for CKY parse table.

      ReplyDelete
    2. Image

      @Chris: yes, that's right. CKY parse table.

      ReplyDelete