|
||||
|
OK. After my first vague post, that was perhaps a bit more scatterbrained that I would have liked it to be due to being utterly frustrated, I decided to meet up with my professor to clarify some of the things I was confused about and it helped a lot. So, much, in fact, that I finished 2/3's of my computational linguistics assignment. I'd like to now elaborate about the assignment, what I was confused about, and what the solutions were. At the end of this clarification, I once again ask for help about the last bit that I am still stuck on! The entire assignment revolves around a module written by the professor meant to illustrate some examples of FSA (Finite State Automata), both deterministic (only one possible transition between states; no epsilon transitions allowed) and non-deterministic (could be one or more possible transitions between states; epsilon transitions allowed). These FSA reflect "Sheep Language," which can be represented by the regular expression [ba*!] (one 'b' followed by one or more 'a' followed by '!'). The first exercise was to modify the NFSA class's 'recognize' method (verifies whether or not a given string belongs to Sheep Language and returns True or False) so that the user could specify whether a FIFO (First in First Out) search or LIFO (Last in First out) search would be implemented by the method. The original program only was capable of implementing a LIFO search, so the solution here was to create a new keyword argument (I called this depthFirst, which is another name for the LIFO search) and added a conditional to the recognize module, for which if depthFirst is True, it'd go through the original LIFO search, which uses a stack via pop(). If depthFirst is False, it'd instead go through my added code, which uses a queue via pop(0) (e.g. FIFO). I must have read the documentation on using lists as queues and stacks about 50 times, but for some reason it just didn't click. OK, so the second exercise was to add a new method to the DFSA class that could return whether or not a given string was in the Sheep Language's complement language (e.g. strings that use the same alphabet as Sheep Language, but do not contain the actual strings in Sheep Language, such as 'abb!b') OK, so for this, I had to add a sink state (also known as a fail state) to the DFSA used. This means that all strings not in the Sheep Language are condemned to this sink state, never to leave. I then added a complement method to the DFSA class, which basically used the same code as recognize, except it accepts strings that end NOT IN a valid final state, and rejects those that do end in a valid final state. This way, strings not in the Sheep Language get accepted, but those strings still must have the same alphabet. NOW, what I am stuck on... The professor had us construct another NFSA that has epsilon transitions at every one of its states. This results in an infinite loop since an epsilon transition allows the search to go to the next state without moving the tape forward. Since it can do this at every state, the tape NEVER moves forward. The goal here is to modify the NFSA class so that these sort of detrimental NFSA return a ValueError. In theory, I THINK I know what I have to do, and I wanted to meet with the professor today, however, all of his courses and office hours were canceled for the day and so I did not get to meet with him. Here is what I think I must do, but still do not know exactly how to accomplish it: Keep track of what pops out of the agenda list (the list that contains the states to be looked at and the index number of the tape) and append that to a new 'closed' list. Before new items are added to the agenda, make sure that we aren't repeating ourselves. If we are, return ValueError. If not, add next thing to agenda. For anyone interested in helping, the original code is here and the FSA I am dealing with is in Exercise 3 of this document. |
||||
|
|
