Wednesday 5 December 2012

Proof by contradiction

In this post I will continue talking about question 1 of assignment 3, recording my proof.

I had a DFSA with exactly 16 states that accepts language L and noticed that it drives each binary string of length 4 to a different state. From there, though, I had no idea how to prove that any DFSA that accepts this language must have at least 16 states. I had not seen something similar to this before. Finite state machines did make sense to me during the lectures, but I had not had enough practice with them yet. Also, almost immediately I realized that this proof can not be done using induction - or at least I did not see a way of doing it. That meant that it was going to be different than most of the proofs that I got used to writing in this course.

I remembered my math teacher in high school. He would always remind me that when I am stuck on something and can not prove it, I should try proving instead that it does NOT hold. Indeed I was completely stuck this time and did not know how to start.  That is how I thought of proof by contradiction.
Also, my intuition kept telling me that the fact that the machine is deterministic would play an important role in this proof. It was definitely something I had to use.

I then looked at my DFSA. I noticed that if a string started with 0, for example and was of length 4, it would end up in one of the leaves of the binary tree. If I add one more bit to that, in some cases it would be still be accepting while in some others it would not. I kept adding random bits to it to see what happens. I noticed that it was always the case that when a string of length 4 had 0 as the third-last bit, then if I add 1 random bit to it, it will end up in an accepting state for sure, since the 3rd last bit has now become the 4th last bit. If I add another random bit to it, it will go to an accepting state if and only if the 2nd last bit of the original string was a 0. That observation turned out to be the key in proving this, for the following reason:
If we take two different binary strings, then they will surely have at least 1 different bit. Now if that 1 different bit is not the fourth-last, for all we know they could indeed end up in the same state (since they will either both be accepting or non accepting). However, by adding the right number of random bits to both, we can make it so that the fourth-last bit of the strings we are constructing is different. Then, if we further assume that we are adding the same bits to both strings every time, we have our contradiction. The reason that this is a contradiction is that the machine will eventually drive the two strings we constructed to two different states (since at the point where we have added enough bits and the fourth-last bits of the two strings are different, one will be accepted while the other will not be) under the assumption that they started off at the same state and that we kept adding the same bits to both. From this, it follows that the machine is not deterministic. But this contradicts our hypothesis.

So I now had the basic idea of my proof, and the only thing left to do was to properly structure it and write my thoughts down in a way that any reader could be convinced that I proved what needed to be proved.

I observed that the maximum number of times that one would have to add random bits to the two strings is 3. This is because, if the fourth-last bits are different, the contradiction follows immediately. If they are the same, then there is the case where the third-last bits are different. If they are, then adding 1 bit to both strings would be enough to show the contradiction. If, however, the third-last bits are the same, then we must consider the case of the second-last bits being different, in which case we would have to add 2 bits to each string to end up with a contradiction. Finally, if the second-last bits of the two strings are the same as well, then it has to be the case that the last-bits are different, since we started with two distinct strings. In this final case, we would have to add 3 bits to each strings to show the contradiction.

At this point, it was clear to me how my proof would be structured. At first, I would assume that there exists a DFSA with less than 16 states. From that, though, it follows that some binary strings of length 4 would have to end up in the same state, since there are 16 of those but at most 15 states in the machine. Then, considering I would consider 2 cases, in the first of which the contradiction would be obvious, since the fourth-last bits of the strings are different and in the second of which I would have to add some number of random bits to the strings in order to end up in the contradiction. That second case would have three subcases, depending on the number of time that I would have to add a random bit to the strings (first case: third-last bits are different, add a bit to each just once, second case: second-last bits are different, add a bit to each two times, third case: last bits are different, add a bit to each string 3 times).

Now reading it through again, it makes total sense to me why any DFSA that accepts L must have at least 16 states. I believe that every step of the proof follows from the previous ones and it all builds up to form a contradiction in each subcase of each case, which proves the desired result.

Tuesday 4 December 2012

Problem solving

The time has come to talk about problem solving. It is probably a weird time to talk about solving an exercise of the assignment after I have already submitted it, but I did not want to waste precious time writing this up while the clock was ticking before the deadline. Besides, I have a good memory and I can clearly remember the details of my thought process during those days. Actually, it would have been weird if I had forgotten those details, considering the time I spent on this question, the number of conversations I had with myself in my room, in the subway, while walking on campus etc.

I have chosen to talk about question 1, since it was the question of the assignment that I was stuck on for the most time probably and at the same time, the question I enjoyed the most.

This question is about proving that any DFSA that accepts L has at least 16 states, where L is the language of binary strings whose fourth-last bit is a 0.
I read the question many times and read the hint as well. I then wrote down the 16 binary string of length 4 and highlighted the ones that would be accepted by DFSA (the ones whose first bit is 0). In order for the DFSA to have at least 16 states, there must be some way to prove that each one of these 16 binary strings of length 4 would have to end up in a different state than the rest.  But I had no idea why that made sense and surely no idea how to prove it.

My next thought was to construct a DFSA that accepts this language, since maybe that would help me see why at least 16 states are needed. It took a surprisingly long time to come up with a working DFSA for this purpose. When I finally got it, I counted the states I had and they were exactly 16.
My DFSA looked like a binary tree. On the very top, was the starting state, with a self transition of 1s. This starting "node" had one child only, to which one could be lead to with a 0. That "node" of the DFSA (the child of the starting state) is the root of the binary tree. It has two children: one could get to the first one with a 0 and to the second one with a 1. Each of those children has another two children that can be reached in the same way and each one of those has another two children. Those final children are the "leaves" and each one of them is an accepting state of the DFSA. From the leaves, there are transitions back to the very top in some cases, or back to the root of the tree in other cases, or to the father of grandfather of the leaf in other cases, or even to other leaves sometimes.

I then tested my machine with the 16 binary strings of length 4 and indeed, each one ended up in a different state. Among those 16 binary strings of length 4, there are 8 strings that are accepted by L and 8 that are not accepted by L (since exactly half of them start with a 0 i.e. exactly have of them have 0 as their fourth-last bit). The ones that are accepted by L ended up in one of the leaves - there are exactly 8 leaves in the DFSA that I described above, one for each of those 8 accepted strings. The ones that are not accepted by L, on the other hand, never made it to the leaves, since the self transition in the starting state happened at least once and so there were at most 3 bits left after that, which are not enough to drive the DFSA to the leaves.

At this point, it made sense to me why each binary string would end up in a different state, but I still had no idea about how this can be proven.

In the following post, I will record my thought process that lead me to the proof.

Tuesday 27 November 2012

The universe is gonna collapse soon

The impossible has happened twice within the same week!!
Firstly, I might have actually gotten the last quiz right!
Secondly, I am pretty much done with assignment 3 and this time I had no frustrating "accidents" that lead to  spending a bajilion hours on proving unnecessary things!
Beware, the universe might collapse any minute now!

Of course, I did spend a lot of time trying to find an actual DFSA that accepts the language described by exercise 1, but hey! this time I actually knew it was not necessary, since one can prove that it must have at least 16 states without knowing anything about what it looks like. It makes me happy, however, that I now have a machine for this language that is working - or at least I still have not found a string for which it does not work.

Time flies by and the only thing that is really left to do for this course after this assignment is to study for the final. Even though I will be relieved after writing the final, I know that I will miss this class the moment the final is over, provided that our universe will still exist till December 12.

Thursday 15 November 2012

The quizzes hate me!

Can never seem to get these quizzed right!
I was so confident this time but of course I only got 1/2 again. I feel like those 10 minutes are never enough for me. I don't understand how I can be doing so well in the tests and assignments and messing up these quizzes! It is so frustrating! I solve the tutorial problems before coming to the tutorial and perfectly understand the solutions when my TA solves them on the board but during the quiz there is always some detail that I get wrong.
Not looking forward to next Monday!

Wednesday 31 October 2012

no "late at night" excuses this time ...

Another week has passed during which I found out that I got perfect on a1, yay!

However, it has been proven that I can never do a whole assignment without getting so mad at myself. This time was even worse than the previous one.

I spent 9 hours on Saturday on question 1 of the second assignment. At the end of the day (or rather the start of Sunday), I was convinced that I had the solution for it. I then stayed up another hour to type it in, which took something like 5 pages. I was really proud of my work! Little did I know!

Of course, it would have been too convenient for my solution that I spent 9 hours trying to figure out to be correct! While talking with a couple friends about this exercise, I realized that I was not including some strings in the set of OMCOFS that I should have been including.
I was only looking at consecutive 1s within the strings, so I was including strings like 11000 or 11110 but missing strings like 11011, where there is one or more 0s in between the 1s.

It is so funny how something like this happens every time. I have a folder on my computer that is called "probably useless" and in that folder I save stuff like the result of my 9 hour work on Saturday: stuff that ended up being not what the assignment was asking, but at the same time, too "cool" to just delete ( I also have a beautiful file in there with my own version of an exercise from my stats assignment and the number of those files only keeps increasing ). Some day I will look back on it and laugh.

Now however, in order to make that file in the weirdest possible way useless, I'm blogging about it (better than nothing!). So this is what I came up with on Saturday ( I can't blame anyone for not reading the whole thing ) ...


I notice the following:

Firstly, no matter what the length of the set of binary strings in question is, the string that consists only of 0s will always be one of the OMCOFS (note 1).

Secondly, in every OMCOFS s1, we want to have an even number of contiguous 1s. In other words, we want to have the substring '11' and the substring '1111' and so on, provided that the length of those substrings is less than or equal to the length of s1. Call the length of s1 n. Then, the expression floor(n/2) gives us the number of those substrings that will fit in s1. For example if floor(n/2) is 1, then only the substring '11' will fit, if it is 3, then the substrings '11' , '1111' and '111111' will fit (the 3 first substrings that contain only 1s and have even length) and so on (note 2).

Now lets see how we can construct the OMCOFS of length 5 from the OMCOFS of length 4.

First of all, in the OMCOFS of length 5, we will have the string '00000' for sure (from note 1).
Also, since floor(5/2)  is 2, the substrings '11' and '1111' will show up in the OMCOFS (from note 2). But the OMCOFS of 4 also have substrings '11' (since floor(4/2) is also 2). If we write the OMCOFS of length 4 that contain '11' in the following order: 1100, 0110, 0011 (the first one starts with the contiguous ones and then the 1s get shifted to the right), then we can get the OMCOFS of length 5 that contain '11' by adding a 0 to the end of each one and to the beginning of the last.
So 1100 gives us 11000,
and 0110 gives us 01100
and 0011 gives us 00110 and 00011.

Now, if we write the OMCOFS of length 4 that contain '1111' in the same order as before, we can get the OMCOFS of length 5 that contain '1111' in the same way.
There is only one of those in this case, namely the string 1111, so we will treat it as if it is the last one (i.e. we will add a 0 in the front and the end of that string). So we get the strings 11110 and 01111 of length 5.
So the set of OMCOFS of length 5 are { 00000, 11000, 01100, 00110, 00011, 11110, 01111 }.


Then, if we know the OMCOFS of length n (where n is a natural number greater than 0), we can construct the OMCOFS of length (n + 1) in the following way :
  • Add 0s to each OMCOFS of length n that contains the substring '11' in the way demonstrated above. Do the same for OMCOFS of length n that contain substring '1111' and then '11111' and keep going as long as the substring of 1s has length less than or equal to n.
  • In the case where there is a substring of 1s that fits in strings of length (n + 1) but not in strings of length n, add this substring to the set of OMCOFS of length (n + 1) without adding any 0s to it (call this " case * ").
    The reason we are adding it is that since the substring fits in length (n + 1) but not in length n, it must be the case that that substring has length greater than n but not greater than (n + 1), so it must have length exactly (n + 1). It also must be true that floor(n/2) < floor((n+1)/2)  (from note 2), which can only happen if that n is odd and (n + 1) is even.
    But if (n + 1) is an even substring of 1s and it has length (n + 1), then we want to include this to the set of OMCOFS.
    For example, the string '111111' is not in the OMCOFS of length 5, since its length is greater than 5, but it has to be in the OMCOFS of length 6.
  • Finally, add to the set of OMCOFS the binary string of length (n + 1) that contains only 0s (note 1).


Define H(n): number of OMCOFS in the set of binary strings of length n.

Case 1 : if n is even and (n + 1) is odd, then floor(n/2) = floor((n+1)/2) so the even substrings of 1s that fit in strings of length n are exactly the same as those that fit in length (n + 1).
The number of strings of length (n + 1) that contain '11' will be equal to the number of those of length n that contain '11' plus 1 (since to the last one we are adding a 0 both to the front and to the end). The same will hold for the number of strings that contain '1111' and so on (in total, it will hold for floor(n/2) such cases) (note 2).
So, H(n + 1) = H(n) +floor(n/2)
(since we are adding 1 floor(n/2) times).
This can also be written as H(n + 1) = H(n) + floor((n+1)/2) ,since floor(n/2) = floor((n+1)/2) in this case.

Case 2: If n is odd and (n + 1) is even, then we have case *.
In this case, as before, to get H(n + 1), we have to add floor(n/2) to H(n) for the substrings of 1s that fit in both and we have to add 1 more for the string that fits in length (n + 1) but not in n. So we will have H(n + 1) = H(n) +floor(n/2)+ 1.
Also, we know that floor((n+1)/2)= floor(n/2) + 1 since we know that floor(n/2) <  floor((n+1)/2) is true in case * and that n and n + 1 are consecutive natural numbers.
So H(n + 1) = H(n) +  floor((n+1)/2)in this case.


So in both cases, the recurrence is : H(n + 1) = H(n) +  floor((n+1)/2) or, equivalently, H(n) = H(n – 1) +  floor(n/2).


For your own good, I am going to skip the two pages of unwinding a sigma of floors. But the closed formula that I got is the following:

H(n) = (n^2+ 3) / 4,      if n is odd
or
H(n) = (n^2+ 4 ) / 4,     if n is even

The worst part  is that this time I do not even have the excuse of working too late at night. I spent practically the whole day doing this. All day today (especially while solving the exercise all over again and realizing that it comes down to something so close to the Fibonacci series, which we have already unwound in tutorial!!) I was trying so hard to convince myself that I got more practice by making that mistake and that this is a good thing. I failed miserably. 
I promise myself right now that next time I will double check my formula a bajilion times before I get too carried away with it. 
I really hope I do not end up writing a similar blog post to this while doing assignment 3! 




Monday 22 October 2012

the week my coffeemaker broke ..

It is safe to say that I have had many better weeks than this past week.

The excitement of getting perfect on test1 did not last long. As a matter of fact, it did not last at all.
If I had done a bad job, I would have been upset (like when I got only 1 out of 2 on the quiz this week!), but when I do well, I am not happy, just not unhappy. I am going to end up being miserable if I keep this up!

Also, I was forced to come extremely late to Friday's lecture this week due to an obligation that I had and could not miss. From that experience, what I have learned is that it is not fun walking in the lecture room 10 minutes before the lecture ends ( don't ask why I even bothered ; it was better than waiting outside ). It is also certainly not fun trying to understand the lecture slides the day after.

It is 11:30 pm so I think I have written enough for now. Yes, I usually end up doing most of my work at night, but ever since my coffee maker broke, that plan does not work so greatly anymore. I am living with the hope to buy a new one before the deadline for assignment 2!

Turning off the light. Good night blog!




Sunday 7 October 2012

Do not try proving stuff late at night


I have to be honest and say that I find blogging a strange way to earn marks for a mathematical – logic course. It is, however, an interesting twist and I might as well have fun while at it.

I always liked proof oriented courses and CSC236 is no exception. I find proving things fun (yes, people do say that I have a weird sense of fun), since by proving something you can persuade anyone that something that makes sense to you is actually true.
However, I have also had some "bad" moments in this course so far, such as the "embarrassing" story I am about to write down (I am laughing at myself just thinking about it, before I even start typing it).

Last weekend i was determined to finish assignment 1. I went through questions 1 and 2 quite easily and then i started reading question 3, which seemed like an interesting problem. In that question, we were asked to find the formula that works and then continue with the proof. When reading the question, I realized that the formula I was looking for reminded me of something I recently learned in statistics (I had a quiz on it the previous day). I thought, however, that it would be fun to play around with it and see how it works.
I must admit that I stayed up till 3 am that night, playing around with objects in my bedroom such as pencils and rubbers (the way that our statistics prof played with "fuzzy balls"), putting them into groups and taking combinations and observing the number of 3 – subsets each time.

After wasting so many pages of my poor notebook (I feel bad for the trees that were sacrificed for this), I came up with something like:

if p <= 2 : the number of 3-subsets in a set of n elements is p + 3p(n – 3)
if p > 2 and k = 0, that number is p + 3p(n – 3) + 3^k and
if p > 2 and k != 0, it is p + 3p(n – 3) + (3^p)k.

Where p and k are numbers such that n = 3 * p + k.

I was still not done, there were more cases I needed to look into.

I got too carried away trying to derive the formula myself that I did not realize that it is exactly the "(n + 3) choose 3" formula. I was determined to figure it out, even though it was 3 am, I had a bad headache and was getting quite frustrated.

You can imagine how annoyed I was when while lying in bed trying to sleep, I suddenly realized that the formula I needed is something known that I could use without deriving.
It was a fun night and I am hoping that there are more like that to come. If anything, they make cool blog stories!