Wednesday, February 29, 2012

Falling Behind

I have fallen behind on my blogging. I have been pretty busy since my last post. Have
learned a lot of things and a lot of stuff to talk about. This post is meant to briefly
outline my past activities and what is coming up.

First my job. That has been picking up. Finished my first script and bug fix. Going to be
the "Duty Officer" this week. Should be fun for all.

Next, there was a hackathon hosted by my school's Computer Science Undergraduate Association. In the week leading up to it I gave myself a crash course in Django to create a server as a backend to an iPhone app. Like past hackathons, inexperience and an over-ambitious project prevented us from finishing (although this time we actually had something to present, it just was not nearly finished enough). However, I (and I think my two partners) am excited for this project. We mostly finished and it's a pretty good idea (which is why I have been avoiding saying what exactly it is). So when I am not working, doing homework, or playing the occasional LoL game I have been becoming more familiar with Django and cleaning up the server's code. Hopefully we'll get this done and maybe even released eventually. Here's hoping.

And finally, I wrote my first bash script (wheeeee) to rename some files for my dad. Not
terribly interesting to anyone who has done serious (or even basic) scripting, but I
still think it deserves a quick post.

So these are things. These are things that I have been doing and things that I will
dedicate some time talking about. But not now. Now I have to add multiprogramming to
Nachos for my operating systems class.

Facebook Hacker Cup 2012 - Qualification

The Facebook Hackercup 2, electric boogaloo. Last year Facebook hosted their first
programming competition. Despite a few technical flaws early on, overall it was pretty
good. I think I made it to the third round last year. At the time I had just started
college and programming, so I felt pretty good with how far I got. Hopefully this time
around I'll be able to get at least a bit further. The qualification round ended on
Monday and I'm fairly confident that I got two out of the three problems correct (you
only need one correct answer to move on to the next round). I'll present and talk a bit
about my solutions here.

Alphabet Soup

This problem was probably the easiest of the three. Paraphrasing
the problem statement, essentially you need to determine how many times you can form the
word "HACKERCUP" from a given string. The example input file is given as

5
WELCOME TO FACEBOOK HACKERCUP
CUP WITH LABEL HACKERCUP BELONGS TO HACKER
QUICK CUTE BROWN FOX JUMPS OVER THE LAZY DOG
MOVE FAST BE BOLD
HACK THE HACKERCUP

and your output file should look like:

Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1

For this round I decided to use Python. It's what I've used to solve many Project Euler
questions, so I can comfortable solving problems with it. So first, processing the input
is fairly straightforward.

def process_input(file_path):
    with open(file_path, 'r') as input:
        # first line is number of cases
        num_cases = int(input.readline().rstrip())
        # each remaining line represents a case
        cases = list(case.rstrip() for case in input.readlines())
    return cases

Open the file and grab the cases, stripping off newline characters. Now to actually deal
with a case. Forewarning, my algorithm is terrible. The runtime is fine, but my
implementation is so roundabout that when I read it now I cringe. I'll mention a much
better way to do this after I present my solution. So here's the code:

def eval_case(case):
    word = list('HACKERCUP!')
    word_list = [list('HACKERCUP!')]
    for char in case:
        if char in word:
            for chance in word_list:
                if char in chance:
                    chance.remove(char)
                    flag = True
                    break
                if flag:
                    word_list.append(list('HACKERCUP!'))
    count = 0
    for word in word_list:
        if word == ['!']: 
            count += 1
    return count

word is just a list of the characters in the string
"HACKERCUP!". word_list is a running list of "HACKERCUP!" strings that may have
some characters removed. My algorithm is something like this:

Look at each character in the case.
If that character is in word then look
at word_list
If that character is in a word in word_list remove it
from that word and add a new "HACKERCUP!" string to the list
Once all the characters have been looked at, the number of
"!"'s left in word_list are the number of complete words
that were formed

Wow. Wow wow wow wow. This is why I'm glad I've started this blog. I knew that this
algorithm was stupid, but I did not realize just how stupid until I just wrote this
out. What I meant to implement was a list of strings that kept track of how many partial
"HACKERCUP" strings could be formed. When a character matched the string but was NOT
found in the list, that's when the list should grow. For some reason I wrote it as
WHENEVER the char is found in the list, the list grows. Luckily this didn't affect the
output of the program, but it did make it way more memory intensive. I really do not want
to talk about this piece of not good stuff anymore. But anyway, here's how I tie it
together:

# import was at top of file, of course
import sys

def solve(file):
    cases = map(lambda case: eval_case(case), process_input(file))
    with open('output.txt', 'w') as out:
        for idx, val in enumerate(cases):
            out.write("Case #%i: %i\n" % (idx+1, val))
        
solve('./' + sys.argv[1])

So to solve my case I just put the input file into the same folder as my script and
run python alphabet_soup.py input.txt and the output file output.txt is
generated for submission. The one thing I had to look up for this snippet was the
line for idx, val in enumerate(cases):, which gets me both the index and the
value for my for loop.

What was the good solution, you ask? Luckily I live with
two other EECS students who are much more experienced. They have and continue to teach me
a lot. Their solution was so much nicer. Essentially you make a dictionary mapping
characters that occur in "HACKERCUP" to the number of times they occur in the case. Scale
the value for 'c' by 1/2 (to account for the fact that there are two 'c's in "HACKERCUP")
and take the min of all of the values in the dictionary. Much nicer. Linear time,
constant memory. Ah well, we all managed to get it right in the end.

Conclusion

I am writing this conclusion on Febuary 29th. More than a month after I meant to post
this. I don't really have anything more to say, I just wanted to wrap it up and post
this. I'll post my upcoming plans for blogging soon.

Wednesday, January 11, 2012

Quicksort in Perl

I have been learning Perl over my winter break for a job that I will be starting this semester. My main source has been Impatient Perl, a free book targeting programmers that need to quickly pick up Perl. About halfway through reading it I realized that I was doing a minimal amount of coding and, consequently, stuff probably was not sticking. So I decided to write a quicksort subroutine. Sadly, I am writing this post the morning after doing this. Next time I will be sure to blog as I go.

About Quicksort

First, a little bit about Quicksort (all of this info and the picture can be found on Wikipedia's article on Quicksort). Say you are sorting a list of numbers by hand; if you wanted to use Quicksort to do this, here are the steps that you would follow:

If you only have one number in your list, you are done, so return it.
Othenrwise, choose one number from your list and call it the "pivot". There are a couple ways that you could choose the pivot, but it is simpliest to just pick the first number.
Go through the remaining numbers and put them into two groups: one group of numbers less than or equal to the pivot (call this group "smalls"), the other group ("larges") with the rest.
Quicksort each group.
Stick the now-sorted smalls and larges together with the pivot between them in whatever order you would like to sort the numbers. For ascending order you would return [smalls,pivot,larges]; descending [larges,pivot,smalls].

Quicksort is a divide and conquer sorting algorithm. This means that whatever is being sorted is divided into two groups with themselves are then sorted. This gives Quicksort and average running time of O(n*lg(n)) (the list is divided lg(n) times and each time you divide you go through all n of the elements in linear time). So now let us see about implementing this in Perl.

My Implementation

sub quick_sort {
  # quick_sort - given an array, return a copy of that array sorted 
  #              in ascending order

  # Step 1. base case - list is empty
  if (!@_) {
    return;
  }
  # Step 2. take first element as pivot
  my $pivot = pop(@_);
  # Step 3. divide the rest of the list into two groups
  my (@smalls, @larges);
  foreach (@_) {
    ($_ <= $pivot) ? push(@smalls, $_) : push(@larges, $_);
  }
  # Step 4. Quicksort the two groups
  @larges = quick_sort(@larges);
  @smalls = quick_sort(@smalls);
  # Step 5. Stick the pivot at the front of larges and stick that at
  #         the end of smalls, returning the list 
  #         [smalls, pivot, larges]
  unshift(@larges, $pivot);
  push(@smalls, @larges);
  return @smalls;
}
# test snippet
my @ex;
foreach (0 .. 10) {
  $ex[$_] = int(rand(100));
}
print "unsorted: @ex\n";
@ex = quick_sort(@ex);
print "sorted: @ex\n";

Running this produces the following output:

unsorted: 14 31 36 38 24 31 68 85 90 80 26
sorted: 14 24 26 31 31 36 38 68 80 85 90

It works. It could be changed such that it sorts the given array in place, but I prefer to keep my code functional whenever I can (and it would be more complicated to do it the the other way).

A Better Way

Considering this is the first "real" piece of code I have written in Perl, I am certain that there is a better, less verbose way of getting this done. I did a bit of looking around and I found the simpliest and cleanest solution at Rosetta Code:

sub quick_sort {
    my @a = @_;
    return @a if @a < 2;
    my $p = pop @a;
    quick_sort(grep $_ < $p, @a), $p, quick_sort(grep $_ >= $p, @a);
}

This is much shorter than my implementation. It uses the function grep, which is named for and behaves like Unix's grep. Given a code block or a regex and a list, it evaluates the block or the regex, returning the list of elements for which the expression evaluated true. Those familiar with lispy functions can think of this as a filter function.

Wrapping Up

Not much to say here. This was just a simple exercise and I have a way to go before I am
fluent in Perl. Feel free to comment if you have anything to add or discuss.