Looking Back upon CSC148

The course CSC148 was definitely the first one to present interesting programming techniques and problems in the undergraduate program of computer science at the University of Toronto. After many hours dedicated to labs and assignments, I can say that I have a good understanding of recursive algorithms, a few abstract data structures, and some basic command line user interfaces in Python. Although the course itself could have been much more challenging (and it should be), it was overall a good experience and preparation for my academic future.

Me and my friends have disagreed on the ways some of the course content have been presented to us, and how the course could have been better if it was done in a lower level programming language (such as C++). We have even dreamed about the possibility of the faculty offering an enriched version of CSC148 with more in-depth exploration of data types in lower level languages. However, we were stuck with this course and I believe we made the best out of it. It was fun to solve problems and build solid applications with Python, and it was also good practice for future courses, such as CSC207.

As for a comparison of this course to its precursor, CSC108, this course definitely was more challenging, and requires a much deeper understanding of both Python and programming in general. Assignments, for example, are much more complex in CSC148. Additionally, examinations are slightly more complex in this course as well. In general, it is a considerable, yet not insurmountable, jump from CSC108.

I know I haven’t even mentioned the strike within this posts, and a part of me still feels like I shouldn’t. I mean it is definitely true that it was one of the major events throughout the course, but I feel that it’s not the best idea for me to voice my opinions about it here. Thus, I won’t be doing that. What I can say is that I am glad the strike is over, and that the wonderful TAs are back to look through my slightly interesting posts!

To all my classmates (who probably won’t read this), good luck with your life after CSC148,
Juan Camilo Osorio

Why Geeks Should Write (Revisited)

A couple of weeks ago, at the beginning of one of the courses at the University of Toronto, I wrote about why geeks should write. Today, I am revisiting this topic by going through some of my favorite blogs from my classmates, and how each one helped me either understand course content better, or be inspired to build something amazing. In a sense, I am demonstrating why geeks should write by being a geek who reads, thus giving value to those who took the time to write in the first place. From shared frustrations and code debugging to inspiring and intriguing posts, here are my picks of 2015’s CSC148:

Sooham Rafiz’s post on Assignment 3:

(View this post). My fellow classmate and friend Sooham takes a critical approach to Assignment 3, and, in particular, the suggested algorithm implementation given to us by the faculty. Since I was his teammate for this assignment, we shared the pain and struggle of implementing the (almost) correct pruning optimizations for minimax, and specially had a lot of trouble trying to understand intuitively what the algorithms was doing, since it was so different from common implementations of minimax (including MIT’s one). I strongly support his views on the topic, and encourage the faculty to consider changing the algorithm for future iterations of the course.

Being able to read about other people’s frustrations and struggles is great. No, not in a sadistic way. It’s just good to see that you are not the only one having trouble accomplishing something. Some things aren’t as easy as they seem, and maybe reading about somebody’s approach to a challenge will help you approach a similar one. Knowledge acquired by solving issues is extremely valuable, and being able to grow from obstacles, even if they aren’t your own, is great! I said it once and I’ll say it again: the first reason why geeks should write is that other geeks are very likely to benefit from your geeky endeavours.

Chris Lessard’s post on efficient Binary Search Trees:

(View this post). Chris explores more efficient ways to insert a list of 5000 integers into a binary search tree than were ever mentioned in the course, including red-black search trees that could potentially be used to implement a self-balancing search tree (which is great!). I really enjoyed reading about someone else that thought of extending the basic requirements of the course, trying something different. I posted about my own solution to this problem, which turned out to be quite similar to one of the discussed ones on this post.

It is great to be able to read about different approaches that each person has towards the same problem. If I had not read this post, I wouldn’t have learned about read-black binary trees, and thus my learning would be just that one bit less exciting. I encourage you, the reader, to write about your geeky (or not so geeky) approaches to obstacles that you face in your field, and I’m confident that somebody will benefit from your posts!

Sophie Wong’s take on Why Geeks should Write:

(View this post). I decided to add this article as some sort of meta inception of my own post. I am reviewing a post about why geeks should write on a post that revisits this topic, which is pretty cool. In any case, Sophie mentions a reason why geeks should write that had completely escaped me: we might actually forget how to use normal languages if we don’t practice them. Of course, Sophie (and geeks in general) aren’t worried about not being able to communicate normally. It is more about how a lack of practice in oral and written communication (or anything for that matter) can lead to some deterioration of skills. I found her article both funny and thought provoking. Overall, a great read!

So, if the previous reasons that I have given as to why geeks should write, such as having other geeks benefit from whatever you are writing about, and being able to share your experiences with like-minded people, haven’t convinced you, the last reason why you should begin writing your own blog is that geeks must practice their “soft” skills (such as written communication) as much as they practice their programming/technical skills! The last thing you want is for your rusty writing skills to prevent you from getting that dream job of yours, so get on writing today!

I admit that this post is basically a long persuasive article on why you should start writing your own blog, but I think is worth it, especially since I will potentially end up reading whatever you write when I get stuck on a similar obstacle you have overcome or just reading something inspiring you have done. It is all for my own benefit.

In all seriousness though, I hope that you enjoyed reading about some of the blog posts from students at the University of Toronto, and that you have appreciated (just a little) the wonders of blogging. Whether or not you join this pursue is up to you (or maybe you already did?).

Keep giving me more reasons to think geeks should indeed keep being awesome,
Juan Camilo Osorio

Why Unit Testing Might Seem Like a Waste of Time

Most of us in first year don’t have much (or any) practice in production environments, and we usually work with small code bases and relatively simple software. To me, and many others, unit testing is a complete waste of time, given that we could just check things manually in less time, and our code seldom breaks randomly. If you are one of us who doesn’t see much benefit in testing, don’t worry, I’ve been there.

For example, for the second assignment for CSC148 at the University of Toronto, I spent a good two hours trying to figure out why my doctest wasn’t passing, and the reason turned out to have nothing to do with the actual code I had written. I had basically understood doctests wrong for the longest time. The problem was that I was trying to do two things that don’t go well together: check for newline characters as part of the result of a function call, and split the lines as to meet the “lines must contain 80 characters or less” style guidelines for the course. You can witness part of my struggle, and how I solved it, in this StackOverflow question.

However, consider a project that has thousands of lines of code and a few thousand methods or functions. Now check each one manually. Not fun, right? My point is that, although unit testing might seem pointless in the early days, it will be more than worth it once you join a company or start your own large code base. Writing documentation and making sure your code works with doctests or unit tests is something that we learn out of context. I think that a great idea that beginner courses such as CSC148 could do is have a class-wide project as part of the course, where unit testing and documenting code suddenly becomes crucial to the success of the project.

One of the things I have done to practice writing good test cases and documentation is go through dozens of API documentations, such as Facebook’s, Pinterest’s, or Spotify’s. Although these are mostly web-based, they can help get a general understanding of production-level documentation and testing.

Keep unit testing,
Juan Camilo Osorio

Real Life Abstract Data Types

When you are a computer scientist, you begin to see abstract data types everywhere. From the queues to get food at a cafeteria, to stacks of paper in your desk, and even complex objects such as people, which have different properties and methods. You see, this is the very reason why having abstract data types makes sense in the first place: they can be used to model or describe real-world environments or situations. Strings, numbers, and boolean values are seldom useful if you are trying to describe virtually what a person is, or how a line queue operates. Thus, having a type where you can .enqueue() and .dequeue() people is convenient.

For example, in class, we were asked to implement a minimax algorithm in order to play an arbitrary two-person game (such as tic-tac-toe or subtract square). There, we could have used strings to store each state of the game, and would have had to create a crazy function to parse them and apply the moves, generating new states in order to assess what the best move was. But that’s insane and counter intuitive. That’s why we used recursion and an abstract data type that we created for game states, which allowed us to easily compute new states by applying moves to them, and thus made recursion 100x easier. What would life be without some good ol’ object oriented programming and abstract data types?

Inserting a sorted list into a binary search tree.

In class, we have been looking at binary search trees, and how they can be used to efficiently perform certain actions. We have also learned that some operations, such as finding whether or not a certain binary search tree contains a value or not, have a time complexity proportional to their height, and therefore keeping the height of a tree to a minimum is desirable.

Specifically, in a lab, we were told to find a way to insert a list of values into a BST while keeping the height of such small. Although we were advised to use a random selection in order to achieve this, I was convinced something more efficient was possible. It turns out I was right. By pre-sorting the list of values to be inserted into the binary search tree, it is possible to construct a recursive algorithm that inserts each element in a specific order as to obtain a balanced search tree. Consider the following:

Comparison of Tree Heights.

Since the list is sorted in a non-decreasing order, it definitely doesn’t make sense to insert the elements in the order they are. This is because the naive tree would be produced. Alternatively, we could shuffle randomly the tree (which I believe is what the course instructors wanted us to do in the assignment). However, we can do better. In fact, we can get the smart tree every time.The code I came up with in order to guarantee this is the following:

class BST:
    # ... other methods here
   def insert(self, data):
       """Insert data, if necessary, into this tree.

       Precondition: if data is a list, it is pre-sorted in
       non-decreasing order.

       """
       # Insert list
       if(isinstance(data, list)):
           midata = len(data) // 2
           self.insert(data[midata])
           if(len(data) > 1):
               self.insert(data[:midata])
           if(len(data) > 2):
               self.insert(data[midata + 1:])
       else:
           self._root = _insert(self._root, data)

Here, _insert(...) is a function that adds the value data to self in the correct position. What’s important is that, by choosing the middle element of each array and sub-array it is possible to form a balanced and efficient binary search tree. It is important to note that the list being inserted should be sorted, otherwise the method would not produce a particularly efficient binary tree.

If you are interested in seeing the entire code (including the function _insert, and the other methods of BST), you can download my solution to the entire assignment here!

Object Oriented Programming [For Dummies]

As a Friday the 13th special, here is my guide to [spooky] object-oriented programming for dummies. Regardless of whether or not you are a dummy in this subject, I hope this post explains why it makes sense to code in an object-oriented fashion. I will talk about the main concepts in OOP using simple analogies to fictional characters. I won’t be including any code, mainly because this is meant to be an intuitive explanation, and should work with pretty much any OOP language.

Instances

Jason Vorhees This character is Jason Voorhees, from the franchise Friday the 13th. He is an example of a fictional character. Of course, there are many similar characters, such as Freddy Krueger and Michael Myers. In programming, we usually call each specific character an instance, or an object; you can think of them as concrete examples of a certain type of character. Each one of these characters has specific attributes, and we programmers specify them like this:

Properties

Jason Voorhees Example Properties

Properties are the various things that make each character, or instance, different from others. In the case of the characters we have already mentioned, each has several characteristics that make them different to each other, such as their name, origin, and immortality.

A good thing to keep in mind while understanding object-oriented programming is that programmers are lazy. We will usually avoid having to write similar code twice; therefore, although each one of these characters is slightly different from the others, it is possible to write code that describes all of them by simply storing their differences as variables, which we call properties. Similarly, all of the characters mentioned can do very similar things. Programmers handle actions like this:

Methods

Dance of Death Horror movie characters can, among other things, dance, jump-scare, and murder. In programming, each instance has certain actions that you can have them execute. We call these methods. Additionally, we call the process of having an instance execute one of its methods a method call. For instance (see what I did there?), if ghoul referred to the character at your left, an example of a method call that got it to dance is ghoul.dance(). The great thing about this is that jason.dance(), freddy.dance(), and similar for other characters should work seamlessly too. In order to accomplish this, programmers group all these different characters like this:

Classes

Horror Movie Characters

You guessed it, each one of these characters is also an instance, and has both properties and methods. However, when we talk about characters being instances, it is important to specify what they are instances of. In this case, all of these characters are instances, or examples, of horror movie characters. Each has different, unique, properties, as we mentioned earlier; however, in general, all of them murder people and scare us in films. Given these similarities, it is often useful to group them into classes. Then, it becomes possible to define generic methods and properties for an entire class instead of having to do this for each specific character.

(Super)classes and (Sub)classes

Superclass of Characters

Sometimes, programmers need to group several different classes. For example, we might want to have a general class that includes horror movie characters, along with any other fictional character. It is easy to imagine that this general class is going to be larger than any of the classes it contains. Therefore, programmers call these big, general classes super-classes, and the smaller, more specific classes inside them are analogously called sub-classes.

Why?

Imagine you had to create a program that could run several different two-player games, such as Chess and Tic-tac-toe. If the number of games was small (such as one specific game), it could make sense to write very specific code without the need for an object-oriented structure. However, imagine you had to do pretty much the same thing for two, ten, or even hundreds of games! It would definitely be tedious. It would be great if you could write code for things each game had in common only once. If only you had a super-class that was general enough to describe any two-player game, and then sub-classes that represented each specific game; if only each had the needed properties such as the rules, and methods such as a way to check if a player has won; and if only you could create instances of each sub-class each time a specific game was played, right? Well, now you can. There are a lot of OOP tutorials for each specific language out there.

Keep coding,
JC

Python Recurcepton: The making of a program that traces recursion

Lately, we have been talking a lot about recursion in class at U of T. Specifically, we have been asked to trace recursive functions in order to find their output. While this is fun on its own, I shall go to another level for the sake of keeping this blog interesting. I asked myself, why trace recursive functions when I can make a function that traces recursive functions for me? I call this recurception.

The code I came up with took me a fair amount of time, and makes use of the sys and inspect built-in modules. The code goes like this:

import sys
import inspect

def global_tracer(frame, event, arg):
    """ (frame, str, object) -> function
    
    Return a local tracer.

    """
    return local_tracer

def local_tracer(frame, event, arg):
    """ (frame, str, object) -> function
    
    Trace a local event and return itself.

    """
    if(event == "return"):
        # Get function name
        result = frame.f_code.co_name
        if(not result.startswith("<") and result != "recurception"):
            # Get function arguments and values
            args, _, _, values = inspect.getargvalues(frame)

            # Build Result String
            result += "("
            arglist = ['"' + values[i] + '"' if isinstance(values[i], str) else repr(values[i]) for i in args]
            result += ", ".join(arglist)
            result += ") -> "
            result += '"' + arg + '"' if isinstance(arg, str) else repr(arg)

            # Print it
            print(result)
    return local_tracer

def recurception(recursive, arguments):
    """ (function, list of arguments) -> NoneType

    Trace the recurssion of a recursive function.

    """
    sys.settrace(global_tracer)
    result = recursive(*arguments)

In order to use it, you must define a recursive function (or any other type of function really), such as the following, taken from a class assignment/lab:

def nested_concat(L):
    """(list or str) -> str

    Return L if it’s a str, if L is a (possibly-nested) str return
    concatenation of str elements of L and (possibly-nested) sublists of L.

    Assume: Elements of L are either str or lists with elements that
    satisfy this same assumption.

    """
    if isinstance(L, str):
        return L
    else: # L is a possibly-nested list of str
        return "".join([nested_concat(x) for x in L])

And run it through the recurception like this

>>> recurception(nested_concat, ["yes"])
nested_concat("yes") -> "yes"
>>> recurception(nested_concat, [[["yes", " it"], " works"]])
nested_concat("yes") -> "yes"
nested_concat(" it") -> " it"
nested_concat(['yes', ' it']) -> "yes it"
nested_concat(" works") -> " works"
nested_concat([['yes', ' it'], ' works']) -> "yes it works"

The first argument of recurception is a function, and the second is a list of arguments to pass to such function. It takes care of printing each call to a function and its return value, such that all recursion steps are logged. I thought it was fairly interesting, and learned a lot about Python tracing and the built-in data types such as frames and (precompiled) code.

Keep being recursive,
Juan Camilo Osorio

Event Handing in Python

Python can be considered a sequential programming language in the sense that each line of code leads to the next, with very few possibilities to create event based code natively. However, some pseudo-non-sequential code can be implemented in a way that makes sense and provides an additional abstraction level that helps organize a piece of code.

In the course CSC148 at the University of Toronto, our first assignment was implementing a two-player game system that could utilize a strategy in order to generate valid moves by the PC, and an interface to let the user interact with the game. To me and my partner, it made sense to have some sort of event-driven functions in each part of the code, such that certain functions would be executed when certain conditions were met. In other words, we wanted to have functions that would be executed when certain events where emitted.

For example, our abstract Interface class looked something like this:

class Interface:       
    def on_load(self):
        pass # Optional event handler
    
    def on_start(self):   
        pass # Optional event handler
    
    def on_move(self, move, player):
        pass # Optional event handler

    def on_end(self):
        pass # Optional event handler

In general, each of these functions is an event hander that can be implemented by a subclass that inherits Interface. The actual game system takes care of calling the event handlers once certain conditions have been met, such as when the game has been initialized (on_load), or when a game has ended (on_end). This way of organizing the code in event handlers makes sense mainly because a game is heavily event-based. Things do not necessarily happen in a particular order, given that events take place depending on the actions of the user. Therefore, it becomes extremely useful to think of particular user actions as events that trigger the execution of certain functions.

Keep event handling,
Juan Camilo Osorio

Why Geeks should Write

I recently started a course at the University of Toronto that requires us to keep a sLOG, and this happens to be the first required topic we were instructed to write about. The thing is, the topic somehow implies that there is some sort of cultural belief that they shouldn’t. I mean, I personally believe that everybody should write, regardless of where that writing takes place. However, most people imagine Computer Scientists (or programmers in general) as anti-social beings that live in basements and somehow are not interested in communicating with others. The truth is that our field is much more social than most people would think. Since projects can yield visible results in short periods of time, and both code and thoughts can be interchanged effortlessly through the internet, writing and reading blogs is one of the best investments a geek can make in order to gain knowledge and recognition in the field.

The first reason why geeks should write is that other geeks are very likely to benefit from your geeky endeavours. The more geeks that write about relevant topics, the more the field is going to advance in general. It doesn’t make sense to reinvent the wheel every time somebody wants to make a car, but in order to be able to google how to code a wheel, a geek must have written about it first.

The second reason why geeks should write is that a good blog will make a geek stand out from the crowd. Lets be honest, if you were going to hire someone and had two candidates with exactly the same background and qualifications, but one of them had been writing in a blog that was featured by a mayor corporation, and contained many clever ways to solve problems (of the same kind that they would be encountering in the job), who would you hire? Exactly.

There are many more reasons why a geek should write, but I still think assuming the contrary a priori shouldn’t be done. Being a geek is something people should be proud of, because it is awesome. Then again, blogging is just a by-product of being awesome and wanting to share it with the world.

Keep blogging,
Juan Camilo Osorio