OPW, Yes you should ladies

I am coming to the close of my 3 month internship with the Fedora infrastructure team, thanks to the Gnome Outreach Program for Women, and happy to tell any female interested in applying that yes they should. I haven't been blogging about my experiences as much as I should have, but I have enjoyed the royal treatment of entering the Open Source world. The application process and the mentorship really enable you to first learn more about opportunities available and how to research projects to join, and then how to develop your skills while contributing to something that effects a lot of people and working with people you may not have been so easily introduced to without the program. While I attended PyCon, I was asked a few questions about the program application process. So here are my answers.

Q: IRC? I don't use IRC, is that a problem? A: You will have to learn to use IRC a little if chosen for the program. It is super easy and not worth deterring you from the program. When you first contact a potential mentor, email is fine too. They understand that not all new people have used IRC. They also accept related skill sets like design and documentation who don't use IRC as much as developers.

Q: Was your first task hard? A: The first task is supposed to just get you into the code. See if you can get the code running on your computer and poke around to figure out how to solve something simple. My first task sounded difficult, making an output json serializeable, because I didn't know what json was. It was actually just formatting commas, brackets, and parenthesis correctly so that it could be read properly. Having json outputs proved to be important in the datanommer project. Though it probably would have taken my mentor less than two minutes to do it himself, it took me an hour of reading through Google results to figure out what I was supposed to do. But I did learn something.

Q: Is the application competitive? A: Yeah. There are only so few positions per participating organization, and then multiple projects within each organization. You can make a contribution to more than one project to better your chances that the organizers and mentors will find a place for you. If you stay involved after making one contribution by helping others (without stealing their task or putting them down) and asking questions about the project, then they will want to find a spot for you. The number of positions they wanted to fill per organization was different than the positions actually filled because they cared more about getting good contributing females involved in open source than having two people interning with Fedora (or Mozilla, or other groups). Be active and do more than the minimum asked of you, then you should have no problem being competitive.

Post PyCon 2013

I am very thankful to the Python Software Foundation and PyLadies for sponsoring my attendance to PyCon this year, my first time attending PyCon, though I have stuffed bags and socialized with attendees for the past two years. One of the great things about PyCon is that everyone leaves with a different experience. There are so many options of tracks and activities for different skill levels and interests, as well as different groups of attendees, sponsors/recruiters, and projects (e.g. PyPy) or membership groups (e.g PyLadies). Although concurrency was a hot topic this year (again), a few other topics peaked my interest, influenced and strengthened by the tutorials and talks I attended.

The first day of tutorials, I attended Matt Harrison's Hands-on Intermediate Python. It flew by a lot of material quickly but provided a lot of activities and reading material that I am looking forward to as homework. It also got me interested in functional programming, wondering why some Pythonistas are very against using map and lambda. I believe it is just because list comprehensions are really awesome as replacements. The second day of tutuorials, I attended Mike Bayer's Introduction to SQLAlchemy. I now have a better understanding of why I have seen so many different formats to do the same thing and understand a little about memory while defining variables in development (:name_1). Again, I feel like I need to study read, and practice.

My first day of the conference was interrupted by signing the lease to our bay area, peninsula apartment. I did get to attend a pep talk on writing about code and a talk on functional programming, though I spent more time networking with booths and other attendees. It is really cool to reconnect with people I have met in various cities and travels such as Ian Ozsvald from Startup-Chile, Brandon Rhodes from Atlanta, and a couple people I met while visiting my husband at PyCon last year. The second day, I spent all morning in back to back testing sessions. I enjoy testing in all ways that involve Python, unit or integration, selenium webdriver or just unittest. The final day was also moving day. I heard Guido's keynote on the controversial topic of including async in 3.4 (how he's doing it is the controversy), and talked to a few companies at the career fair.

Now that I have seen a few talks, I believe I could try to do one myself. It seams to me that the best way to do a thirty minute talk is to demo something to get people excited and motivated to learn how to use it themselves. I unfortunately did not get to attend the Sprints. Part of affordable babysitting for the conference was also giving my sister (the babysitter) a good time before she left. I was also being DD to my husband who was receiving more free beers at the Pyramid sprints than most hot single women can get at a bar. One of my hopes of attending PyCon was to make friends with PyLadies, but I much preferred the less specific Women Who Code group. Im still looking for daycare, a job, and furniture, but I would like to become involved with their meetups soon.

Alembic Tips

I've spent the past two weeks changing the database models for fedora's datanommer, but we also wanted to keep the data that was in the previous models. Thus, we needed alembic. I do not claim to be an expert at all, but I do think I figured out some tips that could help others in their own first experience with alembic:

  1. You might not be able to use sqlite. If you are dropping columns, for example, sqlite just doesn't work, but postgres does.
  2. Develop your script in upgrade-downgrade pairs. If you have multiple columns or tables that need to be changed, write the upgrade and downgrade for each change and run the upgrade and downgrade while inspecting the database to make sure it worked before moving on to another alteration.
  3. Alter tables with CLI. If an upgrade or downgrade was interrupted, you may have a funky combination of the the two versions. The problem is trying to run another upgrade or downgrade will fail because the tables it's trying to delete aren't there or the columns it wants to create already exist. Altering the tables in terminal to the models your code expects can fix this.
  4. Contributors and users need different versions and documentation depending on their history. If a user is running the initial None version, upgrades to Version A (version names are actually much longer) upgrades to Version B, then a future upgrade script will go from B to C. But if a new contributor or user starts at Version B, then Version B is their None and to upgrade to version C, they have to delete the version scripts None to A and A to B, writing a new version from None to C.

Alembic documentation and tutorial is located at alembic.readthedocs.org . The tutorial covers how to add tables and columns, but a more realistic use of alembic could use SQL queries and python logic. Here's an example from Fedora's datanommer where we had multiple tables to distribute among multiple servers for each type of data source (bodhi, git, busmon, etc...) but decided to put them all together.

    """one model

    Revision ID: 198447250956
    Revises: None
    Create Date: 2013-01-14 11:14:04.738115

    """

    # revision identifiers, used by Alembic.
    revision = '198447250956'
    down_revision = None

    from alembic import op
    import sqlalchemy as sa

    from sqlalchemy.schema import MetaData
    from sqlalchemy.sql import text

    tables = [
        "bodhi", "busmon", "compose", "fas", "git", "httpd", "koji",
        "logger", "meetbot", "tagger", "unclassified", "wiki"
    ]

    metadata = MetaData()

    def get_table_args(tname):
      return [
        tname,
        metadata,
        sa.Column('id', sa.Integer, primary_key=True),
        sa.Column('i', sa.Integer, nullable=False),
        sa.Column('timestamp', sa.DateTime, nullable=False),
        sa.Column('certificate', sa.UnicodeText),
        sa.Column('signature', sa.UnicodeText),
        sa.Column('topic', sa.UnicodeText),
        sa.Column('_msg', sa.UnicodeText, nullable=False)
      ]

    def map_values(row):
      return dict(
        i=row[0],
        timestamp=row[1], #datetime.strptime(row[1], '%Y-%m-%d %H:%M:%S.%f')
        certificate=row[2],
        signature=row[3],
        topic=row[4],
        _msg=row[5],
      )


    def upgrade():
        base_query = "SELECT i, timestamp, certificate, signature, topic, _msg FROM %s"

        # create table messages, messages_table is the python var for the sql table
        args = get_table_args('messages')
        engine = op.get_bind().engine
        messages_table = sa.Table(*args)
        metadata.create_all(engine)

        # query each topic table and insert into messages table
        for table in tables:
            tname = '%s_messages' % table
            query = base_query % tname

            results = engine.execute(text(query))
            data = map(map_values, results.fetchall())

            if data:
                  op.bulk_insert(messages_table, data)


        # drop each topic table
        for table in tables:
            op.drop_table('%s_messages' % table)


def downgrade():
    base_query = """
    SELECT i, timestamp, certificate, signature, topic, _msg FROM messages WHERE topic LIKE '%{0}%'
    """

    engine = op.get_bind().engine

    # create table for each topic
    db_tables = {}
    for table in tables:
        tname = '%s_messages' % table
        args = get_table_args(tname)
        db_table = sa.Table(*args)
        db_tables[tname] = db_table

    metadata.create_all(engine)

    # query message table with topic filter and insert in apropriate table
    for table in tables:
        tname = '%s_messages' % table
        results = engine.execute(text(base_query.format(table)))

        data = map(map_values, results.fetchall())

        if data:
            op.bulk_insert(db_tables[tname], data)

    op.drop_table('messages')

My Love of Daycare

I'm doing a lot right now. We are preparing our final reimbursement process for Start-Up Chile, our startup is in the finalists to present to a VC for $120,000, I have a 3 month internship with Outreach Program for Women to contribute to Fedora's datanommer project, and I still try to do some QA work on the development of our startup's software. And I'm pregnant, and learning Spanish. Luckily, I love my son's daycare.

I know that daycare is a somewhat controversial topic, so first let me tell you how awesome my daycare is. There are less than 8 kids under the age of two, with one teacher who attended college for early childhood education and two assistants. Everyday, they do something with music: dancing to, making, or listening to music. They only speak Spanish to the babies but have an English teacher on staff for the older children. They have cultural celebrations and invite children's theatrical groups to perform plays. Even though they do painting and other crafts, I never see mess on his clothes or hands. They enforce house rules without physical punishment. I could just go on and on... And it's 1/3 the price of daycare in the States.

It irritates me when people criticize mothers who use daycare. I do not use daycare because I'm lazy or because I want to spend less time with my child. I know without a doubt that my son is learning things in daycare that I cannot teach him. I am absolutely positive that my son loves the Tias (instructors) because he gives them hugs and kisses every morning. When we encounter his classmates on the way to or from daycare, my son enjoys greeting them. I do miss him during the day, but I think the financial burden of his daycare is well worth the benefits.

I was a stay at home mother for the first year of my son's life, except for spending some time to learning how to code. I think my son and I are both much happier now that my son has friends and a new learning environment while I can dedicate time to my skills, intellect, profession, and don't forget my husband.

In parenting, there are always strong opinions and criticisms of different decisions. I realize that finding an awesome and affordable daycare may not be possible for everyone so I don't think parents who keep their children home are necessarily overprotective. They may be just the right amount of protective. Our daycare just happened to be a perfect fit for our needs. I'm sad that we're moving at the end of February because we're going to lose his daycare (and our awesome view of the beach).

Virtualenv(wrapper)

A virtual environment was first described to me as a sandbox for your code, but I had no idea what that meant. Now I know that it is a way to isolate your projects from other projects' dependencies. Each project has its own environment and its own installations of dependencies which have to be activated each time you want to use them. This may seem irritating to install, for example, pyramid and selenium three times for three different projects, but the little extra time is worth the trouble.

One reason for using a virtualenv is to know which dependencies your project actually needs. Whether you are open sourcing and want others to be able to contribute, or you're deploying on a server that needs the same environment as the one you are developing on, you should know what your dependencies are. Without a virtualenv, if an error or bug is occurring in production but not in development, then the error could be from missing dependencies on the production machine. If you or someone else wants to contribute to your project from a different machine, all the python dependencies should be installed with setup.py. Another reason to use a virtualenv is when different projects are using different versions of dependencies. Using a virtual environment is one of the first steps to being able to work on multiple projects, including open-source. An alternative is Buildout, but a virtual environment is faster and simpler to get started.

Now that I've hopefully convinced you to use a virtual environment, I present two ways of using a virtual environment. The 'old' way is kind of like turning the light on with a light switch, compared to the newer virtualenvwrapper which is like a clap-on mechanism to do the same thing.

To install virtualenv for the first time,

$ pip install virtualenv

OR

$ easy_install virtualenv

Creating a virtual environment for your project 'testproject', and activating it to install your python packages such as pyramid or use your installed packages is as simple as:

$ virtualenv testproject
$ cd testproject
$ source bin/activate
(testproject)$ pip install pyramid

Or to go back to your virtualenv that has already been created:

$ cd testproject
$ source bin/activate

You can actually activate bin from anywhere. Sometimes I forget to activate my virtualenv until I'm already inside my src/myapp directory so I activate by: source ../../bin/activate. If you ever want to check which packages you have installed, you can just look into the bin directory. Pip is included with virtualenv, but easy_install is not. Depending on your terminal configurations, your prompt may look a little different once activated, but should show the activated virtualenv in each prompt. When working in multiple terminals or tabs, the virtualenv has to be activated in each session. Some older documentation may include the argument --no-site-packages when creating a virtual environment but it is now the default and is not required to include.

An alternative practice is to use virtualenvwrapper, which is composed of extension for virtualenv. The syntax is a little different when using virtualenvwrapper. To install:

$ pip install virtualenvwrapper

Now same as before: creating a virtual environment for your project 'testproject', and activating it to install pyramid:

$ mkvirtualenv testproject
(testproject)$ pip install pyramid

The mkvirtualenv command creates the virtual environment and activates it.Now to work on your virtualenv that has already been created:

$ workon testproject

To move into the directories or apps within your project folder:

(testproject)$ cdvirtualenv test

The virtualenwrapper does simplify a few of the actions needed to use a virtualenvwrapper. Personally, I think using virtualenv without virtualenvwrapper is better for beginners. I think it's better to learn those steps that the virtualenvwrapper hides. I have seen open source documentation with either virtualenv or virtualenvwrapper, but you can use either.

We Got In!

Today, about two months after the round four applications for StartUp Chile (www.startupchile.org) closed, the accepted 100 teams were announced. Our team was one of the chosen to recieve 20.000.000 chilean pesos ($40,000 US) over a sixth month time period to start our business in Chile. The money is not tied to any exchange of equity. However, the government does not simply hand over cash either. The program requires us to spend our own money, getting reimbursed up to 90% of each expense for a total of $40,000 over the six month period. We also have to volunteer our time for various causes within the community that directly benefits the local population. Most people in the program earn credit by speaking about entrepreneurship. Our friend George Cadena, founder and CEO of StudioSnaps, taught an entrepreneurship course at local universities and has continued to teach and mentor beyond the duration of his program requirements. We will probably earn our credit by teaching programming and putting on events with our event planning software. Most of the StartUp Chile events are currently using Welcu. We believe we can replace them with our superior knowledge of the industry, our higher standards of quality, and our exceptional technology. Their goal is to be acquired, and it shows. We aim to manage the entire event planning process, from communicating with clients, contracting vendors, organizing speakers, collecting registrations, and analyzing reports. We LOVE the event planning business. Our passion, experience, and quality should make us a serious threat in Chile, South America, and the rest of the world.

Investment Aversion

We have many ideas for businesses that we could be launching as startups. One of our ideas is especially low risk, has a high earnings potential, is related to our CTO's work experience, and is in a fun and exciting industry. Naturally, we've decided to focus our energy on that product, so that we can earn enough money to be able to work on other projects later. That idea is our conference management system.

One of the reasons our conference management system is so well thought out is that we have not been shy to discuss our ideas with other entrepreneurs. We don't fear our idea being stolen because we are confident that no other person or team could complete our product better than ourselves. We've also analyzed the competition and learned from related industries. A couple times while we've discussed the software with someone who has achieved a high level of financial success, they have expressed interest in investing. Many young people with a startup would give up a large portion of their company for some cash, but not us.

Giving equity is messy and time consuming for a couple reasons. First, you have to invest the time to know your potential investor's goals. If your investor wants the company to change or go in a different direction, it's kind of hard to tell them no after they've given you money. You need to have compatible goals and motivation to work well with an investor. Second, you have to determine if the investor truly adds value to your project. Sure, cash is value. But other than cash, can they offer connections? Do they have experience in your industry? Do they want to be on the board, involved in company decisions as a mentor? Third, you have to get everything in writing and pay for lawyers and accountants to organize ownership and vesting schedule contracts. You may spend two or three weeks ironing out details with a potential investor, only for them to change their minds the day before signing the contract.

Then once you go through the laborous process of securing your investor, the politics starts. Be prepared to explain every business expense, ask permission for every new hire, and justify any changes to the business plan. Instead of just making changes to your product to attract more business, everything is dependent on metrics. If a change improved revenue, you better be able to prove it. If you disagree with an idea or a hire suggested by your investor, you need to have spreadsheets, dollar signs, and a presentation prepared to be convincing.

Startups without investment do face a lot of risk and challenges. But as a reward for being truly bootstrapped, they are incredibly mobile and dynamic. Smaller, lighter objects can change directions faster than objects with greater mass. Small teams operating out of a shared apartment can adapt much quicker than a large organization that requires board meetings and weekly email updates. An investor may just become an extra phone call or hoop to jump through before acting on a necessary innovation or pivot. A good investor is like love. It will happen when you aren't actively looking for it, it will be harmoneous, and you would never want to be without them. I would not be averse to a truly beneficial investor, but we're not eager to find our angel just yet. I'd rather be cautious of involving other people in our plans and keep the equity to ourselves.

Intermediate Guess Number Python Game

In my first month of learning python, I wrote a simple guess the number game. It did not include exception handling, was not pep8 compliant, and was organized somewhat poorly. It also needed updating because it used what I like to call "C Print Formatting," which used to be correct in python but is now being replaced. http://www.python.org/dev/peps/pep-3101/ discusses why. I decided to include a little of my coding process in this post as a way to pre-empt debugging.

The following code example was my first iteration. I knew that the user would be making multiple guesses, so my prompt for guess had to be a separate function called multiple times. I also included exception handling with my prompt so that the guess had to be a number within range and not a letter. I didn't know yet if generating the random integer would be complicated, so I hard coded a value for the correct answer and had the program print when completed, so I knew if the program reached the final step. In this version, you get one guess and the program doesn't know if you are wrong or right.

minm = 1
maxm = 20
max_guesses = 6

def randomize():
    return 4

def prompt_for_guess():
    try:
        message = "Guess a number between {:d} and {:d} \n"
        guess = int(raw_input(message.format(minm, maxm)))
        if guess not in range(minm, maxm + 1):
            return prompt_for_guess()
        else:
            return guess
    except ValueError:
        print("Must be a number.")
        return prompt_for_guess()

def main():
    message = "I'm thinking of a number. I'll give you {:d} guesses"
    print(message.format(max_guesses))
    guess = prompt_for_guess()
    answer = randomize()
    print("The answer was {:d}".format(answer))

if __name__ == "__main__":
    main()

Running the program with inputs of a letter, a number out of range, and a number within range verified that my exception handling worked. At this time, I imported random in my terminal and read the directory and help to review use of randint. Randint includes the last boundary, unlike the built in python range. Because it was only one line of code, I decided it did not need its own function. Also because I only needed to use it one time. My next attempt looked like this:

import random

# configs
minm = 1
maxm = 20
max_guesses = 6


guesses_taken = 1


def prompt_for_guess():
    """Asks for a guess and repeats if input is not a number in range"""
    try:
        message = "Guess a number between {:d} and {:d}. \n"
        guess = int(raw_input(message.format(minm, maxm)))
        if guess not in range(minm, maxm + 1):
            return prompt_for_guess()
        else:
            return guess
    except ValueError:
        print("Must be a number.")
        return prompt_for_guess()


def main():
    message = "I'm thinking of a number. I'll give you {:d} guesses"
    print(message.format(max_guesses))
    answer = random.randint(minm, maxm)
    if guesses_taken <= max_guesses:
        guess = prompt_for_guess()
        if guess == answer:
            print("That's it! You win!")
            break
        else:
            if answer > guess:
                reason = "low"
            else:
                reason = "high"
        print("Your guess is too {:s}".format(reason))
        guesses_taken += 1
    print("The answer was {:d}".format(answer))


if __name__ == "__main__":
        main()

But this had some errors. First, my guesses taken cannot be assigned outside my main function without making it a global variable. Unlike my minm, maxm, and max guesses, I want to edit the value of guesses taken. My second error was using an if statement instead of a while loop for guesses taken less than max guesses. I want my loop to continue until the user exceeds the number of allowed guesses or they get the answer right. My third mistake was my final print statement. It was useful for my initial attempt to write the program, but now I only want to reveal the answer if the user loses.

Now The Final Code

import random

# configs
minm = 1
maxm = 20
max_guesses = 6


def prompt_for_guess():
    """Asks for a guess and repeats if input is not a number in range"""
    try:
        message = "Guess a number between {:d} and {:d}. \n"
        guess = int(raw_input(message.format(minm, maxm)))
        if guess not in range(minm, maxm + 1):
            return prompt_for_guess()
        else:
            return guess
    except ValueError:
        print("Must be a number.")
        return prompt_for_guess()


def main():
    message = "I'm thinking of a number. I'll give you {:d} guesses"
    print(message.format(max_guesses))
    answer = random.randint(minm, maxm)
    guesses_taken = 1
    while guesses_taken <= max_guesses:
        guess = prompt_for_guess()
        if guess == answer:
            print("That's it! You win!")
            break
        else:
            if answer > guess:
                reason = "low"
            else:
                reason = "high"
            print("Your guess is too {:s}".format(reason))
        guesses_taken += 1
        if guesses_taken > max_guesses:
            print("The answer was {:d}".format(answer))


if __name__ == "__main__":
        main()

Easy Anagram Dictionaries Practice

I do not use dictionaries very often. Friday, I was without internet all day, so I took the opportunity to play with dir() and help() to discover some dictionary properties. My short-lived obsession with Draw Something on the iPhone has gotten me interested in anagrams (kicked the habit by reading programming books). I believe using dictionaries is the fastest and most accurate way to determine if two words are anagrams of each other, at least without importing any other modules.

A dictionary is an unordered set of key: value pairs. Keys must be an immutable type. Values can be anything. Being unordered causes some interesting properties for working with dictionaries, different from any other python data structure. Instead of being indexed by numbers, dictionaries are indexed by keys. Because they are indexed by keys, each key is unique within it's dictionary. If two dictionaries with the same keys are added to each other, the values of the same data type combine. This is convenient for our anagram activity. But first, some dictionary review.

>>> sample_dict = {}        # creates an empty dictionary
>>> type( sample_dict )
<type 'dict'>
>>> sample_dict['Name'] = 'Jessie'        # creating a key:value pair
>>> sample_dict['Age'] = 23        # another key:value pair
>>> sample_dict
{'Age': 23, 'Name': 'Jessie'}
>>> sample_dict2 = {'Name': 'Jessie', 'Age': 23}        # another way to create dict
>>> type (sample_dict2)
<type 'dict'>
>>> sample_dict2
{'Age': 23, 'Name': 'Jessie'}
>>> sample_dict + sample_dict2        # cannot add dictionaries, only values
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
>>> sample_dict['Age'] + sample_dict2['Age']        # adds values
46
>>> sample_dict.keys()
['Age', 'Name']
>>> sample_dict.values()
[23, 'Jessie']
>>> type( sample_dict.values() )        # keys and values are returned as lists
<type 'list'>
>>> sample_dict.get('Age')        # gets the value at a specific key
23
>>> type( sample_dict.get('Age'))         # value maintains data type in dictionary
<type 'int'>
>>> sample_dict.has_key('Age')        # D.has_key() returns boolean
True
>>> sample_dict3 = {'Children': 'Graham'}
>>> sample_dict3.update(sample_dict)        # update keys and values
>>> sample_dict
{'Age': 23, 'Name': 'Jessie'}
>>> sample_dict3        # Children field is added as a key:value pair
{'Age': 23, 'Children': 'Graham', 'Name': 'Jessie'}
>>> {'Age': 23, 'Name': 'Jessie'} == {'Name': 'Jessie', 'Age': 23}        # different order is equal
True

How do we know if two words are anagrams? Consider the anagrams odor and door. We could say that they are reshuffled strings. Each word uses the same letters, but in a different order: 2 o's, 1 r, and 1 d. My simple program creates empty dictionaries for the two words being compared, stores the letters as keys, and adds to the value for each occurrence of the same letter, then checks that the dictionaries are equivalent. I have not included exception handling and I made the design decision to count white space as part of the anagram such that 'abc def' is not an anagram of 'fdeabc,' but is an anagram of 'abc fed.'

def get_dict(word, count):
    for i in word.lower():
        if count.has_key(i):
            count[i] += 1
        else:
            count[i] = 1
    return count

def main():
    word1 = raw_input("What is the first word? \n")
    word2 = raw_input("What is the second? \n")
    count1 = {}
    count2 = {}
    count1 = get_dict(word1, count1)
    count2 = get_dict(word2, count2)
    if count1 == count2:
        print("Yes, those are anagrams!\n")
    else:
        print("No, you've failed \n")

if __name__ == "__main__":
    main()
Next Page
RSS
Links