CH

August 18, 2014

nothing beats reading the source - boto and s3 and multiple-arity edition

Filed under: Uncategorized — @ 12:00 a.m.
nothing beats reading the source - boto and s3 and multiple-arity edition

I'm scripting the provisioning of client clusters today. This entails standing up Amazon Relational Data Store (RDS) instances up for Datomic to stuff facts into. I'm using the Python1 boto library2 to stand up these RDS instance. When first I went to stand up this instance, I said, "dear boto: run you this command!"

conn.create_dbinstance(
    id=identifier,
    instance_class=instance_class,
    allocated_storage=allocated_storage,
    engine=engine,
    db_name=db_name,
    master_username=master_username,
    master_password=master_password,
    port=port,
    security_groups=security_groups)

conn, of course, being some wacky connection object with RDS manipulation objects hanging off of it3.

AWS responds thusly:

DB Security Groups can only be associated with VPC DB Instances using API versions 2012-01-15 through 2012-09-17.

Fine. Okay. I'm accustomed to this - I brought this on myself by leaving the wood shop and the metal fabrication studio and the high-end semiconductor probe manufacturing company for a lifestyle and beer in the office. How then is one to query the AWS API for the upped-ness status of an instance4?

The answer is to call get_all_instances 5 with an instance ID:

def get_all_dbinstances(self, instance_id=None, max_records=None,
                        marker=None):
    """
    Retrieve all the DBInstances in your account.

    :type instance_id: str
    :param instance_id: DB Instance identifier.  If supplied, only
                        information this instance will be returned.
                        Otherwise, info about all DB Instances will
                        be returned.

    :type max_records: int
    :param max_records: The maximum number of records to be returned.
                        If more results are available, a MoreToken will
                        be returned in the response that can be used to
                        retrieve additional records.  Default is 100.

    :type marker: str
    :param marker: The marker provided by a previous request.

    :rtype: list
    :return: A list of :class:`boto.rds.dbinstance.DBInstance`
    """
    params = {}
    if instance_id:
        params['DBInstanceIdentifier'] = instance_id
    if max_records:
        params['MaxRecords'] = max_records
    if marker:
        params['Marker'] = marker
    return self.get_list('DescribeDBInstances', params,
                         [('DBInstance', DBInstance)])

In the EC2 scripts, I explicitly iterate over all of my reservations, looking for the reservation id returned from the run_instance command. Then I just loop over that reservation until it returns an instance ID. I must be pretty dumb, because this is the best solution to the bog-standard use case of "give me a new instance and return its instance ID that I've found:

def get_current_hostname(conn, resid):
    i = 0
    wait_cycle = 3.0
    while True:
        if i * wait_cycle > 60.0:
            raise Exception('Timed out waiting for EC2 to return a hostname')
        else:
            reservations = all_reservations(conn)
            for res in reservations:
                if res.id == resid:
                    pdn = res.instances[0].public_dns_name
                    if pdn == '':
                        i += 1
                        sleep(3.0)
                    else:
                        return pdn

Anyways the lesson of all this is that6 there is no substitute for a) an editor that browses source (thanks, Emacs! thanks, Elpy! Thanks, Mr. Schafer!), and b) reading the source. Documentation, man. It's always damn misleading7, at least in some regard.

I am derpy because all that I know of Java I learned from Clojure. Functions! That's all I know, man. I don't even write macros cuz I'm too dumb.

Footnotes:

1

2.7 in case you're curious, because that's what ships with OS X, and my lifespan is finite and I'm trying to actually ship some software over here.

2

Clearly the authors of this lib were thinking about working on one project for one company, or maintaining this file by hand, or something else inscrutable. For the hired gun randomly poking at the AWS API for random projects of all kinds, all of this needs to be explicitly declared in the provisioning scripts somehow.

This means that there's a) Python and b) local configuration to deal with if using the CLI tools. This rules the CLI tools clean out, and means that my options for scripting up the enclusterification of client projects are a) dig up and prod the relevant endpoints myself (the problem with which is that they spit back XML (not even JSON I can parse with .jq ffs)) or b) take a dependency on some other language that parses XML in which case why not just use one of the AWS API's from the get-go?

Which brings me to a point I've been harping about lately which is that my bash scripts shell out to python and not the other way around and as far as I'm concerned this is the correct approach. Direct arguments on the topic to benkay@gmail.com.

3

I'm just some derpy lisper, so for the record, this is how I've turned the OOP insanity inside-out with referentially-transparent functions so that things make at least a modicum of sense:

def new_bucket(conn, bucketname, region):
    return conn.create_bucket(bucketname, location="us-west-1")

s3conn = boto.connect_s3(aws_access_key_id=creds['AWSAccessKeyId'],
                         aws_secret_access_key=creds['AWSSecretKey'])
new_bucket(s3conn, bucketname, region)

The critical difference is that my new bucket function takes in the connection as an argument and then calls the appropriate method dangling off the thing. I can't help but imagine passing this giant jangling set of keys around my scripts as this whole thing happens. But I digress.

4

For the love of Christ, don't use recursion in Python. This intuitive approach to programming (which has been technically solved since the eighties? seventies? Longer than I've been alive, certainly) will make Python blow up if it recurses more than a trivial number of times. Mr. van Rossum says this is because something to do with stacktraces, which doesn't make sense to my derpy lispy Java brain, as Clojure's stacktraces are entirely either incomprehensible or totally legible depending on who you ask and how long they've been hacking Clojure. In any case, the problem is above my head and beyond my pay grade, so disregard my derping on the topic.

5

Yes, I know that subscripting is ruining the reading experience on my blog. Thank you.

6

I'm just kidding. There's no lesson. Programming is a hilarious exercise in mashing one's fingers with a hammer. I blog to entertain with stories from the trenches, not to educate anyone about anything. Perhaps all the AWS in the post will raise me in some search rankings, though. Nefarity!

7

One last footnote, I swear: I'm clearly the most poorly engineered developer you know, because it took me three days of derping to realize that all of the clj-time joda-time objects implement (.toDate %). This was mentioned nowhere in the clj-time documentation (that I found). Maybe I should just call random methods on things at the REPL - that's what I'll do!

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Reply

« The American consumer, in two photos --- Perceived vs. actual barriers to homeownership for young adults »