magnify
formats

Bitcoin Is Now Supported in Encode::Base58::GMP

Published on May 2, 2013 by in open source, Perl

Base58 is a human-readable, protocol-safe encoding useful for case-sensitive encodings in cross-application situations such as urls, databases, file systems, full-text indexes and others. It was popularized by Flickr and also used by Bitcoin and Grokbase. Back in 2011, I open sourced Base58 libraries using the GNU MP Big Num Library (GMP) for Perl and Ruby to create Base58 encodings. At the time, the Ruby library supported Bitcoin, Flickr and GMP alphabets while the Perl library only supported the Flickr and GMP alphabets. With Encode:Base58::GMP version 1.0, the Perl library now also supports the Bitcoin alphabet for decoding and encoding as well as conversion from and to the Flickr and GMP alphabets. Grokbase uses the Flickr alphabet so there’s no need for a separate library in the code.

In addition to Bitcoin support, the Perl library has now added a base58_from_to() function to generically transcode Base58 strings and has updated the md5_base58() function to create strings padded with leading zero values. The change to md5_base58() is not backward compatible so this is something to consider for an upgrade.

For a bit of background, the reason Encode::Base58::GMP did not support Bitcoin in earlier releases is that I utilize the transliteration operator for the conversion and Perl does not support variables with the transliteration operator which I was using for implmentations in other languages at the time (Ruby, PHP, and Python). At the time, I didn’t find a solution to this other than hand-writing each conversion which I wanted to avoid so I left it out. Coincident to the recent Bitcoin activity, a bug report and blog comment were posted requesting this feature so I decided to look at this again. This time, I found the Perl solution is to use eval with optional pre-compile. I implemented the eval which looked clean but, in the end, I decided to release the hardcoded tr version.

Thanks to the Perl community for Metacpan, RT, Distzilla and CPAN Testers for making this update easy. Also thanks to Ingy for writing Test::Base which makes Data Driven Testing a pleasure. If you are interested in Test::Base, this distribution includes an example of using external files in “t/03_md5-base58.t”.

Version 1.00 is now on CPAN so have fun.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Open Source Conference Banners

I was recently surfing the net and came across an artistic and eye-catching conference banner for PyCon Argentina 2012. I thought it was good marketing so I wanted to think about ways these banners could get a wider distribution, sort of like the conference ribbons people put on their blogs. To get wider distribution, I think it would be useful if more conferences made banners in one or more standard sizes. This way websites can feature them regularly and rotate between upcoming conferences. Unfortunately, I didn’t find any more in my brief look but I liked the graphics for YAPC::Asia 2012 so I whipped up a banner from the website graphics. Both are shown below.

If these became more widespread, websites could start using them routinely. One website where I think these can be used is Grokbase, a mailing list archive covering a few open source projects. Grokbase already links to project websites and IRC channels but there’s still a good amount of open space on the discussion pages that I think could be used to promote open source projects and communities, with this being a possible way to both promote the community and add some color to the pages.

What do you think? Do you like it when conferences have custom graphics design work done? Should more open source conferences have display banners that websites can use for promotion? Would you place something like this on your websites? And is this a good use of the open space on Grokbase?

Update

I thought Gabor‘s comment on creating a few more banners was a good one so I added a few more to get a feel for what a larger collection could look like which are provided below.

After a brief conversation with Gabor, two things struck me as essential to make this project successful:

  1. Ease of creating banners: While the banners are easy for me to create in a graphics editor, it may not be straight-forward for someone who has never done it before. If there’s interest, I could create a quick tutorial to show it can be done. There can also be a forum or mailing list where people can discuss banner creation.
  2. Ease of using banners: The second item that is needed is an easy way to embed these in websites, ideally in a way that can be automatically updated perhaps using a snippet of JavaScript to pull data from a remote server. This could be a new website or an existing service like the Perl Community AdServer.
 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
2 Comments  comments 
formats

Mailing List Analytics and Segregating Automated Posts

I recently added mailing list analytics to Grokbase in response to Gabor Szabo‘s post on estimating the size of the Perl community. This provides some useful insight into community participation; however, both Gabor and I noticed there are a significant number of automated posts from computer programs and it would be useful to segregate these to get a better look at actual user participation. In light of this, I added some reports that break out highly automated email lists and users.

Description

This feature adds a select drop-down ['All post types', 'User posts', 'Automated posts'] when both user and automated posts are present. You can see an example here. Javascript is currently required to use the select navigation feature; however, you can navigate directly to these pages by adding ‘?uf=users‘ or ‘?uf=automated‘ to the URL.

The posts are segregated by identifying ‘Automated posts’ using:

  1. a set of mailing lists
  2. a set of email sender addresses

Everything that is not currently flagged as an automated list or email address is classified as a ‘User post.’

This is a first approximation and can still use some refinement because (a) some email lists have both automated and non-automated emails and (b) some sender email addresses are used to send both automated and non-automated emails. In the future, it would be nice to identify automated emails at the per-email level. For example, there are quite a few automated git emails on the perl5-porters list which is comprised of both user and automated emails. Further, the git emails can be sent from email addresses that are also used to send non-automated emails. For now, emails from these users to non-flagged lists are classified as User posts though some should be flagged as automated in the future.

A list of the identified automated lists and users are available in text and json formats. Please let me know if there should be any additions or corrections.

An Example Comparison

To get a feel for this feature, check out the analytics for Perl and Python, both of which have a number of automated lists and users. The numbers show that both Perl.org and Python.org communities are making greater use of automated systems to communicate so segregating this out will give us better insight into user participation.

By mailing list domain:

By subject across domains:

One advantage of looking at reports by subject is that they can (a) aggregate lists across domains, e.g. perl.org and Catalyst lists and (b) they can filter lists within a domain, e.g. Django and MongoDB both hosted by Google Groups.

When looking at the Python subject stats, you will see a spike in activity around the May 2012 time frame. This is not due to actual new users but due to new archived users, because the Django archive starts around that time and is aggregated into the Python subject.

Development

There was very little actual development for this as the feature consisted of combining the analytics system with the automated user/list classification system. Specifically, more automated lists were identified and the analytics system was refactored to support more flexibility.

The automated user/list classification system was initially developed to identify automated users, aka agents, which are segregated in the user list and badge lists. After the automated user/list system was updated, some additional APIs were added and the analytics system was refactored to easily accommodate these filters. Generic interfaces are used to enable fast addition of new features, when applicable.

Please check it out and let me know what you think and how it could be of more use.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Community Analytics via Mailing Lists

Recently Gabor Szabo wrote an article to answer a question he’s often asked, “How many people are in the Perl community?” In his response, he mentions and analyzes a number of proxies including conference participation, commits, open source contributions, mailing list posts, etc. John Napiorkowski agreed with the importance of analyzing this data, “Understanding this I think is going to become more important as we try to do a better job of promoting and discussing Perl.” Given the importance of this topic and my interest in analytics, I wanted to see if I could help by developing some community analytics for the mailing lists archived by Grokbase.

The Analytics

Grokbase already generates some metadata for lists including posts per month and users per month on a per-list basis so the basic structure was there, albeit not yet aggregated into reports or charts for easy analysis. The first step was to create an aggregate report from the existing data. Then I added some additional data I thought would be useful to come up with this initial list:

  • posts per month
  • users per month
  • new users per month
  • user domains per month
  • TLD distribution

The new statistics were added because they might be useful for the community, specifically the new users per month figure provides an indication of community growth over time. I also added user domains per month which can be used as a proxy for the number of organizations using a technology. Finally, I added email address top-level domain (TLD) distribution to get an idea of geographic distribution for people using email addresses with country-code TLDs (ccTLDs). While many people use corporate and free email services with generic TLDs such as .com, a number of people continue to use country-code TLDs which might be a useful indicator for conference or workshop potential. Once the reports were done, I added some charts for easy trend analysis and visualization.

Check out the pages for perl5-porters, python-list, android-developers, and mongodb-user. To get to any of these pages, navigate to the group page and click the “Analytics” tab in the top right corner.

To get an idea of the larger community, I also aggregated the data across groups by list host. This is useful for seeing the activity for an overall community like perl.org, python.org, php.net or an Apache project like Cassandra or Hadoop. You can see these pages by navigating to the groups by host page, clicking a a host and then clicking the Analytics tab.

There’s obviously more that can be done but it was quick to get this out and I hope it will prove useful. Please let me know what you think and what could make it more useful.

The Development

This project was developed and deployed in ~1.5 days, with ~55-60% of that time spent on refactoring the front-end to ensure good code maintainability. The analytics are built off of a normalized PostgreSQL data store and denormalized into a JSON-document store. From there, web pages were developed to access the denormalized data for groups and list hosts. This initially resulted in further code duplication (then 4 duplicates due to earlier technical debt), so some refactoring was done to address that. A number of services are already deployed this way so it was relatively fast and easy to do. The charts are provided by Google’s Chart API using a custom wrapper to automate rendering multiple charts on a page.

The Future

This is just the start and I would like to add additional analytics that are of use to the community. It’s only been a couple of days since I read Gabor’s blog post but I’ve already come up with some possible enhancements:

  • Extensions on existing analytics
    • Aggregate by subject (done, see below): default group navigation is done by subject (vs. host), however aggregate analytics are currently only provided by host. Support for subject-base aggregations would help for Google-hosted groups like Android, MongoDB, Scala, etc.
    • Aggregate groups report (done, see below): the ability to list groups by subject or list host to enable comparison of group activity by posts, users or new users. For example, if we see a lot of new users for perl.org this month but do not know which lists they are participating in, a quick report would let us zero in on those lists and see what is happening.
    • Compare groups/hosts: compare arbitrary groups or sets of groups so we can more easily determine how one technology is faring against another.
    • API access: more generic data availability to support custom analytics
  • Additional data
    • New users list: a list of new users can be useful for the community to ensure those users are getting their issues addressed and to see how they are engaging with the community.

I’ve already found the analytics to be very useful for myself so my thanks go to Gabor and John for inspiring me to put this together. Please let me know what you think and how it could be made more useful for you and others.

Update

After receiving positive initial feedback, I decided to see to see if any of the additional features could be implemented quickly and ended up adding the following two features (online as of Aug 20):

  • Aggregate by subject: it’s now possible to view aggregate list reports by subject which is useful when the subject matter doesn’t have a dedicated subject-based list server, e.g. Android, MongoDB, and Scala. For these lists specifically, Grokbase, may not yet have the full list archives so take that into account when viewing the charts. If you have data that does not appear to be archived, please consider sending it to us.
  • Aggregate groups report: in addition to viewing aggregate reports by month and TLD, it’s now possible to view the reports by group for both the life of the archived data as well as on a per-month basis. This enables comparision of group activity by posts, users and new users. This can be useful for seeing which groups have the most posts vs. the most users and the most new users.

These features took an additional 1 day with the primary difference being that they were built to consider and take advantage of the now-existing denormalized data.

I’d like to thank Gabor for mentioning the analytics feature in the latest Perl Weekly. As noted, automated emails are currently included in the aggregate reports. By segregating automated emails from the reports, visibility into real user activity could be increased. Work has begun on identifying automated email lists and users so stay tuned.

Update (Aug 92): segregating automated posts is now completed.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
3 Comments  comments 
formats

Base58: Fast Hashing with GMP

Base58 is an alternative to Base64 that is growing in popularity for case-sensitive encodings due to several characteristics including multi-protocol-safety and human-readability. It is used by Flickr, Bitcoin, and now Grokbase. The general characteristics of Base58 are that it uses the protocol-safe alpha-numeric alphabet (Base62) and excludes easy to confuse digits. Both Flickr and Grokbase use [0-9a-zA-Z] excluding [0O1l] while Bitcoin uses [0-9A-Za-z] excluding [0O1l].

Grokbase uses MD5 hashes of email addresses for UIDs and when I did the update, there were over 500,000 user profiles in the system so I wanted a fast encoder. I ended up using GMP with tr and released the code as Encode::Base58::GMP for Perl and base58gmp for Ruby.

As for how much faster, I ran a few benchmarks comparing Encode::Base58::GMP to the pure Perl Encode::Base58 implementation by Miyagawa and found that the GMP version was about 700x (70,000%) faster encoding MD5 hex values using the Ubuntu 11.10 system Perl. However, for 32-bit integers, the pure Perl version was faster by about 50%.

Two notes about these numbers:

  1. Ubuntu 11.10 x86_64 system Perl isn’t compiled with 64-bit integer support and Encode::Base58 needs integers so I used bignum to enable hex() support. One possibilty is to compile a Perl with 64-bit int support; however, Ubuntu system Perl is part of my stack and many others also rely on system Perl so I left it as that.
  2. Math::GMPz can accept hex values natively and since the goal was to generate MD5 values, I bypassed the integer conversion used with Encode::Base58. When adding integer conversion, GMP based encoding is still faster but drops by an order of magnitude.

70,000% Faster for MD5 (BigInts)

use strict;
use warnings;

use Benchmark qw/cmpthese/;
use bignum qw/hex/;
use Encode::Base58;
use Encode::Base58::GMP;
use Digest::MD5 qw/md5_hex/;

cmpthese(-5, {
  pp  => sub {
    Encode::Base58::encode_base58(hex(md5_hex(rand())))
  },
  gmp => sub {
    Encode::Base58::GMP::encode_base58('0x'.md5_hex(rand()))
  }
} );

Here is a sample result:

       Rate     pp    gmp
pp    111/s     --  -100%
gmp 79644/s 71484%     --


Rate pp gmp
pp 209/s — -100%
gmp 168636/s 80644% –

33% Slower on 32-bit Ints

Here are some results using 32-bit integers.

cmpthese(-5, {
  pp  => sub {
    Encode::Base58::encode_base58(int(rand 2**32))
  },
  gmp => sub {
    Encode::Base58::GMP::encode_base58(int(rand 2**32))
  }
} );
       Rate  gmp   pp
gmp 66733/s   -- -32%
pp  98803/s  48%   --


Rate gmp pp
gmp 161754/s — -33%
pp 243206/s 50% –

Conclusion

While GMP-based encoding is about 33% slower for 32-bit ints, it is significantly faster for 64-bit ints on Ubuntu 11.10 system Perl and very desirable for creating hashes. Numbers will vary with a Perl compiled with 64-bit int support.

Note: Grokbase User URLs

Currently, the Grokbase profile URLs use unsalted MD5 Base58 hashes of email addresses so you can find anyone’s profile page using the following:

use Encode::Base58::GMP qw/md5_base58/;

my $url = 'http://grokbase.com/user/-/'
        . md5_base58( 'test@example.com' );

The ‘-’ after ‘/user’ will get redirected to the user’s display name as shown in their emails.

Using an unsalted MD5 hash is similar to how Gravatar works and has the added benefit of simply requiring a hex to Base58 conversion to convert a Gravatar UID to a Grokbase UID.

Credits:

  • Marcus Bointon‘s article on Base62 encoding using GMP provided the initial idea of using GMP for arbitrary-base transcoding.
  • Sam Rawlins provided an update for Ruby’s gmp to handle arbitrary-base decoding.
  • Case Van Horsen provided an update for Python’s gmpy to support Base37-62.
 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
1 Comment  comments 
formats

Confident Code Using Type Constraints

The other day on Hacker News, there was a good discussion of Avdi Grimm‘s talk on how to write Confident Code given at this year’s Ruby Midwest. He covers four areas: gather input, perform work, deliver results and handle failure. Here are links to the code example showing the initial, timid and final, confident code. I’ll just cover the input gathering step because I think embedded type constraints is vital for improving coding efficiency and minimizing boilerplate.

Most of this can actually be done using something like Ruby’s Doodle. It does not appear that Doodle is very popular yet (21 all time messages in the mailing list) though I think it would be good if it were more so.

More popular is Stevan Little‘s Moose for Perl which has 28 named contributors. While I like to read about many technologies I generally like to use more mature and popular technologies than less so I’ve been using Perl’s Moose and haven’t coded with Doodle yet.

To show how Moose is powerful, let’s write the Confident Code example in Perl

Confident Code in Moose

I whipped this up for discussion so it can definitely be refined, however, it should get the ideas across. Here are some things that are covered:

  1. Input Validation: Inputs are typed and given default values in both attributes (e.g. width as an Integer with default 40) and methods (e.g. messages as an ArrayRef of Strings – ArrayRef[Str])
  2. Separation of Concerns: attributes and methods are conveniently separated for easier maintenance and testing
  3. Output: IO::All is used for output, defaulting to STDOUT (i.e. ‘-’)
use MooseX::Declare;

class CowsayTheMooseway {
  use Capture::Tiny qw(capture_merged);
  use IO::All;
  use Log::Log4perl qw(:easy);
  Log::Log4perl->easy_init($INFO);

  has width       => ( isa => 'Int', is => 'rw', default => 40      );
  has eyes        => ( isa => 'Str', is => 'rw', default => ''      );
  has cowfile     => ( isa => 'Str', is => 'rw', default => 'moose' );
  has destination => ( isa => 'Str', is => 'rw', default => '-'     );

  method command {
    my $command = 'cowsay -W '           . $self->width;
    $command   .= ' -e '.$self->eyes    if $self->eyes;
    $command   .= ' -f '.$self->cowfile if $self->cowfile;
  }

  method say (ArrayRef[Str] $messages) {
    my $command = $self->command;
    my $output  = capture_merged {
      system "$command $_" for @$messages;
    };
    $output > io $self->destination;
    INFO "wrote to ".$self->destination;
  }
}

Note how the input attributes and methods are all separated cleanly with their own input types.

Here’s how to run the class:

use CowsayTheMooseway;

CowsayTheMooseway->new(
  destination => 'moo.txt'
)->say([
  'Psst...Want More Confidence?',
  'Here is How',
  'Moose This Way'
]);

And the results:

 ______________________________
< Psst...Want More Confidence? >
 ------------------------------
  \
   \   \_\_    _/_/
    \      \__/
           (oo)\_______
           (__)\       )\/\
               ||----w |
               ||     ||
 _____________
< Here is How >
 -------------
  \
   \   \_\_    _/_/
    \      \__/
           (oo)\_______
           (__)\       )\/\
               ||----w |
               ||     ||
 ________________
< Moose This Way >
 ----------------
  \
   \   \_\_    _/_/
    \      \__/
           (oo)\_______
           (__)\       )\/\
               ||----w |
               ||     ||

Let me know what you think. Should this type of coding be done in Ruby? Is Doodle the way to go or is there something else? Can the Perl code be improved?

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
3 Comments  comments 
formats

Explaining Why Moose is Awesome – The Direct Way

The Perl community knows that Moose is awesome as evidenced by the amount of talks at YAPC and the number of times it is mentioned on blog articles and discussion boards. However, to me, what often seems to get lost is why Moose is awesome and why it is relevant, especially to those outside the Perl community. At the same time, I think it is vitally important to simply and directly convey why Moose is awesome to both increase Perl awareness and encourage other languages to adopt similar techniques.

This article covers how I explain Moose and why I think it not only addresses the key concerns with Perl but makes it more attractive when compared to some other languages as they are today.

The ideas here started when I was recently showing off a project of mine to a YC-funded Xoogler that asked me what it was written in. When I told him it was mostly Perl, he said, “Oh Perl, you’re old school!” I could have left it at that if it were true, but it wasn’t and I wanted to explain why it wasn’t. I had specifically chosen Perl for this project in 2011 over Python because it provided things that were not yet in (or popular in) other languages. This is how I presented it.

Why: Moose makes Perl more Maintainable than Other Languages

Nothing is more attention grabbing than saying something opposite to what people expect so this is exactly what I said. Of course, the gauntlet was already thrown so I wanted to respond in kind. I was able to say this honestly because I backed it up using my recent experiences using Perl and JavaScript with some Ruby, Python and C#. At the same time, it is important to tell it as it is and not sugar coat things. I’ve had my share of head scratching when looking at my own 6-month old pre-Moose Perl code so I’m painfully aware of Perl’s write once reputation.

The way I explain it is that while it can be argued that some languages may be cleaner than Perl on a line-by-line basis, my view is that Moose makes Perl cleaner and more maintainable on a class and method basis. By providing an expressive attribute getter/setter system, Moose helps you freeze APIs and limit the scope for maintainability to the attribute / method level. It does this by adding one key feature…

How: Moose adds Static Typing to Getters and Setters

What Moose really does is add type checking to getters and setters. While it supports some built-in types (integers, numbers, strings, arrayrefs, hashrefs, etc.), the real power is in composite types like an arrayref of objects e.g. ‘ArrayRef[Email::Simple]‘. By freezing the API, Moose improves coding speed and reduces errors, enabling me to write 1000s of lines of code and having it all run with minimal typo correction. The great thing about this explanation is that static typing is something that people are generally looking at and interested in for many dynamic languages so once you get this point across, the benefits are easily understood and don’t have to be explained further.

Bonus: Moose Eliminates Write Once Code

Generally, I don’t have to go further than explaining static typing with Moose; however, it can be useful to note that Moose also eliminates write-once Perl code by providing a “template” for getter/setter creation. Creating classes this way ensures that your classes will follow well-ordered, well-understood convention. After having to maintain several old pre-Moose Perl apps, I was pleasantly surprised when I opened a 6 month old Moose class and had to double-check to see if it was written months ago or just the other day.

Summary

So there you have it. In one fell swoop, the key benefits of Moose are explained in a way that addresses long-running, key concerns with Perl with key features not yet readily available, but often desired, in other languages.

By succinctly explaining why Moose is awesome, hopefully we can raise more awareness for the great work being done by the Moose team and the Perl ecosystem.

What You Can Do

If you like this approach, one thing that can be done to more quickly explain the effectiveness of Moose programming is to create side-by-side code examples of code with Moose and without Moose, in other languages. This is often done to compare and contrast languages and I think it would be useful to more succinctly explain the benefits here. Expressive modern Perl code can impress those outside the community, there just needs to be more of it.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
2 Comments  comments 
formats

Perl and Python – Two Nights in Silicon Valley

One of the well-touted advantages of living in Silicon Valley is that it is a crucible of talent. Along with colleagues, conferences and unconferences, there are many regular user groups in the valley, of which I’ve attended several. This past week, I decided to check out the local SF Perl Mongers and BayPIGgies (Python) which both had meetings.

Night One – SF Perl Mongers

I’ve been to two YAPCs but this was my first SF Mongers meeting and I was pleasantly surprised. As it happens, the talk was being given by Ingy döt Net, the creator of YAML, who I met briefly several years ago. I also got to meet up with Larry Wall and Miyagawa, both of whom I last spoke with at YAPC::NA, as well as new acquaintances including Catalyst core team members Tom Doran (who I recognized from his Gravatar) and Florian Ragwitz (rafl). Tom and Florian were here from Europe courtesy of Google for their participation in the Google Summer of Code.

Ingy’s talk was on Acmeism, his movement to simultanenously write code in multiple object-oriented languages including JavaScript, Perl, PHP, Python and Ruby among many others. The genesis of this project seemed to be from his experience with YAML where each language would have it’s own, often slightly different, implementation. With Acemeism, changes could be pushed out to multiple languages with one code update. I can see the advantage of Acmeism for a cross-language library that requires interoperability. For example, it would be great if there was only one interoperable version of LZF compression; however, it seems that not only does each language have a non-interoperable implementation, there may be multiple non-interoperable implementations per language. More generally, I’ve been interested in the compilation of interpreted code to C/C++ for the performance boost, a la Facebook’s Hip Hop, and did mentioned this to Larry in the context of Moose and Perl 6.

Afterwards, we went over to 21st Amendment Brewery for some drinks and dinner.

Night Two – Python BayPIGgies

This meeting was run by a few folks from Google, including JJ Behrens, from the YouTube API team, and Wesley Chun, a developer advocate based in SF. Both JJ and Wesley were held up so we were treated to a talk by Alex Martelli the #2 ranked “Famous Python Programmer at Google” according to Wesley’s slides he was giving. Alex, JJ and Wesley all provided great talks and insights.

Some good takeaways from the night included:

  1. The original spidering code for the crawler was done in Python by a part-timer who received a 5% equity stake.
  2. Back when Google Video and YouTube were competitors, the YouTube team was out-innovating the Google team because they were using Python while the Google team was using C++.
  3. While Python is one of the approved deployment languages, it is used mostly in the back-end sysadmin, deployment and QA code. The user apps are either in C++ or are moving that way which can provide up to a 20% speed up.
  4. PyPy was recommended by Wesley as a faster version of Python that is also faster than CPython on many tests. Digging into PyPy a bit afterwards, I was pleasantly surprised to learn it’s written in RPython, a statically-typed restricted version.

From Google’s perspective, it seems like Python is great for rapid prototyping and competitive feature deployment as well as back-end work; however, once the feature set is relatively stable, there are performance gains to be had from porting the user-facing applications. While perhaps not the approach, everyone would take, it’s interesting to hear their perspective.

Afterwards we broke up and went our own ways. There did not seem to be the same desire to hang out afterwards as in SF.

Both presentations were posted online:

  1. Wesley’s Python @ Google (IO 2011) presentation
  2. JJ’s YouTube for Business presentation

Summary

I was impressed by the people and knowledge gained at both meetings. They were intimate in size so it was easy to get to know people and engage in conversation. I’m looking forward to more meetings in the future.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

Filtering Popular Mailing List Discussions with DateTime

Discovering Information via Better Browsing

A great way to get into the thick of a new software project you want to learn or need to use is to check out popular mailing list discussions. Typically these will have detailed answers and multiple points of view. Recognizing this, Grokbase was designed to sort many things by popularity, or activity volume, such as discussions, groups, users, etc. I found it useful enough that I made it the default monthly view and especially like getting a quick overview of a group by browsing these pages on my iPad using the monthly prev and next links.

I was recently chatting with Dave Rolsky, creator of the stalwart DateTime and newer Courriel email distributions, who mentioned it would be great if you could select popular discussions based on time periods longer than a month, such as 6 months or a year. It was easy to see where he was going with this. While sorting across all time is useful for curiosity once in a while, more recent popular discussions are what’s needed to keep up to date,

Needing a break from a larger project, I decided to see how quickly I could implement this. In the end, it took just a few hours from start to deployment, mostly helped by the fact that DateTime performs excellent date gymnastics. Finding the correct date to use was the easiest part using DateTime’s subtraction feature. DateTime’s functionality and excellent documentation made adding this feature a snap. In fact, the only thing it cost me was a little sleep ;)

You can see the results on any of the more popular mailing lists such as Perl5 Porters (P5P) as shown here. Under the popular tab, there are now filters for time periods from 3 months to 5 years and for all time. I have to say I find it fun to dig into mailing lists this way. More than blogs or articles, diving into P5P really makes you feel like you’re at a YAPC.

For more on DateTime, checkout the MetaCPAN page and the Grokbase list archive.

Right now, this only accessible on the top popular tab. It is also enabled only for more popular groups where this is more likely to be useful. Try it out and let me know what you think.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments 
formats

A Tribute to Steve Jobs (1955-2011)

Thinking Differently

Steve Jobs has passed away and the world has lost someone that showed what could be accomplished with a singular and distinctive vision. His was an approach that fused usability, appearance and functionality into a cohesive whole. I’ve read many of the tributes and articles and two of which that stood out from me were ons that showed him on a more personal level as told by Stephen Wolfram and Al Alcorn.

My condolences to his family, friends, and colleagues during this difficult time.

For my tribute, I’ll simply list a few of his quotations that helped focus his efforts on his and his team’s achievements.

  1. Inspiration
  2. Design
  3. Marketing
  4. Recruiting
  5. Managing
  6. Revenues
  7. Discipline
  8. Follow Your Heart


1. Inspiration

  • We think the Mac will sell zillions, but we didn’t build the Mac for anybody else. We built it for ourselves. We were the group of people who were going to judge whether it was great or not. We weren’t going to go out and do market research. We just wanted to build the best thing we could build.
  • Innovation has nothing to do with how many R&D dollars you have. When Apple came up with the Mac, IBM was spending at least 100 times more on R&D. It’s not about money. It’s about the people you have, how you’re led, and how much you get it.


2. Design

  • Unfortunately, that’s too rare a commodity. A lot of people in our industry haven’t had very diverse experiences. So they don’t have enough dots to connect, and they end up with very linear solutions without a broad perspective on the problem. The broader one’s understanding of the human experience, the better design we will have.
  • That’s been one of my mantras — focus and simplicity. Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it’s worth it in the end because once you get there, you can move mountains.
  • Design is a funny word. Some people think design means how it looks. But of course, if you dig deeper, it’s really how it works. The design of the Mac wasn’t what it looked like, although that was part of it. Primarily, it was how it worked. To design something really well, you have to get it. You have to really grok what it’s all about. It takes a passionate commitment to really thoroughly understand something, chew it up, not just quickly swallow it. Most people don’t take the time to do that.


3. Marketing

  • We’ve never worried about numbers. In the market place, Apple is trying to focus the spotlight on products, because products really make a difference. [...] Ad campaigns are necessary for competition; IBM’s ads are everywhere. But good PR educates people; that’s all it is. You can’t con people in this business. The products speak for themselves.
  • It’s not about pop culture, and it’s not about fooling people, and it’s not about convincing people that they want something they don’t. We figure out what we want. And I think we’re pretty good at having the right discipline to think through whether a lot of other people are going to want it, too. That’s what we get paid to do. We just want to make great products.
  • Our DNA is as a consumer company – for that individual customer who’s voting thumbs up or thumbs down. That’s who we think about. And we think that our job is to take responsibility for the complete user experience. And if it’s not up to par, it’s our fault, plain and simply.


4. Recruiting

  • Recruiting is hard. It’s just finding the needles in the haystack. You can’t know enough in a one-hour interview. So, in the end, it’s ultimately based on your gut. How do I feel about this person? What are they like when they’re challenged? I ask everybody that: ‘Why are you here?’ The answers themselves are not what you’re looking for. It’s the meta-data
  • When I hire somebody really senior, competence is the ante. They have to be really smart. But the real issue for me is, Are they going to fall in love with Apple? Because if they fall in love with Apple, everything else will take care of itself. They’ll want to do what’s best for Apple, not what’s best for them, what’s best for Steve, or anybody else.
  • We’ve had one of these before, when the dot-com bubble burst. What I told our company was that we were just going to invest our way through the downturn, that we weren’t going to lay off people, that we’d taken a tremendous amount of effort to get them into Apple in the first place – the last thing we were going to do is lay them off.


5. Managing

  • So when a good idea comes, you know, part of my job is to move it around, just see what different people think, get people talking about it, argue with people about it, get ideas moving among that group of 100 people, get different people together to explore different aspects of it quietly, and, you know – just explore things.
  • My job is to not be easy on people. My job is to make them better.


6. Revenues

  • A lot of companies have chosen to downsize, and maybe that was the right thing for them. We chose a different path. Our belief was that if we kept putting great products in front of customers, they would continue to open their wallets.


7. Discipline

  • People think focus means saying yes to the thing you’ve got to focus on. But that’s not what it means at all. It means saying no to the hundred other good ideas that there are. You have to pick carefully.


8. Follow Your Heart

  • Almost everything – all external expectations, all pride, all fear of embarrassment or failure – these things just fall away in the face of death, leaving only what is truly important.
  • Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose.
  • You are already naked. There is no reason not to follow your heart.

Summary

Thank you Steve.

May you rest in peace.

 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments