PWC 312 Time to be random

PWC 312 Time to be random


Weekly Challenge 312 is here. 312 is the telephone area code in and around Chicago. Once upon a time, when I dressed so fine, threw the bums a dime in my prime, it was my area code, but then we ran out of phone numbers and, like a soccer team falling on hard times, the suburbs were relegated to the far less prestigious 708 (12×59 — what kind of useless factoring is that?), and then the only slightly better 630 (2×3×5×7 × 3).

Be that as it may, here we are, ticking away the moments that make up a dull day, and so we are going to fritter and waste the hours in an off-hand way solving a couple of programming problems.

Circular typewriter

You are given a typewriter with lowercase
English letters a to z arranged in a circle.
Typing a character takes 1 sec. You can
move the pointer one character at a time
clockwise or counter-clockwise. The pointer
initially points at 'a'.

Write a script to return the minimum time it
takes to print the given string.
Enter fullscreen mode

Exit fullscreen mode

  • Example 1: Input: $str = "abc" Output: 5

    • The pointer is at ‘a’ initially.
    • 1 sec – type the letter ‘a’
    • 1 sec – move pointer clockwise to ‘b’
    • 1 sec – type the letter ‘b’
    • 1 sec – move pointer clockwise to ‘c’
    • 1 sec – type the letter ‘c’
  • Example 2: Input: $str = "bza" Output: 7

    • The pointer is at ‘a’ initially.
    • 1 sec – move pointer clockwise to ‘b’
    • 1 sec – type the letter ‘b’
    • 2 sec – move pointer counter-clockwise to ‘z’
    • 1 sec – type the letter ‘z’
    • 1 sec – move pointer clockwise to ‘a’
    • 1 sec – type the letter ‘a’
  • Example 3: Input: $str = "zjpc" Output: 34



The plan

Each time we type a letter, that takes one second, so the time will be at least the length of the string.

To that, we need to add the time to move from one letter to the next, starting with the finger poised over ‘a’. Going around the circle is modulo arithmetic, but we need to check both directions and pick the shorter.



The code

sub minTime($str)
{
    # List of pointer positions, beginning at A
    my @ptr = (0, map { ord($_) - ord("a") } split(//, $str));

    # Array of closest differences between positions
    my @d = slide { min(($a - $b) % 26, ($b-$a) % 26 ) } @ptr;

    # One second for each character in str, plus time to move
    my $t = length($str) + sum0(@d);
    return $t;
}
Enter fullscreen mode

Exit fullscreen mode

Notes:

  • split(//, $str) Turn the string into a list of characters
  • map ord($_) - ord("a") } Turn the characters into indices between 0 and 25
  • my @ptr = (0, ...) The hovering finger is initially over ‘a’, so insert a 0 into the front of the positions.

    • slide { ... } From List::MoreUtils, move across the list a pair at a time. Make a list of differences between character positions.

No loops, no index variables, no off-by-one errors.

There are $n balls of mixed colors: red, blue or
green. They are all distributed in 10 boxes labeled 0-9.
You are given a string describing the location of balls.
Write a script to find the number of boxes containing
all three colors. Return 0 if none found.
Enter fullscreen mode

Exit fullscreen mode

  • Example 1 Input: $str = "G0B1R2R0B0"

    • Output: 1
    • The given string describes there are 5 balls as below:
      • Box 0: Green(G0), Red(R0), Blue(B0) => 3 balls
      • Box 1: Blue(B1) => 1 ball
      • Box 2: Red(R2) => 1 ball
  • Example 2 Input: $str = "G1R3R6B3G6B1B6R1G3"

    • Output: 3
    • The given string describes there are 9 balls as below:
      • Box 1: Red(R1), Blue(B1), Green(G1) => 3 balls
      • Box 3: Red(R3), Blue(B3), Green(G3) => 3 balls
      • Box 6: Red(R6), Blue(B6), Green(G6) => 3 balls
  • Example 3 Input: $str = "B3B2G1B3"

    • Output: 0
    • Box 1: Green(G1) => 1 ball
    • Box 2: Blue(B2) => 1 ball
    • Box 3: Blue(B3) => 2 balls



The plan

We’ll need to split up the input string and create a representation of balls in boxes. Two ways come to me quickly

  • an array of 10 strings, one for each box. The G, B, or R from the input is appended onto its box.
  • an array of 10 lookup tables. Each box has a little hash that counts the number of G, B, or R balls from the input.

Once the balls are categorized by box, then we can check how many boxes contain at least one of G, B, and R. In a string, that can be a regular expression match, or a simple index function. In the hash, it would be a search for counts greater than zero.



The code, with regular expressions

sub ballBox($str)
{
    my @box = ("") x 10;
    for my ($color, $b) (split(//, $str)) # As of 5.36, multiple values allowed
    {
        $box[$b] .= $color;
    }
    return scalar grep { /B/ && /G/ && /R/ } @box;
}
Enter fullscreen mode

Exit fullscreen mode

Notes:

  • my @box = ("") x 10 An array of 10 boxes (0 to 9), each initialized with an empty string.
  • for my ($color, $b) ... My favorite recent perl addition: processing tuples out of an array.
  • $box[$b] .= $color Distribute balls into their boxes.
  • grep { /B/ && /G/ && /R/ } Easiest way to use regular expressions to mean all three must be present.
  • return scalar ... I only want a zero/non-zero return code. If the function is used in an array context (which it is in a unit test is() statement), it returns a list instead of a number. No want.



The code, without regular expressions

sub ballBox_idx($str)
{
    my @box = ("") x 10;
    for my ($color, $b) (split(//, $str)) # As of 5.36, multiple values allowed
    {
        $box[$b] .= $color;
    }
    return scalar grep { index($_, "G") >= 0 && index($_, "B") >= 0 && index($_, "R") >= 0 } @box;
}
Enter fullscreen mode

Exit fullscreen mode

This is almost the same, except that the regular expression matches have been replaced by index() checks. Even though regular expressions are efficient, there’s a lot going on under the hood, and I suspect (but need to check) that a simple string scan might be faster.



The code, with a hash

sub ballBox_hash($str)
{
    my @box;
    for my ($color, $b) (split(//, $str))
    {
        $box[$b]{$color}++;
    }
    return scalar grep { $_->{G} && $_->{B} && $_->{R} } @box;
}
Enter fullscreen mode

Exit fullscreen mode

Once again, we split the string in a similar way, but this time we’re going to count occurrences.



The performance

My intuition for performance is notoriously bad. Let’s set up a benchmark test. I’ll create a random string that distributes balls randomly, and run 50,000 or so iterations of each variation.

sub runBenchmark($repeat)
{
    use Benchmark qw/cmpthese/;

    my $str;
    foreach my $color ( qw(R G B ) )
    {
        $str .= $color . int(rand(10)) for 1 .. 25;
    }

    cmpthese($repeat, {
            grep  => sub { ballBox($str) },
            index => sub { ballBox_idx($str) },
            hash  => sub { ballBox_hash($str) },
        });
}
Enter fullscreen mode

Exit fullscreen mode

bash-3.2$ perl ch-2.pl -b 50000
         Rate  hash  grep index
hash  44248/s    --   -3%  -12%
grep  45455/s    3%    --   -9%
index 50000/s   13%   10%    --
Enter fullscreen mode

Exit fullscreen mode

This time, I’m at least partially right. Using simple string functions slightly beats hash tables and regular expressions, at least for randomly distributed data of a moderate number of balls.
I should also check for other distributions that are most likely in my actual application, like mostly empty boxes, or very large numbers of balls, or combinations like thousands of green balls but only a few red.

However, I’m going to randomly reclaim my time and call it a win.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *