<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
   <channel>
      <title>emaki</title>
      <link>http://emaki.uc.org/blog/</link>
      <description></description>
      <language>en</language>
      <copyright>Copyright 2007</copyright>
      <lastBuildDate>Mon, 15 Jan 2007 22:42:10 -0500</lastBuildDate>
      <generator>http://www.sixapart.com/movabletype/?v=3.2</generator>
      <docs>http://blogs.law.harvard.edu/tech/rss</docs> 

            <item>
         <title>Programming Challenge Three</title>
         <description><![CDATA[<p>So, I haven't posted my solution to last weeks puzzle, for a couple of reasons.  The big one is that my solution doesn't work yet.  I successfully solve a problem, but on closer inspection, it was quite a different one from the one that I posed.  And in fact, I can't solve the problem I posed the way I approached it.  Thus I need to start over.  And I will.  But I can't miss this weeks challenge, or I'm hosed.</p>

<p>This week, I'll going for something slightly simpler, though it has virtually infinite room for expansion and optization.</p>

<h3>Challenge Three</h3>

<p>Write a pattern matching engine.  Given a pattern and a string, return a boolean match or no match response:</p>

<pre>
matches = pmatch( pattern, inputstr );
</pre>

<p>The pattern syntax can be anything other than straight substrings, though you must start by selecting a subset of the PCRE syntax.  Start with the '*', '+' and '?' quantifiers, together with character classes and alternations.  Ignore capturing, but include grouping.  That set of features is probably the minimum set to provide useful matching facilities, but it's not so overwhelming as to require enormous amounts of code.  Add to those features as long as it remains interesting.</p>

<p>Some random notes:</p>

<ul>
<li>If you primarily work in a 'high level language' like Python or Perl, I suggest trying this in something lower level.</li> 
<li>I also suggest seperating the pattern compilation from the matching, either internally, or explicitly in the API.</li>
<li>Note that classes can be represented as alternations.  '+' can be expanded to a representation using only '*'.  '?' can be expressed as an alternation.</li>
</ul>

<p>It may seem foolish to work at solving a problem so widely and commonly solved, but this is in fact a challenging problem with lots to think about.  If you work with regular expressions every day, take this opportunity to look at them from the bottom.</p>]]></description>
         <link>http://emaki.uc.org/blog/2007/01/programming_challenge_three.html</link>
         <guid>http://emaki.uc.org/blog/2007/01/programming_challenge_three.html</guid>
         <category></category>
         <pubDate>Mon, 15 Jan 2007 22:42:10 -0500</pubDate>
      </item>
            <item>
         <title>Programming Challenge Two</title>
         <description><![CDATA[<p>I had fun doing last week's challenge, no other solutions sent or linked yet, but there's no timelimit; I'll accept entries indefinitely.  Other program notes: I think that I'm going to try to post challenges on Monday, my solutions on Thursdays.</p>

<h3>Challenge Two</h3>

<p>The <a href="http://en.wikipedia.org/wiki/Knight's_tour">Knight's Tour</a> is an old math problem: given a standard chessboard and a single knight, can you move the knight such that you land on every square exactly once.  The knight must move as per normal chess rules, you can start and end on any square.  A slight variation is a closed tour, where the knight ends up one legal move away from the starting point.</p>

<p>The challenge is to produce a program (in any language) that returns a solution to the puzzle.</p>

<p>None of the following are problem requirements, but worth considering:</p>

<ul>
<li>generalize the problem, and allow the size of the chess board to be user specified.  Schwenk's Theorem provides some board constraints that are known to affect problem satisfiability.  (Short version is that there are solutions if at least one dimension is even, and the board is reasonably large.)</li>
<li>produce only closed path solutions, or allow the user to select</li>
<li>produce random solutions.  There are known to be billions of solutions for the standard board - program so as to find a solution non-deterministically.</li>
<li>Wikipedia notes that it is possible to solve the general problem in linear time.  It's not required that the solution be optimal, but consideration should likely be given to efficiency.</li>
</ul>

<p>Some notes, none of which are promised to be helpful, but have occurred to me as I posed the problem:</p>

<ul>
<li>The solution set can be represented as a sequence of transition pairs - once all the transitions are selected, then the ordering is unambigious.</li>
<li>All of the knight's moves for a given square are represented by a single vector, reflected horizontally, vertically or diagonally.</li>
</ul>]]></description>
         <link>http://emaki.uc.org/blog/2007/01/programming_challenge_two.html</link>
         <guid>http://emaki.uc.org/blog/2007/01/programming_challenge_two.html</guid>
         <category></category>
         <pubDate>Sun, 07 Jan 2007 10:22:42 -0500</pubDate>
      </item>
            <item>
         <title>programming challenge one</title>
         <description><![CDATA[<p>It's the first week of a new year, and it's the sort of day that one feels like saying: I'm going to do <em>X</em> every week for the rest of the year.  Then you realize that you likely don't really.  But what the hell, let's give it a shot:</p>

<h3>Weekly Programming Challenge</h3>

<p>I'm going to gather ideas from a bunch of places, and then write up a challenge.  The problems are language neutral, so maybe I'll try some in a few different languages, or different problems in different languages.</p>

<p>There are a few versions of "99 Lisp/Prolog/Whatever Programming Problems" floating around, and I'll take my inspiration for the first few from them.  </p>

<h3>Challenge One</h3>

<p>Goldbach's conjecture states:</p>

<blockquote>
Every even integer greater than 2 can be written as the sum of two primes.
</blockquote>

<p>Write a function that, given a even integer greater than 2, returns a pair of primes that sums to the given integer.</p>

<ul>
<li>Note that the prime sums are not necessarily unique.  You can pick a rule for which set to return, but only return one pair.  (ie. first found, or the pair with the smallest absolute difference)</li>
<li>Should raise an exception/etc. if not even, not larger than 2.</li>
</ul>

<h4>Example: (in Perl)</h4>

<blockquote>
<pre>
my @result = goldbach( 6 );  # result 3, 3
my @result = goldbach( 10 ); # result 3, 7 -or- 5, 5 
</pre>
</blockquote>

<h3>Solutions:</h3>

<p>First week, so I need to work out the plan.  I'll post my solutions after at least 24 hours, but I'll just link to them without notes or solution details on this page.  I'll link to anyone else's solution that requests it, you can also link to your solutions/etc. in comments.  I'll defer posting comments that contain too much information until at least after the first 24 hours.</p>

<p>I'll always provide my solutions, but I'll stress that I make no suggestion at all that mine are good or completely correct.  In particular, I'm planning on trying these problems in a variety of languages, some of which I am likely to suck at.  (The only language I can reasonably claim to not suck in is Perl.)</p>

<h4>My Solutions:</h4>

<ul>
<li><a href="http://emaki.uc.org/challenges/week1/goldbach_pl.txt">Perl solution</a>:  All the program and solution notes are with this code.  Solution is only about 9 logical lines of code, perhaps a little too golfed in places.</li>
<li><a href="http://emaki.uc.org/challenges/week1/goldbach_c.txt">C solution</a>: Virtually no documentation, follows the exact same algorithm as the Perl solution.</li>
</ul>

<h4>Other Solutions:</h4>

<ul>
<li><a href="http://hapoteh.net/emaki/emaki1.pl">hapoteh</a> (Perl) - notes inline.  Slightly different approach, precomputes primes in range first.</li>
</ul>]]></description>
         <link>http://emaki.uc.org/blog/2007/01/programming_challenge_one.html</link>
         <guid>http://emaki.uc.org/blog/2007/01/programming_challenge_one.html</guid>
         <category>weekly challenge</category>
         <pubDate>Thu, 04 Jan 2007 08:54:58 -0500</pubDate>
      </item>
            <item>
         <title>Long Overdue</title>
         <description><![CDATA[<p><img src="http://emaki.uc.org/images/TPR-v2i3-cover.png" align="right" alt="cover of The Perl Review volume 2, issue 3" hspace="5" vspace="10">Long overdue for an update.  This will thus be scattershot.  Our mosaic/collage article was published last month in <a href="http://theperlreview.com">The Perl Review</a>, and <a href="http://college.livetext.com/college/index.html">LiveText++</a> paid for everyone at YAPC::NA to get a free copy with their registration bag-thing.  Got a bit of author email, and at the conference a gentleman from Perl Seminar New York found me and gave me a photo collage he found in a travel magazine.  (Which, used non-square photos, something I would like to support, but allowed adjacent duplicates.)</p>

<p>YAPC was good this year.  I didn't find it as intellectually rewarding as the previous two, but I don't think that's necessarily a reflection on the conference.  I'm just not that interested in AJAX, try as I might, and there were a number of AJAX talks.  Also, a lot of the cooler things I have seen in previous years.  MJD's talk would have been awesome if I hadn't read his book (I sat in for a while - interesting, but familiar.)  As with last year, the best talk were pmichaud's Rule and Parser talks.  I missed out on the APL talk, though, because I misunderstood the abstract.  Ahh well.  Overall, still an awesome week, well worth it.  A great break, and a change of scene.</p>

<p>I started working on two projects at YAPC, which I've barely touched since coming back ($work has been quite consuming).  They are: converting the guts of the mosaic code to use C via-XS.  I figure I can likely get a hundredfold performance boost that way.  Learning curve is quite interesting.  Like everything over the last week, I've just been reading lightly during downtime.</p>

<p>The other project is to write a Huffman compressor in C++ -- part of my longer term goal to write one in ten different languages.  I've done it in Perl5 and Perl6 (as much as possible).  Would like to do it in PIR, then perhaps some of: OCaml, Haskell, Ruby, Python, Eiffel, Scheme.   Part of me is tempted to toss Fortran in there.  It's long past due for me to take a world tour of languages.  Huffman compression is interesting, yet simple enough.  The Perl5 implementation took an hour or two - so add in the learning curve of a new language, and you've got a good sized project.  Solving the same problem multiple times has the upside of allowing direct comparison of approaches.  I'll be careful to try to approach the problem fresh each time, so that I'm not writing Perl in Haskell, etc.</p>]]></description>
         <link>http://emaki.uc.org/blog/2006/07/post.html</link>
         <guid>http://emaki.uc.org/blog/2006/07/post.html</guid>
         <category>perl</category>
         <pubDate>Mon, 17 Jul 2006 08:52:20 -0500</pubDate>
      </item>
            <item>
         <title>YCbCr</title>
         <description><![CDATA[<p>I have refactored quite a lot, and allowed the colourspace to be selected at mosaic build time.  This lets me use the same tile index file and experiment with various colourspaces and weightings without rebuilding the index.  I've built up the corpus to about 80&nbsp;000 tiles, and since the vectors are held in RAM, I needed to move off my laptop and onto a machine with more RAM.  </p>

<p>Here is a mosaic from a photo I took on Georgian Bay:</p>

<p><img src="http://emaki.uc.org/mosaic/2102_large_80k_ycbcr_w111.thumb.jpg" alt="50 tiles wide from an 80k+ tile corpus.  YCbCr colourspace matching with an even weighting." title="50 tiles wide from an 80k+ tile corpus.  YCbCr colourspace matching with an even weighting." border="0" /></p>

<p>Here is a close up of a transition area (sky -> trees -> water):</p>

<p><img src="http://emaki.uc.org/mosaic/2102_large_80k_ycbcr_w111.zoomed.jpg" alt="zoomed view"></p>

<p>The repeat limit was manhattan distance of 40, the build order was shuffled.  I used the YCbCr colourspace, which I think performs way, way better than HSV/HSL.  YCbCr is, I think, the 8-bit version of YUV, and I think that it does a good job of mapping colour perceptually.  It gives two components to chroma, rather than HSVs one, which makes for a clean color match.  I found that RGB tended to favour blue disproportionately... in fact primaries seemed to match more strongly.  The YUV/YCbCr matching gives a softer, more 'real' and slightly more impressionistic feel, I think.  I intend to run this again with just RGB, for a straight comparison of the two.  </p>

<p>HSL/HSV also represent chroma as a loop... which means that at the 360-0 boundary, the maximum span, there is great similarity.  This resulted in poor matches near the boundary.  The two dimensional chroma space of YUV/YCbCr place perceptual extremes at the edges, with no wrapping.  This means that matches appear good throughout the perceived colour gamut.  I'm not a graphics expert in the least, so you'll excuse some heavy misuse of jargon and some similarly heavy glossing.</p>

<p>For those who want the full 4.3MB bomb, <a href="http://emaki.uc.org/mosaic/2102_large_80k_ycbcr_w111.jpg">here is the full-size version (3750x6450 pixels)</a>.  Be forewarned, I just selected the images based on license characteristics, I didn't review all 80k tiles for appropriateness, etc.</p>

<p>I think that the fine detail improves considerably as I add to the corpus, but obviously that is RAM limited.  Each tile requires an in-RAM 75 element array to represent the vector.  It also extends the runtime of the match linearly... this mosaic ran overnight, for about 9 hours.</p>]]></description>
         <link>http://emaki.uc.org/blog/2006/04/ycbcr.html</link>
         <guid>http://emaki.uc.org/blog/2006/04/ycbcr.html</guid>
         <category>mosaic</category>
         <pubDate>Sun, 09 Apr 2006 10:30:06 -0500</pubDate>
      </item>
            <item>
         <title>Sudoku Article Published</title>
         <description><![CDATA[<p><img src="http://emaki.uc.org/images/TPR-v2i2-cover.png" align="right" vspace="5" hspace="5" /> I didn't mention this earlier, but my article on Sudoku Generation was printed in the Spring edition of <a href="http://theperlreview.com/">The Perl Review</a>.  <a href="http://theperlreview.com/SamplePages/ThePerlReview-v2i2.p10.pdf">The first page of the article is available in PDF</a>  And of course the whole thing is available to download for subscribers.  For subscribers only, <a href="http://theperlreview.com/Subscribers/CodeArchive/v2i2/sudoku-generator.txt">the actual code from my generator</a> is posted on the TPR website <em>(password required)</em>.</p>

<p>I haven't gotten my physical copy in the mail yet, though.  I also haven't gotten any author email telling me I'm brilliant or that I missed something obvious.  So that is half good news, half disappointing.  </p>

<p>I've sort of stalled on my Sudoku Grader project... I did complete the Locked Candidate detector, and flesh out the outer framework of the Grader, but I haven't gotten much farther.  I might pick it up again in a month or so.  I'm trying to follow AudreyT's approach, and be productive in hobby-coding by optimizing for fun.  The big downside of that is that when you hit a patch of boring refactoring, you tend to stall.</p>

<p><em>Update: When I got home, it had arrived.  It looks pretty good, I think.</em></p>]]></description>
         <link>http://emaki.uc.org/blog/2006/03/sudoku_article_published.html</link>
         <guid>http://emaki.uc.org/blog/2006/03/sudoku_article_published.html</guid>
         <category>tpr</category>
         <pubDate>Wed, 29 Mar 2006 09:44:26 -0500</pubDate>
      </item>
            <item>
         <title>prosaic mosaic</title>
         <description><![CDATA[<p>I need to (and plan to) launch this blog in earnest.  I thought that I would quickly mention that Daniel and I have been working on a photo mosaic generator in Perl with GD, using the Flickr API to harvest thumbnails for the corpus.  I'll try to doc things out here, but Daniel has <a href="http://da-lj.livejournal.com/68865.html">posted examples and some explaination on his LJ already</a>.</p>

<p>Here's my favourite example so far:</p>

<p><img src="http://emaki.uc.org/mosaic/best_flower_resized.jpg" /></p>

<p>I am most of the way through the HSV chunk that he mentioned... I am including the abilty to weight the HSV so that you can choose which component is more important.  </p>

<p><em>more added later:</em></p>

<p>Someone on Daniel's blog suggested YUV instead of HSV:</p>

<pre>
my $y =  0.299 * $r + 0.587 * $g + 0.114 * $b;
my $u = -0.147 * $r - 0.289 * $g + 0.436 * $b;
my $v =  0.615 * $r - 0.515 * $g - 0.100 * $b;
</pre>

<p>That actually has the additional benefit of being easier to calculate.</p>

<p>I'm pretty torn about my system of weighting dimensions of the colour space.  I'm curently scaling my vectors in those dimensions when I create the vectors, rather than weighting differently in the vector distance calculation phase.  That makes loads of sense, since it avoids millions of calculations... but the way I have things set up, it applies the scaling at tile index time, rather than index load time.  I'm thinking that I should just always index in RGB, then transform the space and scale the vectors during the pre-matching phase (what Daniel and I have been calling phase 2a).</p>

<p>Okay, I'm not torn anymore.  That last paragraph untore me.</p>]]></description>
         <link>http://emaki.uc.org/blog/2006/03/prosaic_mosaic.html</link>
         <guid>http://emaki.uc.org/blog/2006/03/prosaic_mosaic.html</guid>
         <category>mosaic</category>
         <pubDate>Tue, 28 Mar 2006 10:26:57 -0500</pubDate>
      </item>
            <item>
         <title>grading pt1</title>
         <description><![CDATA[<p>I recently finished my Perl sudoku generator.  Now I can fairly efficiently generate a minimal puzzle pretty quickly.  It's in Perl, and could probably be faster, but there are two projects that I want to start on now:</p>

<ol>
<li>grading and classifying puzzles</li>
<li>generalize the exact cover to play around with Kataminoe covers</li>
</ol>

<p>The first item is first.  After sampling just a few of the thousands of puzzles I generated, I noticed a few things:</p>

<ul>
<li>A good percentage don't appear to be human-solvable</li>
<li>I find puzzles most interesting if they are tricky, and contain a known pattern... like a multicolour or an X-Wing.</li>
</ul>

<p>Both grading and categorizing require something that recognizes patterns, and the last few days, I've been figuring out all of these patterns, and coming up with napkin pseudocode for detecting them.  I'm still having a struggle with XY-Wings.  Swordfish and X-Wing patterns actually turn out to be easier to find, I think, than hidden pairs.  Things like remote pairs and inference chains are extremely difficult for people like me, and programs, to find.  You need creepy FPGA brains for that sort of thing.  </p>

<p>I already have detectors for Naked Singles, Hidden Singles, and Naked N-Tuples.  Next in my agenda is locked candidates.  I'm not sure how I am going to implement Colours, yet, though detecting complements is fairly easy.  </p>

<p>I'm going to keep posting here as I make progress.  Ideally with my basic design next, then progress on each detector.</p>]]></description>
         <link>http://emaki.uc.org/blog/2006/02/grading_sudoku.html</link>
         <guid>http://emaki.uc.org/blog/2006/02/grading_sudoku.html</guid>
         <category>sudoku</category>
         <pubDate>Sat, 18 Feb 2006 23:18:27 -0500</pubDate>
      </item>
            <item>
         <title>geekblogging</title>
         <description><![CDATA[<p>I've had a personal blog for many years, and like everyone else, I got pretty sick of writing about myself in the late 90s.  I still have a personal blog, but it is now reserved for drunken poetry.</p>

<p>I'm going to start posting here about programming whatsit that interests me.  My degree was in math/computer science, but I wasn't terribly serious about it at the time, to be honest.  Lately, certain aspects of computer science have interested me more and more.  I've come to a couple of conclusions:</p>

<ul>
<li>I like language and symbol aspects, like data and semantic encodings</li>
<li>I like puzzles, and problem solving with code</li>
<li>Hardware continues to be boring to me</li>
<li>I need to learn more languages, try more problems</li>
</ul>

<p>I'm okay with my preferences and interests.  I'll post about them here when I feel like it is interesting enough.  I'm going to take on projects, and discuss them with myself here.</p>]]></description>
         <link>http://emaki.uc.org/blog/2006/02/geekblogging.html</link>
         <guid>http://emaki.uc.org/blog/2006/02/geekblogging.html</guid>
         <category>undef</category>
         <pubDate>Mon, 13 Feb 2006 19:54:24 -0500</pubDate>
      </item>
      
   </channel>
</rss>
