Riley on Life: graph

Showing posts with label graph. Show all posts

Friday, October 16, 2015

Esoteric Language Resources

The internet is an amazing resource for programmers. This is particularly true for programming languages in widespread use. The following graph shows the number of questions on Stack Overflow with a tag for some well-known languages.

There's a huge disparity between mainstream programming languages such as Java and C# and languages associated with artificial intelligence such as LISP and Prolog.

An even greater disparity exists between mainstream languages and expert system languages (that aren't even perceptible on the prior graph unless one zooms in on the far right):

I think this is one of the tallest hurdles to jump for users of esoteric languages. The presence of resources on the internet (discussion forums, documentation, tutorials, examples, etc.) have not reached the critical mass needed for widespread use.

Friday, August 14, 2015

Holy Mathematics Batman!

The Batman equation

Monday, June 2, 2014

The Long Slow Decline of App Store Income

I didn’t expect to get rich creating an iOS app. If that were my goal I certainly wouldn’t have create a ‘to do’ list app to compete with the gazillion other ‘to do’ lists apps in the App Store.

But for a very long time I'd been making lists on scraps of paper, so when I first got my iPod Touch in September 2008 and couldn’t find a list app that I liked, I decided to make my own. If I could make a bit of extra money in the process, so much the better.

I released List! Lite in February 2010 and List!, a paid version of the app, in July 2010. Since that time, I’ve released 25 updates including new features and bug fixes. List! has 210 ratings with an average of 4.5 out of 5 stars.

There are many excellent articles worth reading on the challenges of making money on paid apps, but I can nicely summarize them with a single graph showing the monthly income I’ve made on List! between October 2010 and April 2014:

I got a large bump in income shortly after adding support for the larger screen of the iPad in June 2011, but nothing I've done since that time—new features, sales, promotions, and localization—has had any lasting effect on generating sustainable income. At best, all I've done is just slow the decline.

As a hobbyist, I have no regrets on the time I spent developing the app. I learned a number of new things, made enough money to buy some toys, and have an app that I use on a daily basis.

As a developer, I’m glad the only investment I lost was my spare time. There’s all kinds of speculations I could make, but at the end of the day I wrote an app I loved creating that just wasn’t commercially successful. I haven't written my last app, but I now have a more enlightened view of the economic realities of app development.

So if you approach a developer about an app idea that’s sure to make money and they look at you skeptically, now you know why.

Thursday, May 8, 2014

River Benchmark

I came across another river crossing problem that’s similar to the farmer’s dilemma example problem for CLIPS. It’s a little bit more complex by virtue of the number of things that have to be moved, but it’s essentially the same type of problem.

One of the issues with existing benchmarks such as waltz and manners is validation. The manners benchmark only runs in CLIPS with the depth conflict resolution strategy and the waltz benchmark executes different numbers of rules depending upon the conflict resolution strategy chosen.

Clearly this is an issue. I'm a proponent of having lots of benchmarks, rather than one or two, but in order to have lots of benchmarks, they need to be dead simple to translate and verify. In the case of the existing benchmarks, they're not.

What this means is that you have to design the benchmarks with this is mind. I thought it would be an interesting exercise to demonstrate how this can be done. So I wrote a CLIPS program to solve a variation of the river crossing problem and once I had it working, set the conflict resolution strategy to random to see if the same number of rules were executed with each run. There weren’t.

It took several iterations before I had a version that produced the same number of rule executions regardless of the order in which rules of the same priority/salience were placed on the agenda. The primary mechanism used to get an exact number of rules executed was to assign weights to each of the possible moves that could be made so that the search was always made in the same order. In total, there were 19 rules and 5 salience values for the different groups of rules. If I’d used modules (which would have made translation to other languages more difficult), there wouldn’t have been any need for salience values at all.

Like the manners and waltz programs, the river program runs considerably faster in version 6.3 of CLIPS than in version 6.24. On a 2.66 GHz Intel Core 2 Duo it completes in 0.7 seconds in the newer version as opposed to 29 seconds in the older version.

The river program is available here.

Wednesday, April 30, 2014

Affordable Health Care

When I left my job working for a company to become a software consultant working for myself, one of the first things I did was get an individual health insurance policy. Since my premiums aren’t subsidized, I’m fully aware of the true cost I’m paying and in the last eight years that cost has risen 333%:

With most of the rate increases I’ve gotten in the past, I’ve also received this bit of ominous advice from my insurance carrier:

You may qualify for other plans at different rates. Before you make a decision, you may want to discuss these alternate plans with us or your broker. If you choose a new plan, we might review your health information again. This might mean a higher rate. And you may not be able to return to your original plan.

So if I understand what they’re telling me, if I try to switch to another health plan to save money, I may end up paying more money for less coverage than what I have now. Thanks, let me think a bit and decide if I want to bend over now or bend over later.

Obtaining health insurance as an individual sucks; you lack the power that a large group has to negotiate reasonable rates for all of its members. I don’t see how any free market advocate can claim that this kind of lock-in—either to a company subsidizing/negotiating your costs or to a plan that can’t be changed without significant risk—is a sign of a competitive environment that will work things out if we just give it a chance. In fact, the opposite is true. In an unregulated free market, insurance companies will naturally exclude those most in need of health insurance in order to maximize their profits.

Long term, I honestly don’t know whether Obamacare is going to ruin the American way of life as we know it; I’ll leave those pronouncements to the politicians and pundits who claim to know everything, but can’t offer a better solution to the current broken system.

Short term, however, it’s going to cut my insurance premiums almost in half. Because I can’t be charged more or denied coverage for preexisting conditions (such as taking statin medication to help keep my cholesterol in check), and the information on the healthcare exchanges makes it much easier to compare plans from different carriers, I was able to find a better deal with a different carrier (keeping the same doctors I was using in my old network).

Only time will tell if there are any gotchas with my new health care plan and perhaps the government’s health care mandates are unsustainable, but from my perspective the Affordable Care Act really has made my health care more affordable.

Full disclosure: I did not vote for Obama in the 2008 and 2012 presidential elections, but I, for one, welcome our new socialist overlords.

Monday, March 17, 2014

Manners and Waltz Benchmarks

The two most widely used benchmarks for rule-based systems are Manners and Waltz. Daniel Selman nicely summarizes the major issues with these benchmarks in The Good, The Bad, and the Ugly - Rule Engine Benchmarks.

CLIPS didn’t implement the optimizations needed to run these benchmarks efficiently until version 6.30, so it compared unfavorably to other engines when these were the only metrics used for comparing performance.

As the following graphs show, the performance of CLIPS 6.30 is orders of magnitude faster than CLIPS 6.24 for the larger data sets used by these benchmarks.

In particular, hashing the memory nodes and optimizations for handling large numbers of activations caused the dramatic improvement in the benchmark results.

How did these optimizations improve the performance of a real world application? I benchmarked some of the larger sample data sets for a production system I developed that’s run hundreds of thousands of time a month. The system consists of hundreds of rules and the amount of data processed can range up to ten thousand or more facts.

Each of the samples showed improvement, but not nearly as dramatic as Manners or Waltz:

A process is created and the rules loaded each time the system is run, so a more accurate picture of the total processing time would include the time to load the rules:

There’s nothing wrong with modest improvements, but if your expectations of performance were based on Manners and Waltz, you’d surely be disappointed.

That’s not to say there weren’t performance benefits of the 6.30 optimizations in real world situations. Occasionally, I’d write one or more rules that were efficient for a small data set, but acceptance testing did not include a large data set. The system would run fine until a sufficiently large data set with the appropriate types of facts was submitted, at which point the process would display non-optimized Manners/Waltz behavior (i.e. it would appear to hang).

When using CLIPS 6.24, I rewrote the offending rules to be more efficient for large data sets. With 6.30, since the system is more tolerant of inefficient rules, these situations occur less frequently and are easier to correct.

CLIPS versions of the Manners and Waltz benchmarks are available here and here.

Tuesday, January 7, 2014

Pattern Matching Constants in CLIPS

In CLIPS, there are three basic templates for matching a constant in a rule pattern. The constant, in this case 387, can be placed directly in the pattern as a literal constraint:

(defrule rule-387
   (data (value 387))
   =>)

A predicate constraint can be used to evaluate an expression testing for equality:

(defrule rule-387
   (data (value ?x&:(eq ?x 387)))
   =>)

A test conditional element can be used to evaluate an expression testing for equality:

(defrule rule-387
   (data (value ?x))
   (test (eq ?x 387))
   =>)

An empirical analysis can be performed to determine which template is most efficient as the number of rules and distinct constants increase. A common set of constructs will be used with each group of rules to trigger pattern matching as a single fact is repeatedly modified:

(deftemplate data
   (slot index)
   (slot value))

(deffacts start
   (data (index 1000000) (value 100000)))

(defrule loop
   (declare (salience 1))
   ?f <- (data (index ?i&~0))
   =>
   (modify ?f (index (- ?i 1)) (value (- ?i 1))))

Using CLIPS 6.24 to run the common rules in conjunction with groups of 1, 100, 400, 700, and 1000 rules using each of the three templates produces the following results.

The literal constraint is the most efficient, which is what you’d intuitively expect after an examination of the code. It directly compares a value in a fact to a value stored in the pattern network.

The predicate constraint and test conditional element are less efficient as they require a function evaluation. The predicate constraint is slightly more efficient because it’s evaluated at an earlier stage of pattern matching than the test conditional element. If all of the referenced variables in a predicate constraint are bound within the pattern, then the evaluation can be performed in the pattern network. Test conditional elements and expressions containing variables bound in other patterns are evaluated in the join network, which is primarily used to unify variable bindings across multiple patterns.

CLIPS 6.30 provides significantly better performance for literal constraints:

In situations where multiple patterns satisfy the criteria for sharing, literal constraints are not evaluated one at a time. Instead the next node in the pattern network is determined using a hashing algorithm which requires just a single computation regardless of the number of literal constraints. So for this particular benchmark, the execution time does not increase for literal constraints as the number of rules and distinct literals is increased.

Bottom line: regardless of which version of CLIPS you’re using, use literal constraints rather than predicate constraints or test conditional elements.

The code for these benchmarks is available here.

Friday, October 11, 2013

Movies, Pacing, and Books

For economic and practical reasons, most movies have a run time of 1½ to 2½ hours. But what if time constraints could be removed and audiences would sit through an entertaining movie of any length? How much longer would films run and would their content have to change to keep the audience in their seats?

There are many examples of movies that are both long and successful. Avatar, the top grossing film of all time, had a run time of 2 hours and 42 minutes. Titanic at 3 hours and 14 minutes is the second top grossing film. The three movies of The Lord of the Rings trilogy were all around three hours in length and individually are the 33rd, 24th, and 7th highest grossing films.

But The Lord of the Rings, both the movie and the books on which it is based, is really one story told in three parts. A single “super movie” would be over 9 hours in length with combined grosses placing it ahead of Avatar.

Because the story is a sweeping epic, it’s hard to envision these movies being shorter and still doing justice to the source material. You can even argue that the extended versions of the films, which bring the total length to over 11 hours, fill in missing pieces making the story more enjoyable.

So if a 9 hour movie can be good and an 11 hour movie (possibly) as good or better, would making movies longer make them better?

The Avengers, perhaps the best (or at the very least one of the best) superhero movies ever made clocks in at 2 hours and 23 minutes. It’s the third top grossing film. Would this movie have been twice as good if it were twice as long? I don’t think so. You could add additional subplots and lengthen the actions sequences, but I doubt this would have made the movie better.

On the other hand, every M. Night Shyamalan movie feels like it’s thirty minutes too long. Even The Sixth Sense, which I liked quite a bit, drags at points. Cutting at least ten minutes out of the running time of 1 hour and 47 minutes would have made it a better film.

Why is this?

Pacing.

Stories have rhythm. If it’s too fast, then we wonder why things happen. If it’s too slow, then we wait for things to happen. If it’s just right then we get caught up in the story and lose track of time.

The Sixth Sense presents too little story in too much time. A point is reached where we understand that Cole is a troubled child who has disturbing supernatural events occurring around him. We also understand that Malcolm, the child psychiatrist who is trying to help him, is troubled in his own way. We’re ready to move on to the next thing—for the characters to begin working on a resolution for their problems—but instead the film keeps focusing on the problem for far too long before finally coming to a resolution and satisfying conclusion.

The editing process, driven by practicality of run times, generally makes movies better. No amount of editing is going to turn hours and hours of bad footage into an oscar winner, but it stands to reason that if you take the best footage, sequence it properly, and leave the rest on the cutting room floor, you’ll probably have the best film you can make.

For good films this often means the hard choice of not including some footage. In Alien, a cut scene reveals the fate of Dallas and Brett who had been earlier captured by the xenomorph. It revealed some intriguing information about the alien’s life cycle, but placing it in the middle of Ripley’s frantic escape from the Nostromo before it self-destructed would have disrupted the pacing of that sequence. Similarly in Aliens, there is a cut scene in which Ripley learns that her young daughter lived out her life and died during the 57 years she was adrift in space suspended in hypersleep. This short scene makes it easy to see how Ripley would view Newt as a surrogate daughter, risking her own life to rescue her from the depths of the alien hive. But is it an essential scene to understand Ripley’s motivation? I would say the answer is no. From the interaction shown in the movie, it’s believable that Ripley would risk her life to save this child and thus the scene can be cut without diminishing the film.

What these cuts illustrate is that in well-made movies, every scene should be questioned before being included in the final cut. Does this scene serve a purpose? Is this scene needed to advance the plot? Is the movie better with this scene in it and worse without it?

I wish every book author went through this process, especially once they became successful.

When a film franchise becomes successful, the films don’t typically become longer and longer. If anything increases, it’s usually the budget and production values—once you’ve proven something is successful spending more money on it is less of a risk.

When a book franchise becomes successful, the same is not always true. I’ve graphed the page counts of a number of fantasy book series I’ve enjoyed reading: Discworld, Xanth, Harry Potter, and Anita Blake: Vampire Hunter. I’ve only graphed the first ten books in these series with the exception of Harry Potter which consists of just seven books.

First, I love the Discworld books written by Terry Pratchett. Every book he writes is just as long as it needs to be. And funny never gets old—his dialogue, descriptions, observations, and plots always ooze with his unique sense of humor. I’ve read 33 out of the 40 books currently in the series and haven’t grown tired of them yet. His books show a modest increase in size as the series became successful with the 33rd being only 50% longer than the first.

The Xanth series by Piers Anthony is one I started reading when I was in High School and I still have very fond memories of them. I started reading them again as an adult and finished 25 out the 38 currently in the series. The books have always had pun-derived humor, but the latter books began to focus less and less on plot and characters and more and more on puns (which eventually became too pun-ishing for me), so I lost interest in the series. As the graph shows, the length of his first ten books was the most consistent of the four series. This trend continued in his latter books with the 25th being only 9% longer than the first. He clearly has a feel for the length for the stories he wants to tell and crafts his plots to fall within the desired range.

Next is the Harry Potter series by J.K. Rowling. It’s insane to argue about the formula for books as wildly successful as these, but I’m going to remove my tin foil hat and make an attempt.

I can’t think of better examples of how to craft plots than the first three books in this series. Everything in the stories serves a purpose. The pacing is just right and you’re never left waiting for things to happen. The latter books, however, feel bloated in comparison. Subplots abound that serve little purpose in the greater arc of the story and it takes far too long to resolve many issues.

I’m not arguing that the last four books are bad; I enjoyed them. I just think they could have been better. Look at the modest increase in size for books 2 and 3 in the series. Now look at how the page count sky rockets for books 4 and 5. The fifth book in the series is 278% longer than the first. It’s certainly not 278% better.

That’s because more of the same thing is not always the same thing. The experience of eating one piece of candy is not the same as eating one hundred pieces of candy—in one case, you wish you could eat more and in the other you wish you had eaten less.

Making your books twice as long is not twice as much of the same thing—it’s something different. It’s a different formula. It’s New Coke vs. Classic Coke. Even if you like New Coke better, you can’t reasonably argue that it’s the same thing as Classic Coke.

The same is true for the Harry Potter series. The formula for the first three books is clearly different than that for the last four. The expanding plot changed the pacing of the stories for the worse, not the better. It wasn’t enough of a change that I stopped reading the series, but it was enough that I enjoyed the latter stories to a lesser extent.

Finally, there’s the Anita Blake: Vampire Hunter novels by Laurell K. Hamilton, which started out strong, but slowly bloated over time. The tenth was the last I read before giving up on the series. It was 241% longer than the original novel—it was also 241% worse.

Here’s the deal. If you have common scenes in every book you write, you have to be really careful with pacing. Once your reader gets through one elaborate description of someone getting dressed, eating a meal, channeling supernatural power, or pulling out a gun during a tense confrontation, reading similar passages becomes less and less interesting. If your books remain the same length, no worries. But if they become longer and longer by including more and more of the same old thing, then you really need to find an editor who’s willing to challenge everything you write, regardless of how well your books are selling.

As a reader, this is my plea to authors. Edit your books like movies. Set a reasonable limit, say 400 pages, and then trim your story, keeping only the best parts, until it fits within that self-imposed limit. Then add pieces back only if you can justify their inclusion.

I’m not saying you can’t write a masterpiece that would be diminished by the omission of a single word—I’m saying that if you think you can, you probably can’t.

Sunday, September 8, 2013

App of the Day Promotion

Back in February I was approached about offering List! on a “free app of the day” promotion app. The idea is that you make your paid app free for a day during which you have mass exposure to the millions of users of the promotion app. The benefit pitched to developers is that once the app becomes paid again, there will be a significant increase in the number of downloads (4 to 50 times more). In my case, there was no cost for the app promotion other than the lost sales caused by reducing the cost of the app from $1.99 to free. The results of the promotion during the week ending February 10th, 2013 are shown in the following chart.

I’ve graphed the week of the promotion as having no paid downloads—there were over 446,000 app downloads during the week of the promotion, but only a minuscule portion of these would be paid rather than free on the days before and after the promotion. There was a slight bump of 2 to 3 times the usual number of downloads in the following week, but this wasn’t a sustained trend. So in my case, there was no upside to the promotion.

There was a downside to the promotion that I hadn’t fully considered beforehand. The promotion app was not available in the US App Store, so most of the additional exposure from the promotion was in Europe, Central and South America, Russia, and Japan. List! is not localized for languages other than English and the promotion app apparently did not contain any information about the languages supported by the promoted app. So rather than getting downloads from people who were fluent enough with English to consider using the app, there were a number of people who weren't comfortable with English, downloaded the app without knowing it didn't support their native language, and then left one star reviews because of the lack of localization. It wasn't a huge number, but a half dozen reviews is all it takes to get a one star average. Bottom line: I might consider another promotion like this even though the results were disappointing, but I wouldn't do it unless my app was localized.