Bookmarks for December 31st through January 17th

These are my links for December 31st through January 17th:

  • Khan Academy – The Khan Academy is a not-for-profit organization with the mission of providing a high quality education to anyone, anywhere.

    We have 1000+ videos on YouTube covering everything from basic arithmetic and algebra to differential equations, physics, chemistry, biology and finance which have been recorded by Salman Khan.

  • StarCraft AI Competition | Expressive Intelligence Studio – AI bot warfare competition using a hacked API to run StarCraft, will be held at AIIDE2010 in October 2010.
    The competition will use StarCraft Brood War 1.16.1. Bots for StarCraft can be developed using the Broodwar API, which provides hooks into StarCraft and enables the development of custom AI for StarCraft. A C++ interface enables developers to query the current state of the game and issue orders to units. An introduction to the Broodwar API is available here. Instructions for building a bot that communicates with a remote process are available here. There is also a Forum. We encourage submission of bots that make use of advanced AI techniques. Some ideas are:
    * Planning
    * Data Mining
    * Machine Learning
    * Case-Based Reasoning
  • Measuring Measures: Learning About Statistical Learning – A "quick start guide" for statistical and machine learning systems, good collection of references.
  • Berkowitz et al : The use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems (2006) – Berkowitz, Steven D., Woodward, Lloyd H., & Woodward, Caitlin. (2006). Use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems. Originally intended for publication in updating the 1988 volume, eds., Wellman and Berkowitz, Social Structures: A Network Approach (Cambridge University Press). Steve died in November, 2003. See Barry Wellman’s “Steve Berkowitz: A Network Pioneer has passed away,” in Connections 25(2), 2003. It has not been possible to add the updating of references or of the quality of graphics that might have been possible if Berkowitz were alive. An early version of the article appeared in the Proceedings of the Session on Combating Terrorist Networks: Current Research in Social Network Analysis for the New War Fighting Environment. 8th International Command and Control Research and Technology Symposium. National Defense University, Washington, D.C June 17-19, 2003
  • SSH Tunneling through web filters | – Step by step tutorial on using Putty and an EC2 instance to set up a private web proxy on demand.
  • PyDroid GUI automation toolkit – GitHub – What is Pydroid?

    Pydroid is a simple toolkit for automating and scripting repetitive tasks, especially those involving a GUI, with Python. It includes functions for controlling the mouse and keyboard, finding colors and bitmaps on-screen, as well as displaying cross-platform alerts.
    Why use Pydroid?

    * Testing a GUI application for bugs and edge cases
    o You might think your app is stable, but what happens if you press that button 5000 times?
    * Automating games
    o Writing a script to beat that crappy flash game can be so much more gratifying than spending hours playing it yourself.
    * Freaking out friends and family
    o Well maybe this isn't really a practical use, but…

  • Time Series Data Library – More data sets – "This is a collection of about 800 time series drawn from many different fields.Agriculture Chemistry Crime Demography Ecology Finance Health Hydrology Industry Labour Market Macro-Economics Meteorology Micro-Economics Miscellaneous Physics Production Sales Simulated series Sport Transport & Tourism Tree-rings Utilities"
  • How informative is Twitter? » SemanticHacker Blog – "We undertook a small study to characterize the different types of messages that can be found on Twitter. We downloaded a sample of tweets over a two-week period using the Twitter streaming API. This resulted in a corpus of 8.9 million messages (”tweets”) posted by 2.6 million unique users. About 2.7 million of these tweets, or 31%, were replies to a tweet posted by another user, while half a million (6%) were retweets. Almost 2 million (22%) of the messages contained a URL."
  • Gremlin – a Turing-complete, graph-based programming language – GitHub – Gremlin is a Turing-complete, graph-based programming language developed in Java 1.6+ for key/value-pair multi-relational graphs known as property graphs. Gremlin makes extensive use of the XPath 1.0 language to support complex graph traversals. This language has applications in the areas of graph query, analysis, and manipulation. Connectors exist for the following data management systems:

    * TinkerGraph in-memory graph
    * Neo4j graph database
    * Sesame 2.0 compliant RDF stores
    * MongoDB document database

    The documentation for Gremlin can be found at this location. Finally, please visit TinkerPop for other software products.

  • The C Programming Language: 4.10 – by Kernighan & Ritchie & Lovecraft – void Rlyeh
    (int mene[], int wgah, int nagl) {
    int Ia, fhtagn;
    if (wgah>=nagl) return;
    swap (mene,wgah,(wgah+nagl)/2);
    fhtagn = wgah;
    for (Ia=wgah+1; Ia<=nagl; Ia++)
    if (mene[Ia]<mene[wgah])
    swap (mene,++fhtagn,Ia);
    swap (mene,wgah,fhtagn);
    Rlyeh (mene,wgah,fhtagn-1);
    Rlyeh (mene,fhtagn+1,nagl);


  • How to convert email addresses into name, age, ethnicity, sexual orientation – This is so Meta – "Save your email list as a CSV file (just comma separate those email addresses). Upload this file to your facebook account as if you wanted to add them as friends. Voila, facebook will give you all the profiles of all those users (in my test, about 80% of my email lists have facebook profiles). Now, click through each profile, and because of the new default facebook settings, which makes all information public, about 95% of the user info is available for you to harvest."
  • Microsoft Security Development Lifecycle (SDL): Tools Repository – A collection of previously internal-only security tools from Microsoft, including anti-xss, fuzz test, fxcop, threat modeling, binscope, now available for free download.
  • Analytics X Prize – Home – Forecast the murder rate in Philadelphia – The Analytics X Prize is an ongoing contest to apply analytics, modeling, and statistics to solve the social problems that affect our cities. It combines the fields of statistics, mathematics, and social science to understand the root causes of dysfunction in our neighborhoods. Understanding these relationships and discovering the most highly correlated variables allows us to deploy our limited resources more effectively and target the variables that will have the greatest positive impact on improvement.
  • PeteSearch: How to find user information from an email address – FindByEmail code released as open-source. You pass it an email address, and it queries 11 different public APIs to discover what information those services have on the user with that email address.
  • Measuring Measures: Beyond PageRank: Learning with Content and Networks – Conclusion: learning based on content and network data is the current state of the art There is a great paper and talk about personalization in Google News they use content for this purpose, and then user click streams to provide personalization, i.e. recommend specific articles within each topical cluster. The issue is content filtering is typically (as we say in research) "way harder." Suppose you have a social graph, a bunch of documents, and you know that some users in the social graph like some documents, and you want to recommend other documents that you think they will like. Using approaches based on Networks, you might consider clustering users based on co-visitaion (they have co-liked some of the documents). This scales great, and it internationalizes great. If you start extracting features from the documents themselves, then what you build for English may not work as well for the Chinese market. In addition, there is far more data in the text than there is in the social graph
  • mikemaccana’s python-docx at master – GitHub – MIT-licensed Python library to read/write Microsoft Word docx format files. "The docx module reads and writes Microsoft Office Word 2007 docx files. These are referred to as 'WordML', 'Office Open XML' and 'Open XML' by Microsoft. They can be opened in Microsoft Office 2007, Microsoft Mac Office 2008, 2.2, and Apple iWork 08. The module was created when I was looking for a Python support for MS Word .doc files, but could only find various hacks involving COM automation, calling .net or Java, or automating OpenOffice or MS Office."

Bookmarks for April 24th through April 27th

These are my links for April 24th through April 27th:

Bookmarks for April 13th through April 15th

These are my links for April 13th through April 15th:

Bookmarks for April 11th through April 12th

These are my links for April 11th through April 12th:

  • Wordle – Beautiful Word Clouds – Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes.
  • The dark side of Dubai – Johann Hari, Commentators – The Independent – "Dubai was meant to be a Middle-Eastern Shangri-La, a glittering monument to Arab enterprise and western capitalism. But as hard times arrive in the city state that rose from the desert sands, an uglier story is emerging."
  • Topless Robot – Hot Girls Have Lightsaber Strip-Fight for Your Viewing Pleasure – Star Wars CGI meets fake body spray ad
  • Poll Result: Best VPN to leap China’s Great Firewall? – Thomas Crampton – - Witopia – Undisputed winner. Quality of service, speed of surfing, though it is said to be relatively expensive at US$50 to US$60 per year. Hotspot Shield – Bandwidth limits can be painful. Force you to wait until the next month if you use it too much. – Ultrasurf – StrongVPN
  • InfoQ: Facebook: Science and the Social Graph – In this presentation filmed during QCon SF 2008 (November 2008), Aditya Agarwal discusses Facebook’s architecture, more exactly the software stack used, presenting the advantages and disadvantages of its major components: LAMP (PHP, MySQL), Memcache, Thrift, Scribe.
  • The Running Man, Revisited § SEEDMAGAZINE.COM – a handful of scientists think that these ultra-marathoners are using their bodies just as our hominid forbears once did, a theory known as the endurance running hypothesis (ER). ER proponents believe that being able to run for extended lengths of time is an adapted trait, most likely for obtaining food, and was the catalyst that forced Homo erectus to evolve from its apelike ancestors.

Bookmarks for March 3rd from 05:48 to 12:10

These are my links for March 3rd from 05:48 to 12:10:

Bookmarks for March 2nd from 10:48 to 21:40

These are my links for March 2nd from 10:48 to 21:40:

Bookmarks for February 21st from 13:59 to 21:55

These are my links for February 21st from 13:59 to 21:55:

Bookmarks for February 16th through February 17th

These are my links for February 16th through February 17th:

  • Top 100 Network Security Tools – Many many security testing and hacking tools.
  • FRONTLINE: inside the meltdown: watch the full program – "On Thursday, Sept. 18, 2008, the astonished leadership of the U.S. Congress was told in a private session by the chairman of the Federal Reserve that the American economy was in grave danger of a complete meltdown within a matter of days. "There was literally a pause in that room where the oxygen left," says Sen. Christopher Dodd"
  • The Dark Matter of a Startup – "Every successful startup that I have seen has someone within their ranks that just kinda “does stuff.” No one really knows specifically what they do, but its vital to the success of the startup."
  • Why I Hate Frameworks – "A hammer?" he asks. "Nobody really buys hammers anymore. They're kind of old fashioned…we started selling schematic diagrams for hammer factories, enabling our clients to build their own hammer factories, custom engineered to manufacture only the kinds of hammers that they would actually need."
  • Mining The Thought Stream – Lots of comments around what is Twitter good for and how will it make money, revolving around real/near-time search, analytics, marketing, etc.
  • Understanding Web Operations Culture – the Graph & Data Obsession … – Comparison of traffic at Flickr, Google, Twitter, during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."

Women in film

On the occasion of the Academy Awards, an interesting video montage featuring faces of famous actresses. via Instapundit 

I’ve noticed that over the past several years, I basically don’t see movies in theaters, except for kids movies. Everything else is either on airplanes, cable, DVD, or online.

Moebius Transformations Revealed

Videos like these would have been handy in my student days. I remember wrestling with hand drawings of transformations and mappings at one point. I should look for some on vector curl and tensors, there are probably some great animations around now.

May as well put this guy in charge of the banks

Another day, another subprime-related fiasco. Today GE Asset Management announced that one of its not-quite-money-market short bond funds, the Enhanced Cash Trust, took a loss from subprime holdings, and is offering customer redemptions at 96 cents on the dollar. Normally these funds are considered to be a higher-yielding version of a money market fund. This would make you pretty unhappy if you were looking for 5%-ish stable returns while waiting for the stock market to settle down.

Along these lines, here are British comedians John Fortune & John Bird chatting about the state of the banking system, Northern Rock, and subprime in another interview of “George Parr, investment banker” from last month.

See also: Subprime crisis explained, by British comedians

Ms. Dewey – Stylish search, with whips, guns, and dating tips

It’s been a while since I’ve come across something I haven’t seen before online. Ms. Dewey fits the bill. It is a Flash-based application combining video clips of actress Janina Gavankar with Windows Live search.

As a search application, it’s fat, slow, and the query results aren’t great. However, as John Batelle observes, “clearly, search ain’t the point.” This is search with an flirty attitude, where the speed and quality of the results aren’t at the top of the priority list.

As short-attention-span theater goes, it’s quite entertaining.

If you can’t think of anything to search for, Ms. Dewey will fidget for a while and eventually reach out and tap on the screen. “Helloooo…type something here…”

It’s far more interesting to try some queries and check out the responses. I spent over half an hour typing in keywords to see what would come up, starting with some of the suggestions from Digg and Channel9. The application provides a semi-random set of video responses based on the search keywords, so you won’t always get the same reaction each time.

The whip and riding crop don’t always appear when you’d think, the lab coat seems to be keyed to science and math (try “partial differential equation”), and I’m not sure what brings on the automatic weapons.

“Ms. Dewey” also has a MySpace page with more video clips. The way the application is constructed, they can probably keep updating and adding responses as long as they want to.

I briefly tried using Ms. Dewey in place of Google, as a working search engine, but it takes too long to respond to a series of queries (have to wait for the video to play) and the search results aren’t great (Live is continuing to improve, though). At the moment this is a fun conceptual experiment.

I wonder if we’ll see a new category of search emphasizing style (entertainment, attitude, sex) over substance (relevance, speed, scope). Today’s version might already work for the occasional search user, but imagine Ms. Dewey with faster, non-blocking search results, a better search UI, and Google’s results. It all vaguely reminds me of a William Gibson novel.

Hey Comcast – I’d take the internet service if you could keep the video running

This week there was a guy from Comcast going door-to-door in our neighborhood, offering promotional rates on their triple play bundle (video, data, voice), and internet service in particular. In general, I’m enthusiastic about the future prospects for combined services from either the cable companies or the telcos, and the Comcast internet service is attractively priced at $19.99 for 6mbits down/384k up, so in theory we are a good prospect for this service.

Unfortunately, I’ve been on the verge of cancelling our Comcast service for months because of sporadic outages. I’m not totally thrilled with my relatively slow PacBell/SBC DSL service (1.5mbits down/384k up), but other than widespread outages due to flooding or power interruptions, it has been quite stable. In contrast, our cable TV service went out for a week last year, and I have observed outages lasting anywhere from a few minutes to an hour or more every month or so since then. I can live without CNBC or Disney Channel, but things can rapidly grind to a halt here without internet service.

The Palo Alto fiber loop passes just a block from here. I should see if it’s gotten any easier and cheaper to set up a connection. The Palo Alto Fiber-to-the-home project seems to be perpetually stalled, but the bandwidth business has been coming back over the past few years. There are enough wireless LANs visible from here, I could probably set up a mini-ISP or bandwidth co-op for the whole neighborhood.

At the end of the day, the main thing I want from an internet service provider is fast, stable performance at a reasonable price. $19.95 is a pretty good price, but Comcast hasn’t shown that it can keep basic video service running yet. Maybe later.

Update 01-21-2007: Ended up installing Comcast internet, but we’re still keeping the DSL service in place to run the office network. Internet video is a lot faster on the cable network, but it’s already been offline once.

Star Trek and the Knights of the Round Table

“Knights, I bid you welcome to your new home…Camelot!”

An amazing mashup of Monty Python’s “Knights of the Round Table” song, with singing and dancing by the cast of the original Star Trek.

If this doesn’t leave you rolling around on the floor, it probably means you’re completely baffled and/or younger than 30 or so.

Try naming the episodes for extra credit…

via The Big Picture

The Matrix, with Muppets

Featuring Kermit, Miss Piggy, and friends. YouTube, via 500 Hats

The Singing Economist – Every Breath Bernanke Takes

Glenn Hubbard, Dean of the Columbia Business School, was recently considered as a potential replacement for Alan Greenspan as head of the US Federal Reserve. Watch this, and think of how much more entertaining the Humphrey-Hawkins reports could be if Hubbard had gotten the job.

You might also be amused by “Dean, Dean, Baby!”
(not Vanilla Ice)

More productions from the Columbia Business School follies

These clearly demonstrate that B-school students have too much spare time on their hands.

Postscript: I see from this post that it’s actually a close look-alike (student), not actually Glenn Hubbard.

Update 04-28-2006 9:49 PDT – This clip has been making the rounds pretty quickly. They just showed it on CNBC, where they’ve also got the Columbia Business School students who made the video. 15 minutes of fame…

Bangalore boom, traffic congestion

Today’s (Sunday) San Jose Mercury News features a cover story on Bangalore, India, and draws some parallels with the Bay Area. The headline reads “The tech boom didn’t die. It just moved to India.” I find that I unexpectedly run into people from the Bay Area quite often during trips out there, and there has been amazing growth in salaries and real estate prices which reminds me of late ’99 here. At the same time they seem to be hitting resource limits of various sorts. The water and power supplies can be spotty, the storm drains routinely flood the streets during monsoon season, the roads are overloaded, there’s often a shortage of hotel rooms, and the airport is remarkably bad, considering that so much of the local economy depends on foreign business travel.

Bangalore, the tech center of India, is booming as the Bay Area once did, becoming a world-class hub for tech jobs, economic activity and, increasingly, innovation. While Silicon Valley still retains a hold on high-end tech jobs, countless lower-level positions, particularly in software — and now some sophisticated research and development work — are shifting to this city of 6.5 million in southern India. The emergence of Bangalore — and of India — as a tech power signals a new world economic order that is both opportunity and threat to Silicon Valley.

The article also mentions the traffic (and the fact that it can take an hour to go a few miles). Reminded me to go dig up some video clips I’ve been meaning to do something with. Nothing spectacular, but as I travel, I find the differentness of the mundane aspects of daily life interesting, and there are lots of little things to see in these. (WM9 only, no Quicktime, I don’t have an encoder handy at the moment.)

See also:

I wonder if the Mercury News found the same cow that hangs out on Hosur Road. There are a few that are always wandering around along the side of the road, they must live nearby somewhere.

To unplug the cable, or not to unplug the cable, that is the question

Comcast just announced that they’re raising their monthly fee by around 7% starting in January:

The package price will rise by an average of $3.13 per month, or about $44.80 to $47.93. Prices vary depending on the community.

I already pay $49.61 per month (with tax) here in Palo Alto, so the new rate will be around $53 per month. The old rate seems too high for what little we watch in our household, and the new rate is worse.

“Comcast’s Bay Area market prices reflect increasing operating expenses,” said spokesman Andrew Johnson, “as well as investments that Comcast is making to improve the value of the service.” He cited improvements in customer service as well as more programming choices that have come through advances in technology and partnerships with new programming providers.

We haven’t noticed any service improvements, and had already been thinking about getting rid of the subscription. Last month our cable service went out for most of a week, and it didn’t really change our daily routine at all. Over the long weekend I also made some good progress on moving our DVDs and VHS videos onto the house server, so I had pretty much decided to reallocate something less than $600 per year to purchasing / buying video content and unplug the cable after December.

Another way to think of it is that for the same price, I can subscribe to Netflix, and also purchase 2 or 3 DVDs a month, and still end up ahead.

One sticking point is likely to be Emily’s cartoons on the weekends. Another is that nobody else in our household can get videos to play over the network reliably, which puts a big dent in the convenience factor.

In the meantime, the channel unbundling discussion seems to have come back to life at the FCC, although the a la carte services would probably be even more expensive.

Monty Python – And Now for Something Completely Different

Our family has enjoyed Rowan Atkinson as Mr. Bean in the past, so this weekend I thought I’d see how Monty Python went over with our daughter. I think British humor is partially an acquired taste, but the 4th graders around here seem to have a keen appreciation for the absurd, especially if it involves naked people and/or underwear. A bit of animation doesn’t hurt, either.

And Now For Something Completely Different isn’t really a movie so much as a collection of skits that can be watched (or skipped) separately without missing anything.

A few notes:

  • The Hungarian Tourist was a big hit. Hopefully we won’t end up with all the kids at school saying “My hovercraft is full of eels”.
  • The Man with a Tape Recorder Up His Nose was completely baffling to my daughter, who has never actually seen a tape recorder. We had to pause the DVD for a sidebar discussion.
  • How Not to Be Seen: “Why is everyone getting blown up?” People randomly getting shot, blown up, or having 16 ton weights dropped on them was vaguely confusing to her. We don’t generally watch a lot of PG-13 movies with her, although we will probably make an exception for the new Harry Potter movie.
  • The Dead Parrot, the Biker Grannies, and the Marriage Counselor all went over well. We had another sidebar discussion on what marriage counselors were and the various sorts of “inappropriateness” that were going on…
  • Cartoon naked people, the dancing Venus on the Half Shell, and men in bikinis all got the kid-stamp-of-approval

This is probably not a movie for kids of all ages, but might be entertaining for some. I think we’ll try Monty Python and the Holy Grail before too long.

Star Wreck – In the Pirkinning

The cable guy actually did turn up last week, so we still have cable TV. In the meantime, there are many interesting, non-mass-media video projects online.
In The Pirkinning
A few days ago I got around to fixing Azureus on the house server so I could download Star Wreck – In the Pirkinning using BitTorrent. This is a Finnish-made take-off on Star Trek and Babylon 5, created by a group of motivated fans over a period of seven(!) years. (Wikipedia entry)

Digital video tools became drastically cheaper and better during the project, and the quality of the composited sets and special effects is impressive.

The movie is available (with English subtitles, too) free, under a Creative Common license.

See also: Rocketboom

Page 1 of 212