Bookmarks for December 31st through January 17th

These are my links for December 31st through January 17th:

  • Khan Academy – The Khan Academy is a not-for-profit organization with the mission of providing a high quality education to anyone, anywhere.

    We have 1000+ videos on YouTube covering everything from basic arithmetic and algebra to differential equations, physics, chemistry, biology and finance which have been recorded by Salman Khan.

  • StarCraft AI Competition | Expressive Intelligence Studio – AI bot warfare competition using a hacked API to run StarCraft, will be held at AIIDE2010 in October 2010.
    The competition will use StarCraft Brood War 1.16.1. Bots for StarCraft can be developed using the Broodwar API, which provides hooks into StarCraft and enables the development of custom AI for StarCraft. A C++ interface enables developers to query the current state of the game and issue orders to units. An introduction to the Broodwar API is available here. Instructions for building a bot that communicates with a remote process are available here. There is also a Forum. We encourage submission of bots that make use of advanced AI techniques. Some ideas are:
    * Planning
    * Data Mining
    * Machine Learning
    * Case-Based Reasoning
  • Measuring Measures: Learning About Statistical Learning – A "quick start guide" for statistical and machine learning systems, good collection of references.
  • Berkowitz et al : The use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems (2006) – Berkowitz, Steven D., Woodward, Lloyd H., & Woodward, Caitlin. (2006). Use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems. Originally intended for publication in updating the 1988 volume, eds., Wellman and Berkowitz, Social Structures: A Network Approach (Cambridge University Press). Steve died in November, 2003. See Barry Wellman’s “Steve Berkowitz: A Network Pioneer has passed away,” in Connections 25(2), 2003. It has not been possible to add the updating of references or of the quality of graphics that might have been possible if Berkowitz were alive. An early version of the article appeared in the Proceedings of the Session on Combating Terrorist Networks: Current Research in Social Network Analysis for the New War Fighting Environment. 8th International Command and Control Research and Technology Symposium. National Defense University, Washington, D.C June 17-19, 2003
  • SSH Tunneling through web filters | – Step by step tutorial on using Putty and an EC2 instance to set up a private web proxy on demand.
  • PyDroid GUI automation toolkit – GitHub – What is Pydroid?

    Pydroid is a simple toolkit for automating and scripting repetitive tasks, especially those involving a GUI, with Python. It includes functions for controlling the mouse and keyboard, finding colors and bitmaps on-screen, as well as displaying cross-platform alerts.
    Why use Pydroid?

    * Testing a GUI application for bugs and edge cases
    o You might think your app is stable, but what happens if you press that button 5000 times?
    * Automating games
    o Writing a script to beat that crappy flash game can be so much more gratifying than spending hours playing it yourself.
    * Freaking out friends and family
    o Well maybe this isn't really a practical use, but…

  • Time Series Data Library – More data sets – "This is a collection of about 800 time series drawn from many different fields.Agriculture Chemistry Crime Demography Ecology Finance Health Hydrology Industry Labour Market Macro-Economics Meteorology Micro-Economics Miscellaneous Physics Production Sales Simulated series Sport Transport & Tourism Tree-rings Utilities"
  • How informative is Twitter? » SemanticHacker Blog – "We undertook a small study to characterize the different types of messages that can be found on Twitter. We downloaded a sample of tweets over a two-week period using the Twitter streaming API. This resulted in a corpus of 8.9 million messages (”tweets”) posted by 2.6 million unique users. About 2.7 million of these tweets, or 31%, were replies to a tweet posted by another user, while half a million (6%) were retweets. Almost 2 million (22%) of the messages contained a URL."
  • Gremlin – a Turing-complete, graph-based programming language – GitHub – Gremlin is a Turing-complete, graph-based programming language developed in Java 1.6+ for key/value-pair multi-relational graphs known as property graphs. Gremlin makes extensive use of the XPath 1.0 language to support complex graph traversals. This language has applications in the areas of graph query, analysis, and manipulation. Connectors exist for the following data management systems:

    * TinkerGraph in-memory graph
    * Neo4j graph database
    * Sesame 2.0 compliant RDF stores
    * MongoDB document database

    The documentation for Gremlin can be found at this location. Finally, please visit TinkerPop for other software products.

  • The C Programming Language: 4.10 – by Kernighan & Ritchie & Lovecraft – void Rlyeh
    (int mene[], int wgah, int nagl) {
    int Ia, fhtagn;
    if (wgah>=nagl) return;
    swap (mene,wgah,(wgah+nagl)/2);
    fhtagn = wgah;
    for (Ia=wgah+1; Ia<=nagl; Ia++)
    if (mene[Ia]<mene[wgah])
    swap (mene,++fhtagn,Ia);
    swap (mene,wgah,fhtagn);
    Rlyeh (mene,wgah,fhtagn-1);
    Rlyeh (mene,fhtagn+1,nagl);


  • How to convert email addresses into name, age, ethnicity, sexual orientation – This is so Meta – "Save your email list as a CSV file (just comma separate those email addresses). Upload this file to your facebook account as if you wanted to add them as friends. Voila, facebook will give you all the profiles of all those users (in my test, about 80% of my email lists have facebook profiles). Now, click through each profile, and because of the new default facebook settings, which makes all information public, about 95% of the user info is available for you to harvest."
  • Microsoft Security Development Lifecycle (SDL): Tools Repository – A collection of previously internal-only security tools from Microsoft, including anti-xss, fuzz test, fxcop, threat modeling, binscope, now available for free download.
  • Analytics X Prize – Home – Forecast the murder rate in Philadelphia – The Analytics X Prize is an ongoing contest to apply analytics, modeling, and statistics to solve the social problems that affect our cities. It combines the fields of statistics, mathematics, and social science to understand the root causes of dysfunction in our neighborhoods. Understanding these relationships and discovering the most highly correlated variables allows us to deploy our limited resources more effectively and target the variables that will have the greatest positive impact on improvement.
  • PeteSearch: How to find user information from an email address – FindByEmail code released as open-source. You pass it an email address, and it queries 11 different public APIs to discover what information those services have on the user with that email address.
  • Measuring Measures: Beyond PageRank: Learning with Content and Networks – Conclusion: learning based on content and network data is the current state of the art There is a great paper and talk about personalization in Google News they use content for this purpose, and then user click streams to provide personalization, i.e. recommend specific articles within each topical cluster. The issue is content filtering is typically (as we say in research) "way harder." Suppose you have a social graph, a bunch of documents, and you know that some users in the social graph like some documents, and you want to recommend other documents that you think they will like. Using approaches based on Networks, you might consider clustering users based on co-visitaion (they have co-liked some of the documents). This scales great, and it internationalizes great. If you start extracting features from the documents themselves, then what you build for English may not work as well for the Chinese market. In addition, there is far more data in the text than there is in the social graph
  • mikemaccana’s python-docx at master – GitHub – MIT-licensed Python library to read/write Microsoft Word docx format files. "The docx module reads and writes Microsoft Office Word 2007 docx files. These are referred to as 'WordML', 'Office Open XML' and 'Open XML' by Microsoft. They can be opened in Microsoft Office 2007, Microsoft Mac Office 2008, 2.2, and Apple iWork 08. The module was created when I was looking for a Python support for MS Word .doc files, but could only find various hacks involving COM automation, calling .net or Java, or automating OpenOffice or MS Office."

Bookmarks for May 5th through May 6th

These are my links for May 5th through May 6th:

Bookmarks for April 30th through May 2nd

These are my links for April 30th through May 2nd:

  • FusionCharts Free – Animated Flash Charts and Graphs for ASP, PHP, ASP.NET, JSP, RoR and other web applications – Flash charting component that can be used to render data-driven & animated charts for your web applications and presentations. It is a cross-browser and cross-platform solution that can be used with PHP, Python, Ruby on Rails, ASP, ASP.NET, JSP, ColdFusion, simple HTML pages or even PowerPoint Presentations to deliver interactive and powerful flash charts. You do NOT need to know anything about Flash to use FusionCharts. All you need to know is the language you're programming in.
  • Raphaël—JavaScript Library – Raphaël is a small JavaScript library that should simplify your work with vector graphics on the web. If you want to create your own specific chart or image crop and rotate widget, for example, you can achieve it simply and easily with this library. Raphaël uses the SVG W3C Recommendation and VML as a base for creating graphics. This means every graphical object you create is also a DOM object, so you can attach JavaScript event handlers or modify them later. Raphaël’s goal is to provide an adapter that will make drawing vector art compatible cross-browser and easy.
  • A Really Gentle Introduction to Data Mining | Regular Geek – List of data mining blogs and related resources.
  • BlackBerry SSH Tutorial: Connect to Unix Server using MidpSSH for Mobile Devices – Notes on using MidpSSH on Blackberry for remote access to servers. Seems to work, although big network lag on my BlackBerry Bold / AT&T.
  • Country Reports on Terrorism 2008 – U.S. law requires the Secretary of State to provide Congress, by April 30 of each year, a full and complete report on terrorism with regard to those countries and groups meeting criteria set forth in the legislation. This annual report is entitled Country Reports on Terrorism. Beginning with the report for 2004, it replaced the previously published Patterns of Global Terrorism.
  • DIY: How To Find Authoritative Twitter Users Plus 100 To Get You Started | Ignite Social Media – Some comments on recommendation metrics for Twitter, trying to use "favorites" mark as an indicator.
  • SIGUSR2 > The Power That is GNU Emacs – "If you've never been convinced before that Emacs is the text editor in which dreams are made from, or that inside Emacs there are unicorns manipulating your text, don't expect me to convince you."

Bookmarks for April 28th through April 29th

These are my links for April 28th through April 29th:

Bookmarks for April 12th from 17:02 to 19:13

These are my links for April 12th from 17:02 to 19:13:

Bookmarks for April 11th through April 12th

These are my links for April 11th through April 12th:

  • Wordle – Beautiful Word Clouds – Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes.
  • The dark side of Dubai – Johann Hari, Commentators – The Independent – "Dubai was meant to be a Middle-Eastern Shangri-La, a glittering monument to Arab enterprise and western capitalism. But as hard times arrive in the city state that rose from the desert sands, an uglier story is emerging."
  • Topless Robot – Hot Girls Have Lightsaber Strip-Fight for Your Viewing Pleasure – Star Wars CGI meets fake body spray ad
  • Poll Result: Best VPN to leap China’s Great Firewall? – Thomas Crampton – - Witopia – Undisputed winner. Quality of service, speed of surfing, though it is said to be relatively expensive at US$50 to US$60 per year. Hotspot Shield – Bandwidth limits can be painful. Force you to wait until the next month if you use it too much. – Ultrasurf – StrongVPN
  • InfoQ: Facebook: Science and the Social Graph – In this presentation filmed during QCon SF 2008 (November 2008), Aditya Agarwal discusses Facebook’s architecture, more exactly the software stack used, presenting the advantages and disadvantages of its major components: LAMP (PHP, MySQL), Memcache, Thrift, Scribe.
  • The Running Man, Revisited § SEEDMAGAZINE.COM – a handful of scientists think that these ultra-marathoners are using their bodies just as our hominid forbears once did, a theory known as the endurance running hypothesis (ER). ER proponents believe that being able to run for extended lengths of time is an adapted trait, most likely for obtaining food, and was the catalyst that forced Homo erectus to evolve from its apelike ancestors.

Bookmarks for March 12th through March 16th

These are my links for March 12th through March 16th:

Bookmarks for March 3rd from 05:48 to 12:10

These are my links for March 3rd from 05:48 to 12:10:

Global Markets Daily Trading Schedule


Global financial markets are linked more closely than ever. Here’s a crib sheet of a few markets of interest and their opening/closing times in Pacific Time (US West Coast).

PST EST Market
12:00M 3:00am London stock exchange open (8:00am local)
Frankfurt stock exchange open (9:00am local)
Hong Kong stock exchange afternoon session close (4:00pm local)
1:00am 4:00am Singapore stock exchange afternoon session close (5:00pm local)
3:30am 6:30am Bombay stock exchange close (3:30pm local)
5:00am 8:00am US ECN premarket open
6:30am 9:30am NYSE, NASDAQ, AMEX, TSE markets open
8:30am 11:30am London stock exchange close (4:30pm local)
Frankfurt stock exchange close (5:30pm local)
1:00pm 4:00pm NYSE NASDAQ, AMEX, TSE market close, US afterhours ECN trading open
1:15pm 4:15pm CME close (US Globex electronics futures daily close)
1:30pm 4:30pm CME open (US Globex electronics futures open)
3:00pm 6:00pm CME Sunday/Holiday open (US Globex electronic futures weekly open)
4:00pm 7:00pm Tokyo stock exchange morning session open (9:00am local)
Korean stock exchange open (9:00am local)
Australian stock exchange open (10:00am local)
5:00pm 8:00pm Singapore stock exchange morning session open (9:00am local)
Taiwan stock exchange open (9:00am local)
US afterhours ECN trading close
5:30pm 8:30pm Shanghai stock exchange morning session open (9:30am local)
6:00pm 9:00pm Hong Kong stock exchange morning session open (10:00am local)
Tokyo stock exchange morning session close (11:00am local)
7:30pm 10:30pm Tokyo stock exchange afternoon session open (12:30pm  local)
Shanghai stock exchange morning session close (11:30am local)
8:30pm 11:30pm Hong Kong stock exchange morning session close (12:30pm local)
Singapore stock exchange morning session close (12:30pm local)
9:00pm 12:00M Shanghai stock exchange afternoon session open (1:00pm  local)
9:25pm 12:25am Bombay stock exchange open (9:55am  local)
10:00pm 1:00am Tokyo stock exchange afternoon session close (3:00pm local)
Australian stock exchange close (4:00pm local)
Singapore stock exchange afternoon session open (2:00pm local)
10:15pm 1:15am Korean stock exchange close (3:15pm local)
10:30pm 1:30am Hong Kong stock exchange afternoon session open (2:30pm local)
Taiwan stock exchange close (1:30pm local)
11:00pm 2:00am Shanghai stock exchange afternoon session close (3:00pm local)


There are often interesting interactions at major market open and closes, especially during the overlap between the US market open and European market close. In addition, US index futures, particularly the ES (S&P 500 e-mini futures) also trade nearly around the clock, closing only for daily settlement between 4:15 and 4:30pm US East Coast time (plus weekend and holidays).

Note that many Asian markets have morning and afternoon sessions, and close for lunch. Different countries also observe differing practices with respect to Daylight Savings Time, so the relative timing may change seasonally. You may find it useful to check with a World Clock for the current times. Also remember that Asia begins it’s week on Sunday evening in the US, and is closed for the week by the time it’s Friday in the US.

On being thankful for clean water and food (and alpacas)

Around this time of year I usually review our charitable contributions. This year I’ve enlisted my 10-year-old daughter in part of the review process. We recently received a donation “catalog” from World Vision, which lists a range of targeted donations for livestock, medicine, education, water, and other basic needs. I’ve given her the responsibility to read through the catalog, learn about the various needs, and choose something that we will fund. (Along similar lines, last year some of the kids at her school had a project to buy a cow at Heifer International.)

It can be hard for kids (and grownups) living here to relate to the idea of scarce and/or unsafe water, subsistence farming, or a general absence of health, education, and basic physical and economic security. Spending time travelling in and around the developing world has given me a greater appreciation for the mundane efficiency of everyday life when I return home. (Drinkable tap water, stable electricity, Whole Foods, etc.)

I’m inclined toward systemic solutions and assistance rather than one-time fixes, so I’m biased toward providing aid that enables people to help themselves. This obviously doesn’t work for disaster relief, but I’m starting to think of that as a separate recurring category of its own.

Overall, here’s where we allocate most of our donations, from local to global:

  • Our local public elementary school (California state funding is lame)
  • Various community funds (helps agencies here in Palo Alto)
  • Our church (keeps it running, and funds external national and global programs)
  • My high school (boarding school, which I attended on financial aid)
  • My college (which I attended on financial aid)
  • Local United Way (designated to regional agencies)
  • Various global charities (different ones from year to year)

Emily is apparently leaning towards an alpaca, because they’re soft and furry. I’m trying to make a case for one of the water related items, although at this point I figure I’m happy simply having a framework for a discussion with her about life in the rest of the world.

Charity Navigator is a good resource for checking out charitable organizations. At the moment I like the Grameen Foundation and World Vision for global programs.

See also: Merry-go-round and see-saw powered water pumps, DIY UV Water Treatment System, Voltage Stabilizers and Hidden Costs of Rural ICT

North Korea tests a nuclear bomb?

North Korea has been threatening to test a nuclear weapon recently, and may have done so a couple of hours ago.

The test is “unconfirmed” at the moment, but South Korea says it detected seismic activity measuring 3.5 on the Richter scale at 0136GMT, or 10:36AM Korea local time. The presumed test site is underground, in a coal mine in Gilju.

It’s surprising to me that, given the advance warning, there isn’t an official confirmation that there was a nuclear test or not. There’s probably no shortage of equipment set up to monitor the situation, and I would expect a different signature for a nuclear explosion than from setting off a huge pile of RDX at the bottom of a mine.

There’s no shortage of countries that could build nuclear weapons if they wanted. South Korea and Japan in particular come to mind at the moment. More problematic would be Kim Jong-Il making a deal with Iran’s Mahmoud Ahmadinejad (or someone similar) to trade oil and hard currency for nuclear weapons technology.

Harmony and Disharmony – Organizational issues in Al-Qaida and startups

There’s an interesting new report out today from the Combating Terrorism Center at West Point (the US Military Academy), titled “Harmony and Disharmony: Exploting Al-Qa’ida’s Organizational Vunerabilities“, which has some useful insights for entrepreneurs and corporate managers as well as for those dealing with global jihadist movements or with a general interest in global security issues.

The report is based on a collection of captured documents which have been recently declassified, and examines some of the strengths and weaknesses of the Al-Qa’ida organizational structure. The merits of a 21st-century, networked, mobile, internet-enabled insurgency have been observed elsewhere at length, as summarized by James Na at Korea Liberator:

Martin van Creveld of Hebrew University, the author of the highly influential Transformation of War who has been lauded (including by me) as a leading prophet of military transformation, even went on to suggest that the small/weak would always beat the big/strong in a long war. (The stronger side is more constrained in methods; it also loses morale more rapidly from inability to defeat the weak completely over a long period of time; on the other hand, the weaker side often enjoys a more flexible, networked organization, and has a faster decision making cycle, i.e. the OODA loop).

The captured documents (available online in both original Arabic and translated English) have a remarkably familiar feel to them. Take out the parts about politics, religion, and carrying out jihad, and it looks kind of like an odd startup, with position descriptions (“must have work experience of no less than 5 years and have complete military operational experience in the battlefront and bases”), employment contracts (“vacation requests must be submitted two and a half months before the travel date”), and bylaws (“Goals – To spread the feeling of Jihad throughout the Muslim nation”).

Part of what makes the report interesting is that it’s based on Al-Qa’ida’s own self-assessment of what’s working and what isn’t working. Here are some sample items from a post-mortem summary of Al-Qa’ida’s experience in Syria:

  1. Absence of an advanced comprehensive plan and strategy
  2. The faithful mujahideen were spread among numerous organizations
  3. Failure to explain the mujahid revolutionary theory and clarify it’s objectives on an ideological level
  4. Low level of religious instruction and scarcity of revolutionary and political awareness
  5. Dependence on quantity after the 1st blow did away with the quality
  6. Weak public relations campaign both inside and out
  7. Dependence of the mujahideen on outside sources for support instead of being self-sufficient
  8. Getting bogged down in long term gang warfare unsuitable for the country
  9. Moving out of the country for an extended period of time, losing touch with the masses, and the decline of the religious and revolutionary level among the members
  10. Not benefiting from the Islamic and international gang warfare experiences
  11. Dealing with the neighboring regimes as if they were permanent supporters of jihad
  12. Operating publicly was a grave error
  13. Deficiency of military operations on the outside and failure to deter the enemy and their friends
  14. No planning for the aftermath of the regime
  15. Not rallying around the religious scholars and benefiting from them

A lot of this looks like the “before” part of a management consulting project.

Some items here remind me of Noel Tichy’s views on management, on the need for aligning ideas and values to achieve effective action within the organization. At the same time, many of their operational problems are linked to “agency” problems. This is when individuals or affliates have an incentive to do something in their own interest rather than those of the organization, and which get worse in the presence of personal risk and operational secrecy. This tends not to happen as much in companies, but there are still spectacular failures from time to time (think Enron’s SPEs).

If you’re interested in thinking about startup organizations and competition from a very different perspective, check it out.

Update 02-14-2006 23:37 PST: You may also be interested in “Unrestricted Warfare“, on asymmetric warfare, a 1999 paper by senior Chinese PLA officers, and Scott Maxwell’s recent series of posts, “How David can beat Goliath“.

Update 03-08-2006 10:58 PST: You may be interested in “Stealing Al-Qaida’s Playbook” which reviews other writings from active jihadists, also from the Combating Terrorism Center, although it’s probably less useful in a business context than the ideas on asymmetric warfare.

Some disconnects in China GDP reporting

SimonWorld notes some analysis in the South China Morning Post on last week’s official Chinese GDP numbers.

Last week China reported another stunning GDP growth number of 9.4%. But as we’ve found numerous times before, the numbers underlying the GDP calculation don’t add up. Either China’s consumers went on strike or fixed asset investment has been over-estimated. Jake van der Kamp is on the case and reaches an unsurprisingly but important conclusion:

…It seems from this that in the year to September the man on the street spent 17 per cent less on daily necessities and toys than he did the previous year. But this is not what other official statistics say. They say that retail spending for the year to September was 13.6 per cent greater than it was the previous year (the blue line) and that this retail spending alone was almost twice as great as the remainder number we calculated for all personal consumption spending.

How is it possible?

It is not. The latest GDP figures from the mainland simply do not add up. I hesitate to use the word “rubbish” to describe them but I am starved of a better one.

I think the enormous discrepancy most likely results from an overstatement of fixed asset investment. Capital spending probably is much less than the National Bureau of Statistics says it is. This would imply something else again, however. It would suggest that a vast amount of money earmarked for capital projects was embezzled by corrupt officials and used instead for personal spending on luxury services and toys.

I shall not suggest that this surprises you.

Every second anecdote from the mainland tells you it happens every day. All I have done is put some possible numbers to the scale of it, a very big scale indeed. But I do suggest to the National Bureau of Statistics that it adopt a brand new approach for checking statistics, a new one to the bureau that is. The next time it publishes data it might want to check that the sum of the parts adds up to a given total.

It’s always worth taking any government reports with a grain of salt. In particular, no one actually believes the published numbers by the Chinese government. But this offers a glimpse at some of specific disconnects in the official reporting system. Plus, “rubbish” as financial commentary is pretty entertaining.

For reference, 15 trillion yuan is roughly $1.85 trillion US dollars. The US GDP in 2004 was roughly $11.75 trillion US dollars.

Online Currency Trading at

Although there are a lot of online brokerages, there aren’t a lot of easy ways to trade foreign currency. I came across this last summer in an article in Barrons.

Update 09-22-2005 21:16 PDT: profile of at alarm:clock