Bookmarks for December 31st through January 17th

These are my links for December 31st through January 17th:

  • Khan Academy – The Khan Academy is a not-for-profit organization with the mission of providing a high quality education to anyone, anywhere.

    We have 1000+ videos on YouTube covering everything from basic arithmetic and algebra to differential equations, physics, chemistry, biology and finance which have been recorded by Salman Khan.

  • StarCraft AI Competition | Expressive Intelligence Studio – AI bot warfare competition using a hacked API to run StarCraft, will be held at AIIDE2010 in October 2010.
    The competition will use StarCraft Brood War 1.16.1. Bots for StarCraft can be developed using the Broodwar API, which provides hooks into StarCraft and enables the development of custom AI for StarCraft. A C++ interface enables developers to query the current state of the game and issue orders to units. An introduction to the Broodwar API is available here. Instructions for building a bot that communicates with a remote process are available here. There is also a Forum. We encourage submission of bots that make use of advanced AI techniques. Some ideas are:
    * Planning
    * Data Mining
    * Machine Learning
    * Case-Based Reasoning
  • Measuring Measures: Learning About Statistical Learning – A "quick start guide" for statistical and machine learning systems, good collection of references.
  • Berkowitz et al : The use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems (2006) – Berkowitz, Steven D., Woodward, Lloyd H., & Woodward, Caitlin. (2006). Use of formal methods to map, analyze and interpret hawala and terrorist-related alternative remittance systems. Originally intended for publication in updating the 1988 volume, eds., Wellman and Berkowitz, Social Structures: A Network Approach (Cambridge University Press). Steve died in November, 2003. See Barry Wellman’s “Steve Berkowitz: A Network Pioneer has passed away,” in Connections 25(2), 2003. It has not been possible to add the updating of references or of the quality of graphics that might have been possible if Berkowitz were alive. An early version of the article appeared in the Proceedings of the Session on Combating Terrorist Networks: Current Research in Social Network Analysis for the New War Fighting Environment. 8th International Command and Control Research and Technology Symposium. National Defense University, Washington, D.C June 17-19, 2003
  • SSH Tunneling through web filters | s-anand.net – Step by step tutorial on using Putty and an EC2 instance to set up a private web proxy on demand.
  • PyDroid GUI automation toolkit – GitHub – What is Pydroid?

    Pydroid is a simple toolkit for automating and scripting repetitive tasks, especially those involving a GUI, with Python. It includes functions for controlling the mouse and keyboard, finding colors and bitmaps on-screen, as well as displaying cross-platform alerts.
    Why use Pydroid?

    * Testing a GUI application for bugs and edge cases
    o You might think your app is stable, but what happens if you press that button 5000 times?
    * Automating games
    o Writing a script to beat that crappy flash game can be so much more gratifying than spending hours playing it yourself.
    * Freaking out friends and family
    o Well maybe this isn't really a practical use, but…

  • Time Series Data Library – More data sets – "This is a collection of about 800 time series drawn from many different fields.Agriculture Chemistry Crime Demography Ecology Finance Health Hydrology Industry Labour Market Macro-Economics Meteorology Micro-Economics Miscellaneous Physics Production Sales Simulated series Sport Transport & Tourism Tree-rings Utilities"
  • How informative is Twitter? » SemanticHacker Blog – "We undertook a small study to characterize the different types of messages that can be found on Twitter. We downloaded a sample of tweets over a two-week period using the Twitter streaming API. This resulted in a corpus of 8.9 million messages (”tweets”) posted by 2.6 million unique users. About 2.7 million of these tweets, or 31%, were replies to a tweet posted by another user, while half a million (6%) were retweets. Almost 2 million (22%) of the messages contained a URL."
  • Gremlin – a Turing-complete, graph-based programming language – GitHub – Gremlin is a Turing-complete, graph-based programming language developed in Java 1.6+ for key/value-pair multi-relational graphs known as property graphs. Gremlin makes extensive use of the XPath 1.0 language to support complex graph traversals. This language has applications in the areas of graph query, analysis, and manipulation. Connectors exist for the following data management systems:

    * TinkerGraph in-memory graph
    * Neo4j graph database
    * Sesame 2.0 compliant RDF stores
    * MongoDB document database

    The documentation for Gremlin can be found at this location. Finally, please visit TinkerPop for other software products.

  • The C Programming Language: 4.10 – by Kernighan & Ritchie & Lovecraft – void Rlyeh
    (int mene[], int wgah, int nagl) {
    int Ia, fhtagn;
    if (wgah>=nagl) return;
    swap (mene,wgah,(wgah+nagl)/2);
    fhtagn = wgah;
    for (Ia=wgah+1; Ia<=nagl; Ia++)
    if (mene[Ia]<mene[wgah])
    swap (mene,++fhtagn,Ia);
    swap (mene,wgah,fhtagn);
    Rlyeh (mene,wgah,fhtagn-1);
    Rlyeh (mene,fhtagn+1,nagl);

    } // PH'NGLUI MGLW'NAFH CTHULHU!

  • How to convert email addresses into name, age, ethnicity, sexual orientation – This is so Meta – "Save your email list as a CSV file (just comma separate those email addresses). Upload this file to your facebook account as if you wanted to add them as friends. Voila, facebook will give you all the profiles of all those users (in my test, about 80% of my email lists have facebook profiles). Now, click through each profile, and because of the new default facebook settings, which makes all information public, about 95% of the user info is available for you to harvest."
  • Microsoft Security Development Lifecycle (SDL): Tools Repository – A collection of previously internal-only security tools from Microsoft, including anti-xss, fuzz test, fxcop, threat modeling, binscope, now available for free download.
  • Analytics X Prize – Home – Forecast the murder rate in Philadelphia – The Analytics X Prize is an ongoing contest to apply analytics, modeling, and statistics to solve the social problems that affect our cities. It combines the fields of statistics, mathematics, and social science to understand the root causes of dysfunction in our neighborhoods. Understanding these relationships and discovering the most highly correlated variables allows us to deploy our limited resources more effectively and target the variables that will have the greatest positive impact on improvement.
  • PeteSearch: How to find user information from an email address – FindByEmail code released as open-source. You pass it an email address, and it queries 11 different public APIs to discover what information those services have on the user with that email address.
  • Measuring Measures: Beyond PageRank: Learning with Content and Networks – Conclusion: learning based on content and network data is the current state of the art There is a great paper and talk about personalization in Google News they use content for this purpose, and then user click streams to provide personalization, i.e. recommend specific articles within each topical cluster. The issue is content filtering is typically (as we say in research) "way harder." Suppose you have a social graph, a bunch of documents, and you know that some users in the social graph like some documents, and you want to recommend other documents that you think they will like. Using approaches based on Networks, you might consider clustering users based on co-visitaion (they have co-liked some of the documents). This scales great, and it internationalizes great. If you start extracting features from the documents themselves, then what you build for English may not work as well for the Chinese market. In addition, there is far more data in the text than there is in the social graph
  • mikemaccana’s python-docx at master – GitHub – MIT-licensed Python library to read/write Microsoft Word docx format files. "The docx module reads and writes Microsoft Office Word 2007 docx files. These are referred to as 'WordML', 'Office Open XML' and 'Open XML' by Microsoft. They can be opened in Microsoft Office 2007, Microsoft Mac Office 2008, OpenOffice.org 2.2, and Apple iWork 08. The module was created when I was looking for a Python support for MS Word .doc files, but could only find various hacks involving COM automation, calling .net or Java, or automating OpenOffice or MS Office."

Bookmarks for June 3rd through June 4th

These are my links for June 3rd through June 4th:

Bookmarks for May 29th from 05:17 to 12:45

These are my links for May 29th from 05:17 to 12:45:

Bookmarks for May 12th from 10:52 to 21:56

These are my links for May 12th from 10:52 to 21:56:

Bookmarks for May 8th through May 12th

These are my links for May 8th through May 12th:

Bookmarks for May 3rd through May 4th

These are my links for May 3rd through May 4th:

  • Dilbert comic strip for 05/04/2009 from the official Dilbert comic strips archive. – Secretary to Pointy Haired Boss: "I live in a rented trailer and all of my money is in my checking account. Your investments are worthless and your mortgage is underwater. My net worth is higher than yours now. I guess promiscuity and a G.E.D. was a pretty good strategy after all." Reminded me of a thought I had earlier this year, that much of Western Civilization is built on valuing delayed gratification, which hasn't worked out so well recently as opposed to immediate consumption in many cases.
  • Without Warning, Twitter Kills StatTweets (Businesses Beware) – StatSheet.com ChangeLog – Owner of StatTweets post regarding his network of sports-related Twitter handles being banned. They had several hundred accounts, one for stats for each team. This makes sense for users, given the way Twitter works, but they don't like mass account creation. Interested to see how this sorts out, there seem to be at least a few similar Twitter networks with team/region/topic-specific handles.
  • Dooley Online: What URL Shortener Should I Use? – Comparison of features and some usage data for URL shorteners such as tinyurl and bit.ly used on twitter and other services.
  • Obesity and Overweight: Trends: U.S. Obesity Trends 1985-2007 | DNPAO | CDC – During the past 20 years there has been a dramatic increase in obesity in the United States. This slide set illustrates this trend by mapping the increased prevalence of obesity across each of the states. In 2007, only one state (Colorado) had a prevalence of obesity less than 20%. Thirty states had a prevalence equal to or greater than 25%; three of these states (Alabama, Mississippi and Tennessee) had a prevalence of obesity equal to or greater than 30%. The animated map below shows the United States obesity prevalence from 1985 through 2007.
  • Why text messages are limited to 160 characters | Technology | Los Angeles Times – A look back to the beginnings of SMS in 1985 – Would the 160-character maximum be enough space to prove a useful form of communication? Having zero market research, they based their initial assumptions on two "convincing arguments," Hillebrand said. For one, they found that postcards often contained fewer than 150 characters. Second, they analyzed a set of messages sent through Telex, a then-prevalent telegraphy network for business professionals. Despite not having a technical limitation, Hillebrand said, Telex transmissions were usually about the same length as postcards.

Bookmarks for April 30th from 05:57 to 07:10

These are my links for April 30th from 05:57 to 07:10:

Bookmarks for April 12th through April 13th

These are my links for April 12th through April 13th:

Bookmarks for April 11th through April 12th

These are my links for April 11th through April 12th:

  • Wordle – Beautiful Word Clouds – Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes.
  • The dark side of Dubai – Johann Hari, Commentators – The Independent – "Dubai was meant to be a Middle-Eastern Shangri-La, a glittering monument to Arab enterprise and western capitalism. But as hard times arrive in the city state that rose from the desert sands, an uglier story is emerging."
  • Topless Robot – Hot Girls Have Lightsaber Strip-Fight for Your Viewing Pleasure – Star Wars CGI meets fake body spray ad
  • Poll Result: Best VPN to leap China’s Great Firewall? – Thomas Crampton – - Witopia – Undisputed winner. Quality of service, speed of surfing, though it is said to be relatively expensive at US$50 to US$60 per year. Hotspot Shield – Bandwidth limits can be painful. Force you to wait until the next month if you use it too much. – Ultrasurf – StrongVPN
  • InfoQ: Facebook: Science and the Social Graph – In this presentation filmed during QCon SF 2008 (November 2008), Aditya Agarwal discusses Facebook’s architecture, more exactly the software stack used, presenting the advantages and disadvantages of its major components: LAMP (PHP, MySQL), Memcache, Thrift, Scribe.
  • The Running Man, Revisited § SEEDMAGAZINE.COM – a handful of scientists think that these ultra-marathoners are using their bodies just as our hominid forbears once did, a theory known as the endurance running hypothesis (ER). ER proponents believe that being able to run for extended lengths of time is an adapted trait, most likely for obtaining food, and was the catalyst that forced Homo erectus to evolve from its apelike ancestors.

Bookmarks for April 9th from 08:07 to 17:53

These are my links for April 9th from 08:07 to 17:53:

Bookmarks for March 16th through April 2nd

These are my links for March 16th through April 2nd:

Bookmarks for March 9th through March 12th

These are my links for March 9th through March 12th:

Bookmarks for March 6th through March 8th

These are my links for March 6th through March 8th:

Bookmarks for March 4th through March 6th

These are my links for March 4th through March 6th:

  • Welcome to VIPERdb – Scripps – VIPERdb is a database for icosahedral virus capsid structures . The emphasis of the resource is on providing data from structural and computational analyses on these systems, as well as high quality renderings for visual exploration.
  • Virus images at VIPERdb – If you have ever wanted to make beautiful images of viruses, in colors of your choice, then go to VIPERdb, the virus particle explorer.
  • Reverse HTTP – IETF draft-lentczner-rhttp-00.txt – Formal description of the reverse HTTP proposal for initiating connections through firewalls then reversing server and client roles.
  • Reverse HTTP – Second Life Wiki – Experimental protocol which takes advantage of the HTTP/1.1 Upgrade: header to turn one HTTP socket around. When a client makes a request to a server with the Upgrade: PTTH/0.9 header, the server may respond with an Upgrade: PTTH/1.0 header, after which point the server starts using the socket as a client, and the client starts using the socket as a server.
  • WTFs/m – The only valid measurement of code quality, WTFs/min

Bookmarks for February 16th through February 17th

These are my links for February 16th through February 17th:

  • Top 100 Network Security Tools – Many many security testing and hacking tools.
  • FRONTLINE: inside the meltdown: watch the full program – "On Thursday, Sept. 18, 2008, the astonished leadership of the U.S. Congress was told in a private session by the chairman of the Federal Reserve that the American economy was in grave danger of a complete meltdown within a matter of days. "There was literally a pause in that room where the oxygen left," says Sen. Christopher Dodd"
  • The Dark Matter of a Startup – "Every successful startup that I have seen has someone within their ranks that just kinda “does stuff.” No one really knows specifically what they do, but its vital to the success of the startup."
  • Why I Hate Frameworks – "A hammer?" he asks. "Nobody really buys hammers anymore. They're kind of old fashioned…we started selling schematic diagrams for hammer factories, enabling our clients to build their own hammer factories, custom engineered to manufacture only the kinds of hammers that they would actually need."
  • Mining The Thought Stream – Lots of comments around what is Twitter good for and how will it make money, revolving around real/near-time search, analytics, marketing, etc.
  • Understanding Web Operations Culture – the Graph & Data Obsession … – Comparison of traffic at Flickr, Google, Twitter, last.fm during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."

BarCamp returns to Palo Alto


BarCamp returns to Palo Alto next weekend, this time as BarCampBlock.

Almost two years ago, a group of 6 San Francisco geeks in 7 days, using blogs, wikis and IRC slapped together a weekend conference with wifi, food and amazing presentations in Palo Alto, California. This was a different kind of conference, though. There were no superstar keynote speakers. There were no pre-programmed agendas. There was a brilliant agenda filled with content by and for the attendees. Everyone, including the sponsors, the organizers, the speakers and the audience were involved in making the event happen equally and were often one and the same. Over the weekend, more than 200 people showed up and people watched remotely from all over the world. This event was BarCamp.

Who should be there? Anyone working on a new startup that wants to get some great feedback. Anyone looking for talent. Anyone talented looking for work. Anyone looking to invest in brilliant new ideas. Anyone looking to find partners for their brilliant new ideas. Anyone who wants to practice a presentation s/he is working on. Anyone who has a passion for blogging, wikis, design, coding and the web in general. Everyone is welcome. Everyone is encouraged to present. It’s totally free and an excellent source of what is hot, new and upcoming.

Highly recommended if you’re in the area and have any interest whatsoever. The price is right, too. (Free, donations welcome.)

See also: Notes from BarCamp (the original one in 2005)

Point spread function, before and after LASIK



I recently went for my two-year followup to see how my eyes are doing after wavefront LASIK. At the initial exam and each followup visit, they measure the point spread function of your eye. Here’s a before-and-after.

The scale of the two graphs are different, so the improvement is even better than it appears at first glance. The upper plot corresponds to roughly 20/80 vision. The lower plot, two years later, is at 20/15.

Ms. Dewey – Stylish search, with whips, guns, and dating tips


It’s been a while since I’ve come across something I haven’t seen before online. Ms. Dewey fits the bill. It is a Flash-based application combining video clips of actress Janina Gavankar with Windows Live search.

As a search application, it’s fat, slow, and the query results aren’t great. However, as John Batelle observes, “clearly, search ain’t the point.” This is search with an flirty attitude, where the speed and quality of the results aren’t at the top of the priority list.

As short-attention-span theater goes, it’s quite entertaining.

If you can’t think of anything to search for, Ms. Dewey will fidget for a while and eventually reach out and tap on the screen. “Helloooo…type something here…”

It’s far more interesting to try some queries and check out the responses. I spent over half an hour typing in keywords to see what would come up, starting with some of the suggestions from Digg and Channel9. The application provides a semi-random set of video responses based on the search keywords, so you won’t always get the same reaction each time.

The whip and riding crop don’t always appear when you’d think, the lab coat seems to be keyed to science and math (try “partial differential equation”), and I’m not sure what brings on the automatic weapons.

“Ms. Dewey” also has a MySpace page with more video clips. The way the application is constructed, they can probably keep updating and adding responses as long as they want to.

I briefly tried using Ms. Dewey in place of Google, as a working search engine, but it takes too long to respond to a series of queries (have to wait for the video to play) and the search results aren’t great (Live is continuing to improve, though). At the moment this is a fun conceptual experiment.

I wonder if we’ll see a new category of search emphasizing style (entertainment, attitude, sex) over substance (relevance, speed, scope). Today’s version might already work for the occasional search user, but imagine Ms. Dewey with faster, non-blocking search results, a better search UI, and Google’s results. It all vaguely reminds me of a William Gibson novel.

Star Trek and the Knights of the Round Table


“Knights, I bid you welcome to your new home…Camelot!”

An amazing mashup of Monty Python’s “Knights of the Round Table” song, with singing and dancing by the cast of the original Star Trek.

If this doesn’t leave you rolling around on the floor, it probably means you’re completely baffled and/or younger than 30 or so.

Try naming the episodes for extra credit…

via The Big Picture

Who carries three cell phones?

IMG_6298
I was out for dinner at Fukisushi in Palo Alto this evening, enjoying some excellent spider rolls and giant clam sushi. A few minutes after we were served, a young couple came in, perhaps meeting for a date after work. At first I noticed that the man had the same cell phone (a Nokia 6682) as my wife, as he took it out and set it on the table next to him. Then he took out a Motorola Razr, flipped it open, and set it on the table next to the Nokia. I’m thinking that this is somewhat geeky and he should be paying more attention to his attractive blonde companion, but he looks like an engineering or tech operations kind of guy, and this is Silicon Valley, so maybe he has a work phone for being on call and a personal phone. But then he pulls out yet another phone, flips it open and sets it down next to the other two, creating a sort of mini-console of cell phones on the table next to the sushi plates.

Now I’m confused. I can think of lots of reasons why someone might have two cell phones. I can’t think of any good reason to park three cell phones on the table while on a date, though.

I don’t think he actually used any of them, except to take a photo of his companion with the Nokia.

Personally, I’ve been cutting down on the hardware I carry for some time now. At one point a few years ago, I often carried two PDAs, two cell phones, and a pager. That didn’t last long. These days I try to stick with one phone, as small as practical.

This episode makes me laugh, because I’m more puzzled by this guy carting three phones around than him parking them on the table in the middle of his dinner date.

Page 1 of 212