Bookmarks for February 4th through February 11th

These are my links for February 4th through February 11th:

  • Schneier on Security: Interview with a Nigerian Internet Scammer – "We had something called the recovery approach. A few months after the original scam, we would approach the victim again, this time pretending to be from the FBI, or the Nigerian Authorities. The email would tell the victim that we had caught a scammer and had found all of the details of the original scam, and that the money could be recovered. Of course there would be fees involved as well. Victims would often pay up again to try and get their money back."
  • xkcd – Frequency of Strip Versions of Various Games – n = Google hits for "strip <game name>" / Google hits for "<game name>"
  • PeteSearch: How to split up the US – Visualization of social network clusters in the US. "information by location, with connections drawn between places that share friends. For example, a lot of people in LA have friends in San Francisco, so there's a line between them.

    Looking at the network of US cities, it's been remarkable to see how groups of them form clusters, with strong connections locally but few contacts outside the cluster. For example Columbus, OH and Charleston WV are nearby as the crow flies, but share few connections, with Columbus clearly part of the North, and Charleston tied to the South."

  • Redis: Lightweight key/value Store That Goes the Extra Mile | Linux Magazine – Sort of like memcache. "Calling redis a key/value store doesn’t quite due it justice. It’s better thought of as a “data structures” server that supports several native data types and operations on them. That’s pretty much how creator Salvatore Sanfilippo (known as antirez) describes it in the documentation. Let’s dig in and see how it works."
  • Op-Ed Contributor – Microsoft’s Creative Destruction – NYTimes.com – Unlike other companies, Microsoft never developed a true system for innovation. Some of my former colleagues argue that it actually developed a system to thwart innovation. Despite having one of the largest and best corporate laboratories in the world, and the luxury of not one but three chief technology officers, the company routinely manages to frustrate the efforts of its visionary thinkers.

Bookmarks for January 20th through January 23rd

These are my links for January 20th through January 23rd:

  • Data.gov – Featured Datasets: Open Government Directive Agency – Datasets required under the Open Government Directive through the end of the day, January 22, 2010. Freedom of Information Act request logs, Treasury TARP and derivative activity logs, crime, income, agriculture datasets.
  • All Your Twitter Bot Needs Is Love – The bot’s name? Jason Thorton. He’s been humming along for months now, sending out over 1250 tweets to some 174 followers. His tweets, while not particularly creative, manage to be both believable and timely. And he’s powered by a single word: Love.

    Thorton is the creation of developer Ryan Merket, who built him as a side project in around three hours. Merket has just posted the code that powers him, and has also divulged how he made Thorton seem somewhat realistic: the bot looks for tweets with the word “love” in them and tweets them as its own.

  • Building a Twitter Bot – "Meet Jason Thorton. To people who know Jason, he is a successful entrepreneur in San Francisco who tweets 4-5 times a day. But Jason has a secret, he’s not really a human, he’s the product of my simple algorithm in PHP

    Jason tweets A LOT about the word “love” – that’s because Jason actually steals tweets from the public timeline that contain the word “love” and posts them as his own

    Jason also @replies to people who use the word “love” in their tweets, and asks them random questions or says something arbitrary

    It took me about 3 hours to code Jason, imagine what a real engineer could do with real AI algorithms? Now realize that it’s already a reality. Sites like Twitter are full of side projects, company initiatives, spambots and AI robots. When the free flow of information becomes open, the amount of disinformation increases. Theres a real need for someone to vet the people we ‘meet’ on social sites – will be interesting to see how this market grows in the next year

  • Website monitoring status – Public API Status – Health monitor for 26 APIs from popular Web services, including Google Search, Google Maps, Bing, Facebook, Twitter, SalesForce, YouTube, Amazon, eBay and others
  • PG&E Electrical System Outage Map – This map shows the current outages in our 70,000-square-mile service area. To see more details about an outage, including the cause and estimated time of restoration, click on the color-coded icon associated with that outage.

Bookmarks for June 9th through June 10th

These are my links for June 9th through June 10th:

Bookmarks for June 3rd through June 4th

These are my links for June 3rd through June 4th:

Bookmarks for May 30th through May 31st

These are my links for May 30th through May 31st:

Bookmarks for May 24th through May 27th

These are my links for May 24th through May 27th:

  • Formulas and game mechanics – WoWWiki – Your guide to the World of Warcraft – Formulas and game mechanics rules and guidelines for developing role playing games
  • Manchester United’s Park Has the Endurance to Persevere – NYTimes.com – Korean soccer player Park Ji-Sung – On Wednesday night in Rome, Park is expected to become the first Asian player to participate in the European Champions League final when Manchester United faces Barcelona.
  • mloss.org – Machine Learning Open Source Software – Big collection of open source packages for machine learning, data mining, statistical analysis
  • The Datacenter as Computer – Luiz André Barroso and Urs Hölzle 2009 (PDF) – 120 pages on large scale computing lessons from Google. "These new large datacenters are quite different from traditional hosting facilities of earlier times and cannot be viewed simply as a collection of co-located servers. Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of Internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base."
  • Geeking with Greg: The datacenter is the new mainframe – Pointer to a paper by Googlers Luiz Andre Barroso and Urs Holzle on the evolution of warehouse scale computing and the management and use of computing resources in a contemporary datacenter.

Bookmarks for May 22nd through May 23rd

These are my links for May 22nd through May 23rd:

  • Improve MySQL Insert Performance – Summary – use LOAD DATA INFILE
  • Scratch | Home | imagine, program, share – Scratch is designed to help young people (ages 8 and up) develop 21st century learning skills. As they create and share Scratch projects, young people learn important mathematical and computational ideas, while also learning to think creatively, reason systematically, and work collaboratively
  • Alice.org – Programming language environment for teaching kids, built on Java, geared toward a story telling approach.
  • Jason R Briggs | Snake Wrangling for Kids – “Snake Wrangling for Kids” is a printable electronic book, for children 8 years and older, who would like to learn computer programming. It covers the very basics of programming, and uses the Python 3 programming language to teach the concepts.
  • Benchmarking BDB, CDB and Tokyo Cabinet on large datasets – CDB comes out significantly faster. (It's for unchanging data though, so not totally surprising) Benchmark data for 11M key-value pair dataset stored in Berkeley DB, CDB, and Tokyo Cabinet.

Bookmarks for April 28th from 05:35 to 14:24

These are my links for April 28th from 05:35 to 14:24:

  • Official Google Blog: Adding search power to public data – Interesting. Wonder if the underlying public data sets will eventually become available on Google App Engine as well, sort of like the public data sets available for use with Amazon EC2 applications.
  • MySQL And Search At Craigslist – Jeremy Zawodny's slides on MySQL, Sphinx, and free text search implementation at Craigslist, from last week's MySQL conference.
  • Skew, The Frontend Engineer’s Misery @ Irrational Exuberance – For mashups and the like, the distinction between a FE engineer and web dev is rather small in terms of technical skills; they are both using the same skillset, they are both interacting with APIs, and so on. However, there are important distinctions between the two: 1. web developers tend to move in small groups or as individuals, whereas fe engineers work in larger groups, 2. web developers tend to design a product on top of an existing backend service (api, etc), while fe engineers are usually working in parallel with the backend being developed.
  • Study: Twitter Audience Does Not Have A Return Policy – Over 60 percent of people who sign up to use the popular (and tremendously discussed) micro-blogging platform do not return to using it the following month, according to new data released by Nielsen Online. In other words, Twitter currently has just a 40 percent retention rate, up from just 30 percent in previous months–indicating an “I don’t get it factor” among new users that is reminiscent of the similarly-over hyped Second Life from a few years ago.
  • Hey Americans, Appreciate Your Freedom Of Speech : NPR – Firoozeh Dumas on the underappreciated freedoms of speech and expression we have in the US vs journalists and bloggers in Iran.

Bookmarks for April 20th through April 23rd

These are my links for April 20th through April 23rd:

Bookmarks for April 15th through April 17th

These are my links for April 15th through April 17th:

Bookmarks for April 13th through April 15th

These are my links for April 13th through April 15th:

Bookmarks for April 12th through April 13th

These are my links for April 12th through April 13th:

Bookmarks for April 12th from 17:02 to 19:13

These are my links for April 12th from 17:02 to 19:13:

Bookmarks for April 11th through April 12th

These are my links for April 11th through April 12th:

  • Wordle – Beautiful Word Clouds – Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes.
  • The dark side of Dubai – Johann Hari, Commentators – The Independent – "Dubai was meant to be a Middle-Eastern Shangri-La, a glittering monument to Arab enterprise and western capitalism. But as hard times arrive in the city state that rose from the desert sands, an uglier story is emerging."
  • Topless Robot – Hot Girls Have Lightsaber Strip-Fight for Your Viewing Pleasure – Star Wars CGI meets fake body spray ad
  • Poll Result: Best VPN to leap China’s Great Firewall? – Thomas Crampton – - Witopia – Undisputed winner. Quality of service, speed of surfing, though it is said to be relatively expensive at US$50 to US$60 per year. Hotspot Shield – Bandwidth limits can be painful. Force you to wait until the next month if you use it too much. – Ultrasurf – StrongVPN
  • InfoQ: Facebook: Science and the Social Graph – In this presentation filmed during QCon SF 2008 (November 2008), Aditya Agarwal discusses Facebook’s architecture, more exactly the software stack used, presenting the advantages and disadvantages of its major components: LAMP (PHP, MySQL), Memcache, Thrift, Scribe.
  • The Running Man, Revisited § SEEDMAGAZINE.COM – a handful of scientists think that these ultra-marathoners are using their bodies just as our hominid forbears once did, a theory known as the endurance running hypothesis (ER). ER proponents believe that being able to run for extended lengths of time is an adapted trait, most likely for obtaining food, and was the catalyst that forced Homo erectus to evolve from its apelike ancestors.

Bookmarks for February 25th through February 26th

These are my links for February 25th through February 26th:

Bookmarks for February 24th through February 25th

These are my links for February 24th through February 25th:

Bookmarks for February 18th through February 19th

These are my links for February 18th through February 19th:

Bookmarks for February 16th through February 17th

These are my links for February 16th through February 17th:

  • Top 100 Network Security Tools – Many many security testing and hacking tools.
  • FRONTLINE: inside the meltdown: watch the full program – "On Thursday, Sept. 18, 2008, the astonished leadership of the U.S. Congress was told in a private session by the chairman of the Federal Reserve that the American economy was in grave danger of a complete meltdown within a matter of days. "There was literally a pause in that room where the oxygen left," says Sen. Christopher Dodd"
  • The Dark Matter of a Startup – "Every successful startup that I have seen has someone within their ranks that just kinda “does stuff.” No one really knows specifically what they do, but its vital to the success of the startup."
  • Why I Hate Frameworks – "A hammer?" he asks. "Nobody really buys hammers anymore. They're kind of old fashioned…we started selling schematic diagrams for hammer factories, enabling our clients to build their own hammer factories, custom engineered to manufacture only the kinds of hammers that they would actually need."
  • Mining The Thought Stream – Lots of comments around what is Twitter good for and how will it make money, revolving around real/near-time search, analytics, marketing, etc.
  • Understanding Web Operations Culture – the Graph & Data Obsession … – Comparison of traffic at Flickr, Google, Twitter, last.fm during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."