These are my links for February 4th through February 11th:
- Schneier on Security: Interview with a Nigerian Internet Scammer – "We had something called the recovery approach. A few months after the original scam, we would approach the victim again, this time pretending to be from the FBI, or the Nigerian Authorities. The email would tell the victim that we had caught a scammer and had found all of the details of the original scam, and that the money could be recovered. Of course there would be fees involved as well. Victims would often pay up again to try and get their money back."
- xkcd – Frequency of Strip Versions of Various Games – n = Google hits for "strip <game name>" / Google hits for "<game name>"
- PeteSearch: How to split up the US – Visualization of social network clusters in the US. "information by location, with connections drawn between places that share friends. For example, a lot of people in LA have friends in San Francisco, so there's a line between them.
Looking at the network of US cities, it's been remarkable to see how groups of them form clusters, with strong connections locally but few contacts outside the cluster. For example Columbus, OH and Charleston WV are nearby as the crow flies, but share few connections, with Columbus clearly part of the North, and Charleston tied to the South."
- Redis: Lightweight key/value Store That Goes the Extra Mile | Linux Magazine – Sort of like memcache. "Calling redis a key/value store doesn’t quite due it justice. It’s better thought of as a “data structures” server that supports several native data types and operations on them. That’s pretty much how creator Salvatore Sanfilippo (known as antirez) describes it in the documentation. Let’s dig in and see how it works."
- Op-Ed Contributor – Microsoft’s Creative Destruction – NYTimes.com – Unlike other companies, Microsoft never developed a true system for innovation. Some of my former colleagues argue that it actually developed a system to thwart innovation. Despite having one of the largest and best corporate laboratories in the world, and the luxury of not one but three chief technology officers, the company routinely manages to frustrate the efforts of its visionary thinkers.
These are my links for June 9th through June 10th:
- Announcing the Yahoo! Distribution of Hadoop (Hadoop and Distributed Computing at Yahoo!) – Yahoo releases its internal version of Hadoop, a source-only distribution of Apache Hadoop tested and used in production at Yahoo.
- Google Fusion Tables FAQ – Sort of like extra-large Google Docs spreadsheets, up to 100MB per table, 250MB per user. One interesting wrinkle is that it doesn't actually delete your dataset when you "delete" it, so the data is still available for derived tables that other users have built.
- Filesystem Performance from a Database Perspective – Presentation on performance benchmarks on linux filesystems (ext2, ext3, reiserfs, xfs, etc)
- What Assumptions Make: Filesystem I/O from a database perspective – Slide presentation comparing linux file system performance across various formats (ext2, ext3, etc), RAID configurations, readahead buffer sizes
- MySQL – Common Queries Tree – A collection of common queries implemented in MySQL
These are my links for May 30th through May 31st:
- Scaling Twitter: Making Twitter 10000 Percent Faster | High Scalability – Collection of links to presentations and interviews regarding Twitter's architecture, implementation plans, and performance issues, from spring 2009.
- The Last Psychiatrist: The Difference Between An Amateur, A Scientist, And A Genius – An amateur is full of wonder and speculation, tinkering towards the truth but suffering from a lack of knowledge and idleness; he's not even sure if someone else has already made these discoveries. "Is this a worthwhile pursuit?"
A scientist performs experiments to confirm or disprove a hypothesis, and in that way he grinds out the truth.
A genius has three abilities, which are actually the union of amateur and scientist: 1. to know the state of the art, what is known and what is not known. 2. To be able to think "out of the box". 3. To be disciplined enough to concentrate on the tedium of a formal investigation of his wondrous speculations.
- PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing – Research paper on sort of "super healing brush" for manipulating digital images, allows splicing together different sections of the image and automatically selecting similar textures to make the seam transitions work better.
- Light Blue Touchpaper » Blog Archive » Attack of the Zombie Photos – Social networking and sharing sites have challenges implementing and managing access control policies at large scale, and content delivery networks add another wrinkle.
- Map of all Google data center locations | Royal Pingdom – Where in the world is your search being served from? An attempt to assemble a list of known Google data centers worldwide.
These are my links for May 22nd through May 23rd:
- Improve MySQL Insert Performance – Summary – use LOAD DATA INFILE
- Scratch | Home | imagine, program, share – Scratch is designed to help young people (ages 8 and up) develop 21st century learning skills. As they create and share Scratch projects, young people learn important mathematical and computational ideas, while also learning to think creatively, reason systematically, and work collaboratively
- Alice.org – Programming language environment for teaching kids, built on Java, geared toward a story telling approach.
- Jason R Briggs | Snake Wrangling for Kids – “Snake Wrangling for Kids” is a printable electronic book, for children 8 years and older, who would like to learn computer programming. It covers the very basics of programming, and uses the Python 3 programming language to teach the concepts.
- Benchmarking BDB, CDB and Tokyo Cabinet on large datasets – CDB comes out significantly faster. (It's for unchanging data though, so not totally surprising) Benchmark data for 11M key-value pair dataset stored in Berkeley DB, CDB, and Tokyo Cabinet.
These are my links for May 22nd from 06:31 to 07:14:
- MySQL: Forked beyond repair? | Developer World – InfoWorld – Now that MySQL is part of Oracle, will the forks take over? "if MySQL's approval ratings are slumping, all the more reason for Oracle to move decisively. Oracle must work to regain the trust and support of the MySQL community or risk losing mindshare to a fork, such as Drizzle or MariaDB. To do that, it has to avoid making the mistakes that Sun made when it acquired MySQL. In a sense, to succeed with MySQL, Oracle will have to stop acting like Oracle."
- Scott Hanselman’s Computer Zen – Less Virtual, More Machine – Windows 7 and the magic of Boot to VHD – Notes on using Windows virtual hard drives to manage instances of multiple version of Windows in parallel, e.g. Windows 7 beta, WinXP, etc.
- How Opera’s business model works – Communication Breakdown – David Meyer’s Blog at ZDNet.co.uk Community – Around 40M users, "Most of our revenue — 75-80 percent — comes from mobile devices, fom a free browser. We provide the browser for free, like Opera desktop and Mini, and then we generate revenue through our content partners. We provide the search in the right corner and things like that, and that generates revenues in the free distributions. Then you get paid by OEMs [original equipment manufacturers] for distribution — companies like Nokia and Motorola. Most of the mobile OEMs and a fair amount of the other OEMs. We signed up Ford recently and we're now in Ford trucks."
- Digicorp » Blog Archive » Prevention of Sql Injection with PHP – Notes on good coding hygiene for avoiding SQL injection attacks while processing web form input such as passwords and other text fields.
These are my links for April 28th from 05:35 to 14:24:
- Official Google Blog: Adding search power to public data – Interesting. Wonder if the underlying public data sets will eventually become available on Google App Engine as well, sort of like the public data sets available for use with Amazon EC2 applications.
- MySQL And Search At Craigslist – Jeremy Zawodny's slides on MySQL, Sphinx, and free text search implementation at Craigslist, from last week's MySQL conference.
- Skew, The Frontend Engineer’s Misery @ Irrational Exuberance – For mashups and the like, the distinction between a FE engineer and web dev is rather small in terms of technical skills; they are both using the same skillset, they are both interacting with APIs, and so on. However, there are important distinctions between the two: 1. web developers tend to move in small groups or as individuals, whereas fe engineers work in larger groups, 2. web developers tend to design a product on top of an existing backend service (api, etc), while fe engineers are usually working in parallel with the backend being developed.
- Study: Twitter Audience Does Not Have A Return Policy – Over 60 percent of people who sign up to use the popular (and tremendously discussed) micro-blogging platform do not return to using it the following month, according to new data released by Nielsen Online. In other words, Twitter currently has just a 40 percent retention rate, up from just 30 percent in previous months–indicating an “I don’t get it factor” among new users that is reminiscent of the similarly-over hyped Second Life from a few years ago.
- Hey Americans, Appreciate Your Freedom Of Speech : NPR – Firoozeh Dumas on the underappreciated freedoms of speech and expression we have in the US vs journalists and bloggers in Iran.
These are my links for April 20th through April 23rd:
- What I’ve Learned from Hacker News – Paul Graham on social dynamics and managing Hacker News, user submitted comments and ranking (voting up/down) , editorial intervention and moderators, project goals.
- SEOmoz | Reddit, Stumbleupon, Del.icio.us and Hacker News Algorithms Exposed! – Looking at variations on algorithms for ranking items on social news aggregators
- NGINX + PHP-FPM + APC = Awesome – Walkthrough on setting up cached PHP web server on nginx with apc.
- Particletree » PHP Quick Profiler – Lightweight tool for profiling PHP code.
- MySQL’s Full-Text Formulas – Database Journal –
- http://www.acapela-group.com/text-to-speech-interactive-demo.html – Online text-to-speech demo, with various male and female speakers, plus a few translations.
- Dealing with Duplicate Person Data – Proud to Use Perl – Classifying likely duplicate entries in name/address contact data using Levenshtein distance and tables of nickname synonym and assigned distance weights.
- Web Security Horror Stories: The Director’s Cut at <head> – Presentation slides from a talk by Simon Willison on cross site scripting, SQL injection, referer forgery, and clickjacking attacks on web applications.
These are my links for April 12th from 17:02 to 19:13:
These are my links for April 9th from 08:07 to 17:53:
- IP address geolocation SQL database – IP address geolocation with MySQL by Marc-Andre Caron. He's done all the necessary legwork to solve this problem, putting together a free, monthly-updated MySQL dataset that will allow you to derive country, region, city, zip, latitude, and longitude from an IP address.
- Del.icio.us Finally Gets Some Respect from Yahoo – Probably Too Late – ReadWriteWeb –
- In the Event That You Have Accidentally Swallowed the Higgs Boson by Michael Rottman – The Morning News – "7. Do you feel protons decaying? Grand Unification may be occurring near your vital organs. "
- FT.com / Companies / UK companies – Dotcom veterans in Twitter ‘brains trust’ – "Mr Read has brought together a “brains trust” of advisers to Twitter Partners, including Brent Hoberman and Martha Lane Fox, founders of Lastminute.com; Saul Klein, a partner at Index Ventures, the London venture capitalists; and Toby Coppel, the former European vice-president at Yahoo."
- byteonic.com » What you cannot do using Java in Google App Engine – List of some restrictions on Java code running on GAE
These are my links for April 7th through April 9th:
These are my links for February 27th through February 28th:
These are my links for February 26th through February 27th: