|
|
site admin | June 10th, 2009 | Comments are closed
These are my links for June 9th through June 10th:
- Announcing the Yahoo! Distribution of Hadoop (Hadoop and Distributed Computing at Yahoo!) – Yahoo releases its internal version of Hadoop, a source-only distribution of Apache Hadoop tested and used in production at Yahoo.
- Google Fusion Tables FAQ – Sort of like extra-large Google Docs spreadsheets, up to 100MB per table, 250MB per user. One interesting wrinkle is that it doesn't actually delete your dataset when you "delete" it, so the data is still available for derived tables that other users have built.
- Filesystem Performance from a Database Perspective – Presentation on performance benchmarks on linux filesystems (ext2, ext3, reiserfs, xfs, etc)
- What Assumptions Make: Filesystem I/O from a database perspective – Slide presentation comparing linux file system performance across various formats (ext2, ext3, etc), RAID configurations, readahead buffer sizes
- MySQL – Common Queries Tree – A collection of common queries implemented in MySQL
site admin | May 29th, 2009 | Comments are closed
These are my links for May 29th from 05:17 to 12:45:
- Some stats from Twitter conference compared to… – Robert Scoble – FriendFeed – Anecdotal data from 140tc this week. 200 tweets/second at peak. Didn't see an estimate of current user account population though, I keep seeing site unique visitor estimates, which aren't useful.
- Microsoft Silverlight vs Google Wave: Why Karma Matters | Zoho Blogs – "The real interesting contrast to us, as independent software developers, is the way developers responded to Silverlight as opposed to the reaction yesterday to Google Wave. Both Silverlight and Wave are aimed at taking the internet experience to the next level. To be perfectly honest, Silverlight is a great piece of technology. Google Wave, as yet, is not much more than a concept and an announcement. It is easy to dismiss all this with "Oh, the press just loves to hype everything Google, and loves to hate Microsoft," but that cannot explain why even competitors like us are willing to embrace Google's innovations, but stay away from perfectly good innovations from Microsoft, such as Silverlight? It comes down to one word: karma."
- makerfaire.com: Maker Faire – This weekend at San Mateo Expo Center
- Google Wave Federation Protocol –
- Google Wave API Overview – Google Wave API – Google Code – APIs for Google Wave email / bbs / wiki / chat / collaboration / communications mashup platform introduced yesterday.
- What Emacs Commands Do You Use Most and Find Most Useful? : programming – Reddit thread discussing favorite emacs commands
site admin | May 22nd, 2009 | Comments are closed
These are my links for May 22nd from 06:31 to 07:14:
- Javascript Malware Analysis: A Case Study – "This particular beast was found in the wild in May 2009 on a site phishing for Facebook user credentials, and is a particularly-nasty bugger. Note the number of strangely-named variables created up front, many of which are not even referenced in the code blocks that follow. Additionally notice the odd ternary statements which have no impact on the operation of the code, and presumably must exist to trip up scanners (unless there is a fancy form of string replacement on the body of some functions, in which case the functions could be mutated before execution – and that would be scary. A cipher based on the body of the function has also been seen.)"
- MySQL: Forked beyond repair? | Developer World – InfoWorld – Now that MySQL is part of Oracle, will the forks take over? "if MySQL's approval ratings are slumping, all the more reason for Oracle to move decisively. Oracle must work to regain the trust and support of the MySQL community or risk losing mindshare to a fork, such as Drizzle or MariaDB. To do that, it has to avoid making the mistakes that Sun made when it acquired MySQL. In a sense, to succeed with MySQL, Oracle will have to stop acting like Oracle."
- Scott Hanselman’s Computer Zen – Less Virtual, More Machine – Windows 7 and the magic of Boot to VHD – Notes on using Windows virtual hard drives to manage instances of multiple version of Windows in parallel, e.g. Windows 7 beta, WinXP, etc.
- How Opera’s business model works – Communication Breakdown – David Meyer’s Blog at ZDNet.co.uk Community – Around 40M users, "Most of our revenue — 75-80 percent — comes from mobile devices, fom a free browser. We provide the browser for free, like Opera desktop and Mini, and then we generate revenue through our content partners. We provide the search in the right corner and things like that, and that generates revenues in the free distributions. Then you get paid by OEMs [original equipment manufacturers] for distribution — companies like Nokia and Motorola. Most of the mobile OEMs and a fair amount of the other OEMs. We signed up Ford recently and we're now in Ford trucks."
- Digicorp » Blog Archive » Prevention of Sql Injection with PHP – Notes on good coding hygiene for avoiding SQL injection attacks while processing web form input such as passwords and other text fields.
site admin | May 21st, 2009 | Comments are closed
These are my links for May 21st from 06:07 to 22:34:
site admin | May 20th, 2009 | Comments are closed
These are my links for May 20th from 19:50 to 22:03:
- PicFog Displays the Strength of Real-Time Image Search – More real time social search prototypes, this one for images shared on twitter. Fun to play with.
- bits done properly – 7 TwitPic alternatives – A list of alternative photo sharing sites suitable for use with Twitter.
- Twitter Data – A simple, open proposal for embedding data in Twitter messages – Home – "Twitter Data is a simple, open, semi-structured data representation format for embedding machine-readable, yet human-friendly, data in Twitter messages. This data can then be transmitted, received, and interpreted in real time to enable powerful new kinds of applications to be built on the Twitter platform."
- Announcing TweetMotif for summarizing twitter topics with a dash of NLP – Brendan O’Connor’s Blog – TweetMotif is an experiment in using natural language processing to identify trending topics.
- OneRiot Announces API & Real-Time Search Partnerships – "Real-time social search outfit OneRiot today announced their API and partnership program for adding real-time search capabilities to browser add-ons, desktop applications, social websites and other services" Screenshots from initial app TwitterBar (browser extension)
- Mozilla Labs » Blog Archive » Introducing Jetpack, Call for Participation – API for Firefox extension development
site admin | May 12th, 2009 | Comments are closed
These are my links for May 8th through May 12th:
site admin | May 8th, 2009 | Comments are closed
These are my links for May 6th through May 7th:
- Content Syndication with Case-Hardened JavaScript – kentbrewster.com – Handy code for building Javascript widgets with content from various sources such as Twitter, Digg, Yahoo Pipes, etc.
- Mathematical Atlas: A gateway to Mathematics – "The Mathematical Atlas is a collection of articles about aspects of mathematics at and above the university level, but (usually) not at the level of current research. The goal of this collection is to introduce the subject areas of modern mathematics, to describe a few of the milestone results and topics, and to give pointers to some of the key resources where further information is to be found. Like any good atlas, we try to present several ways to look at each area and to show its relationship with neighboring areas and sub-areas. "
- Three Reasons Why Twitter Will NOT Index the Links You Share – ReadWriteWeb – Argues that Twitter will rely on bit.ly through partnership or acquisition to handle sentiment and semantic analysis of twitter search and link contents.
- Tough Love For Microsoft Search – December 2008 post from Danny Sullivan on Microsoft and the search landscape.
- Annals of Innovation: How David Beats Goliath: Reporting & Essays: The New Yorker – Malcolm Gladwell, with a reporter at large on Vivek Ranadivé and his NJB girls basketball team, employing asymmetric strategies to overcome conventionally stronger teams, and a broader look at the history of insurgent strategies from David and Goliath, T.E. Lawrence, George Washington, etc.
site admin | May 6th, 2009 | Comments are closed
These are my links for May 5th through May 6th:
- Coding Horror: I Just Logged In As You: How It Happened – On good password management, why forums should mostly not be storing user passwords in general, and how re-use of passwords on multiple sites can lead to vulnerability on other sites.
- Arc Forum | Arc – Arc is a version of Lisp. Among other things it is used to implement Hacker News.
- John Graham-Cumming: Can you trust Paul Graham with your password? – On best practices for storing password hashes to avoid attacks on compromised password files and the use of rainbow files, in a look at Hacker News implementation of passwords
- Deliberate Ambiguity: How *not* to rate a search engine – Search engines have very simple user interfaces, but are used in many different contexts, most of which don't resemble the way people often try out a new search engine.
- The Slow Erosion of Google Search – Bokardo – On changes in internet user behaviors over time, more social media (ask your Twitter friends) vs directed search (send a keyword query) etc.
- Brynn Marie Evans » Why social search won’t topple Google (anytime soon) – On differences between searching through social media such as Twitter, Facebook etc, vs Google etc.
- The Financial Services Club’s Blog: Stock picking with real-time news – Looking at real time social media trends for trading ideas.
- Lisp’s reputation is so bad that many people don’t even take a look at Lisp | International Lisp Conference 2009 – I haven't touched Lisp in years, except maybe for configuring emacs. A list of possible reasons why Lisp is not more widely used, e.g. "Lisp is old and moldy. It must be primitive by today's standards.", "The exciting languages to learn now are Python, Ruby, Groovy, etc."
- Peering into North Korea – The Big Picture – Boston.com – A collection of recent photos of scenes from North Korea.
site admin | May 5th, 2009 | Comments are closed
These are my links for May 4th through May 5th:
- Influential Nodes in a Diffusion Model for Social Networks (icalp05-inf.pdf) – Kempe, Kleinberg, Tardos. Algorithm for greedy approximation of most influential nodes in social network (63% of optimal) under various conditions.
- Maximizing the Spread of Influence through a Social Network (kdd03-inf.pdf) – Kempe, Kleinberg, Tardos. Maximizing propagation by selecting most influential nodes is NP-hard, but a greedy approximation can work well (63% of optimal) under various conditions.
- Notification Strategies for Social Networks – Discussion on approaches to maximizing use of a limited number of notifications within social networks e.g. Facebook
- James Smith • loopj.com » Blog Archive » jQuery Plugin: Tokenizing Autocomplete Text Entry – Looks handy – "This is a jQuery plugin to allow users to select multiple items from a predefined list, using autocompletion as they type to find each item. You may have seen a similar type of text entry when filling in the recipients field sending messages on facebook."
- Google Code FAQ – Using cURL to interact with Google data services – Step by step tutorial on using curl with Google data APIs.
- Behind The Business Plan Of Pirates Inc. : NPR – It takes around $250K to fund a Somali pirate operation. About 20 percent goes to pay off officials who look the other way. About 50 percent is for expenses and payroll. The leader of an attack makes $10,000 to $20,000 (the average Somali family lives on $500 a year). The initial investor — who put in $250,000 of seed capital — gets 30 percent, sometimes up to $500,000.
site admin | May 2nd, 2009 | Comments are closed
These are my links for April 30th through May 2nd:
- FusionCharts Free – Animated Flash Charts and Graphs for ASP, PHP, ASP.NET, JSP, RoR and other web applications – Flash charting component that can be used to render data-driven & animated charts for your web applications and presentations. It is a cross-browser and cross-platform solution that can be used with PHP, Python, Ruby on Rails, ASP, ASP.NET, JSP, ColdFusion, simple HTML pages or even PowerPoint Presentations to deliver interactive and powerful flash charts. You do NOT need to know anything about Flash to use FusionCharts. All you need to know is the language you're programming in.
- Raphaël—JavaScript Library – Raphaël is a small JavaScript library that should simplify your work with vector graphics on the web. If you want to create your own specific chart or image crop and rotate widget, for example, you can achieve it simply and easily with this library. Raphaël uses the SVG W3C Recommendation and VML as a base for creating graphics. This means every graphical object you create is also a DOM object, so you can attach JavaScript event handlers or modify them later. Raphaël’s goal is to provide an adapter that will make drawing vector art compatible cross-browser and easy.
- A Really Gentle Introduction to Data Mining | Regular Geek – List of data mining blogs and related resources.
- BlackBerry SSH Tutorial: Connect to Unix Server using MidpSSH for Mobile Devices – Notes on using MidpSSH on Blackberry for remote access to servers. Seems to work, although big network lag on my BlackBerry Bold / AT&T.
- Country Reports on Terrorism 2008 – U.S. law requires the Secretary of State to provide Congress, by April 30 of each year, a full and complete report on terrorism with regard to those countries and groups meeting criteria set forth in the legislation. This annual report is entitled Country Reports on Terrorism. Beginning with the report for 2004, it replaced the previously published Patterns of Global Terrorism.
- DIY: How To Find Authoritative Twitter Users Plus 100 To Get You Started | Ignite Social Media – Some comments on recommendation metrics for Twitter, trying to use "favorites" mark as an indicator.
- SIGUSR2 > The Power That is GNU Emacs – "If you've never been convinced before that Emacs is the text editor in which dreams are made from, or that inside Emacs there are unicorns manipulating your text, don't expect me to convince you."
site admin | April 28th, 2009 | Comments are closed
These are my links for April 28th from 05:35 to 14:24:
- Official Google Blog: Adding search power to public data – Interesting. Wonder if the underlying public data sets will eventually become available on Google App Engine as well, sort of like the public data sets available for use with Amazon EC2 applications.
- MySQL And Search At Craigslist – Jeremy Zawodny's slides on MySQL, Sphinx, and free text search implementation at Craigslist, from last week's MySQL conference.
- Skew, The Frontend Engineer’s Misery @ Irrational Exuberance – For mashups and the like, the distinction between a FE engineer and web dev is rather small in terms of technical skills; they are both using the same skillset, they are both interacting with APIs, and so on. However, there are important distinctions between the two: 1. web developers tend to move in small groups or as individuals, whereas fe engineers work in larger groups, 2. web developers tend to design a product on top of an existing backend service (api, etc), while fe engineers are usually working in parallel with the backend being developed.
- Study: Twitter Audience Does Not Have A Return Policy – Over 60 percent of people who sign up to use the popular (and tremendously discussed) micro-blogging platform do not return to using it the following month, according to new data released by Nielsen Online. In other words, Twitter currently has just a 40 percent retention rate, up from just 30 percent in previous months–indicating an “I don’t get it factor” among new users that is reminiscent of the similarly-over hyped Second Life from a few years ago.
- Hey Americans, Appreciate Your Freedom Of Speech : NPR – Firoozeh Dumas on the underappreciated freedoms of speech and expression we have in the US vs journalists and bloggers in Iran.
site admin | April 19th, 2009 | Comments are closed
These are my links for April 18th through April 19th:
- Why Programmers Suck at CSS Design – Stefano’s Linotype – A practical approach to CSS for non-designers (programmers).
- The Art & Science of Seductive Interactions – Presentation slides on improving application user experience by making them more game like (points, levels, scarcity), social interaction, and other ideas.
- Stephen Marsland – Python code from "Machine Learning: An Algorithmic Perspective", assorted clustering and estimation algorithms.
- Firediff – In Case of Stairs – Firediff implements a change monitor that records all of the changes made by firebug and the application itself to CSS and the DOM. This
provides insight into the functionality of the application as well as provide a record of the changes that were required to debug and tweak the page’s display.
- Crowdsourcing the semantic web | lexanderA – "Currently, all attempts at providing semantic metadata require server-side changes which means that we need to rely on page authors to implement them. This, of course, is a major obstacle. But what if we could change that? What if we could bypass page authors and have the crowd add semantic metadata to existing pages?"
- Just How Important is the Valley? Let’s Look at some Data. – Tony Wright dot com – Is the silicon valley entrepreneurship model specific to SV? List of acquisitions in 2007 and 2008.
site admin | April 17th, 2009 | Comments are closed
These are my links for April 15th through April 17th:
- Paul Buchheit: Make your site faster and cheaper to operate in one easy step – Compress text files with gzip to reduce file size/bandwidth, the incremental cpu cost is usually low relative to the performance gain from lower network cost. Friendfeed uses nginx in front of main web servers for this.
- Jabbify – Free Comet web service and browser client for simple chat and streaming status applications.
- TinEye Image Search Engine – Idée Inc. – The Visual Search Company – Finds references to images online, starting with an original image. Attempts to use image analysis to be independent of scaling, cropping, and other common manipulations.
- All That Twitters Isn’t Gold: A Popular Web Application in Search of a Business Plan – Knowledge@Wharton – Business school take on Twitter and high growth, non-revenue consumer web startups.
- Almost Viral: A Hybrid Acquisition Strategy – "By being almost viral you can grow very cheaply, control your rate of growth and demographics, and get enough traffic to conduct meaningful experiments. Need to grow more slowly? Just decrease your daily ad spend. Need statistically significant results more quickly? Increase your daily ad spend. With a viral coefficient of 0.9 you’ve dealt with your acquisition risk. Rather than going fully viral and dealing with the operational difficulties, it might be worth your time to deal with other market risks: retention, engagement, and monetization. "
site admin | April 15th, 2009 | Comments are closed
These are my links for April 13th through April 15th:
site admin | April 10th, 2009 | Comments are closed
These are my links for April 9th through April 10th:
site admin | April 9th, 2009 | Comments are closed
These are my links for April 9th from 08:07 to 17:53:
- IP address geolocation SQL database – IP address geolocation with MySQL by Marc-Andre Caron. He's done all the necessary legwork to solve this problem, putting together a free, monthly-updated MySQL dataset that will allow you to derive country, region, city, zip, latitude, and longitude from an IP address.
- Del.icio.us Finally Gets Some Respect from Yahoo – Probably Too Late – ReadWriteWeb –
- In the Event That You Have Accidentally Swallowed the Higgs Boson by Michael Rottman – The Morning News – "7. Do you feel protons decaying? Grand Unification may be occurring near your vital organs. "
- FT.com / Companies / UK companies – Dotcom veterans in Twitter ‘brains trust’ – "Mr Read has brought together a “brains trust” of advisers to Twitter Partners, including Brent Hoberman and Martha Lane Fox, founders of Lastminute.com; Saul Klein, a partner at Index Ventures, the London venture capitalists; and Toby Coppel, the former European vice-president at Yahoo."
- byteonic.com » What you cannot do using Java in Google App Engine – List of some restrictions on Java code running on GAE
site admin | April 9th, 2009 | Comments are closed
These are my links for April 7th through April 9th:
site admin | April 7th, 2009 | Comments are closed
These are my links for April 3rd through April 7th:
- Agile Testing: Experiences deploying a large-scale infrastructure in Amazon EC2 – Practical guidance on using cloud computing at EC2. Expect failures, automate deployment, more.
- joshua’s blog: on url shorteners – Joshua Schachter (founder of del.icio.us) summary on the state of URL shorteners (tinyurl, bit.ly, etc), and issues with 3rd party redirects, link sharing through twitter, etc.
- Control Yourself » status.net coming soon – On status.net, plans for hosting laconi.ca sites, and federating microblogging status networks
- There must be some way out of here (Scripting News) – Comments on the rise of celebrity accounts on Twitter, increasing spam/noise, and alternative models for laconi.ca and status.net
- Stochastic Models of User-Contributory Web Sites – Tad Hogg, Kristina Lerman 31 Mar 2009 Abstract: We describe a general stochastic processes-based approach to modeling user-contributory web sites, where users create, rate and share content. These models describe aggregate measures of activity and how they arise from simple models of individual users. This approach provides a tractable method to understand user activity on the web site and how this activity depends on web site design choices, especially the choice of what information about other users' behaviors is shown to each user. We illustrate this modeling approach in the context of user-created content on the news rating site Digg.
Ho John Lee | March 12th, 2009 | 1 comment

Didn’t attend ETech this week, but thanks to a Twitter pointer from Gene Becker, I did take a few breaks to participate in a collaborative future forecasting experiment at the event, organized by Institute For the Future / Signtific Labs. The general idea is to enlist game players to offer Twitter-like short notes with outlier ideas regarding a scenario under discussion, in this case the consequences of inexpensive ($100) 1kg microsatellites (“CubeSats”) capable of high speed networking and remote sensing. The same game framework could be used for any scenario, though. Bonus points are awarded to “Super-Interesting” ideas and ideas that result in additional discussion, which helped me out on the scoreboard.
Gene (“ubik“) won a “Feynman” award on the first day, and I managed to end up with a high score at ETech, thus winning a lab coat to go with my “Genius” label.
Some of my favorite future forecast contributions from “What will you do when space is as cheap and accessible as the Web is today?” (slide summary here):
Jurisdiction-free data haven built with csats full of rad-hard flash memory, hbase-style distributed replication across multiple nodes. Subpoena-proof anonymizers, for better or worse. Alternative, universal internet currency evolves, outside any government’s central bank control. Following forced disclosure of banking client list, Swiss government recognizes anonymous cSat net IDs, followed by Cayman, Bermuda etc.
CSats deorbited in vacant areas of oceans as impulse input to passive sonar imaging. Oceanographers get great maps, submarines lose stealth. Depending on how accurately you can drop a CSat, you can effectively “ping” a region and listen to the return signal through existing arrays. This really messes with strategic deterrence since now subs are vulnerable to first strike. But CSat deorbit is cheap WMD for all. On the positive side, detailed acoustic propagation data leads to new insights on ocean dynamics – bathymetrics, thermoclines, currents, etc. A similar version of dropping CSats on land might yield useful seismic imaging. But these would all be surface impulse, not at depth.
Csat data networks circumvent the Great Firewall of China and other govt access controls, leading to broader/safer citizen engagement online
CSat operating interface is marketed as a toy, like Tamagochi. Recharge, collect interesting data, avoid mean csats, team with friends. Organizations might post cash prize/rewards for things like locating missing ships, oil/trash dumping at sea, smokestack emissions, etc
Commodity traders are early adopters of CSat operator networks. Looking for crop yield data, mine production volumes, freight shipments etc. Among other things, CSat observations could give a more accurate estimate of “floating” oil parked in tankers as well as ongoing demand. Similarly, you’d get a decent idea of iron ore production by watching BHP’s railway in Australia, and the demand side in China, Korea etc. CSat data could improve the market visbility into supply/demand. But one might start creating Potemkin mining/farming operations etc… Sadly, credit derivative risk is not observable via CSat.
Ubiquitous, near real time satellite surveillance. No more privacy outdoors. But really good Google Maps. Ultra high resolution terrain maps of the world synthesized from multiple satellite passes/viewing aspects. Long term studies of effects of erosion, farming, development, earthquakes, flooding, drought, etc. Insurgents, militias, and terrorists get real time tactical data feeds, make use of homebrew UAVs, sensors, and in-field dispatch from afar. Turf wars among poppy and marijuana growers who now know where each other’s fields are. All vehicles – car, truck, rail, container, airplanes, etc – get a sky-facing ID plate. Maybe these should just be really big QR codes with an authoritative registry to foil car thieves from painting on bogus “plates”.
Now I need to figure out how to collect that lab coat.
site admin | March 3rd, 2009 | Comments are closed
These are my links for March 3rd from 05:48 to 12:10:
|
|