These are my links for May 30th through May 31st:
- Scaling Twitter: Making Twitter 10000 Percent Faster | High Scalability – Collection of links to presentations and interviews regarding Twitter's architecture, implementation plans, and performance issues, from spring 2009.
- The Last Psychiatrist: The Difference Between An Amateur, A Scientist, And A Genius – An amateur is full of wonder and speculation, tinkering towards the truth but suffering from a lack of knowledge and idleness; he's not even sure if someone else has already made these discoveries. "Is this a worthwhile pursuit?"
A scientist performs experiments to confirm or disprove a hypothesis, and in that way he grinds out the truth.
A genius has three abilities, which are actually the union of amateur and scientist: 1. to know the state of the art, what is known and what is not known. 2. To be able to think "out of the box". 3. To be disciplined enough to concentrate on the tedium of a formal investigation of his wondrous speculations.
- PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing – Research paper on sort of "super healing brush" for manipulating digital images, allows splicing together different sections of the image and automatically selecting similar textures to make the seam transitions work better.
- Light Blue Touchpaper » Blog Archive » Attack of the Zombie Photos – Social networking and sharing sites have challenges implementing and managing access control policies at large scale, and content delivery networks add another wrinkle.
- Map of all Google data center locations | Royal Pingdom – Where in the world is your search being served from? An attempt to assemble a list of known Google data centers worldwide.
These are my links for May 14th through May 15th:
- Congratulations, Google staff: $210k in profit per head in 2008 | Royal Pingdom – Google had $209,624 in profit per employee in 2008, which beats all the other large tech companies we looked at, including big hitters like Microsoft ($194K), Apple ($151K), Intel ($64K) and IBM ($30K).
- Statistical Data Mining Tutorials – A nice collection of presentations reviewing topics in data mining and machine learning. e.g. "HillClimbing, Simulated Annealing and Genetic Algorithms. Some very useful algorithms, to be used only in case of emergency." These include classification algorithms such as decision trees, neural nets, Bayesian classifiers, Support Vector Machines and cased-based (aka non-parametric) learning. They include regression algorithms such as multivariate polynomial regression, MARS, Locally Weighted Regression, GMDH and neural nets. And they include other data mining operations such as clustering (mixture models, k-means and hierarchical), Bayesian networks and Reinforcement Learning.
- Dare Obasanjo aka Carnage4Life – Why Twitter’s Engineers Hate the @replies feature – Looking at the infrastructure overhead required for Twitter's attempted change to @reply behavior.
- Scratch Helps Kids Get With the Program – Gadgetwise Blog – NYTimes.com – On my candidate list for 7th grade introductory programming and analysis. "Scratch, an M.I.T.-developed computer-programming language for children, is the focus of worldwide show-and-tell sessions this Saturday. "
These are my links for April 13th through April 15th:
These are my links for April 12th from 17:02 to 19:13:
These are my links for April 11th through April 12th:
- Wordle – Beautiful Word Clouds – Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes.
- The dark side of Dubai – Johann Hari, Commentators – The Independent – "Dubai was meant to be a Middle-Eastern Shangri-La, a glittering monument to Arab enterprise and western capitalism. But as hard times arrive in the city state that rose from the desert sands, an uglier story is emerging."
- Topless Robot – Hot Girls Have Lightsaber Strip-Fight for Your Viewing Pleasure – Star Wars CGI meets fake body spray ad
- Poll Result: Best VPN to leap China’s Great Firewall? – Thomas Crampton – - Witopia – Undisputed winner. Quality of service, speed of surfing, though it is said to be relatively expensive at US$50 to US$60 per year. Hotspot Shield – Bandwidth limits can be painful. Force you to wait until the next month if you use it too much. – Ultrasurf – StrongVPN
- InfoQ: Facebook: Science and the Social Graph – In this presentation filmed during QCon SF 2008 (November 2008), Aditya Agarwal discusses Facebook’s architecture, more exactly the software stack used, presenting the advantages and disadvantages of its major components: LAMP (PHP, MySQL), Memcache, Thrift, Scribe.
- The Running Man, Revisited § SEEDMAGAZINE.COM – a handful of scientists think that these ultra-marathoners are using their bodies just as our hominid forbears once did, a theory known as the endurance running hypothesis (ER). ER proponents believe that being able to run for extended lengths of time is an adapted trait, most likely for obtaining food, and was the catalyst that forced Homo erectus to evolve from its apelike ancestors.
These are my links for April 3rd through April 7th:
- Agile Testing: Experiences deploying a large-scale infrastructure in Amazon EC2 – Practical guidance on using cloud computing at EC2. Expect failures, automate deployment, more.
- joshua’s blog: on url shorteners – Joshua Schachter (founder of del.icio.us) summary on the state of URL shorteners (tinyurl, bit.ly, etc), and issues with 3rd party redirects, link sharing through twitter, etc.
- Control Yourself » status.net coming soon – On status.net, plans for hosting laconi.ca sites, and federating microblogging status networks
- There must be some way out of here (Scripting News) – Comments on the rise of celebrity accounts on Twitter, increasing spam/noise, and alternative models for laconi.ca and status.net
- Stochastic Models of User-Contributory Web Sites – Tad Hogg, Kristina Lerman 31 Mar 2009 Abstract: We describe a general stochastic processes-based approach to modeling user-contributory web sites, where users create, rate and share content. These models describe aggregate measures of activity and how they arise from simple models of individual users. This approach provides a tractable method to understand user activity on the web site and how this activity depends on web site design choices, especially the choice of what information about other users' behaviors is shown to each user. We illustrate this modeling approach in the context of user-created content on the news rating site Digg.
These are my links for February 18th through February 19th:
- Single Google Query uses 1000 Machines in 0.2 seconds – Google Fellow Jeff Dean says from 1999-2009, while both search queries and processing power have gone up by a factor of 1000, latency has gone down from around 1000ms to 200ms. Crawler updates now take minutes compared to months in 1999. 1000 machines handle a single query, all in memory.
- Government 2.0: Tweeting the Talk, Walking the Walk « Adriel Hampton – List of twitter users in various government organizations.
- The Absurdly Artificial Divide Between Pure and Applied Research – Olivia Judson – NYTimes.com – I used to explain myself as an "applied research" guy, small "r", not big "R" pure research. Love theory and analysis but want to see it get used for something eventually.
- Amazon Web Services Developer Community : Load data into S3 via hard drives? – Amazon asks for feedback regarding the FedEx option for bulk data transfer. "We have heard a number of requests about sending hard drives to AWS to load into S3. If such a service would benefit your business, we’d like to learn more about your use case."
- Local Media in a Postmodern World, Part XCI, Advertising Loses Its Balance – On the shifts in supply and demand, buyers and sellers in advertising markets as media moves from 1-to-many to niche-oriented, many-to-many and sellers take control of their own online media and advertising campaigns
These are my links for February 16th through February 17th:
- Top 100 Network Security Tools – Many many security testing and hacking tools.
- FRONTLINE: inside the meltdown: watch the full program – "On Thursday, Sept. 18, 2008, the astonished leadership of the U.S. Congress was told in a private session by the chairman of the Federal Reserve that the American economy was in grave danger of a complete meltdown within a matter of days. "There was literally a pause in that room where the oxygen left," says Sen. Christopher Dodd"
- The Dark Matter of a Startup – "Every successful startup that I have seen has someone within their ranks that just kinda “does stuff.” No one really knows specifically what they do, but its vital to the success of the startup."
- Why I Hate Frameworks – "A hammer?" he asks. "Nobody really buys hammers anymore. They're kind of old fashioned…we started selling schematic diagrams for hammer factories, enabling our clients to build their own hammer factories, custom engineered to manufacture only the kinds of hammers that they would actually need."
- Mining The Thought Stream – Lots of comments around what is Twitter good for and how will it make money, revolving around real/near-time search, analytics, marketing, etc.
- Understanding Web Operations Culture – the Graph & Data Obsession … – Comparison of traffic at Flickr, Google, Twitter, last.fm during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."
These are my links for February 15th through February 16th:
- Berkeley cloud report gets mixed reviews | The Wisdom of Clouds – CNET News – James Urqhardt commentary on UCB paper, "The paper begins by setting a definition of Cloud Computing that will be considered controversial by many, as it is firmly in the "there is no cloud computing inside enterprise data centers" camp."
- Above the Clouds: Above the Clouds Released – UC Berkeley RAD Lab starts a new blog and publishes their take on the state of cloud computing.
- Forget Dunbar’s Number, Our Future Is in Scoble’s Number « I’m Not Actually a Geek – A look at changing interaction styles enabled by growing use of online social networks and applications. "If Dunbar’s Number is defined at 150 connections, perhaps we can term the looser connection of thousands as Scoble’s Number. "
- What really happened at Ma.gnolia and lessons learned – Video podcast with Larry Halff describing how Ma.gnolia was implemented (Ruby on Rails), its ongoing operation leading up to the failure of the (1/2 TB) MySQL database a few weeks ago.
- Infrastructure for Modern Web Sites « random($foo) – An overview of packages, services, and approaches for building web systems, circa January 2009. With assorted comments.
- Online Mind Mapping – MindMeister – Web-based, embeddable mind mapping software, sort of like MindJet, wiki-style collaborative editing.
- Jean-Lou Dupont’s WEBlog: Cloud Computing Mind Map – A mind map of companies and projects in the cloud computing space.
An excellent guest lecture at Stanford’s EE380 sometime around February 2004 by Bob Colwell, chief architect of Intel’s IA32 microprocessors from 1992-2000. (90 minutes, Windows Media).
On the history of CPUs, chip processes, power and heat dissipation, Itanium IA64 versus IA32, target markets and economies of scale, FDIV, CPUID, lifetimes of architectures, organizational politics, learning to deal with branded consumer market rather than pure technology customers.
Architects must take the long view
Architect’s job is to make valuable products
- not clever microarchitectures or instruction sets
- not “blue crystals” – useless differentiating features
- look for intersection between what technology will be able to do and what buyers will want, then sell that vision to rest of company
This presentation was made a couple of years ago, in the middle of Pentium 4 and the early days of Centrino, Itanium was the path forward, Opteron was under the radar, and power dissipation and mobility were rising in perceived importance compared with higher clock speeds and CPU benchmarks alone.
via The Inquirer
Update 03-08-2006 23:03 PST: Here’s the abstract and speaker bio from Stanford EE380
I caught a couple of the sessions at the SD Forum Web Based Architecture event yesterday. Adam Denning (Senior Director of the Architecture Strategy Team, Microsoft) prefaced his talk by noting the grand titles that software architects often end up with, and the often fuzzy and open-ended nature of the territory.
I liked this take on the role of the software architect, from someone in the audience, which I think was Pat Helland from Amazon:
Q: What’s the job of a software architect?
A: “Make stuff up and sucker people into building it!”
Someone else in the audience observed that unlike physical world architects, software architects are often involved in actually implementing their designs.
Andre Stechert, Kevin Burton, Alok Bhanot, and Colin Johnson had a panel session after lunch. Andre thinks of software architecture as “the parts of the product that are hard to change”.
On rolling out new software: Kevin observes that apps with large user bases (looks at Alok at eBay) are generally penalized for deploying too early, because of a higher premium on stability and security, while startups are penalized for deploying too slowly, because their main issues are establishing a competitive position in the marketplace before burning through their startup resources. Alok says simplicity is good, overengineering is a risk, eBay has 7 levels of processes, sometimes you don’t anticipate the success of your product. (He also avoids citing any specific war stories.)