Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni. PNUTS: Yahoo!'s Hosted Data Serving Platform. In PVLDB 2008.
PNUTS is a parallel, geographically distributed database that supports Yahoo!'s web applications. It chooses high-availability, low response times and scalability over strict consistency. The authors claim that since web applications typically manipulate one record-at-a-time, while different records may have activity in different geographic localities. To this end, they provide a per-record timeline consistency model, where all replicas of a given record apply all the updates made to it in the same order. This is achieved by having a per-record master to make sure that all updates are in order. Another key component of the design is the Pub-Sub Message System used for record-level asynchronous geographical replication which uses a (message broker based) guaranteed message delivery service rather than a persistent log. Lastly, PNUTS trusts the application designers to do the right thing in terms for their consistency needs by providing support for hashed and ordered table organizations, and more importantly API calls like read-any, read-latest, read-critical (version specific) etc.
Comments/Critiques:
PNUTS is an interesting design point in this space due to its simplicity by trusting the application designers to do the right thing. This somehow feels like an anti-Google system design philosophy in some sense (which always puts 'making the job of the application designer easy' as their top priority). However, it remains to be seen if application designers can write their applications by properly leveraging the trade-offs (e.g. where should they use read-latest vs. read-critical?). Secondly, it is unclear to me if their lack of support for referential integrity and complex queries (with joins etc.) is a major drawback or not for continuously evolving web applications. However, in all, I think PNUTS is definitely an interesting design and things like per-record timeline consistency would find its way in many future systems.
No comments:
Post a Comment