#+TITLE: Yet Another Page on Readings in Distributed Systems #+DESCRIPTION: My own list of links, articles, paper, etc. I enjoyed reading about distributed systems #+TAGS: Distributed Systems #+TAGS: Readings #+DATE: 2015-05-08 #+UPDATED: 2015-05-12 #+SLUG: readings-in-distributed-systems #+LINK: aphyr-post-network-reliable http://aphyr.com/posts/288-the-network-is-reliable #+LINK: wiki-fallacies-of-distributed-computing http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing #+LINK: wiki-cap-theorem http://en.wikipedia.org/wiki/CAP_theorem #+LINK: lysefgg-cap http://learnyousomeerlang.com/distribunomicon#my-other-cap-is-a-theorem #+LINK: cap-paper http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf #+LINK: codehale-cant-partition-tolerance http://codahale.com/you-cant-sacrifice-partition-tolerance/ #+LINK: wiki-consistency-model http://en.wikipedia.org/wiki/Consistency_model #+LINK: wiki-list-consistency-models http://en.wikipedia.org/wiki/Category:Consistency_models #+LINK: wiki-linearizability http://en.wikipedia.org/wiki/Linearizability #+LINK: bailis-linear-vs-serial http://www.bailis.org/blog/linearizability-versus-serializability/ #+LINK: wiki-eventual-consistency http://en.wikipedia.org/wiki/Eventual_consistency #+LINK: wiki-paxos http://en.wikipedia.org/wiki/Paxos_(computer_science) #+LINK: distributed-thoughts-understanding-paxos http://distributedthoughts.wordpress.com/2013/09/22/understanding-paxos-part-1/ #+LINK: willportnoy-lessons-paxos http://blog.willportnoy.com/2012/06/lessons-learned-from-paxos.html #+LINK: wiki-vector-clock http://en.wikipedia.org/wiki/Vector_clock #+LINK: wiki-split-brain http://en.wikipedia.org/wiki/Split-brain_(computing) #+LINK: wiki-network-partitions http://en.wikipedia.org/wiki/Network_partitioning #+LINK: cemerick-ds-end-api https://speakerdeck.com/cemerick/distributed-systems-and-the-end-of-the-api #+LINK: linkedin-blog-the-log http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying #+LINK: aphyr-jepsen-tag http://aphyr.com/tags/jepsen #+LINK: aphyr-jepsen-call-me-maybe http://aphyr.com/posts/281-call-me-maybe #+LINK: snookles-tcp-incast http://www.snookles.com/slf-blog/2012/01/05/tcp-incast-what-is-it/ #+LINK: growse-hdfs-partition-tolerance https://www.growse.com/2014/07/18/partition-tolerance-and-hadoop-part-1-hdfs/ #+LINK: aphyr-jepsen-zookeeper http://aphyr.com/posts/291-call-me-maybe-zookeeper #+LINK: aphyr-jepsen-kafka http://aphyr.com/posts/293-call-me-maybe-kafka #+LINK: aphyr-jepsen-cassandra http://aphyr.com/posts/294-call-me-maybe-cassandra #+LINK: wiki-acid http://en.wikipedia.org/wiki/ACID #+LINK: aphyr-jepsen-postgres http://aphyr.com/posts/282-call-me-maybe-postgres #+LINK: ferd-lessons-large-scale http://ferd.ca/lessons-learned-while-working-on-large-scale-server-software.html #+LINK: ferd-about-guarantees http://ferd.ca/it-s-about-the-guarantees.html #+LINK: ferd-queues-overload http://ferd.ca/queues-don-t-fix-overload.html #+LINK: ferd-erlang-anger http://www.erlang-in-anger.com/ #+LINK: aphyr-async-replication http://aphyr.com/posts/287-asynchronous-replication-with-failover #+LINK: aphyr-strong-consistency-models http://aphyr.com/posts/313-strong-consistency-models #+BEGIN_QUOTE "Distributed systems are hard." -Everyone. #+END_QUOTE #+BEGIN_PREVIEW This page is dedicated to general discussion of distributed systems, references to general overviews and the like. Distributed systems are difficult and even the well established ones aren't [[aphyr-post-network-reliable][bulletproof]]. How can we make this better? As SysAdmins? As Developers? First we can attempt to understand some of the issues related to designing and implementing distributed systems. Then we can throw all that out and figure out what /really/ happens to distributed systems. #+END_PREVIEW ** Recommended Reading *** General - [[wiki-fallacies-of-distributed-computing][Fallacies of Distributed Computing]] - [[wiki-cap-theorem][CAP Theorem]] - [[lysefgg-cap][LYSEFGG: Distribunomicon: My other cap is a theorem]] - For a more entertaining introduction to CAP, Hebert's ''Learn You Some Erlang for Great Good'' has a really good subsection on the topic that includes the zombie apocalypse and some introduction to how a blend between AP and CP systems can be achieved. - [[cap-paper][CAP Theorem Proof]] - [[codehale-cant-partition-tolerance][You can't sacrifice partition tolerance]] - [[wiki-consistency-model][Consistency Model]] - [[wiki-list-consistency-models][List of Consistency Models]] - [[wiki-linearizability][Linearizability]] - [[bailis-linear-vs-serial][Linearizability versus Serializability]] - [[wiki-eventual-consistency][Eventual Consistency]] - [[wiki-paxos][Paxos]] - [[distributed-thoughts-understanding-paxos][Understanding Paxos (Part 1)]] - [[willportnoy-lessons-paxos][Lessons learned from implementing Paxos (2013)]] - [[wiki-vector-clock][Vector Clock]] - [[wiki-split-brain][Split-Brain]] - [[wiki-network-partitions][Network Partitions]] - [[cemerick-ds-end-api][Distributed Systems and the End of the API]] - [[linkedin-blog-the-log][The Log]]: What every software engineer should know about real time data's unifying abstraction The [[aphyr-jepsen-tag][Jepsen]] "Call me maybe" articles are really good, well written essays on topics and technologies related to distributed systems. Introductory post to the "Call me maybe" series: - [[aphyr-jepsen-call-me-maybe][Call me maybe]] Here are some personal recommendations: - [[aphyr-post-network-reliable][The Network is Reliable]] - [[aphyr-strong-consistency-models][Strong Consistency Models]] - [[aphyr-async-replication][Asynchronous Replication with Failover]] Really anything from Ferd Herbert is good. Particularly, the first and last chapters of [[ferd-erlang-anger][Erlang In Anger]] which includes longer essays from his blog posts. - [[ferd-queues-overload][Queues Don't Fix Overload]] - [[ferd-about-guarantees][It's About the Guarantees]] - [[ferd-lessons-large-scale][Lessons Learned while Working on Large-Scale Server Software]] *** General Networking - [[snookles-tcp-incast][TCP incast]] *** Hadoop ecosystem This link is more specific to HDFS and is a rather limited experiment but nonetheless a good read to further understand partition issues that can arise in Hadoop systems: - [[growse-hdfs-partition-tolerance][Partition Tolerance in HDFS]] More links from the [[aphyr-jepsen-tag][Jepsen essays]]: - [[aphyr-jepsen-zookeeper][Call me maybe: Zookeeper]] - [[aphyr-jepsen-kafka][Call me maybe: Kafka]] - [[aphyr-jepsen-cassandra][Call me maybe: Cassandra]] *** Databases - [[wiki-acid][Wikipedia ACID]] - [[aphyr-jepsen-postgres][Call me maybe: Postgres]]