<h1 id="replication">Replication<a aria-hidden="true" class="anchor-heading icon-link" href="#replication"></a></h1>
<h2 id="what-is-it">What is it?<a aria-hidden="true" class="anchor-heading icon-link" href="#what-is-it"></a></h2>
<p>Replication means keeping a copy of the same data on multiple machines that are connected via a network.</p>
<p>Each node that stores a copy of the database is called a replica. </p>
<p>If you look at two database nodes at the same moment in time, you’re likely to see different data on the two nodes, because write requests arrive on different nodes at different times (replication lag).</p>
<ul>
<li>These inconsistencies occur no matter what <a href="/notes/wbs6c9snnonnyzm8vk2t4b9">replication strategy</a> the database uses (single-leader, multi-leader, or leaderless replication).</li>
<li>Most replicated databases provide at least eventual consistency
<ul>
<li>in other words, this means that the inconsistency among replicas is temporary and eventually resolves itself.</li>
</ul>
</li>
</ul>
<h2 id="why-do-we-do-it">Why do we do it?<a aria-hidden="true" class="anchor-heading icon-link" href="#why-do-we-do-it"></a></h2>
<ul>
<li>Latency: To keep data geographically close to your users (and thus reduce latency)</li>
<li>Availabrlity: to allow the system to continue working even if some nodes have failed</li>
<li>Scalability: to <a href="/notes/HZSth7yP1s7aPaPPZRJPm#horizontal-scaling-aka-scaling-out">scale out</a> the number of machines that can serve read queries (and thus increase read throughput)</li>
<li>Network fault-tolerance: to keep the application working when there is a network interruption</li>
</ul>
<p>In a world where our data didn't change over time, replication would be simple: we would just copy the data to each node and be done.</p>
<ul>
<li>what makes replication hard is figuring out how to handle changes to replicated data.</li>
</ul>
<p>Normally, replication is quite fast: most database systems apply changes to followers in less than a second.</p>
<ul>
<li>However, there are circumstances when followers might fall behind the leader by several minutes or more; for example:
<ul>
<li>if a follower is recovering from a failure</li>
<li>if the system is operating near maximum capacity</li>
<li>if there are network problems between the nodes</li>
</ul>
</li>
</ul>
<p>There are tradeoffs to consider when implementing replication (both are often configuration options in databases):</p>
<ul>
<li>synchronous or asynchronous replication</li>
<li>how to handle failed replicas</li>
</ul>
<h2 id="replication-strategies">Replication strategies<a aria-hidden="true" class="anchor-heading icon-link" href="#replication-strategies"></a></h2>
<p><a href="/notes/wbs6c9snnonnyzm8vk2t4b9">Strategies</a></p>
<hr>
<strong>Children</strong>
<ol>
<li><a href="/notes/wbs6c9snnonnyzm8vk2t4b9">Strategies</a></li>
</ol>
<hr>
<strong>Backlinks</strong>
<ul>
<li><a href="/notes/g7ulqi8no93ezeocbesc3ll">CouchDB</a></li>
<li><a href="/notes/mytCOts26Pidush65tdRW">DNS</a></li>
<li><a href="/notes/vutujFFWxQu6TshWVuMpI">Distributed Computing</a></li>
<li><a href="/notes/FvIw1Okwy0YOOrU52a8nd">Scaling</a></li>
<li><a href="/notes/FuHcb7zDdczt8IDTtOH0b">Partitioning</a></li>
<li><a href="/notes/ilOPfgNyiPSHOb9tNB6yL">CAP Theorem</a></li>
<li><a href="/notes/lmuquv97wk9dx4wornn70w1">Causal Consistency</a></li>
<li><a href="/notes/nMxAwbzkQChlhX37Gih8a">Sharding</a></li>
<li><a href="/notes/q2b3asuqy4xlorkmdvinwc3">CRDT (Conflict-free Replicated Data Type)</a></li>
<li><a href="/notes/mH9p012Girn7AC2S2mhAp">Log</a></li>
<li><a href="/notes/vaDhC8WlLYfKrpBgU4D9g">Command-Query-Responsibility-Segregation</a></li>
</ul>

Replication

tech

This Dendron vault of tech knowledge is organized according to domains and their sub-domains, along with specific implementation of those domains.

For instance, Git itself is a domain. Sub-domains of Git would include topics like `commit`,
`tags`, `reflog` etc., while implementations of each of those could be `cli`, `strat`
(strategies), `inner` (inner workings), and so on.

The goal of the wiki is to present data in a manner that is from the perspective
of a querying user. Here, a user is a programmer wanting to get key information
from a specific domain. For instance, if a user wants to use postgres functions
and hasn't done them in a while, they should be able to query
`postgres.functions` to see basic implementations, as well as common patterns
that have been employed in the past.

This wiki has been written with myself in mind. While learning each of these
domains, I have been sensitive to the "aha" moments and have noted down my
insights as they arose. I have refrained from capturing information that I
considered obvious or otherwise non-beneficial to my own understanding.

As a result, I have allowed myself to use potentially arcane concepts to help
explain others. For example, in my note on [[unit testing|testing.method.unit]],
I have made reference to the [[microservices|general.arch.microservice]] note.
The ability to analogize between different concepts captured in different notes
allows an opportunity to build strong generalized understandings. Given that
you'd have to understand microservices to be able to draw that same parallel
that I've already drawn, these links won't work for everyone. Since these notes
are written for myself, I have been fine with taking these liberties and leaning
on them heavily.

What I hope to gain from this wiki is the ability to step away from any
given domain for a long period of time, and be able to be passably useful for
whatever my goals are within a short period of time. Of course this is all
vague sounding, and really depends on the domain along with the ends I am
trying to reach.

To achieve this, the system should be steadfast to:
- be able to put information in relatively easily, without too much thought
	required to its location. While location is important, Dendron makes it easy
	to relocate notes, if it becomes apparent that a different place makes more
	sense.
- be able to extract the information that is needed, meaning there is a
	high-degree in confidence in the location of the information. The idea is
	that information loses a large amount of its value when it is unfindable.
	Therefore, a relatively strict ideology should be used when determining
	where a piece of information belongs.
	- Some concepts might realistically belong to multiple domains. For instance, the concept of *access modifiers* can be found in both `C#` and `Typescript`. Therefore, this note should be abstracted to a common place, such as [[OOP|paradigm.oop]].

This Dendron notebook is the sister vault to the general [Second Brain](https://thoughts.kyletycholiz.com).

## Tags
Throughout the garden, I have made use of tags, which give semantic meaning to the pieces of information.

- `ex.` - Denotes an *example* of the preceding piece of information
- `spec:` - Specifies that the preceding information has some degree of *speculation* to it, and may not be 100% factual. Ideally this gets clarified over time as my understanding develops. I try to go back after I have better understood the topic and clear out the notes of `spec:` tags
- `anal:` - Denotes an *analogy* of the preceding information. When I can, I attempt to link concepts to others that I have previously learned.
- `mn:` - Denotes a *mnemonic*
- `expl:` - Denotes an *explanation*

## Resources
### UE (Unexamined) Resources
Often, I come across sources of information that I believe to be high-quality. They may be recommendations or found in some other way. No matter their origin, I may be in a position where I don't have the time to fully examine them (and properly extract notes), or I may not require the information at that moment in time. In cases like these, I will add reference to a section of the note called **UE Resources**. The idea is that in the future when I am ready to examine them, I have a list of resources that I can start with. This is an alternative strategy to compiling browser bookmarks, which I've found can quickly become untenable.

### E (Examined) Resources
Once a resource has been thoroughly examined and has been mined for notes, it will be moved from *UE Resources* to *E Resources*. This is to indicate that (in my own estimation), there is nothing more to be gained from the resource that is not already in the note.

### Resources
This heading is for inexhaustible resources. 
- A prime example would be a quality website that continually posts articles.  - Another example would be a tool, such as software that measures frequencies in a room to help acoustically treat it.