313. Super Ugly Number
313. Super Ugly Number
Description
Solution
Use priority_queue to find the smallest number
Be careful about integer overflow.
Code
12345678910111213141516class Solution {public: int nthSuperUglyNumber(int n, vector<int>& primes) { priority_queue<long, vector<long>, greater<>> pq; long top; pq.push(1); while(n--){ top = pq.top(); while(!pq.empty() && pq.top() == top) pq.pop( ...
727. Minimum Window Subsequence
727. Minimum Window Subsequence
Description
Solution
There are two solutions, one is original DP(very common solution that must think of it when encounter a string problem), another is Finite State Machine solution.
DP
123456789DP[i][j] = the minimum subsequence length k//e.g, s2[0 : j] is a subsequence of s1[i - k + 1: i]if s1[i] == s2[j] dp[i][j] = dp[i-1][j-1] + 1else dp[i][j] = dp[i-1][j] + 1return min{dp[i][N]} for i= 1,2,3..M
Finite State Machine
1234567Define a array ne ...
6.824 Lab4
Lab4
Overview
Finally, we make it to the final lab, no doubt that it’s the most challenging work among 4 labs. In this part, we gonna implement two main K-V storage system, one is shardmaster and one is shardkv. The specific service of them will be discussed in FRAMEWORK part, let’s talk how much workload this lab may take first.
Compared with former labs, we have no paper to refer to and no framework to follow. This lab emphasizes more on real producing environment that we need balance loa ...
6.824 CT lecture note
Certificate Transparency
Introduce
First review the vulnerability of HTTP and an example of Man in the Middle Attack.
HTTP
HTTP is a request response protocol to communicate asynchronously between client and server.
For websites and pages the browser acts as a client and a web-server like Apache or IIS acts as server. The server hosts the files (like html , audio , video files etc) and returns are responds to client requests with the data. Depending on the request a response contains the ...
6.824 CRAQ lecture note(Pending for updated)
FaRM (To be updated...)
Introduce
Microsoft’s main memory distributed computing platform, FaRM can provide distributed transactions with serializability, high performance, durability, and high availability using RDMA and a new, inexpensive approach to providing non-volatile DRAM.
NVRAM
Strategy to become non-volatile:
Using battery as back-up power, once the electric power fails, the system goes with battery and do all context saving work then shut down
Note: This scheme is just helpful when ...
6.824 ZooKeeper lecture note
ZooKeeper(To be continuely updated)
Introduction
ZooKeeper is a very popular service for coordinating processes of distributed applications, it provides a simple and high performance kernel for building more complex coordination primitives. We users can deploy ZK in many distributed applications like, services registration and discovery, cluster management as a coordinator.
ZooKeeper has the following features:
Sequential Consistence
All client see the same data that one client’s transaction ...
6.824 Lab4 Conclusion
Conclusion
In this part, I will illustrate more detailed implementation of ShardKV based on the expectation that you have read the lab4 and know the basic idea of the framework. And I will describe following the sequence of a message sent by Clerk and to each part of Raft Group.
Implementation
ShardKV
As we described previously, we should maintain a Shard Map, config & lastConfig information in each Raft Group to complete 3 main functions.
12345678910111213type ShardKV struct{ ...
6.824 Conclusion
Conclusion
Lab3 is a tough journey, especially when you finish the whole framework, then problem just starts. I took roughly two days to complete my draft and spent 2-3 times time to debug.
In order to remind me of this hard time, spending hours reading 20000 lines log, suspecting each line I wrote and make you who are reading this avoid bugs or debug smoothly, I will write problems as explicit as I can.
To make my design easy to understand, I will start with design graph.
Structural Design
...
6.824 Lab3 KVRaft
Lab3
Overview
In this Lab, we are gonna build a Key/Value storage system based on raft, AKA KVRaft. In part A, just maintain a in memory Hashmap of Key -> Value and In part B, we need to deal with the growing Hashmap and use Snapshot to discard old log entries. Some details may be discussed in each section.
Lab3A
To implement a KVRaft, we use clerk to denote a interface the client calls to use K/V storage, server to denote a peer among K/V storage system which is based on Raft protocol ...
6.824 Lab2 Raft
Raft
Paper Topics
Start with a Raft Protocol paper, In Search of an Understandable Consensus Alogorithm , Also you can find Chinese version here
Raft Basics
Contain 3 kinds of servers : Follower, Candidate, Leader
The Leader will be elected for each <**term**>, and term is the basic unit served as a Timer to denote the up-to-date log’s time stamp
Leader Election
Using the election timeout mechanism to make Follower begin a new round elction
To avoid split votes, Raft set ...
6.824 Lab1 MapReduce
MapReduce
Summary
Topics about Distributed Systems
Infrastructure
Storage
Communication
Computation
Performance
Scalability
Fault Tolerance
Consistency
strong consistency
weak consistency
MapReduce
About MapReduce
TL;DR – Google’s engineers build a model that implements the split and dispatch process to make large scale computation tasks easy programmed. (Just need to write map() and reduce() function)
You can find the detailed model description in MapReduce: Simplifi ...
6.824 Bitcoin lecture note
Bitcoin
As mentioned in Satoshi Nakamoto’s paper that bitcoin is aimed to prevent double-spending as well as reduce the cost of third party involvement.
Bitcoin has three features that makes it Epoch-making
Decentralization
Using Peer-to-Peer Technology
Low-cost transaction
Brief Intro
Bitcoin is a distributed ledger which implement decentralization. Bitcoin ‘s design should solve Byzantine Generals Problem because it is running through public Internet. Bitcoin systems reply on Proof of Wo ...
6.824 Spark lecture note
Spark
Introduce
Spark is a successor of MapReduce to process distributed big-data computing tasks. Frameworks like MapReduce, Dryad, and Spark help data scientists to focus on their business rather than wasting time on designing the distributed tasks and fault-tolerance.
There are some constraints in previous frameworks that MapReduce lacks abstractions for leveraging distributed memory so makes it inefficient for those that reuse intermediate results across multiple computations and lacks h ...
6.824 COPS lecture note
COPS
Introduce
Still the website Front-End read from DB storage cluster model. We are gonna explore another possibility as local read and local write to speed up transactions. How to achieve stronger Consistency in terms of CAP Theorem.
Consider [Spanner](./13.Spanner(Strong Consistency RW).md) and [Memcache](./16.Memcache(Cache Consistency).md)’s scheme
Spanner
Linearizability
For R/W transaction, Use PAXOS and 2PC to write to remote replication. (Wait for quorum’s acknowledge)
For R/O tr ...
6.824 Memcache lecture note
Memcahce at Facebook
Intro
In a popular webserver scenario, we have web application that clients send request (especially most reads and a few writes) to Data Base, and as we both know things get worse when one peer in the system suffer a throughput bottleneck. To achieve better performance and also get stronger consistency.
Single Web Server(eg, running on Apache Tomcat) + Single DataBase(eg, MySQL/ Oracle)
↓
Muti-Stateless Web Server + Single DB
↓
Mutl-Stateless W ...