Meeting notes for Isabella, Oliver, Joy and me
Meeting Date/Time: 07/25/2021
Speaker: XiuYi
Attendees: Oliver, Isabella, Joy
Meeting Notes:
- Scalability
- 服务器的负载均衡算法
- 分布式缓存场景,3万张图片,3个缓存服务器
- 传统Hash取模算法,所有的缓存数据hash值改变,All cache miss
- 一致性hash算法,先hash服务器编号,然后hash文件/请求的名字,只有小部分缓存失效
- 数据倾斜/某台服务器宕机,请求分散
- 代码
Meeting Date/Time: 07/27/2021
Speaker: Oliver, Joy
Attendees: XiuYi, Isabella
Meeting Notes:
Oliver: 缓存简介,缓存更新策略(上)
Joy:DDIA第一章
- Reliability
- Scalability
- Maintainability
Meeting Date/Time: 07/30/2021
Speaker: Isabella, XiuYi
Attendees: Joy, Oliver
Meeting Notes:
push/pull & Polling
- short polling/long polling/websocket
- pull/pull各自适用场景
Inverted Index
- 关键词为key,出现地方为value
- lucene,Solr/ElasticSearch
Meeting Date/Time: 07/31/2021
Speaker: XiuYi, Oliver
Attendees: Isabella, Joy
Meeting Notes:
如何保证消息的时序性
- 分布式,集群化部署
- 全局ID generator
- 离线推送
缓存简介,缓存更新策略(下)
- Weak/Eventual/Strong Consistency
- data parition - tweet id
- fail over / replication
- cache aside(write around) / write back / write through / refresh ahead
Meeting Date/Time: 08/05/2021
Speaker: Isabella, Xiuyi
Attendees: Oliver, Joy
Meeting Notes:
API设计要点
- RPC
- SOAP
- REST
- GraphQL
BQ准备,亚麻领导力准则
- 自我介绍
- 1)客户至上; 2) 主人翁精神; 3)创新简化; 4)决策正确
Meeting Date/Time: 08/08/2021
Speaker: Oliver, Joy
Attendees: Isabella, Xiuyi
Meeting Notes: 1)Oli - SDP - Database
- #1: Relational database management system (RDBMS)
- master-master, master-slave, federation, sharding
- #2: NoSQL
- Key-Value Store, Documentation Store, Wide-Column Store, Graph Database
- #3: SQL or NoSQL
- SDP
2)Joy - DDIA Chapter II
- SQL/Nosql
- 关系型 文档型
Meeting Date/Time: 08/12/2021
Speaker: Joy, Isabella
Attendees: Oliver, Xiuyi
Meeting Notes:
- Design Chat Service (Grokking)
- polling - WS
- consistency
- store - column based
- partitioning
2)MicroService
- Introduction
Meeting Date/Time: 08/15/2021
Speaker: Oliver, Xiuyi
Attendees: Joy, Isabella
Meeting Notes:
- chat 补充
- polling and websocket
- push vs pull
- 流程图
- 分库分表
- 策略
- join操作
Meeting Date/Time: 08/19/2021
Speaker: Joy, Xiuyi
Attendees: Isabella, Oliver
Meeting Notes: 1) * 2) *
Meeting Date/Time: 08/22/2021
Speaker: Oliver, Isabella
Attendees: Joy, Xiuyi
Meeting Notes:
- 算法模板
- Binary Search
- Two Pointer
- BFS
- BT/BST
- DP
- Heap
- Union Find
- Trie
Meeting Date/Time: 08/26/2021
Speaker: Joy, Xiuyi
Attendees: Oliver, Isabella
Meeting Notes:
- From grokking
- Load Balancer
- caching
- Data Partitioning
- SQL/NoSQL
- CAP
- Long Polling, WB, Server sent events(SSE)
- Load Balancer必知必会
Meeting Date/Time: 08/29/2021
Speaker: Isabella, Oliver
Attendees: Joy, Xiuyi
Meeting Notes: 1)
- session ID 是否可以用全局ID生成器例如Redis?
Meeting Date/Time: 09/02/2021
Speaker: Joy,Oliver,
Attendees: Isabella, Xiuyi
Meeting Notes: 1)
- cache storage
- tech blog
Meeting Date/Time: 09/05/2021
Speaker: Xiuyi, Isabella
Attendees: Joy, Oliver
Meeting Notes: 1) * 2) *
Meeting Date/Time: 09/10/2021
Speaker: Joy, Xiuyi
Attendees: Oliver, Isabella
Meeting Notes:
- 复习topics
- 一致性hash
- Inverted Index
- Cookie, Session, Token
- Load Balancer
Meeting Date/Time: 09/12/2021
Speaker: Isabella
Attendees: Joy, Xiuyi
Meeting Notes:
- 复习topics
- Learn and Be Curious
- Earn Trust
Meeting Date/Time: 09/17/2021
Speaker: Joy, Oliver
Attendees: Xiuyi, Isabella
Meeting Notes:
- Tiny URL
- 不同短网址在不同网站上,检测访问量
- 一个长网址对应多个短网址(Cassandra等column based)
- 服务器和用户可设置过期时间,不能被猜到
- hash function
- Key generation Service生成短网址的letters, 6个8个等
- Typeahead
Meeting Date/Time: 09/19/2021
Speaker: Xiuyi, Isabella
Attendees: Oliver, Joy
Meeting Notes: 1) * 2) *
Meeting Date/Time: 09/28/2021
Speaker: Joy, Isabella
Attendees: Oliver, Xiuyi
Meeting Notes:
- 图论入门
- Queue
- Priority Queue
- Union Find
Meeting Date/Time: 09/30/2021
Speaker: Xiuyi
Attendees: Oliver,Joy, Isabella and many others
Meeting Notes:
- 设计KV Store
Meeting Date/Time: 08/22/2021
Speaker: Joy, Isabella, Oliver
Attendees: Xiu Yi
Meeting Notes: 1)
- Redundancy (server) and Replication(data)
- Bloom Filters - 不为1肯定没有,为1不保证一定有
- Quorum
- Leader and Follower
- Heartbeat
- Checksum - 对TCP/UDP的报文段进行校验/检测nodes之间数据复制是否完整
- Design Rate Limiter
- Token bucket
Meeting Date/Time: 10/07/2021
Speaker: Isabella, Oliver
Attendees: Joy, Xiuyi
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 10/12/2021
Speaker: Xiuyi
Attendees: Isabella, Oliver
Meeting Notes: 1)
- Learn and Be Curious 好奇求知
- Hire and Develop The Best 选育贤能
- Insist on the Highest Standards 最高标准
- Think Big 远见卓识
Meeting Date/Time: 10/14/2021
Speaker: Joy,Isabella
Attendees: Xiuyi
Meeting Notes: 1)
- paste table + user table
- clean service
- peer to peer / master slave
- Cahce (缓存淘汰策略,缓存与DB一致性问题 - 先更新DB再删缓存,脏数据最少)
Meeting Date/Time: 10/19/2021
Speaker: Xiuyi, Oliver
Attendees: Isabella
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 10/21/2021
Speaker: Joy, Xiuyi
Attendees: Oliver, Isabella
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 10/24/2021
Speaker: Isabella, Oliver
Attendees: Joy, Xiuyi
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 10/31/2021
Speaker: Isabella, Xiuyi
Attendees:
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 11/07/2021
Speaker: Joy, Isabella, Xiuyi
Attendees:
Meeting Notes: 1)
- tweets的Sharding - 雪花算法ID
- Cache
- Twitter Search
- MapReduce
- Count-min Setch
- 可重入性的问题
Meeting Date/Time: 11/14/2021
Speaker: Xiuyi
Attendees: Isabella, Joy, Oliver
Meeting Notes: 1) Amazon price tracker
1k products
requirements:
-
grab the prices and store prices somewhere
-
we pop these date to customers
Amazon prices -> services to grab the prices -> to our own stores - > services to fetch data from our stores -> pop up for ASAP for real time for customer needs
-
pay Amazon to use APIs to get data - alternative for others, we may can use crawlers
-
rest/graphQL APIs to get data in our gateway layers - json objects
-
middle ware service to clean/aggregate/cluster data - pre-compute
-
store
shcema:
Product: productID, categoryID, price, timestamp;
Price: ID, productID, Array{{price1, time1},{price2, time2},{price3, time3}...}
Category: ID, categoryNames....
Select price From Product Where ProductID = "123"
-
sevices to grab data and do some computing - pick some products/category,save them in the cache -> speed reading for new customers
-
pop to clients
authozation -> free -> can access some limited resources (e.g, overal category prices change in past three months) vs VIP users (have more accesses to all tracks, like all historical data changes for all produces)
Meeting Date/Time: 11/18/2021
Speaker: Isabella, Xiuyi
Attendees: Joy, Oliver
Meeting Notes: 1) * * 2)
- kafka不在一个consumer group的消费者怎么消费同一个消息?
Meeting Date/Time: 12/02/2021
Speaker: Xiuyi
Attendees: Isabella, Oliver
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 12/05/2021
Speaker: Joy, Isabella, Xiuyi
Attendees: Oliver
Meeting Notes: 1)
- Quad Tree
033. 1)[Counting system](https://medium.com/pinterest-engineering/building-a-real-time-user-action-counting-system-for-ads-88a60d9c9a
); 2) GeoHash Meeting Date/Time: 12/05/2021
Speaker: Isabella, Xiuyi
Attendees: Joy, Oliver
Meeting Notes: 1) * * 2) * 3) *
Meeting Date/Time: 12/05/2021
Speaker: Joy, Isabella, Xiuyi
Attendees: Jasmine
Meeting Notes: 1) * * 2) * 3) *
Meeting Date/Time: 01/02/2022
Speaker: Isabella, Xiuyi
Attendees:
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 01/09/2022
Speaker: Isabella, Xiuyi
Attendees: Joy, Oliver, Jasmine
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 01/16/2022
Speaker: Xiuyi, Nick, Isabella
Attendees: Joy
Meeting Notes: 1) * * 2) * 3)
Meeting Date/Time: 01/23/2022
Speaker: Xiuyi
Attendees: Jasmine, Nick, Isabella, Oliver
Meeting Notes: 1) * * 2) * 3)
Meeting Date/Time: 01/30/2022
Speaker: Xiuyi, Isabella, Oli
Attendees: Jasmine, Nick, Joy
Meeting Notes: 1) * * 2)
Meeting Date/Time: 02/06/2022
Speaker: Nick, Joy
Attendees: Jasmine, Vickie, Isabella, Oli, Xiuyi
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 02/10/2022
Speaker: Xiuyi
Attendees: Jasmine, Vickie, Isabella, Oli, Nick, Joy, wenruo
Meeting Notes: 1) * * 2) *
Meeting Date/Time: 02/20/2022
Speaker: Xiuyi
Attendees: Jasmine, Isabella, Oli, Nick, Joy, wenruo
Meeting Notes: 1)
- Python版本的'|'问题,应该是语法糖两个进程通信,而不是位操作的或操作
Meeting Date/Time: 02/27/2022
Speaker: Wenruo, Isabella
Attendees: Xiuyi, Nick, Jasmine, Vickie
Meeting Notes: 1) * * 2)
Meeting Date/Time: 03/06/2022
Speaker: Nick, Joy, Xiuyi
Attendees: Wenruo, Isabella, Xiuyi, Jasmine, Vickie
Meeting Notes: 1)
- Kafka从2.4版本开始可以从followers中读取数据,之前版本只能从leader中读;正常的数据库一般是leader只负责写入
- WAL vs binlog
Meeting Date/Time: 03/13/2022
Speaker: Ning, Xiuyi
Attendees: Yu, Xiuyi, Wenruo, Isabella, Jasmine, Vickie
Meeting Notes: 1) * * 2) * 3)
Meeting Date/Time: 03/20/2022
Speaker: Xiuyi, Isabella,
Attendees: Ning, Yu, Xiuyi, Wenruo, , Jasmine, Vickie
Meeting Notes: 1) * * 2)
- pre-PRC, rpc, post-RPC transaction最后一个步骤失败了怎么办?
- Transaction的操作顺序问题
- master和replicas在payments中同步问题
Meeting Date/Time: 03/27/2022
Speaker: Wenruo, Oli Xiuyi
Attendees: Ning, Yu, Joy, Isa, Jasmine
Meeting Notes: 1) * * 2) * 3)
Meeting Date/Time: 04/03/2022
Speaker: Xiuyi
Attendees: Yu, Joy, Isa, Jasmine, Wenruo, Oli
Meeting Notes: 1)
- 是否可以将Kafka的partition指定给consumer消费?
Meeting Date/Time: 04/10/2022
Speaker: Isabela Xiuyi
Attendees: Yu, Jasmine, Wenruo, Yibo, Vickie
BQ topics - customer obsession
- Who was your most difficult customer?
- Give me an example of a time when you did not meet a client’s expectation. What happened, and how did you attempt to rectify the situation?
- When you’re working with a large number of customers, it’s tricky to deliver excellent service to them all. How do you go about prioritizing your customers’ needs?
- Tell the story of the last time you had to apologize to someone.
Meeting Notes: 1) * *
Meeting Date/Time: 04/17/2022
Speaker:
Attendees: Isabela, Xiuyi, Yu, Jasmine, Wei, Yibo, Vickie
Meeting Notes: 1)
- RST ARK
Meeting Date/Time: 04/24/2022
Speaker: Yu, Xiuxi
Attendees: Isabela, Jasmine, Wei, Wenruo, Nick, Oli
Meeting Notes: 1) * 2)
- resources
Meeting Date/Time: 04/24/2022
Speaker: Wei, Nick
Attendees: Yu, Xiuxi, Isabela, Jasmine, Yibo, Wenruo, Oli
Meeting Notes: 1) * 2)
- resources
Meeting Date/Time: 08/28/2022
Speaker: Joy
Attendees: Yu, Xiuxi, Isabela, Jasmine, Yibo, Wenruo, Oli
Meeting Notes:
- Meeting Notes: Types of connections:
- short polling; (close right after response) 2.long polling; (long and will timeout) 3.WebSocket. (Bi-directional and persistent) High-level design: Stateless: User > LB > services (AMI, Service discovery, User service, Group service) Stateful: via WS Storage: KV stores 用于水平scaling KV store 低延迟。 关系型搞不定长尾问题。索引也是坑的一笔。 Table 设计: dbo.[message] dbo.[group_message] 只要保证group范围内ID唯一,即可躺平。
- 1.Message 1 to 1 chat workflows, 重点是有个message queue。面向消息队列编程。
- 如何跨设备同步,实现统一通信? 手机PC(都记录各自收到哪一条了)然后都通过Chat Server1的session来catch up。
- Small Group chat workflow: 重点是有message queue, 发是发给queue, 收也是收那个queue。
- Online Presence如何隐身 5s一个heartbeat,30s就当他掉了。 PS: WB开销很大,只考虑建立给active chatting,timeout即可回收资源。
- Meeting Note三 1.Online Status fanout Pub-sub模式 更新在线状态非常昂贵,让他们自己手动pull? Wrap up 加分题: 1.支持文字以外的多媒体。 2.端到端加密。小心老旧机型性能问题。。。 3.Caching。历史数据本地存着?
- 关系数据库可以存下用户,setting等比较stable的基本数据(靠谱便于维护)。但是像一条条的即时消息,必须上KV store。 4.服务发现 PS:面试时把replica放在back pocket就好。太细节不用摆在桌面。 PS:拜占庭共识???
-
resources
Meeting Date/Time: 09/25/2022
Speaker: XiuYi
Attendees: Yu, Wei, Isabela, Vickie, Joy, Yibo, Oli, O.O, Wen, Ju, zhengda
Meeting Notes: 1)
-
Num! Isabella Ju
-
Top1 Joy Zhengda
-
找工作 Wei Wen Yuan
-
YYV Yu Yibo Vickie
Meeting Date/Time: 10/02/2022
Speaker: Zhengda Wu
Attendees: Yu, Wei, Isabela, Guilin, Yibo, Oli, Ning, Joy, Yuan
Meeting Notes:
- Design a web crawler (搜索引擎)
- 如何防止死循环
- MapReduce
- NoSQL存储
- 抓取的数据隔一段时间不同如何处理? - timestamp
- PageRank
- Load Balancer算法
- 缓存的地方(浏览器。redis,数据库层, CDN, Application)
- Web Server/API Server
- 分布式爬虫? 数以亿计的网站
- crawler在背后一直run,起点是什么? - 有一些种子links,公司自己定;
- 热门网站如何决定 - 热门网站放在queues里面(deep dive)
- 反爬机制 - 文件/协议
Meeting Date/Time: 10/09/2022
Speaker: Joy, Ding
Attendees: Yu, Wei, Isabela, Guilin, Ning, Yuan, Zhengda,Oli
Meeting Notes:
- Design a web crawler
- 种子URLs - 根据场景来选择
- 准备download的 - URL frontier
- content比较防止重爬 - 文字比对效率太低,利用Hashing结果来对比,MD5比较慢,MurmurHash或者JenkinsHash可用,Lucene用的Lookup3Signature(by JenkinsHash)
- 比对hashing结果非常相近的时候,是否存两次或相同对待;只存关键词和links - fingerprints
- Inverted Indices - 索引关键词,部分网页存cache大部分不存网页本身内容(no content storage)
- Ding Q&A
- 国内国外工作最大的不同 - 科技类公司,国内工作怕普遍强度高很多
Meeting Date/Time: 10/16/2022
Speaker: Xiuyi
Attendees: Isabela, Oli, Ning, Ding, Zhengda
Meeting Notes:
- Prometheus + Grafana + SpringBoot
- 联邦监控
- tombstones文件
- Prometheus高级功能
Meeting Date/Time: 10/23/2022
Speaker: Joy
Attendees: Isabela, Xiuyi, Oli, Ning, Ding, Zhengda
Meeting Notes:
- Design Web Crawler - 下
- DFS vs BFS
- Queue Selector - 设定优先级
- 防扒机制 - 大网站
- 不同domain但内容相同 - filter
- Spider Traps - 黑名单
- 两个queues存放问题
Meeting Date/Time: 10/30/2022
Speaker: Joy
Attendees: Isabela, Xiuyi, Ning, Yu, Wei, Zhengda, Oli
Meeting Notes: 1) *
Meeting Date/Time: 11/13/2022
Speaker: Xiuyi
Attendees: Isabela, Oli, Ning, Yu, Zhengda, Joy
Meeting Notes:
- Prometheus + Grafana + Kafka
- SlackBot token
Meeting Date/Time: 11/20/2022
Speaker: Xiuyi
Attendees: Isabela, Wei, Yuan, Yu, Joy, Oli
Meeting Notes: 1)
- 通常我们会到container(起始)和component(局部展开)这两层
Meeting Date/Time: 12/04/2022
Speaker: Xiuyi
Attendees: Isabela, Wei, Ning, Yu, Joy, Oli, Zhengda
Meeting Notes: 1)
Meeting Date/Time: 12/11/2022
Speaker: Wei
Attendees: Isabela, XiuYi, Ning, Yuan, Joy, Oli, Zhengda
Meeting Notes:
Meeting Date/Time: 12/18/2022
Speaker: XiuYi
Attendees: Isabela, Wei, Ning, Yu
Meeting Notes:
- Index的创建和使用
-- 创建Person表
SHOW DATABASES;
USE testdb;
drop table IF EXISTS `person`;
create TABLE `person`
(
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`score` int(11) NOT NULL,
`create_time` timestamp NOT NULL,
PRIMARY KEY (`id`),
KEY `name_score` (`name`, `score`) USING BTREE,
KEY `create_time` (`create_time`) USING BTREE
) ENGINE = InnoDB
DEFAULT CHARSET = utf8mb4;
-- 在本地机器上,循环sp创建100K测试数据,with MySQL8.0.31
-- maintain cost
create
DEFINER = `root`@`localhost` PROCEDURE `insert_person`()
begin
declare c_id integer default 1;
while c_id <= 100000
do
insert into person values (c_id, concat('name', c_id), c_id + 100, date_sub(NOW(), interval c_id second));
set c_id = c_id + 1;
end while;
end;
-- space cost
select round(((DATA_LENGTH) / 1024 / 1024), 3) `Data size in MB`,
round(((INDEX_LENGTH) / 1024 / 1024), 3) `Index size in MB`
from information_schema.TABLES
where TABLE_NAME = 'person';
-- retrieves rows from a table using an index cost
explain select * from person where NAME = 'name1';
explain select NAME, SCORE from person where NAME = 'name1';
-- 索引匹配列前缀
EXPLAIN SELECT * FROM person WHERE NAME LIKE '%name123' LIMIT 100; -- not use index
EXPLAIN SELECT * FROM person WHERE NAME LIKE 'name123%' LIMIT 100; -- use index
-- where后的条件语句
EXPLAIN SELECT * FROM person WHERE LENGTH(NAME) = 7;
-- Joint Index只用一个
EXPLAIN SELECT * FROM person WHERE SCORE > 45678; -- not use index
EXPLAIN SELECT * FROM person WHERE NAME LIKE 'NAME45%'; -- use index
EXPLAIN SELECT * FROM person WHERE SCORE > 45678 AND NAME LIKE 'NAME45%'; -- use index
SHOW TABLE STATUS LIKE 'person';
-- check places for optimization
SET optimizer_trace="enabled=on";
SELECT * FROM person WHERE NAME >'name84059' AND create_time>'2022-12-17 15:00:00';
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET optimizer_trace="enabled=off";
Meeting Date/Time: 01/08/2023
Speaker: Xiuyi
Attendees: Isabela, Wei, Ning, Yu, Joy, Oli, Zhengda, Jasmine
Meeting Notes:
- geohash vs google s2 vs quadra tree
Meeting Date/Time: 01/22/2023
Speaker: Joy
Attendees: Isabela, Wei, Ning, Yu, Xiuyi, Oli
Meeting Notes:
- Proximity Service (Yelp)
Meeting Date/Time: 01/29/2023
Speaker: Joy
Attendees: Isabela, Wei, Ning, Yu, Xiuyi, Zhengda, Oli
Meeting Notes:
- peer to peer vs shared backend
- service discovery
Meeting Date/Time: 02/05/2023
Speaker: Wei
Attendees: Joy, Isabela, Ning, Yu, Xiuyi, Zhengda, Oli
Meeting Notes:
- Leaders and Followers replication
- Single Leader replication Logs: Statement based logging; Write ahead logging(WAL); logical (row based) logging; Trigger based logging
- Multi-leader replication: Performance; Conflict resolution; Network;
- Conflict-free replicated data types (CRDTs);
- Mergeable persistent data structures (MPDSs);
- Operational transformation (OT);
- Leaderless replication:
Meeting Date/Time: 02/12/2023
Speaker: Isabela
Attendees: Joy, Yu, Xiuyi, Zhengda, Oli
Meeting Notes:
- Load Balancer - Least connection/Least response time/Least Bandwidth/Least Packets/Round Robin/Weighted Round Robin/IP Hash/URL Hash
- Cache - Client side cache/CDN/Server side cache, Global cache/Distributed Cache (with Request Cache), Local cache
- Cache Invalidation - Write through (Cache and DB sametime)/Write Around (Write only DB)/Write Back (Write Cache first and then DB)/Cache Aside (Read from Cache first, if not found, read from DB and then write to Cache)
- Cache Eviction - FIFO/LIFO/LRU/LFU/MRU/RR
- 先更新数据库再更新缓存 vs 先删除缓存再更新数据库 vs 先更新数据库再删除缓存 (根据不用的use cases来选择)
Meeting Date/Time: 02/19/2023
Speaker: Xiuyi
Attendees: Joy, Yu, Ning, Zhengda, Oli, Jasmine, Isabella
Meeting Notes:
- Container - hotfix mechanism; which to control this process
- Docker uses Linux Container on Windows - LCOW (not Linux Container - LXC, LCOW refers Hyper-V) to run Linux containers on Windows
- TODO: K8S
Meeting Date/Time: 02/26/2023
Speaker: Joy
Attendees: Joy, Yu, Xiuyi, Ning, Oli, Jasmine, Isabella
Meeting Notes:
- Functional Requirements: User location service, Navigation service including ETA, Map redering
- Non-functional Requirements: Accuracy, Smoothness, Data and Battery usage, General Availability and scalability
- Map 101: Geocoding vs Reverse Geocoding; Geohashing, Map rendering (Hirachical routing tiles: zoom in/out, 4个方向,小块为4的n次方)
- Road data processing for navigation service: Dijistra algorithm, A* algorithm
- Google Map Architecture:
- Frontend: Map rendering, Navigation, Search, User location service
- Backend: Map data, Navigation data, Search data, User location data
- Map data: Map tiles, Map features, Map labels, Map images
- Navigation Service: Navigation routes, Navigation labels, Navigation images
- Search data: Search index, Search labels, Search images
- User location Service: User location, User location labels, User location images
- CDN for location service
- Data Models: Routing tiles/Object DB(s3), , user location data/column based(cassandra), geocoding data路的信息/redis, map tiles图片信息
- Shortest-path service - A* algorithm, BFS
- ETA service, Ranking service, Updator service
- Delivery protocol: Websocket, Server-Send Events(SSE)
Meeting Date/Time: 03/05/2023
Speaker: Joy
Attendees: Yu, Xiuyi, Isabella, Ruichen, Zhengda, Joy, Ning, Oli
Meeting Notes:
- SOLID
- Proxy Pattern
- Command Pattern
Meeting Date/Time: 03/12/2023
Speaker: Isabella
Attendees: Yu, Xiuyi, Zhengda, Ning, Joy
Meeting Notes:
- Recovery Time Objective (RTO) vs Recovery Point Objective (RPO)
- Weak consistency vs Strong consistency
- Reverse Proxy vs Forward Proxy
- Redis vs Other in-memory cache
- supplement - Crash cource in Caching
Meeting Date/Time: 03/19/2023
Speaker: Xiuyi
Attendees: Isabella, Yu, Ruichen, Ning, Joy, Oli
Meeting Notes:
- Is K8s Node same as distributed system Node?
- How do K8s nodes isaolate from each other?
- sidecar pattern
Meeting Date/Time: 03/26/2023
Speaker: Joy
Attendees: Isabella, Xiuyi, Yu, Zhengda, Oli
Meeting Notes:
- Design Distributed MQ
- Benefits of Distributed MQ
Meeting Date/Time: 04/02/2023
Speaker: Joy
Attendees: Isabella, Xiuyi, Ning, Oli, Ruichen
Meeting Notes:
- Delayed Message Delivery
Meeting Date/Time: 04/09/2023
Speaker: Xiuyi
Attendees: Isabella, Oli, Ruichen, Zhengda
Meeting Notes: 1)
Meeting Date/Time: 04/23/2023
Speaker: Isabella
Attendees: Xiuyi, Ning, Oli, Ruichen, Zhengda, Joy
Meeting Notes:
- 转账的transaction,不同accounts之间,先减后加还是先加后见,讨论觉得先减后加,但DDIA ch7的例子是先加后减 2)Isolation level: Read Uncommitted vs Read Committed vs Repeatable Read (Snapshot Isolation) vs Serializable
Meeting Date/Time: 04/30/2023
Speaker: Joy
Attendees: Isabella, Xiuyi, Ning, Oli, Ruichen, Zhengda
Meeting Notes:
- Topic: 2023-04-30 Metrics Monitoring & Alerting System
- Metrics Collections
- Alerting Systems
Data retention policy: raw data for 7 days, 1 min resolution for 30 days 1 hour resolution for 1 year Non-functional requirements Scalability Low latency Reliability Flexibility
Pull v.s. Push
Aggregations: 一般习惯用Flink,Spark来aggregte (用kafka功能非常少,kafka做aggregate很少用,社区讨论啊,支持啊都很少)。 Kafka Streams natively supports "incremental" aggregation functions, in which the aggregation result is updated based on the values captured by each window. Incremental functions include count(), sum(), min(), and max().
Handle storage: Down Sampling Cold Storage
PagerDuty v.s. Prometheus
Architectures: Lambda v.s. KAPPA
Real-time Streaming v.s. Batching
Online v.s. Offline aggregations
Scaling: YARN, and the Cassandra built-in
Partition tolerance: refer to the Swimlane Diagram > keep a record of your offset in the HDFS, and also group the (send aggregated result + save new offset + reply ACK to the Kafka) transaction.
Fault tolerance: keep the partial results.
Correctness: Reconciliation
Meeting Date/Time: 05/07/2023
Speaker: Joy
Attendees: Isabella, Xiuyi, Ning, Oli, Zhengda, Jeff
Meeting Notes:
- Ad Click Event Aggregation
Architectures: Lambda v.s. KAPPA
Real-time Streaming v.s. Batching
Online v.s. Offline aggregations
Scaling: YARN, and the Cassandra built-in
Partition tolerance: refer to the Swimlane Diagram > keep a record of your offset in the HDFS, and also group the (send aggregated result + save new offset + reply ACK to the Kafka) transaction.
Fault tolerance: keep the partial results.
Correctness: Reconciliation
Meeting Date/Time: 05/14/2023
Speaker: Isabella
Attendees: Joy, Xiuyi, Ning, Oli, Ruichen, Yu
Meeting Notes:
- Lost update - write write conflict - Atomic write operation/Explict locking
- Write skew and phantoms - constraint locking / predicate locking
- Serializability Isolation - Two phase locking
- shared mode / exclusive mode
- growing phase - 可以获得锁,不能释放锁
- shrinking phase - 可以释放锁,不能获得锁
- bad for performance
- serializable snapshot isolation (SSI) - 读写锁分离, optimistic concurrency control (OCC)
Meeting Date/Time: 05/21/2023
Speaker: Xiuyi
Attendees: Isabella, Joy, Oli, Ruichen, Yu, Zhengda
Meeting Notes: 1)
Meeting Date/Time: 05/28/2023
Speaker: Xiuyi
Attendees: Isabella, Joy, Oli, Yu, Zhengda
Meeting Notes:
- System Desing Components
- DNS (Domain Name System)
- Load Balancer
- API Gateway
- CDN (Content Delivery Network)
- Forward Proxy v.s. Reverse Proxy
- Caching
- Data Partitioning
- Sharding
- Database Replication (Async v.s. semi-sync v.s. synchronous) Quorum ()
- Distributed Messaging System
- Distributed File Systems
- Notification System
- Full-text Search
- Data Warehouse
Meeting Date/Time: 06/04/2023
Speaker: Joy
Attendees: Xiuyi, Isabella, Ning, Yu, Zhengda
Meeting Notes:
- System Design - Hotel Reservation System (Airbnb)
- instead of room id, use room type
- reservation id
- idenpotence for reservation id, idempotent API
- pessimistic locking v.s. optimistic locking
- pessimistic locking - lock the resource before you use it
- optimistic locking - version number, compare and check outdated or not
- 2 phase commit and 3 phase commit
Meeting Date/Time: 06/11/2023
Speaker: Isabella
Attendees: Xiuyi, Yu, Ruichen, Oli, Zhengda
Meeting Notes:
- Distributed transaction - 2 phase commit and 3 phase commit
- Atomic transaction protocol
Meeting Date/Time: 06/18/2023
Speaker: Xiuyi
Attendees: Isabella, Yu, Ruichen, Oli, Zhengda
Meeting Notes: 1)
Meeting Date/Time: 06/25/2023
Speaker: Xiuyi
Attendees: Isabella, Yu, Joy, Ruichen, Oli, Zhengda
Meeting Notes: 1)
Meeting Date/Time: 07/02/2023
Speaker: Ruichen
Attendees: Isabella, Xiuyi, Wenruo
Meeting Notes:
Meeting Date/Time: 07/09/2023
Speaker: Xiuyi
Attendees: Isabella, Ning, Oli, Ruichen, Yu, Zhengda
Meeting Notes:
- Senior+ Engineer
- Treat your team as a company, treat other teams as customers;
- Divide and conquer: 5 years -> 1 year -> 6 months -> 1 month -> 1 week -> 1 day; soft goals;
- How to persuade other teams to adopt your solutions in real work?
- In Interviews?
Meeting Date/Time: 07/16/2023
Speaker: Xiuyi
Attendees: Joy, Oli, Ruichen, Yu, Zhengda
Meeting Notes: 31 Startup business models you must know - with examples
Meeting Date/Time: 07/23/2023
Speaker: Joy
Attendees: Isa, Xiuyi, Oli, Yu, Ning
Meeting Notes:
Meeting Date/Time: 07/30/2023
Speaker: Xiuyi
Attendees: Isa, Joy, Oli, Yu, Ruichen, Yu, Zhengda
Meeting Notes: 1) Review Senarios for System Design Terms 2) Improve your visibility
Meeting Date/Time: 08/06/2023
Speaker: Isa
Attendees: Xiuyi, Joy, Ning, Ruichen, Yu, Zhengda
Meeting Notes:
- encoding vs encryption
- delete longURL?
Meeting Date/Time: 08/13/2023
Speaker: Joy
Attendees: Xiuyi, Isa, Oli, Ruichen, Ning,
Meeting Notes:
- Real-time Gaming Leaderboard
- Requirement: Display top 10 players on the leaderboard.
- Show a user’s specific rank.
- Display players who are four places above and below the desired user (bonus).
- Non-functional requirements Real-time update on scores.
- 5M DAU QPS : if user play 10 games per day on average.
- API Design POST /v1/scores [user id, points]
- GET /v1/scores
- GET /v1/scores/{:userId}
Meeting Date/Time: 09/10/2023
Speaker: Xiuyi
Attendees: Joy, Isa, Oli, Yu, Ruichen, Ning, Zhengda
Meeting Date/Time: 09/17/2023
Speaker: Joy
Attendees: Isa, Xiuyi, Yu, Ning
Meeting Date/Time: 09/24/2023
Speaker: Isa
Attendees: Xiuyi, Yu, Joy, Zhengda
Meeting Date/Time: 10/01/2023
Speaker: Xiuyi
Attendees: Isa, Yu, Oli, Ruichen, Ning
Meeting Date/Time: 10/08/2023
Speaker: Ning, Yu, Isa, Ruichen
Attendees: Xiuyi,Joy 1)分布式系统中API设计的幂等性设计应该考虑哪些因素? 2)分布式系统中请求之间的顺序性如何保证?(例如用MQ做例子) 3)Zookeeper的使用场景和竞品有哪些? 4)简述cookie,session,token之间的联系和区别?讲述分布式session和sticky session
Meeting Date/Time: 10/15/2023
Speaker: Joy
Attendees: Xiuyi,Oli, Yu, Isa, Ruichen, Zhengda
Meeting Date/Time: 10/22/2023
Speaker: Isa
Attendees: Xiuyi,Oli, Joy, Yu, Ning
- Linearizability
- Serializability
Meeting Date/Time: 11/05/2023
Speaker: Ning, Yu, Sophia, Norman
Attendees: Xiuyi, Ruichen, Joy
- 2 Phase Commit - XA 方案 - participant状态提交的阻塞问题;cordiantor单点问题;在网络分区或协调者故障的情况下不保证一致性;悲观锁
- 3 Phase Commit - TCC方案 - 悲观锁,解决coordinator的单点问题 - 应用场景?
- 简述MySQL, Redis,ZK实现分布式锁 - zk如何解决service的死锁问题?
- 业界常见的分布式锁解决方案 - Chubby, Curator, ZK, DynamoDB, Redis, etcd, etcd-recipes, Consul
Meeting Date/Time: 11/12/2023
Speaker: Joy
Attendees: Xiuyi, Isa, Ning, Yu, Sophia, Norman, Oli, Ruichen
- In LSM, how to know which segments are new and which are old? - TimeStamp, Serial Number, Tiered Compaction, Bloom Filter Tier, etc
- sequential write v.s. random write, use WAL to protect data
- B-Tree v.s. LSM
- B+ Tree, many DBs have 3~4 levels, use "latch" or "copy on write" to protect the tree
Meeting Date/Time: 11/19/2023
Speaker: Isa Attendees: Xiuyi, Joy, Ning, Yu, Sophia, Norman, Oli, Ruichen
Meeting Date/Time: 11/26/2023
Speaker: Xiuyi Attendees: Joy, xiao, jeff, Ning, Sophia, Norman, Oli, Ziwei
- how many users/sessions can 1 websocket handle? 1 websocket connection for 1 user
Meeting Date/Time: 12/03/2023 Speaker: Ning, Yu, Sophia, Ruichen Attendees: Xiuyi, Joy, Zhengda, Norman, Ziwei
- @柠檬 综合各个方面,例如应用场景,格式,效率,安全性等等,对比API架构类型 SOAP, REST, RPC,GraphQL; 2)@yugege 结合自己工作经验,介绍企业中一个系统向另外一个系统同步数据的方式,例如何种场景使用了RPC,为什么使用RPC,为什么选择JSON/protocol buffer/avro作为数据传输格式,等等; 3)@Sophia Lu 当我们设计面向customers的API时,通常需要做什么样的措施来保护我们的APIs? 例如 OAuth2, Versioning, Rate Limiting, Input Validation, API Gateway等等; 4)@Ruichen Rong 常见的API接口测试方法有哪些,各适用于什么场景?例如Integration Testing, Functional Testing, Smoke Testing, Load Testing, Regression Testing等等。
Meeting Date/Time: 12/10/2023 Speaker: Joy Attendees: Isa, Xiuyi, Ning, Sophia, Norman, Oli, Ziwei, Zhengda, Ruichen
- Replication - single leader v.s. multi-leader v.s. leaderless
- leaders to followers -> logs
Meeting Date/Time: 12/17/2023 Speaker: Joy Attendees: Xiuyi, Sophia, Ziwei, Zhengda 1)
Meeting Date/Time: 12/24/2023 Speaker: Guilin Attendees: Joy, Ruichen 1)
Meeting Date/Time: 01/07/2024 Speaker: Attendees: Guilin, Ruichen 1)
Meeting Date/Time: 01/14/2024 Speaker: Guilin Attendees: Sophia, Joy, Ruichen, Ziwei, Zhengda 1)
Meeting Date/Time: 01/21/2024 Speaker: Attendees: Norman, Sophia, Guilin, Oli, Ziwei, Zhengda
- Bloom Filter
- Count-Min Sketch
Meeting Date/Time: 02/04/2024 Speaker: Guilin Attendees: Ning, Oli, Sophia, Wenruo, Ruichen, Ziwei, Zhengda
- Monolith v.s. Microservice
- Separate Frontend and Backend
- A BQ template
Meeting Date/Time: 03/10/2024 Speaker: Guilin Attendees: Ning, Norman, Isabella, Ziwei, Zhengda
Meeting Date/Time: 03/17/2024 Speaker: Guilin Attendees: Sophia, Zhengda
Meeting Date/Time: 03/24/2024 Speaker: Guilin Attendees: Sophia, Zhengda, Norman, Oli
- https://www.vultr.com/
- https://get.docker.com
- https://github.com/lobehub/lobe-chat
- https://aistudio.google.com/app/apikey
- https://platform.openai.com/api-keys
- https://console.anthropic.com/settings/keys
- https://vercel.com/
Meeting Date/Time: 03/31/2024 Speaker: Guilin Attendees: Zhengda, Ning
Meeting Date/Time: 04/07/2024 Speaker: Guilin Attendees: Zhengda, Ning, Norman, Oli
Meeting Date/Time: 04/14/2024 Speaker: Guilin Attendees: Zhengda, Ning, Oli
Meeting Date/Time: 04/21/2024 Speaker: Guilin Attendees: Zhengda, Ning, Oli
- https://www.meta.ai/
- https://lmstudio.ai/
- https://jan.ai/
- https://www.youtube.com/watch?v=bc6uFV9CJGg
Meeting Date/Time: 05/05/2024 Speaker: Guilin Attendees: Oli
- Logging v.s. Metrics v.s. Tracing
- Prometheus v.s. Zabbix vs Nightingale
Meeting Date/Time: 05/12/2024 Speaker: Guilin Attendees: Oli, Zhengda
- Metrics - 3 identifiers: global unique name, timestamp, influxdb way
- Metrics - Counter, Gauge, Histogram, Summary
Meeting Date/Time: 05/19/2024 Speaker: Guilin Attendees: Oli, Ning
- Exporters, TSDB, Alerting, Visualization
Meeting Date/Time: 06/02/2024 Speaker: Guilin Attendees: Oli, Zhengda
- Start Prometheus cluster and key concepts
Meeting Date/Time: 06/23/2024 Speaker: Guilin Attendees: Oli, Norman
- AI RAG and Langchain
Meeting Date/Time: 06/30/2024 Speaker: Guilin Attendees: Zhengda, Ning
Meeting Date/Time: 07/14/2024 Speaker: Guilin Attendees: Norman, Ruicheng, Will
137. 1)Graphrag
Meeting Date/Time: 07/21/2024 Speaker: Guilin Attendees: Oli