分布式ID方案调研

六月 11, 2024 分布式架构, 序列中心

关于分布式ID

在调研和评估分布式ID解决方案的时候需要考虑的问题：
（1）唯一性：最基础的功能，这里的唯一性是指针对某个业务而言，比如订单数据的ID要唯一，支付记录的ID要唯一等等
（2）QPS：这是性能保证
（3）单调递增：评估业务上是否存在一定要单调递增的需求，Leaf系统并不支持单调递增

参考资料：
分布式UID解决方案分析

主流方案

Leaf

Leaf同时支持数据库号段模式（Leaf-segment）和改进的雪花算法模式（Leaf-snowflake），已经在美团内部经过大规模使用验证，性能禁得住考验。
在Leaf中不论是数据库号段模式，还是改进的雪花算法模式，生成的ID都不具备单调递增特性，只能保证唯一性和趋势递增。
Leaf原生提供HTTP协议的发号接口，如果是为了最大化性能可以根据项目实际需要集成具体的RPC框架。

数据库号段模式

Leaf-segment号段模式是直接使用数据库自增id作为充当分布式ID的一种优化，减少对数据库的操作频率，从而降低数据库压力。
Leaf-segment号段模式具备如下特性：

通过号段分配方式将直接在数据库中SQL发号转移到在内存中发号，大大提高发号效率的同时减少了数据库操作次数，从而降低数据库访问压力
由于Leaf服务本身可能存在重启的情况，重启后在内存中还未使用的号段ID将会丢失，所以数据库号段模式生成的ID不能保证连续递增，而且不同Leaf服务获取到的号段是不同的，所以也不可能保证单调递增，只能保证趋势递增
数据库号段模式生成的ID不具备保密性，所以在作为外部服务的惟一ID使用时存在被竞对猜测到业务量的风险
依赖数据库，虽然Leaf已经采取了双buffer的模式允许在一定时间内数据库不可用，但是如果缺少数据库完全不可用时将无法发号

雪花算法模式

Leaf-snowflake模式是改进后的雪花算法实现，具体为：

标准的雪花算法总长度为64bit，其中1bit作为保留位，41bit为时间戳，10bit为WorkerID，12bit为自增序号
Leaf改进的雪花算法总长度依然保持64bit，其中1bit为保留位，41bit为时间戳，5bit为机房ID，5bit为机器ID，12bit为自增序号

也就是说，不论是标准的雪花算法模式还是Leaf改进后的雪花算法模式，在理论上都能满足每秒发号百万级别的性能。

Leaf_snowflake模式具备如下特性：

依赖ZooKeeper，在ZooKeeper中将自增顺序节点ID作为WorkerID
如果Leaf服务存在IP地址频繁变化的情况，在ZooKeeper中会大量的无效自增顺序节点
由于雪花算法机制依赖时钟，所以存在时钟回拨的风险，在Leaf中只解决了时钟回拨小于5ms的问题，若时钟回拨超过5ms则直接报错
生成的ID为8字节64bit的无符号整数，满足用作数据库主键ID使用

参考资料：
美团（Leaf）分布式ID算法(实战)
Leaf issues
Leaf源码与技术资料不一致
 Leaf 2017年的文章
 Leaf 2019年的文章
 面试题：雪花算法（SnowFlake）如何解决时钟回拨问题
 雪花算法的详解及时间回拨解决方案
 图解算法(二): 雪花算法(6k字附代码实现+主流方案对比)
雪花算法中非常好用的数字ID生成器

tinyid

Tinyid扩展了leaf-segment算法，支持了多db(master)，同时提供了java-client(sdk)使id生成本地化，获得了更好的性能与可用性。

参考资料：
如何部署安装分布式序列号生成器系统

uid-generator

uid-generator是百度基于twitter雪花算法改进的ID生成方案，重新将64bit做了分配，其中workerId值使用数据主键ID。
该方案提供了两种生成ID的方式：获取时计算ID（DefaultUidGenerator），提前计算ID（CachedUidGenerator）。

DefaultUidGenerator

获取时计算ID，这种方式因为即时要读取系统当前时间作为计算ID的因子，所以同样无法避免时钟回拨的问题。在DefaultUidGenerator的实现中如果时钟回拨超过1秒，就直接抛出异常报错；而如果发生的时钟回拨在1秒内，一方面是通过自增序号来控制，另一方面是通过等待下一秒时间来解决。

CachedUidGenerator

通过“借用未来时间”的方式巧妙地避免了雪花算法中的时钟回拨问题
这里的“借用未来时间”是指：在计算“下一秒”可产生的ID列表时，“下一秒”是直接在当前秒的基础上通过累加的方式得到的，没有去获取系统时间。
为什么可以“借用未来时间”呢？
因为ID是提前在内存中计算出来的，所以再计算“下一秒”内可生成的ID列表时不能直接获取当前的系统时间，只能在当期时间的基础上直接累加计算来获取“下一秒”时间戳。
该方式生成ID效率高的原因是：在指定时间秒内生成的ID都是提前计算的，每次获取的时候直接从内存取值，因此效率极高。
如果在独立服务中使用该方式生成ID可能存在业务容量易被猜测到的风险（原因：独立服务通常不会总是重启，因此生成的ID是连续的）。

比特位分配

uid-generator与其他分布式ID方案最大的不同就是它可以作为组件直接嵌入到应用中使用，无需将其部署为一个独立的服务，这得益于其对twitter雪花算法的改进设计，可以根据项目实际需求灵活修改timeBits（默认29bit），workerBits（默认21bit），seqBits（默认13bit）位数。在使用默认配置情况下，节点采取用完即弃的WorkerIdAssigner策略，最多可支持约210w（2^22=2097152）次机器启动，单节点可支持每秒8192（2^13=8192）个并发，运行约17（2^29/606024*365）年。

应用64bit分配举例：
假设{"timeBits":31,"workerBits":23,"seqBits":9}，则总体重启次数：2^23=8388608次，单节点并发：2^9=512个。
如果节点数为28个，重启频率为12次/天，则一天总重启次数：28*12=336次，支持运行年数：8388608/(336*365)=68年，整体并发量：28*512=14336个
如果节点数为30个，重启频率为12次/天，则一天总重启次数：30*12=360次，支持运行年数：8388608/(360*365)=63年，整体并发量：30*512=15360个

如何使用

如下阐述如何在Spring Boot框架项目中使用uid-generator组件生成唯一ID。
第一步：添加依赖配置。

<!-- 引入百度uid-generator组件：需要先install到本地仓库 -->
<dependency>
    <groupId>com.baidu.fsg</groupId>
    <artifactId>uid-generator</artifactId>
    <version>1.0.0-SNAPSHOT</version>
</dependency>

<!-- 数据库驱动 -->
<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
    <version>5.1.18</version>
</dependency>

<!-- druid作为数据库连接池 -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>druid</artifactId>
    <version>1.0.19</version>
</dependency>

第二步：添加配置文件。

application.properties

1
2

spring.application.name=test-uid-generator
spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration,org.springframework.boot.autoconfigure.transaction.TransactionAutoConfiguration

cached-uid-spring.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="
		http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.1.xsd">

	<!-- UID generator -->
	<bean id="disposableWorkerIdAssigner" class="com.baidu.fsg.uid.worker.DisposableWorkerIdAssigner" />

	<bean id="cachedUidGenerator" class="com.baidu.fsg.uid.impl.CachedUidGenerator">
		<property name="workerIdAssigner" ref="disposableWorkerIdAssigner" />

		<!-- Specified bits & epoch as your demand. No specified the default value will be used -->
		<property name="timeBits" value="29"/>
		<property name="workerBits" value="21"/>
		<property name="seqBits" value="13"/>
		<property name="epochStr" value="2016-05-20"/>
	</bean>

	<!-- Import mybatis config -->
	<import resource="classpath:/mybatis-spring.xml" />
</beans>

并在SpringBootApplication中添加引入xml配置：

@SpringBootApplication
@ImportResource({
    "classpath:/cached-uid-spring.xml"  // 引入配置文件
})
public class TestUidGeneratorApplication {
    public static void main(String[] args) {
        SpringApplication.run(TestUidGeneratorApplication.class, args);
    }
}

mybatis-spring.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns:aop="http://www.springframework.org/schema/aop" xmlns:context="http://www.springframework.org/schema/context"
	xmlns:jdbc="http://www.springframework.org/schema/jdbc" xmlns:tx="http://www.springframework.org/schema/tx"
	xsi:schemaLocation="
		http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.1.xsd
		http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-3.1.xsd
		http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-3.1.xsd
		http://www.springframework.org/schema/jdbc http://www.springframework.org/schema/jdbc/spring-jdbc-3.1.xsd
		http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-3.1.xsd">

	<!-- 确保可在@Value中, 使用SeEL表达式获取资源属性 -->
	<bean id="propertyConfigurer" class="org.springframework.context.support.PropertySourcesPlaceholderConfigurer">
		<property name="properties" ref="configProperties" />
	</bean>

	<bean id="configProperties" class="org.springframework.beans.factory.config.PropertiesFactoryBean">
		<property name="locations">
			<list>
				<value>classpath:/mysql*.properties</value>
			</list>
		</property>
	</bean>

	<!-- Spring annotation扫描 -->
	<context:component-scan base-package="com.baidu.fsg.uid" />

	<!-- 创建SqlSessionFactory，同时指定数据源 -->
	<bean id="sqlSessionFactory" class="org.mybatis.spring.SqlSessionFactoryBean">
		<property name="dataSource" ref="dataSource" />
		<property name="mapperLocations" value="classpath:/META-INF/mybatis/mapper/WORKER*.xml" />
	</bean>

	<!-- 事务相关配置 -->
	<tx:annotation-driven transaction-manager="transactionManager" order="1" />

	<bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
		<property name="dataSource" ref="dataSource" />
	</bean>

	<!-- Mybatis Mapper扫描 -->
	<bean class="org.mybatis.spring.mapper.MapperScannerConfigurer">
		<property name="annotationClass" value="org.springframework.stereotype.Repository" />
		<property name="basePackage" value="com.baidu.fsg.uid.worker.dao" />
		<property name="sqlSessionFactoryBeanName" value="sqlSessionFactory" />
	</bean>

	<!-- 数据源配置 -->
	<bean id="dataSource" parent="abstractDataSource">
		<property name="driverClassName" value="${mysql.driver}" />
		<property name="maxActive" value="${jdbc.maxActive}" />
		<property name="url" value="${jdbc.url}" />
		<property name="username" value="${jdbc.username}" />
		<property name="password" value="${jdbc.password}" />
	</bean>

	<bean id="abstractDataSource" class="com.alibaba.druid.pool.DruidDataSource" destroy-method="close">
		<property name="filters" value="${datasource.filters}" />
		<property name="defaultAutoCommit" value="${datasource.defaultAutoCommit}" />
		<property name="initialSize" value="${datasource.initialSize}" />
		<property name="minIdle" value="${datasource.minIdle}" />
		<property name="maxWait" value="${datasource.maxWait}" />
		<property name="testWhileIdle" value="${datasource.testWhileIdle}" />
		<property name="testOnBorrow" value="${datasource.testOnBorrow}" />
		<property name="testOnReturn" value="${datasource.testOnReturn}" />
		<property name="validationQuery" value="${datasource.validationQuery}" />
		<property name="timeBetweenEvictionRunsMillis" value="${datasource.timeBetweenEvictionRunsMillis}" />
		<property name="minEvictableIdleTimeMillis" value="${datasource.minEvictableIdleTimeMillis}" />
		<property name="logAbandoned" value="${datasource.logAbandoned}" />
		<property name="removeAbandoned" value="${datasource.removeAbandoned}" />
		<property name="removeAbandonedTimeout" value="${datasource.removeAbandonedTimeout}" />
	</bean>

	<bean id="batchSqlSession" class="org.mybatis.spring.SqlSessionTemplate">
		<constructor-arg index="0" ref="sqlSessionFactory" />
		<constructor-arg index="1" value="BATCH" />
	</bean>
</beans>

mysql.properties

#datasource db info
mysql.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/uid_generator
jdbc.username=xxx
jdbc.password=yyy
jdbc.maxActive=2

#datasource base
datasource.defaultAutoCommit=true
datasource.initialSize=2
datasource.minIdle=0
datasource.maxWait=5000
datasource.testWhileIdle=true
datasource.testOnBorrow=true
datasource.testOnReturn=false
datasource.validationQuery=SELECT 1 FROM DUAL
datasource.timeBetweenEvictionRunsMillis=30000
datasource.minEvictableIdleTimeMillis=60000
datasource.logAbandoned=true
datasource.removeAbandoned=true
datasource.removeAbandonedTimeout=120
datasource.filters=stat

第三步： 使用UidGenerator

// 注入id生成组件
@Autowired
CachedUidGenerator cachedUidGenerator;

// 生成唯一ID
long id = cachedUidGenerator.getUID();

完毕！

参考资料：
分布式唯一性id生成方案 ——- 百度的UidGenerator
百度 UidGenerator 源码解析

【参考】
kubernetes指定固定的Pod IP
K8s 指定 pod 运行在固定ip方式