尧图网站建设 尧图网络
  • 首页
  • 关于我们
  • 服务项目
  • 案例展示
  • 建站流程
  • 资讯中心
  • 联系我们
首页/资讯中心/详情

深入解析:Flink 状态和 CheckPoint 的区别和联系(附源码)

深入解析:Flink 状态和 CheckPoint 的区别和联系(附源码)
📅 发布时间:2026/6/20 5:33:37

Flink 状态和checkpoint的区别和联系(附源码

  • 1. 本质区别:运行时 vs 持久化
    • 1.1 State(状态):运行时的"工作内存"
    • 1.2 Checkpoint:状态的"快照存档"
  • 2. 形象类比
  • 3. 源码层面的关系
    • 3.1 CheckpointableKeyedStateBackend:连接两者的桥梁
    • 3.2 StateHandle:Checkpoint 的元数据
    • 3.3 HeapKeyedStateBackend:实际的实现
  • 4. 完整的生命周期
    • 4.1 正常运行时
    • 4.2 Checkpoint 触发时
    • 4.3 故障恢复时
  • 5 关键区别对比表
  • 6. 源码中的协作机制
    • 6.1 Checkpoint 选项配置
    • 6.2不同类型的 Checkpoint
  • 7. 实战示例:完整流程
  • 8. 核心联系总结
    • 8.1 依赖关系
    • 8.2 协作关系
    • 8.3 性能权衡
    • 8.4 统一抽象
  • 9. 关键要点

1. 本质区别:运行时 vs 持久化

1.1 State(状态):运行时的"工作内存"

package org.apache.flink.api.common.state;
import org.apache.flink.annotation.PublicEvolving;
/**
* Interface that different types of partitioned state must implement.
*
* <p>The state is only accessible by functions applied on a {@code KeyedStream}. The key is
* automatically supplied by the system, so the function always sees the value mapped to the key of
* the current element. That way, the system can handle stream and state partitioning consistently
* together.
*/
@PublicEvolving
public interface State {

/** Removes the value mapped under the current key. */
void clear();
}

State 的特征:

  • 位置:存储在 TaskManager 的内存或本地磁盘(RocksDB)
  • 目的:算子处理数据时的"工作记忆"
  • 访问:频繁、实时、微秒级
  • 生命周期:作业运行期间一直存在
  • 可变性:每处理一条数据可能就会更新

1.2 Checkpoint:状态的"快照存档"

@Internal
public interface Snapshotable<S extends StateObject> {/*** Operation that writes a snapshot into a stream that is provided by the given {@link* CheckpointStreamFactory} and returns a @{@link RunnableFuture} that gives a state handle to* the snapshot. It is up to the implementation if the operation is performed synchronous or* asynchronous. In the later case, the returned Runnable must be executed first before* obtaining the handle.** @param checkpointId The ID of the checkpoint.* @param timestamp The timestamp of the checkpoint.* @param streamFactory The factory that we can use for writing our state to streams.* @param checkpointOptions Options for how to perform this checkpoint.* @return A runnable future that will yield a {@link StateObject}.*/@NonnullRunnableFuture<S> snapshot(long checkpointId,long timestamp,@Nonnull CheckpointStreamFactory streamFactory,@Nonnull CheckpointOptions checkpointOptions)throws Exception;}

Checkpoint 的特征:

  • 位置:持久化存储(HDFS、S3、OSS 等)
  • 目的:容错恢复的"存档点"
  • 访问:低频、定期(如每分钟)
  • 生命周期:独立于作业运行,故障恢复时使用
  • 不可变性:一旦完成就不再改变

2. 形象类比

State = 你正在编辑的 Word 文档(内存中)
↓ 每隔一段时间
Checkpoint = 保存到磁盘的文档副本(硬盘上)
↓ 如果程序崩溃
Recovery = 从最近的保存恢复(重新加载到内存)

3. 源码层面的关系

3.1 CheckpointableKeyedStateBackend:连接两者的桥梁

/**
* Interface that combines both, the {@link KeyedStateBackend} interface, which encapsulates methods
* responsible for keyed state management and the {@link Snapshotable} which tells the system how to
* snapshot the underlying state.
*
* <p><b>NOTE:</b> State backends that need to be notified of completed checkpoints can additionally
* implement the {@link CheckpointListener} interface.
*
* @param <K> Type of the key by which state is keyed.
*/
public interface CheckpointableKeyedStateBackend<K>
extends KeyedStateBackend<K>, Snapshotable<SnapshotResult<KeyedStateHandle>>, Closeable {/** Returns the key groups which this state backend is responsible for. */KeyGroupRange getKeyGroupRange();/*** Returns a {@link SavepointResources} that can be used by {@link SavepointSnapshotStrategy} to* write out a savepoint in the common/unified format.*/@NonnullSavepointResources<K> savepoint() throws Exception;}

设计理念:

  • KeyedStateBackend:管理运行时状态的读写
  • Snapshotable:提供状态快照能力
  • CheckpointableKeyedStateBackend:同时具备两种能力

3.2 StateHandle:Checkpoint 的元数据

/**
* Base for the handles of the checkpointed states in keyed streams. When recovering from failures,
* the handle will be passed to all tasks whose key group ranges overlap with it.
*/
public interface KeyedStateHandle extends CompositeStateHandle {

/** Returns the range of the key groups contained in the state. */
KeyGroupRange getKeyGroupRange();
/**
* Returns a state over a range that is the intersection between this handle's key-group range
* and the provided key-group range.
*
* @param keyGroupRange The key group range to intersect with, will return null if the
*     intersection of this handle's key-group and the provided key-group is empty.
*/
@Nullable
KeyedStateHandle getIntersection(KeyGroupRange keyGroupRange);
/**
* Returns a unique state handle id to distinguish with other keyed state handles.

相关新闻

  • XCPC 竞赛 Ubuntu 环境 DOMjudge Server 完整配置指南
  • Python迭代器_高级
  • 字符编码体系详解:从ASCII到UTF-8的演进与实践

最新新闻

  • 3分钟掌握Web Audio API声音变换:Voice Change-O-Matic终极指南
  • 深入解析MC9S08QG8内部时钟源(ICS)模块:FLL原理、七种工作模式与实战配置
  • 如何永久保存微信聊天记录:3步完成数据备份的完整指南
  • 第36章:PagedAttention Kernel 与 KV Cache 内存布局
  • React Native Map Link测试策略:单元测试与集成测试最佳实践
  • (2026新)烟台正规防水补漏公司口碑榜TOP5权威推荐!卫生间/厨房/阳台/屋顶/天花板/地下室渗漏水检测维修攻略-靠谱漏水检测维修师傅推荐 - 安佳防水

日新闻

  • 信任的进化:技术实现详解——如何用JavaScript构建博弈论模拟器
  • Terrakube自定义工作流:如何集成OPA、Infracost等工具扩展IaC能力
  • grunt-concurrent快速入门:5分钟学会并行运行Grunt任务

周新闻

  • 3步解锁iOS设备:applera1n激活锁绕过完全指南
  • 39 2026 人工智能证书终极盘点,普通人选 AI 证书可以从这些方向入手
  • Redis 暴露公网有多危险?从端口检查到补救步骤

月新闻

  • 【总结】入门篇:50句话让你记住架构核心概念
  • WeChatMsg技术方案解析:实现Mac微信数据自主管理的完整解决方案
  • WeChatMsg:革新性微信数据备份方案,打造你的专属数字记忆库

关于尧图

  • 公司简介
  • 团队介绍
  • 企业文化
  • 荣誉资质

服务项目

  • 定制开发
  • 电商建站
  • UI 设计
  • 运维服务

快速链接

  • 案例展示
  • 建站流程
  • 常见问题
  • 资讯中心

联系方式

  • 📍北京市朝阳区互联网产业园 A 座 10 层
  • 📞400-888-8888
  • ✉️contact@rkmt.cn
  • 🕐周一至周日 9:00-21:00

© 2024 北京尧图网络科技有限公司 版权所有 | 京 ICP 备 XXXXXXXX 号